PROGRESS IN
Nucleic A c i d Research a n d M o l e c u l a r Biology Volume
55
This Page Intentionally Left Blank
...
13 downloads
813 Views
16MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
PROGRESS IN
Nucleic A c i d Research a n d M o l e c u l a r Biology Volume
55
This Page Intentionally Left Blank
PROGRESS IN
Nucleic Acid Research and Molecular Biology edited by
WALDO E. COHN
KlVlE MOLDAVE
Biology Division Oak Ridge National Laboratory Oak Ridge, Tennessee
Department of Molecular Biology and Biochemistry University of California, Irvine Iruine, California
Volume 55
ACADEMIC PRESS Son Diego London Boston Sydney Tokyo Toronto
New York
This book is printed on acid-free paper. @ Copyright 0 1996 by ACADEMIC PRESS All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.
Academic Press, Inc. 525 B Street, Suite 1900, San Diego, California 92101-4495, USA http://www.apnet.com Academic Press Limited 24-28 Oval Road, London NWl 7DX, UK h t t p : / l w . hbuk.co.uk/ap/ International Standard Serial Number: 0079-6603 International Standard Book Number: 0-12-540055-1 PRINTED IN THE UNITED STATES OF AMERlCA 96 97 9 8 9 9 00 0 1 E B 9 8 7 6 5
4 3 2 1
Contents
ABBREVIATIONSAND SYMBOLS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SOME ARTICLES PLANNED FOR FUTUREVOLUMES . . . . . . . . . . . . . . .
Experimental Analysis of Global Gene Regulation in Escherichia coli ................................. Kobert M. Blumenthal, Deborah W. Borst and Rowena G. Matthews 1. What Is a Global Regulator:? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... 11. Methods for Experimental Analysis of Global Regulators . 111. Experimental Analysis of Global Regulators and Their Roles .......... in Escherichin coli: Some Exainples IV. Suininary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
... Kcferenees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Eukaryotic Nuclear RNase P: Structures and Functions .................................... Joel R. Chamberlain, Anthony J. Tranguch, Eileen Pagin-Ramos and David K. Engelke I. Ribonuclease P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Yeast Nuclear RNase-P RNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Analysis of Mutations in RPRl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
xi
1
2 27
42 73 75
IS
87
88 95 108 116
Effects of the Ferritin Open Reading Frame on Translational Induction by Iron . . . . . . . . . . . . . . . . . 121 David P. Mascotti, Lisa S. Goessling, Diane Rup and Robert E. Thach I. The IRE and IRPs Are Necessary for Iron Inducibility of Ferritin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11. Sequences Downstream of the IRE Augment Its Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
122 126
vi
CONTENTS
111. Comments and Future Directions
..............................
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
131 131
Depletion of Nuclear Poly(ADP-ribose) Polymerase
by Antisense RNA Expression: Influence on Genomic Stability. Chromatin Organization. DNA Repair. and DNA Replicatioin ............................ 135 Cynthia M . G . Simbulan.Rosentha1. Dean S . Rosenthal. Ruchuang Ding. Joany Jackman and Mark E . Smulson I . Biological Roles of PARP as Assessed by Studies with Chemical Inhibitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1. Molecular Biological Approaches to Study Functional Roles of PARP ............................................... 111. Induction of PARP Antisense RNA Depletes Endogenous PARP mRNA. Protein Levels. and Activity at Selected Biological Time Frames ....................................... IV. Influences of PARP Antisense RNA Expression on Chromatin Organization and Genomic Stability . . . . . . . . . . . . . . . . V. Effects of PARP Antisense RNA Expression on Nuclear DNA Repair. Replication. and Differentiation .................... VI . Other Putative Roles of PARP Currently under Study: Apoptosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
136 137
139 144 146 151 154
The Large Ribosomal Subunit Stalk as a Regulatory Element of the Eukaryotic Translational Machinery . . 157 Juan P. G . Ballesta and Miguel Remacha I . Components of the Eukaryotic Ribosomal Stalk . . . . . . . . . . . . . . . . . . . I1. The Cytoplasmic Pool of the Stalk Components . . . . . . . . . . . . . . . . . . . 111. The PLIPP-PO Protein Complex ................................ IV. Exchange of P Proteins in the Ribosome ......................... V. Phosphorylation of the Stalk Proteins ............................ VI . Functional Roles of the Eukaryotic Stalk Components . . . . . . . . . . . . . VII . Regulation of Ribosome Activity and Translation by the Eukaryotic Ribosomal Stalk .............................. VIII . Regulation of P1 and P2 Expression ............................. IX. Future Prospects ............................................. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
159 165 167 169 171 176 184 187 189 190
vii
CONTENTS
Regulation and Function of Adenosine Deaminase in Mice ...............................
195
Michael R . Blackburn and Rodney E . Kellems I . Developmental and Tissue-specific Expression of Ada . . . . . . . . . . . . . I1. Regulation of Ada Gene Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I11 . Physiological Role of ADA during Development . . . . . . . . . . . . . . . . . . IV. Role of ADA in the Murine Immune System . . . . . . . . . . . . . . . . . . . . .
V. Role of ADA in the Secondary Deciduum . . . . . . . . . . . . . . . . . . . . . . . . VI . Role of ADA in the Gastrointestinal Tract ........................ VII . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
198 203 208 216 219 220 221 223
S1 -Nuclease-sensitive DNA Structures Contribute to Transcriptional Regulation of the Human PDGF A-chain ...................................
227
Zhao-Yi Wang and Thomas F . Deuel I . Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
..................................................
228 235 241 242
Minute Virus of Mice &Acting Sequences Required for Genome Replication and the Role of the transActing Viral Protein. NS-1 ..........................
245
I1. Sl-sensitive Sites in PDGF A-Chain Gene ....................... 111. Summary and Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References
Caroline R . Astell. Qingquan Liu. Colin E . Harris. John Brunstein. Hitesh K . Jindal and Pat Tam I . &-Acting Sequences Required for MVM DNA Replication . . . . . . . . . I1 . The Nonstructural Proteins of MVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Summary and Future Directions ................................ References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
253 267 280 282 287
This Page Intentionally Left Blank
Abbreviations and Symbols
All coiitri1)utors to this Series are asked to use the terminology (abbreviations and symbols) recommended by the IUPAC-IUB Commission on Biochemical Noineiiclature (CBN) and approved by IUPAC and IUB, and the Editors endeavor to assure conformity. These Recommendations have been published in many jouriials ( 1 , 2) and compendia (3);they are therefore considered to be generally known. Those used in nucleic acid work, originally set out in section 5 of the first Recoininendations ( I ) and subsequently revised and expanded (2, 3), are given in condensed form in the frontinatter of Volumes 9-33 of this series. A recent expansion of the oneletter system (5)follows. SINGLE-LETTEH CODERECOMMENDATIONS~~ (5) Meaning
syl111)01
Origin of syiiibol
G A T(U) C
G A T(U) C
Guanosine Adenosine (ribo)Thyniidine(Uridine) Cytidine
R Y M K W”
G or A T(U) or C A or C G or T(U) G or C A or T(U)
puRine pyrimidine aMino Keto Strong interaction (3 H-bonds) Weak interaction (2 H-bonds)
H B V Dc
A or C or T(U) G or T(U) or C G or C or A G or A or T(U)
not not not not
N
G or A or T(U) or C
aNy nucleoside (i.e., unspecified)
v
v
Queuosine (nucleoside of queuine)
S
G; H follows G in the alphabet A; B follows A T (not U); V follows U C; D follows C
.Modified from Proc. Nutl. A d . Sci. U.S.A. 83, 4 (1986). ”W lias been used for wyosine, the nucleoside of “base Y” (wyej. V Dhas been used for diliydrouridine (hU or H,Urd).
Enzymes In naming enzymes, the 1984 reconmendations of the IUB Commission on Biochemical Nomenclature ( 4 ) are followed as far as possible. At first mention, each enzyme is described either by its systematic name or by the equation for the reaction catalyzed or by the recommended trivial name, followed by its EC number in parentheses. Thereafter, a trivial iiaiiie niay lie used. Enzyme names are not to lie abbreviated except when the substrate has an approved aldireviation (e.g., ATRse, but not LDH, is acceptable).
ix
ABBREVIATIONS AND SYMBOLS
X
REFERENCES 1 . ]BC 241,527 (1966);Bchem 5, 1445 (1966);B] 101, 1 (1966);ABB 115, 1 (1966).129,1(1969);
and elsewhere. General. 2. E]B 15, 203 (1970);JBC245, 5171 (1970);IMB 55, 299 (1971);and elsewhere. 3. “Handbook of Biochemistry” (G. Fdsman, ed.), 3rd ed. Chemical Rubber Co., Cleveland, Ohio, 1970, 1975, Nucleic Acids, Vols. I and 11, pp. 3-59. Nucleic acids. 4. “Enzyme Nomenclature” [Recommendations (1984) of the Nomenclature Committee of the IUB]. Academic Press, New York, 1984. 5. E]B 150, 1 (1985). Nucleic Acids (One-letter system). Abbreviations of Journal Titles
lournals
Ahhretjiations used
Annu. Rev. Biochem. Annu. Rev. Genet. Arch. Biochem. Biophys. Biochem. Biophys. Res. Commun. Biochemistry Biochem. J. Biochim. Biophys. Actd Cold Spring Harbor Cold Spring Harbor Lab Cold Spring Harbor Symp. Quant. Biol. Eur. J. Biochem. Fed. Proc. Hoppe-Seyler’s Z. Physiol. Chem. J. Amer. Chem. SOC. J. Bacteriol. J, Biol. Chem. J. Chem. Soc. J. Mol. Biol. J. Nat. Cancer Inst. Mol. Cell. Biol. Mol. Cell. Biochem. Mol. Gen. Genet. Nature, New Biology Nucleic Acid Research Proc. Natl. Acad. Sci. U.S.A. Proc. SOC.Exp. Biol. Med. Progr. Nucl. Acid. Res. Mol. Biol.
ARB ARGen ABB BBRC Bchem BJ BBA CSH CSHLah CSHSQB EJB FP ZpChem JAC S J. Bact. JBC JCS JMB JNCI MCBiol MCBchem MGG Nature NB NARes PNAS PSEBM This Series
Some Articles Planned for Future Volumes
Structure and Transcription Regulation of Nuclear Genes for the Mouse Mitochondria1 Cytochome c Oxidase NARAYAN G. AVADHANI, A. BASU, C. SUCHAROV AND N. LENKA General Transcription Factors for RNA Polymerase II
RONALDc. CONAWAY
AND
JOAN w.
CONAWAY
The Internal Structure of the Ribosome BARRYS. COOPERMAN RecA Protein in Recombinational DNA Repair MICHAEL COX AND ALBERTO I. ROCA Biochemistry and Molecular Genetics of Cobalamin Biosynthesis JORGE C. ESCALANTE-SAMERENA Intron-encoded snRNAs
MAURILLE J. FOURNIER AND E. STUARTMAXWELL Mechanisms for the Selectivity of the Cell’s Proteolytic Machinery
ALFRED GOLDBERG,MICHAELSHERMAN
AND OLIVER COUX
Structure/Function Relationships of Phosphoribulokinase and Ri bulosebisphosphate Carboxylase/Oxygenase FREDC. HARTMAN AND HILLELK. BRANDES The Nature of DNA Replication Origins in Higher Eukaryotic Organisms JOEL A. HUBERMAN AND WILLIAMC. BURHANS Function and Regulatory Properties of the MEK Kinase Family GARYL. JOHNSON ET AL. Changes in Gene Structure and Regulation of Cell Adhesion Molecules during Epithelial Tumorigenesis
YOUNG
s. KIM AND JANUSZ JANKOWSKI
Developmental Genome Reorganization in Ciliated Protozoa LAWRENCE A. K L O B ~ C H E RAND GLENNHERRICK Mammalian DNA Polymerase Delta: Structure and Function MARIETTA Y. W. T. LEE mRNA Stability: Role in Human Hemoglobin Gene Expression STEPHENA. LIEBHABER xi
xii
SOME ARTICLES PLANNED FOR FUTURE VOLUMES
DNA Helicases: Roles in DNA Metabolism STEVENW. MATSON AND DANIEL W. BEAM Molecular Genetics of Yeast TCA Cycle Enzymes LEE MCALISTER-HENN Bacterial and Eukaryotic DNA Methyltransferases NORBERT0. REICI~ Self-glucosylating Initiator Proteins and Their Role in Glycogen Biosynthesis PETERJ. ROACH New and Atypical Families of Type I Interferons in Mammals R. M r c i i m L ROBERTS , LIMIRLIV AND ANDREI ALEXENKO
DNA Excision Repair Assays AZIZ SANCAR AND DAVIDM U Chemical Synthesis and Structure of Small RNA Molecules MATIIIAS SPRINZL AND STEFANLIMMER Transcriptional Regulation of Small Nuclear RNA Genes WILLIAME. STUMPII Bacillus subtilis as I Know It NOBORUSUEOKA Molecular Biology of Axon-Glia Interactions in the Peripheral Nervous System UELI SUTER
b
Oligo- and Poly-nucleotides as Biologically Active Compounds
v. VLASSOV ET AL. Molecular Genetic Approaches to Understanding Drug Resistance in Protozoan Parasites DYANN WIRTHET AL. Molecular Regulation of Cytokine Gene Expression: Interferon y as a Model System HOWARDA. YOUNG AND PARITOSH GHOSH
Experimental Analysis of Global Gene Regulation in Escherichia coli ROBERT M. BLUMENTHAL Department of Microbiology Medicul College of Ohio Toledo. Ohio 43699
DEBORAH w. BOHSTAND ROWENA G . MATTHEWS~ Biophysics Reiearch Dioision and Department of Biological Chettiistr!/ The Unioersity of Michigan Ann Arbor, Michigan 48109
I. What Is a Global Regulator? . . . . . . . . ......................... A. A Regulatory Paradigm . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Regulation of the Expression of Target Operons Generdlly Operates at the Level of Transcript Initiation . . . . . . . . . . . . . . . . . . . . . . . . . . . C. How Do Global Regulators Differ from Local Regulators? . . . . . . . . D. How Are Global Regulators Controlled? . . . . . . . . . . . . . . . . . . . . . . . E. What Are the Advantages of Using Global Regulators? . . . . . . . . . . . 11. Methods for Experimental Analysis of Global Regulators . . . . . . . . . . . . A. How Are Regulon Members Identified and Confirmed? . . . .-. . . . . B. How Can Actions of the Regulator Be Studied in Vitio? . . . . . . . . . . 111. Experimental Analysis of Globdl Regulators and Their Roles in Escherichia c o k Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Use of Two-dimensional Gel Electrophoretic Analyses to Study the Heat-Shock Regulon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Nitrogen Source Utilization and Two-component Response Regulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. The Leucine-responsive Regulatory Protein and Metabolite-modu...
... .................. ..................
2
2
5 17 18 23 27 27 37 42 43 54 63
73 75 78
Escherichiu coli is the most studied organism on the planet [“Holy coli”! (A. Kornberg)] and has served as the paradigm for our knowledge about 1
To whom correspondence may be addressed.
Progreic in Nucleic Acid Research and Molecular Biology, Vol. 55
1
Copyright 0 1996 hy Academic Press, Inc All rights of reproduction in any form reserved.
2
ROBERT M. BLUMENTHAL ET AL.
cellular regulation since the original formulation of the operon model (1). After such intensive study for more than three decades, one might expect that there would be little new to report. Instead, recent studies on the regulation of metabolism in E . coli reveal the amazing complexity and subtlety of regulation in this simple unicellular organism. As the focus has moved from description of local regulatory circuits to the coordination of the expression of many genes in global networks, the unique advantages of E . coli for both in uitro and in vivo studies of cellular regulation have made it still the organism of choice for such studies. In this review, we focus on the experimental approaches to the study of global regulation in E . co2i. We discuss specific global regulatory proteins and their regulons primarily to illustrate the portfolio of experimental approaches, rather than to provide a comprehensive review of specific regulatory networks. Our focus is on global regulators in general: why they are useful to bacteria, what the characteristics of each type might be, and how to identify and confirm genes that are controlled by them. For more detailed discussion of specific global regulators, the reader is advised to consult reviews on the heat-shock regulon and RpoH (2), RpoS (3), and nitrogen regulon and RpoN (4), cyclic AMP and Crp (5), Lrp (6), and two-component regulatory systems (7).
1. What Is a Global Regulator? I would have nobody to control me; I would be absolute: and who but I? Now, he that is absolute can do what he likes; he that can do what he likes can take his pleasure; he that can take his pleasure can be content; and he that can be content has no more to desire. [Miguel de Cervantes Saavedra, Don Quixote, Chap. xxiii, 1615 (Lockhart transl.)] Freedom’s just another word for ‘nothingleft to lose.’ (Chris Kristoffersen and Fred Foster, Me and Bobby McGee)
A. A Regulatory Paradigm 1. WHATIs A
REGULON?
It is difficult to talk about global regulators without using the term regulon, because that is what a global regulator controls. The term is defined as follows in a classic text (8) on bacterial physiology: “A relatively simple network of operons that is controlled only by a common regulatory protein and its effector ligand. , . . The term has, however been used in a more general sense to refer even to complex systems involving multiple individual
GLOBAL GENE REGULATION
3
regulators in addition to a pleiotropic regulatory protein.” As that text goes on to point out, an additional term (modulon) is sometimes used for regulons in cases in which the members are also controlled by additional proteins; but we find this distinction to be limited in usefulness and in use, and so we do not employ it in this review. A further issue is whether an operon must be directly controlled by a regulatory protein in order to be termed a member of the regulon. Our approach is to use the term target operon to refer to an operon subject to direct control, as demonstrated by DNA mobility-shift or footprinting assays, and by mutagenesis of the target operon control region. We also include in the regulon any other operons that show significantly altered regulatory behavior in a strain mutant for the relevant regulatory protein. This is the operational definition that seems most often used in recent literature. Two caveats must be mentioned, however. First, the intent of this definition is to include in the regulon operons that are directly influenced by target operons (i.e., under second-generation control); this makes good sense when considering integrated cell physiology, for which the pattern of regulation is more relevant than the mechanism(s) used to achieve that regulation. This definition is not meant to include operons influenced by nonspecific effects of the regulatory mutation. For example, cells in which the crp gene for catabolite activator protein has been interrupted show profound reductions in growth rate (S),and genes that respond to changes in growth rate are not considered to be part of the Crp regulon. The second caveat is that this definition of regulon membership is open-ended in a quantitative sense: the limit is defined by the statistical significance of the measured regulatory change. One other useful term is appropriate to introduce here. A stimulon is the set of operons that respond significantly to some environmental stimulus. This term is useful when nothing is known about the regulator(s) controlling individual genes, or when several regulatory proteins are involved in the response to a common environmental change. Membership in the stimulon is open-ended in the same sense as regulon membership.
2. STIMULUS-RESPONSEPATHWAY A paradigm that provides a useful framework for the analysis of regulatory systems (8) is shown in Fig. 1. A stimulus (increase in temperature, presence of extracellular leucine, etc.) is detected by a sensor. The sensor converts this detection to a signal (chemical or conformational) that is passed, directly or via one or more transducers, to a regulator. The regulator acts on a number of target operons and [in a variation on the original figure (S)] the products of target operons may act on secondary operons to form a regulatory cascade. The production and activity of proteins specified by these
4
ROBERT M. BLUMENTHAL ET AL.
Stimulus
7
'Secondary Operon(s) FIG. 1. A paradigm for analysis of regulatory systems. [Adapted, with permission, from Neidhardt et d.(&3)]. Described in the text.
operons constitute the response, which then can feed back to moderate response strength at the level of signal production, transducer activity, regulator activity, or some combination of these. The feedback or return portion of the paradigm allows system stability: either a return to the prestiinulus level of expression, or stabilization at a new, more appropriate level of expression. This figure frames the key questions to be asked in the study of global regulons: a. Whut is the regulon? Which genes and proteins belong to it? b. Whut is the physiological role of the regulon? Under what conditions are its member genes expressed? What is the signal recognized by the regulator? c. How does the regulator function? How is the expression of individual genes in the regulon controlled by the regulator? How is the regulator activated or inactivated by the signal? d. H o w does the regulon function? What is the feedback circuit by which the magnitude of the response is modulated? Which other genes are controlled by members of this regulon? e. How is the regulon integruted into the cell’s overall response to a given stimulus? Which other regulators control members of this regulon? How do different regulators interact with one another?
5
GLOBAL GENE REGULATION
We begin by providing some background on the setting in which global regulation occurs, and then move to a consideration of the experimental methods used to answer these questions in the study of global regulation. In inany instances, fundamental information needed to understand global regulation is not available; we try to point out those areas in which further research is needed.
B. Regulation of the Expression of Target Operons Generally Operates at the Level of Transcript Initiation 1.
ARE CONCURRENT IN mRNA GENERALLY HASA SHORT HALF-LIFE
TRANSCRIPTION AND TRANSLATION BACTERIA, AND
In bacteria, which do not sequester DNA in a membrane-bounded nucleus, ribosomal initiation complexes bind to nascent inRNA as soon as a ribosome binding site has emerged from the transcription complex; initiating ribosomes interact with about 50 nucleotides of mRNA roughly centered on the initiation codon (10, 11). Unrestricted translating ribosomes generally move along the inRNA at the same speed that unrestricted RNA polymerase moves along the DNA template. Although the rate of translation varies widely (7.4 to 22 amino-acids/second) for individual polypeptides, depending on the codon usage and tRNA availability (12-14), the step-time for the average translating ribosome has been estimated to be 17 amino-acids/ second (15).Similarly, the rate of transcript elongation varies with the template due to the occurrence of discrete pausing sites at which the pause length can be modulated ( 16-19), but the average transcribing RNA polymerase proceeds at 20-50 nt/second (14,2O, 21). Because 17 aalsecond corresponds to 51 ntlsecond, the ribosomes can on average keep pace with RNA polymerase. A kinetic coupling between RNA polymerase and its trailing ribosomes has been proposed (22), such that a paused ribosome releases guanosine tetraphosphate (ppGpp) (23), which can bind to the adjacent RNA polymerase (24) and cause it to extend transcriptional pauses (16, 17). Indeed, when ribosomes do not keep pace with the RNA polymerase, transcriptional polarity or premature mRNA decay can result. The transcriptional polarity occurs when the exposed stretch of inRNA between the RNA polymerase and a ribosome is bound by termination factor Rho: the Rho hexainer moves along the inRNA and can cause transcript termination at a subset of template-dependent pause sites (18, 25). The premature decay of inRNA can occur when the exposed stretch of mRNA contains a substrate site for the ribonuclease RNAse E (26-28). The half-life of inRNA is, in any case, measured in minutes due to nucleolytic attack by polynucleotide phosphorylase and RNase 11: pulse-chase experiments indicate an average mRNA
-
6
ROBERT M. BLUMENTHAL E T AL.
half-life of about 4 minutes (29). In another study, mRNA decay was indirectly assayed by following the change in rates of synthesis of several polypeptides after addition of the transcription inhibitors rifampicin or streptolydigin, using two-dimensional electrophoresis; the apparent half-lives ranged from 40 seconds to 20 minutes (30). In fact, some of the longest transcripts have never been isolated from normal cells in full-length form because their half-lives are less than the time it takes for a transcribing RNA polymerase molecule to reach the end of the gene (26). The facts that translation of an mRNA can begin while transcription is still in progress, and that most mRNAs have very short lifetimes, combine to ensure that translational choices available to the pool of ribosomes reflect the cell's transcriptional choices very closely. It also follows that the most energyeficient focus for regulation of bacterial gene expression i s at the level of controlling transcript initiation. In fact, this is the level at which most known regulation of gene expression occurs (8);this is the level at which the known global regulators act; and this is the level at which this review focuses. However, two points are important to note. First, energy efficiency is sometimes a secondary consideration; for example, when temporal responsiveness is the primary consideration, it is most efficient to modulate the activity of preformed enzyme, as is done in the case of glutamine synthetase (see Section III,B, 1).Second, although the control of transcript initiation is clearly very important, we may have a biased knowledge base that underrepresents the true degree of regulation that occurs at the levels of transcript elongation and termination, RNA processing, initiation of translation, and post-translational processing.
2. STEPS IN TRANSCRIPT CANACT
INITIATION AT WHICHREGULATORS
Transcript initiation has come to be recognized as a complex multistage process. Thus, even if the bulk of gene regulation is focused on controlling transcript initiation, there are many potential mechanisms by which this can be accomplished (31).It is important to note that, for most known regulators, the exact mechanism of action has not yet been determined. Let us briefly go through the major steps of transcript initiation as they are now understood, and discuss how each can be modulated by a regulator. These steps are summarized in Fig. 2. u. The Pools of Sequestered and Free RNA Polymerase. We begin with a cell that, at any given moment, has a certain concentration of RNA polymerase and a certain concentration of promoters. This should be the simplest step in transcript initiation to describe, but our incomplete knowledge of conditions inside the cell makes the story rather complex. Nonethe-
7
GLOBAL GENE REGULATION
/ FIG.2. A broad overview of the transcription cycle, focusing on steps in initiation. Note that steps a-d also correspond to subsections a-d of the text (Section I,B,2). (a) Free RNA polymerase associates with a u factor and binds nonspecificdy to the DNA. (b) The RNA polymerase locates and binds to a promoter (grey box), resulting in a closed complex. (c) The promoter is isomerized (strand separation, formation of open complex). (d) The u factor is released, and RNA polymerase enters a cycle of abortive initiations; eventually escaping to proceed with transcript elongation. (e) Transcript elongation, with concomitant translation if the transcript is a mANA. (0 Termination and release of RNA polymerase into the free pool.
8
ROBERT M. BLUMENTHAL E T AL.
less, this is a story worth exploring, because it describes the environment in which all global regulators act, and illustrates the limitations to our understanding forced by our relative ignorance of conditions inside the living cell. i . DNA. First, we address the concentration of DNA inside an E. coli cell. It is quite high, ranging from 5 to 25 mg/ml (or -7.5 to 40 mM bp) (32); the DNA is not evenly distributed throughout the cytoplasm, and within the nucleoid the DNA concentration may be as high as 30-100 mg/ml(-45-150 mM bp) (33). As this suggests, DNA concentration is not limiting for transcription, and is not so even in mutants with decreased DNA:mass ratios (32).If we take the average size of an E . coli transcription unit as being 2.5 kb (i.e., containing two average-size genes and a bit of untranslated 5’ sequence), and assume that there are few substantial gaps between transcription units [apparently functionless 600- to 800-bp “grey holes” between transcription units are rare (34-36)], then the total DNA concentration corresponds to a promoter concentration of 3-15 p M . However, this fairly straightforward calculation cannot be used with much confidence. Due apparently to histonelike proteins and nucleoid structural elements, the DNA is not uniformly accessible to binding proteins. An in vitro study with the E. coli HU protein revealed the formation of a nucleosome-like structure, with one H U tetramer per 60 bp of DNA (37). The estimated fraction of DNA occluded in vivo by bound proteins, including HU, ranges from 20 to 80% (38-40). Using the partitioning between specific sites and nonspecific DNA for five regulatory proteins, and comparing the in vitro to in vivo partitioning, the E . coli intracellular concentration of DNA effectively available to binding proteins was estimated to be about 100 p M bp (41), or about 1%of the chemical amount of DNA. Using two recombination assays, an analogous comparison of in vivo to in vitro behavior yielded results strikingly similar to those of the binding protein study: the effective DNA concentration in E . coli was 2.5-10% of the chemical concentration (42). Even if this DNA masking, presumably by bound proteins, affects 99% of the DNA, the effective concentration of available DNA in the cell would still be 75-400 pM bp (300-1000 p M bp in the nucleoid). A major uncertainty involves whether promoter regions differ from the rest of the DNA in their coverage by histonelike proteins. Promoters are disproportionately associated with intrinsically bent DNA (43),and it is possible this could selectively reduce the degree of their masking. If promoters are not different from bulk DNA in this coverage, then the effective promoter concentration at any instant should be about 1% of 3-15 p M , or 30-150 nM. On the one hand, overexpression of the histonelike protein H-NS does lead to a global decrease in transcription (though overexpression of HU does not have this effect) (44), suggesting that typical promoters are not immune
GLOBAL GENE REGULATION
9
from these masking effects. On the other hand, it seems counterintuitive that, at any given moment, there is a 99% chance a given promoter will be nonspecifically occluded: this could substantially reduce temporal responsiveness to changing conditions. This apparent paradox may be resolved by the observation (37)that the interaction of HU with DNA is characterized by rapid equilibrium. If the areas of DNA masked by HU and other proteins rapidly shift, then a given promoter may be briefly but frequently exposed to specific binding proteins, including RNA polymerase. Why would this be advantageous, as opposed to simply leaving the D N A unmasked (beyond what is absolutely required for nucleoid folding)? The answer may have to do with the ability of sequence-specific proteins to discriminate between binding sites and the huge excess of nonspecific DNA. The fantastically high chemical concentration of DNA inside the cell is well above the apparent dissociation constants shown by most sequence-specific DNA-binding proteins toward nonspecific DNA; for example, 50% binding of Lac repressor to nonspecific DNA is seen (at physiological ionic strength) with -170 p.g/ml DNA (37),which is 0.1 to 1% of the in uiuo DNA concentration. Thus these proteins could not efficiently occupy their binding sites unless the proteins were quite abundant. With the effective DNA concentration reduced through random dynamic masking, the concentration of nonspecific sites is reduced to a level close to the dissociation constant for nonspecific binding, and efficient site occupancy can occur. If the masking is in fact random, then the ratio of specific sites to nonspecific DNA is not improved. In other words, the equilibrium distribution of the specific protein between specific sites and nonspecific DNA is improved at the price of reduced speed with which that equilibrium is reached, because the effective concentration of specific sites is reduced by the same factor as is the effective concentration of nonspecific DNA. This may, in part, explain the apparent inability to isolate E . coli strains deficient for all three of the major histonelike proteins IHF, H-NS, and HU (45). In summary, the DNA concentration is very high, but masking reduces the effective DNA concentration to one-hundredth. Because the susceptibility of promoters to this masking is unclear, it is not currently possible to estimate accurately the instantaneous promoter concentration in the cell: it may be as high as 3-15 pM, or as low as 30-150 nM. In any case, as described in Section I,B,2,b, promoters vary greatly in the affinity RNA polymerase has for them, and this further clouds the meaning of “promoter concentration.” Furthermore, it should be noted that many of the strongest promoters (for genes specifying the translation machinery) are concentrated near the chromosomal origin of replication; because of nested initiations, the originproximal DNA is selectively amplified at higher growth rates and thus increases the proportion of very strong promoters (46) in a calculable manner
10
ROBERT M. BLUMENTHAL E T AL.
(47).It is also worth noting that the DNA concentration in an individual cell varies over the division cycle (46). i i . RNA polymerase. The known facts regarding in viuo levels of RNA polymerase are strikingly similar to those for DNA: there is a substantial amount in the cell but the effectioe concentration is much lower than the chemical concentration. As the cell growth rate increases from 0.6 to 2.5 doublings per hour, the RNA polymerase concentration increases from 0.9 to 1.6%of total cell protein, or 1500-11,400 molecules per cell (14,48).Over this same growth rate range, the median E . coli cell volume ranges from 0.65 to 2.25 kms (49), so the total concentration of RNA polymerase ranges from 3.8 to 8.5 pM. This range is quite similar to the estimates of promoter concentration. However, to consider binding reactions with promoters, we need the concentration offree RNA polymerase, because RNA polymerase molecules already engaged in transcription are not available to affect the promoter binding equilibria. The chemical concentration of free RNA polymerase has been estimated to be 400 nM, based on electrophoretic analysis of the RNA polymerase subunit content of minicells, which, being anucleate, contain only free RNA polymerase (50); this analysis assumes that free RNA polymerase is randomly distributed through the cytoplasm at the time the anucleate minicell pinches off from the mother cell. Thus, 400 nM free-form RNA polymerase represents -5-10% of the total concentration, which seems to conflict with evidence that only 20-30% of RNA polymerase is actively transcribing D N A (i.e., in the process of elongating transcripts) at any given moment (48). What is the status of the remaining 60-75% of RNA polymerase, which is apparently neither free nor actively elongating transcripts? Some may be nonspecifically bound to DNA in the process of seeking a promoter [e.g., (51);this form would affect promoter binding equilibria], and some may be trapped in the abortive initiation cycle that precedes promoter clearance (see Section I,B,2,d) or at pause sites within transcription units. Consistent with these possibilities, the amount of intracellular RNA polymerase associated with DNA appears to be higher than the amount calculated to be active (52). In addition, some RNA polymerase may be in pools sequestered in other ways-for example, bound to alternative initiation factors (see Section 111,B,2) that target the RNA polymerase to genes whose promoters may at that time happen to be inaccessible (repressed). A comparison of in uivo and in vitro transcription frequencies for a series of promoters, which represents a functional test for the concentration of free RNA polymerase, reveals that growing E . coli cells appear to have an effective level of free RNA polymerase of -30 nM (53).At least part of this 13-fold difference (400 vs. 30 nM)
GLOBAL GENE REGULATION
11
could be due to DNA masking: the in vitro transcription frequencies were determined with pure DNA. Other DNA-binding proteins also show such discrepancies, as is discussed for Lrp in Section III,C,4,b. Although it may be difficult to determine the effective molar ratio of RNA polymerase to promoters, there is considerable experimental evidence that RNA polymerase activity is limiting in E . coli. Decreasing the DNA:mass ratio increases expression of a fully induced lac operon, consistent with transcription being limited by the amount of free RNA polymerase (32).This conclusion is reinforced by several additional studies on fully induced or decontrolled lac operon transcription; lac is a particularly useful choice because its position on the chromosome minimizes gene-dosage changes that depend on the growth rate (32, 47). For example, changes in growth rate over a broad range lead to only minor changes in expression of the decontrolled lac operon, consistent with limited transcription capacity (54). Low concentrations of the RNA polymerase initiation inhibitor rifampicin preferentially decrease the rate of lac operon transcription despite the fact that plncliVSis a moderately strong promoter (55).Similar results were seen when a mutant that was temperature sensitive for the major u factor ( U ~ O ) was shifted to partially restricting temperatures (even after accounting for the effects of the heat-shock response) (56). One is led to wonder about the advantage to E . coli in having subsaturating RNA polymerase activity. On the one hand, one would expect saturating levels of RNA polymerase to make the cell temporally more responsive to changing growth conditions, because sufficient RNA polymerase would always be available for immediate transcription of any newly required gene. On the other hand, limiting the transcription capacity may help to reduce background levels of transcription, increasing the effectiveness of repressors and activators. Limited capacity also establishes an environment that will strongly select for efficient and effective regulators of transcript initiation, which, in the long term, may result in a more efficient cell. It has been noted that the promoters for ribosomal genes are particularly sensitive to the concentration of free RNA polymerase, suggesting that subsaturating RNA polymerase plays a major role in limiting expression of these genes (22).To our knowledge, the effects of coordinately overexpressing the subunits of RNA polymerase, in order to provide the cell with saturating levels of transcription capacity, have not been determined. In any case the subsaturating transcription capacity emphasizes the utility, when metabolic pathways require high temporal responsiveness, of control based primarily on activity modulation of preformed enzyme. Some aspects of this apparently passive interchange between sequestered and unsequestered (free) pools of RNA polymerase could be subject to some widespread active regulation, but to our knowledge this has not
12
ROBERT M. BLUMENTHAL ET AL.
yet been demonstrated. Alternatively, changes in the average transcript elongation rate could be a primary determinant of the amount of free RNA polymerase: for a given rate of initiation, faster elongation will result in RNA polyinerase returning more rapidly to the free pool (22). Both ppGpp and accessory proteins (NusA, GreA, and GreB; see Section I,B,2,d) affect the effective elongation rate of RNA polymerase by modulating the length of pausing.
b. Binding to u Promoter to Form u Closed Complex. Promoter binding involves three steps that are analogous to those presumably used in site recognition by other sequence-specific DNA binding proteins, including global regulators. As with the other steps in transcript initiation, our limited knowledge of in oizjo conditions limits our understanding of in oioo behavior. In the first step, free RNA polymerase binds to DNA nonspecifically. It then begins a search for recognizable and nonoccluded promoter sites. This search is completed much more rapidly than could be explained by simple diffusion in three dimensions, and appears to involve both a one-dimensional scan of the DNA, called sliding (though the variety of histonelike proteins might limit the length of an individual slide), and direct or intersegment transfer, which is the sampling of DNA segments through a rapid binding and dissociation process that involves diffusion in three dimensions by DNA segments (57, 58). Direct transfer is coupled with “hopping,” the movement of 4-8 bp along the DNA with each transfer, which increases the effective target size of the promoter (58).With the following formula we can estimate the number of random DNA contacts an RNA polymerase molecule must make to have a given chance of contacting a promoter: N
=
ln(1 - P)/ln(l - f),
(1)
where N is the number of DNA contacts required to have a P probability of contacting a promoter, andfis the fraction of the DNA constituting promoters (adapted froin Ref. 59). If promoters occur roughly every 2.5 kbp, or once every 5.0 kbp in the correct orientation for a given scanning RNA polymerase molecule, and if RNA polymerase must make direct contact with the promoter (assuming no sliding and 8 bp of hopping), it would take 433 random DNA contacts to have a 50% chance of contacting a promoter. At the DNA concentrations in the cell, this would take very little time, even if the promoter was occluded by histonelike proteins 99% of the time (see Section I, B,2,a). However, RNA polymerase bound nonspecifically presumably does not change its DNA ligand at every contact: the kinetic off-rate for nonspecific DNA has been measured in vitro as 0.3 sec-1 (51).Unless the in vioo off-rate is substantially higher, this would imply a half-time for promoter
GLOBAL GENE REGULATION
13
contact of 24 minutes for a given RNA polymerase molecule: a full generation for E . coli growing in a rich medium! On top of direct transfer, however, sliding has been predicted from kinetic studies particularly where direct transfer is reduced by using lower D N A concentrations (58, 60, 61);RNA polymerase sliding has been visualized as well (62).The effect of the mix of E . coli histonelike proteins on this sliding is not known, but RNA polymerase found target promoters at the same rate on DNA that had or had not been packaged in vitro into animal polynucleowines (63).If we assume that every direct transfer by an RNA polymerase is followed by 60 bp of sliding [the distance between HU tetramers (37)],the number of random contacts needed for a 50% chance of promoter contact would be 29 and the off-rate of 0.3 sec-1 would give a half-time of 1.6 minutes. The effects of random dynamic masking of the DNA on this value depend on whether promoters are subject to the same degree of masking as bulk DNA (see Section I,B,2,a). Again, these arguments can be applied to other sequence-specific DNA-binding proteins, including the global regulators. Once RNA polymerase contacts a promoter, an initial binding can proceed at a rate that depends on the properties of the promoter. Some of these properties affect the isomerization step more than the binding step (see below), but it is worth noting that the dynamic range of E . coli promoters spans four orders of magnitude, corresponding to a transcription initiation rate between once per generation and once per second (64).The properties that can affect promoter binding strength include the primary sequence of the promoter at the -10 and -35 conserved regions (65); the length and sequence of the DNA between these conserved regions (66);the relative orientation of the -35 and - 10 regions, which can in some cases be altered be supercoiling (67),twisting by a bound regulatory protein, or even twisting due to transcription at a different, nearby promoter (68, 69);and the presence of favorable contacts provided by regulatory proteins bound nearby. The process of binding nonspecifically to DNA in order to search for a promoter is not known to be regulated, but the recognition and initial binding to a promoter most certainly are focal points for regulation. Anything that changes the sequence specificity of the RNA polymerase will act at the level of promoter binding (see Section 111,A). However, in contrast to predictions of early models, repressors and activators (proteins that, respectively, increase or decrease transcription from a given promoter) do not all act by modulating this initial binding step; it is not even clear that a majority do so. Nonetheless, some regulators do act by interfering with RNA polymerase binding to the promoter: the cI repressor of bacteriophage A at, A, (70),the Lac1 repressor at plot (71),and L e d repressor at plrtrA(72).These proteins
14
ROBERT M. BLUMENTHAL ET AL.
act by physically obstructing the promoter region. In contrast, some regulators appear to act by improving promoter binding via favorable proteinprotein contacts with RNA polymerase (though in some cases such contacts might also favor conformational changes in the RNA polymerase that stimulate isomerization). The best studied example of this is Crp-CAMP (73-75). Other proteins may affect the promoter-binding step by bending the DNA template and not necessarily contacting the RNA polymerase directly at all. For reasons that are not yet clear, intrinsically bent DNA lacking obvious similarity to the consensus -35 sequence can yield a functional promoter when combined with an authentic - 10 sequence (76).It has been suggested that the bend allows stabilizing interactions to take place between DNA upstream of the promoter and the “back of the RNA polymerase (77). For our purposes, it is important simply to note that anything that bends the DNA can, depending on the direction of the bend relative to the promoter, improve or interfere with promoter binding by RNA polymerase, and some global regulators do act in this way. As an aside, some promoters are controlled at this binding step by their methylation status. If this was more widespread a phenomenon in E . coli than it appears to be, then the Dam methylase could be considered a global regulator. Dam methylates the sequence GATC at a low enough rate that such sequences remain, for some time following passage of a DNA replication fork, methylated only on the parental strand (hemimethylated) (78, 79). This is used by the mismatch repair system to identify the newly synthesized strand, but it is also used to limit transcription at some promoters to a short time following DNA replication [e.g., the chromosome replication initiation protein gene dnaA (80) and the transposase gene tnpA of IS10 ( S l ) ] . Dam methylation can also affect promoter activity indirectly, by affecting the binding of repressor or activator proteins (82).
c. lsomrization of the Closed-to-open RNA Polymerase-Promoter Complex. The isomerization step involves local strand separation in the template DNA, forming what is called an open complex. The binding A n i t y of a promoter for RNA polymerase and its rate constant for isomerization can vary independently of one another (55, 82-84). Experimentally, the isomerization step is most frequently detected in vitro by the concomitant reduction in the ability of the polyanionic molecule heparin to compete with promoter DNA for bound RNA polymerase (85).The kinetics of isomerization are estimated in an abortive initiation assay in which one nucleoside triphosphate is omitted from the reaction: the lag between mixing components and reaching maximal rate of nucleotide incorporation into short oligonucleotides is plotted versus the inverse of RNA polymerase concentra-
GLOBAL GENE REGULATION
15
tion, and, from the extrapolated lag time at an infinite concentration of RNA polymerase, the isomerization rate constant can be derived (84). It has been suggested that the bacteriophage P22 Arc repressor acts by blocking the isomerization step, though this has only been cited as unpublished results (86). More recently it has been reported that MerR blocks isomerization (87), though this appears to result from bending the D N A between the -35 and - 10 regions of the pT promoter not only such that the energetic cost of strand separation is raised but also such that the RNA polymerase fails to contact the -10 region of pT. MerR is converted from acting as a repressor to acting as an activator by the presence of Hg2+, and the MerR activation mechanism also appears to involve the isomerization step. In the presence of Hg2+, not only does the DNA bending disappear, allowing RNA polymerase to contact the -10 part of the promoter, but the spacer region between -10 and -35 is probably untwisted by about 50” relative to normal B-form DNA (87), which would reduce the energetic cost of isomerization to an open complex. Another example of a regulator that acts at the level of isomerization is the phosphoprotein NR,: this stimulates isomerization by a bound RNA polymerase-u54 complex at the p2 promoter of gZnA (88).
d. Forming the lnitial Transcribing Complex and Allowing Promoter Clearunce. Once an open complex has been formed, the RNA polymerase begins to generate short oligomers with the same 5‘ end in a cycle of abortive initiation (89-91), and at this point the u subunit is released (92). The RNA polymerase is now stably associated with the promoter, though it is still capable of dissociating (71). At some promoters, the precise initiation site varies with the nucleotide pools, and this can have regulatory consequences. For example, at the pyrC and pyrD promoters in Salmonella typhimurium, the CTP/GTP pool ratio determines which of two initiation points is used: transcripts beginning at the point farther upstream form a hairpin that reduces translation initiation (93).If this is found to be a widespread phenomenon, then one could consider RNA polymerase to be a combination sensor and global regulator, and nucleotide pools to be the regulatory signals. In uitro analysis of initiation from piacuv,5suggests that only when a transcript 7 to 9 nt long is generated does the RNA polymerase escape from the cycle of abortive initiation and proceed through the remainder of the transcription unit (94). This futile cycle may be a consequence of the “inchworm” model for transcription (19, 89, 95), though the associated stretching and contraction of the RNA polymerase on the DNA are not intrinsic to transcription but occur in response to specific features in the template (95, 96). The rate of escape from the abortive initiation cycle is subject to kinetic
16
ROBERT M. BLUMENTHAL E T AL.
control, because the kinetic efficiency of interaction with the appropriate NTP substrate has been found to vary froin position to position on the template: this parameter can range in value over four orders of magnitude (97,98). For example, the concentration of UTP appears to modulate promoter clearance at the P2 promoter of the gal operon (99).The escape frequency might also turn out to be modulated by RNA polymerase accessory proteins. One of these is NusA, which lengthens transcriptional pauses (100, 101). Two others are GreA and GreB, both of which stimulate RNA polymerase to cleave off short 3'-terminal oligonucleotides from transcripts in paused transcription complexes; this cleavage allows transcription to resume froin the new 3' end (102-106). This raises two basic questions. First, why should the initiation of transcription be gated again once RNA polymerase has jumped over the hurdles of promoter binding and isomerization? Second, is the escape step a possible target fur regulators? To address the first question, two possibilities come to mind. The longer an RNA polymerase is stuck in the abortive initiation cycle, the longer that promoter is inaccessible to other RNA polymerase molecules. This may be important in some cases, e.g., with essential genes that must be expressed, but at low rates. An ideal way to ensure expression of these genes in the face of limiting transcription capacity would be to put thein under the control of a strong (readily bound and isomerized) promoter, which would ensure a supply of RNA polymerase, but then to prevent high levels of expression through a long abortive initiation cycle. This effect could also be achieved, though perhaps with greater risk that the RNA polymerase would dissociate, by having a strong-binding slowly isoinerizing promoter. A second possible use of the abortive initiation cycle takes advantage of the ability of initial transcribing coinplexes to dissociate (71). In this view, the abortive initiation cycle can function as a kinetic filter. If an induction signal occurs with a certain background frequency (such as results from cross-talk; see Section I,E,2), and this background cannot be filtered at the level of the regulator, then it can be filtered just after initiation. For example, if only half of the initiations escape the abortive cycle on average, then twice the induction signal frequency will be needed to give a transcriptional response. [This is distinct froin the attenuators (107), which use the positions of ribosomes and RNA-binding proteins (208,109)to gate RNA polymerase at controllable transcription terminators, well after promoter clearance.] The second question, whether this escape step can be actively regulated, now seeins important to answer: it certainly appears that escape could be a productive target for regulation. GalR appears to act at the entry into the abortive initiation cycle, possibly at the point of the first phosphodiesterification (110, 111),but not at the escape step. The RNA polymerase accessory
GLOBAL GENE REGULATION
17
proteins NusA, GreA, and GreB (see Section I,B,B,d) may modulate the escape frequency for subsets of promoters that have tendencies to longer times in the abortive initiation cycle. In this regard, these three proteins may turn out to be global regulators; it is not yet known what controls the levels or activities of these proteins, nor have their possible target operons been defined.
C. How Do Global Regulators Differ from Local Regulators? 1. NUMBERAND ~
P OF E
TARGET OPERONS
It seems tautological to say that global regulators control a large number
of target operons. This was certainly the originally intended meaning of the distinction between global and local regulators. Nonetheless, there is no clearly defined border between global and local, and it is worth noting that there are inhabitants of the gray zone between them. One example could be provided by ArgR, which controls at least 12 genes in eight operons (112). This is a substantial number of targets, but they are all involved in the same single specific metabolic process-arginine biosynthesis-and another implication of globality is that there is some breadth in the roles of the target operons. In summary, the number and type of target operons do not unambiguously distinguish global from local regulators, though in most cases these groups do differ substantially in these regards. 2. ABUNDANCE AND DEGREEOF DNA SEQUENCE SPECIFICITY
If global and local regulators are not clearly distinguished by their respective targets, perhaps their intrinsic properties can be used to distinguish between them. One might expect global regulators to be more abundant in the cell, or to be somewhat less specific than local regulators. In fact, gene regulatory proteins can take two basic approaches, which have been termed the “carpet bombing” (low specificity, high abundance) and “cruise missile” (high specificity, low abundance) approaches (113). (An alternative analogy for the less military minded might be mosquitoes versus tigers.) Members of the “carpet bombing” class of regulators in E. coli are indeed highly abundant. The abundance of I H F has been measured as a function of the growth phase of an E. coli culture (114). During exponential growth in LB medium, there are 8500-17,000 diiners per cell. As the growth rates slow, I H F concentrations rise, reaching a maximum in early stationary phase of -100,000 dimers per cell. HU is present at -30,000 dimers per cell (115), although measurements do not appear to have been made as a function of the growth phase of the culture. H-NS is also a non-sequence-specific DNA-
18
ROBERT M. BLUMENTHAL ET AL.
binding protein, with an abundance of -20,000 monomers per cell (115).Fis shows relaxed target specificity with a loose 15-nucleotide consensus sequence (116, l l 7). As described below, Fis levels can be as high as -30,000 dimers per cell, but vary with the growth phase. However, these two groups, as defined by abundance and specificity, have a blurred border. For example, it has been suggested that Lrp [-3000 dimers/cell I(ll8)l is an intermediate in a continuum from specific binding proteins to histonelike proteins (119). In any case, global regulators are found in both groups, so this too is not a clear distinction.
3.
CAN
LOCALREGULATORS BE k C R U I T E D AS GLOBAL
REGULATORS?
It seems likely that global regulators (aside from the histonelike class and possibly the alternative u factors) are local regulators that were recruited into regulating additional genes. This process can be imagined to have involved natural selection of cells that, by random mutation, developed sufficiently functional binding sites for a regulator upstream of a gene for which the resulting regulation was appropriate. This process may be continual, as new genes arise via duplication and subsequent divergence, or by horizontal transfer. If the new gene does not simply integrate into an operon that is already under appropriate regulation, then variants of the gene’s upstream region may yield sufficiently functional binding sites for appropriate regulators. There is no difference, in this regard, between adopting a local regulator and adopting a global regulator. It could be simply that regulators responding to more fundamental features of physiological status will tend to be adopted more often, because more genes will be appropriately regulated by such fundamental features. Thus an incoming catabolic gene is more likely to be best regulated in response to the presence or absence of glucose (Crp) than the presence or absence of tryptophan (TrpR), because tryptophan levels only very indirectly reflect fundamental cell status. In this view what are now called global and local regulators differ only in the range of genes for which the propagated regulatory signal is useful.
D. How Are Global Regulators Controlled? There are two basic strategies to controlling a given regulator: varying the amount and varying the activity (Fig. 3). These are not mutually exclusive, and there are many cases in which both the amount and the specific activity of a regulator is modulated. A good example of this is the leucineresponsive regulatory protein, Lrp (see Section III,C).
19
GLOBAL GENE REGULATION
Control AMOUNT of regulator
exZion
(
Examples: RpoH, RpoS, LexA, Fis, H-NS
active
3
regulator
amino
turnover, dilution
acids
active regulator
(3 inactive regulator
Control ACTIVITY of regulator
phosphorylation-dephosphorylation coregulatorbinding, release multimerization-dissociation sequestration-release oxidation-reduction eic.
Examples: Crp, Fnr, NtrC, Arc, PhoR
FIG. 3. Basic modes by which global regulatory proteins are themselves controlled. In many cases both modes are used.
1.
CONTROLLING T H E
AMOUNT OF
T H E REGULATOR PROTEIN
In this approach, expression of the regulator’s own gene (or turnover of an unstable regulator) in response to some physiological signal determines the degree of regulator activity. For example, the induction of DNA-damageinducible SOS genes occurs when RecA protein binds single-stranded DNA, is thereby activated, and proteolyzes the repressor LexA (120). LexA autogenously represses Z e d (and also represses recA), and so its levels are restored soon after the DNA damage signal disappears (121). Most of the alternative u factors and the histonelike regulatory proteins are controlled by varying their concentrations. For example, the rpoH gene, which specifies the alternative u factor involved in the heat-shock response, is transcribed at a markedly higher rate following a temperature increase (122), while the rpoS gene, which specifies the alternative u factor involved in the response to starvation, is expressed in response to the metabolic signal molecule homoserine lactone (123). Both histonelike proteins H-NS and Fis repress their own genes, andfis is also subject to stringent control (124-126).
20
2. CONTROLLING THE ACTIVITY OF
ROBERT M. BLUMENTHAL E T AL. THE REGULATOR PHOTEIN
In this approach, the activity of preformed regulator is modulated in response to some physiological signal. There is a wide variety of means by which this is accomplished. For our purposes, activation of a regulator means converting it to a form that actively affects gene transcription. Often, but not always, this is synonymous with regulating the ability to bind DNA. In the case of MerR, for example, the unliganded protein is a repressor and the Hg2+ -1iganded protein is an activator of transcription (87).
a . Covalent Modification. The most common form of covalent modification for activity control involves phosphorylation-dephosphorylation, particularly as seen with the two-component response regulators (Section 111,B);this is a powerful means of regulation in that the modification in each direction can be under independent kinetic control (in contrast to coregulator binding-dissociation as a control mechanism). Another example of this type of regulation is provided by SoxR, which controls a regulon of genes whose products prevent or repair oxidative damage. SoxR contains an ironsulfur ([2Fe-2S]) cluster, and oxidation of this cluster is linked to the ability of SoxR to activate transcription of SoxS (127, 128). As an aside, SoxS is the activator of the other genes in the regulon, and the potential for wild fluctuations posed by this system of an activator acting on the gene for an activator is eliminated by having SoxS autogenously repress soxS (129-131).
b. Coregulator Binding. A great many regulators are controlled by binding or releasing a coregulator molecule in an equilibrium mechanism, and a variety of regulatory signals can be detected in this way. Coregulator binding can either activate or inactivate the regulator. Regulators can respond to simple concentrations of a coregulator, as in the case of Crp and CAMP. Lrp (Section II1,C) fits into this group as well; although Lrp can respond to both leucine and alanine (6), both amino acids have the same regulatory effects. Alternatively, regulators can respond in different ways to different coregulators-for example, if two potential ligands compete for the same binding site. For instance, PurR exhibits complex interactions with multiple coregulators (110): hypoxanthine and guanine each bind cooperatively to this dimeric protein, but with differing affinities; if both purines are present together, cooperativity is lost and the af€inity for either purine is decreased. Thus PurR responds differently to hypoxanthine alone, guanine alone, and mixtures of the two. Finally, the conformational changes associated with coregulator binding or release may be slow for some regulatory proteins, leading to hysteresis (132), and could thus provide a temporal buffering mechanism for signals that have a noisy fluctuation pattern.
GLOBAL GENE REGULATION
21
c . Multimnerizution. Many regulators are active as homomultimers. The multimerization is a second-order reaction, and given the appropriate subunit association constant, the extent of multimerization can vary widely over the range of subunit concentrations normally seen in the cell. Multimerization can depend on phosphorylation of the monomers, as in NtrC and PhoB (133),making the regulatory response much more sensitive to the rates of phosphorylation and dephosphorylation than would be the case if DNA binding depended simply on phosphorylation of preformed dimers. Multimerization can be DNA dependent, when subunits cooperatively bind the adjacent half-sites for a recognition site and then bind to one another to stabilize the DNA binding (134).In some cases, the relevant multimerization on the DNA involves preformed dimers that bind cooperatively to adjacent sites; this, too, can provide exquisite sensitivity to changes in the concentration of free regulator. Multimerization can be dependent on both coregulator and DNA binding, as in TyrR ( 1 3 9 , which binds as a dimer to three nearby sites on the DNA and forms a tyrosine-dependent hexanier. Finally, multimerization of sequence-specific DNA-binding proteins is specifically stimulated (136) or inhibited (137) by regulatory polypeptides.
d. Sequestration. Sequence-specific DNA-binding proteins have distinct abilities to discriminate between their recognition sequences and nonspecific DNA. One form of controlling the effective concentration of a regulator involves varying the DNA concentration as opposed to the protein concentration. As described in Section I, B, the DNA concentration varies with the growth rate of the cell. DNA-binding proteins with relatively low discrimination between sites and nonspecific DNA could be titrated out at higher DNA concentrations, and this could lead to replication-linked derepression (repressor) or loss of expression (activator) of a target operon. Sequestration can be used to damp a response: the alternative u factor RpoH leads to accumulation of the heat-shock proteins DnaJ and DnaK (see Section III,A), and DnaJ catalytically activates stable DnaK binding of RpoH (138). Another strategy, not yet known to be used by a global regulator but employed by the local regulator PutA, is sequestration at the cytoplasmic membrane. PutA is a single FAD-containing polypeptide that carries out both steps in the catabolism of proline to glutamate; it is loosely associated with the cytoplasmic membrane where it is coupled to the electron transport chain (139).When proline is not present, or for any other reason the oxidized FAD cannot be reduced, the relative hydrophobicity of PutA is decreased and it leaves the membrane; at this point it can dimerize, bind to DNA, and repress transcription of its own gene and that of a proline-specific permease (140,141).
22
ROBERT M. BLUMENTHAL ET AL.
3. DESIRABLE DESIGN FEATURES Biochemical systems analysis predicted that certain general features will be found in the intrinsic properties and feedback controls operating on a regulator (142).A full treatment of this type of analysis is beyond the scope of this review. However, systems analysis makes some specific predictions against which the properties of various global regulators can be measured, bearing in mind that the global regulators are not expected to differ from local regulators with regard to these predictions. First, when is repression likely to be used, and when is activation the better alternative? Savageau (142) bases his argument not on intrinsic properties of the regulators, but rather on the consequences of mutation over time. The argument is illustrated by Table I and distinguishes between genes based on how often (not in what quantity) the gene product is needed by the cell. Thus, over time, genes coding for continually needed products should tend to be controlled by activators, whereas genes coding for rarely needed products should tend to be controlled by repressors. Note that this conclusion is not at all affected by the fact that a given regulator can activate some target operons and repress others; the prediction deals with how the regulator is expected to interact with a given target operon, based on the frequency of expression of that operon. A second prediction is that different types of regulators are likely to have their own structural genes regulated in different ways. Activator-controlled operons, with the activator produced on a constitutive basis, are superior to equivalent systems with autogenously regulated synthesis of the activator.
TABLE I SELECTION PRESSURES ON CONTROL BY GLOBALREGULATORS” Selection pressure Gene product needed at low frequency
Gene product n4eded at high frequency
0
From Ref. 142.
Gene controlled by repressor
Gene controlled by activator
Mutation of repressor leads to constitutive production; disadvantageous to cell because product is rarely needed: strong selection against mutants Mutation of repressor leads to constitutive production; not very disadvantageous to cell because product is usually needed: weak selection against mutants
Mutation of activator abolishes production; not very disadvantageous to cell because product is rarely needed: weak selection against mutants Mutation of activator abolishes production; disadvantageous to cell because product is usually needed: strong selection against mutants
GLOBAL GENE REGULATION
23
This superiority holds for all criteria of functional effectiveness, including stability and temporal responsiveness. In contrast, repressor-controlled operons are expected to be associated with autogenously regulated repressor genes: systems with autogenously regulated repressor are superior to equivalent systems with constitutive repressor synthesis by the same criteria. One way to think about this is that autogenous activation can lead to uncontrolled amplification (formally analogous to the “feedback seen when a microphone is placed in front of the amplified speaker), whereas autogenous repression is intrinsically self-limiting. This prediction is somewhat complicated by the fact that many global regulators function as both activators and repressors. Although a protein may act positively and negatively on various target operons, it can nonetheless act in a single mode on its Own gene; we would thus expect autogenously controlled global regulators to act as repressors in this capacity. The third prediction also deals with stability of the system. It appears that extant regulatory systems have had system stability (avoidance of large fluctuations in output levels in response to changes in stimulus levels or to mutations or other system perturbations) as a major focus of selection. As Savageau states (142), “It is interesting that stability, one of the principal concerns of engineers designing technological systems, also appears to be of prime importance in the design of biochemical systems by natural selection.” This is reflected in the nature of the coregulator for inducible catabolic systems: a system designed with stability as the foremost concern would use an intermediate as the inducer; where temporal responsiveness is most important the substrate should be the inducer. A classic example of this is provided by the lac operon: the substrate (lactose) is actually an antiinducer, and the inducer (1,6-allolactose) results from a side reaction between lactose and the catabolic enzyme p-galactosidase (143). There are many examples of catabolic operons having the substrate as coregulator-for example, the proline utilization system (139). However, the transport of this substrate into the cell is generally controlled by the same regulatory protein as are the catabolic enzymes, so intracellular substrate is an intermediate in this context (142); this is true for proline utilization. In summary, biochemical systems analysis makes a number of specific predictions regarding the regulation of regulatory proteins, the types of regulation likely to be seen for certain types of genes, and the types of features expected to increase system stability.
E. What Are the Advantages of Using Global Regulators? The need for gene regulation in bacteria has several well-appreciated bases:
24
ROBERT M. BLUMENTHAL ET AL.
1. Efficient utilization of resources: transcribing the average 1-kb E . coli gene just once, and translating it with just five ribosomes, would consume about 7000 “high-energy’’ phosphates, not to mention the biosynthetic energy cost of replacing the 1000 ribonucleosides and over 1600 amino acids (though these can be recycled), or the energy cost that might be associated with the functioning of the five protein molecules that were made. 2. Balancing the production of components designed to work together: this could include the equimolar synthesis of components for a transport complex, or the ensurance that a biosynthetic enzyme that makes a toxic intermediate is not produced unless the enzyme catalyzing the next step is also made. 3. Making sure that genes involved in temporal responses are expressed in specific temporal hierarchies: this is particularly important for colonization by pathogenic bacteria (144), and in adaptation to nutrient starvation (145);the potential complexity of these time-sensitive regulatory networks has recently been illustrated with striking clarity for bacteriophage A (146). In this review, however, we are not asking why genes are regulated, but rather why cells have global regulators controlling a large fraction of genes. What can regulation by global regulators provide that could not be better provided by local regulators highly tailored to the optimal regulatory pattern for a target gene? We suggest three broad answers.
1. COORDINATED RESPONSES BY
A ~
G
NUMBER E OF GENES
As noted in Section I,C,l, there is no precise dividing line between global and local regulators in terms of the number of target operons (8). A regulator with a single target (e.g., L a d ) is clearly local, but what about ArgR? At the higher end of this spectrum, the translation machinery involves at least 150 gene products scattered among numerous operons, all subject to coordinate regulation. It could be argued that, in this case, there really is no conceptual difference between local and global: the number of target operons is large, but their products all play a role in the same cell function (translation, in this example). What makes a global regulator distinctive is not just numbers but also providing coordination across the physiological sectors of the cell. Thus (as discussed below) the catabolite activator protein (Crp) controls a large number of operons in response (indirectly) to the presence or absence of glucose. Much of the action of Crp is to ensure that glucose, the preferred carbonlenergy source for E . coli, is used preferentially: operons for the catabolism of other carbonlenergy sources are expressed at
GLOBAL GENE REGULATION
25
low levels when glucose is present. However, Crp also controls expression of the flagellar machinery; this may help to keep E . coli cells in an area containing glucose, but there is quite a functional distinction between (say) flagellin and P-galactosidase. Thus, genes may require responses to very different regulatory signals, some of which reflect the status of fundamental physiological parameters. In theory, this could be addressed perfectly well with two local regulators. one of them sensitive to a global signal. That this is not often done reflects both the inefficiency of using two proteins where one will do, and also the surprising range of regulatory responses that can be achieved with a given regulator by varying its binding location and affinity.
2. POSSIBLEREGULATORY INTEGRATION AS A RESULT OF CROSS-REGULATION OR REGULATIONOF ONE ANOTHER’SGENES Operons can belong to more than one regulon, and thus be controlled by two or more unrelated regulators. In addition, the site at which a given regulator acts may be occupied by a closely related but distinct regulator, providing a sort of regulatory cross-talk that can allow additional regulatory signals to “fine tune” the expression of an operon. As a matter of terminology, however, the term “cross-talk has been used to refer to the undesired but unavoidable noise that occurs when two closely related regulatory systems interfere with one another; when this situation is believed to play some desirable role it is termed “cross-regulation” (147). It is not always clear whether a given situation represents cross-talk or cross-regulation. For example, the GcvA activator for the glycine cleavage enzyme genes can also activate a P-lactamase gene that is normally regulated by its own activator (AmpR) (148). A second type of regulatory cross-talk or cross-regulation involves the transducer rather than the regulator (see Fig. 1).In this case, the sensor for a particular two-component regulatory system phosphorylates the response regulator for a different two-component regulatory system. This is believed to represent cross-regulation in the case of CreC and PhoR both phosphorylating PhoB in the Pho regulon (147), and cross-talk in the case of CheA, EnvZ, and NtrB phosphorylation of one another’s partners (149),with the distinction made in part on the basis of the various sensorregulator affinities. A third type of cross-regulation does not involve misrecognition, but regulation of the structural gene for one global regulator by another global regulator. An example of this is the regulation of Zrp by the histonelike protein H-NS (150)and possibly by I H F (151).This sort of cross-regulation involving global regulators would allow regulatory signals, reflecting fundamental physiologic parameters of the cell, to influence a broader network of operons.
26
ROBERT M. BLUMENTHAL ET AL.
3. IMPROVED GENETICFLEXIBILITY
The use of global regulators could have two consequences for microbial genetics: one is that improvements to a single locus can have broad consequences, and the second is that conservation of global regulators across species would facilitate the interspecies transfer of genes from the relevant regulons. The extant regulatory networks in E. coli reflect not only what is most efficient in giving the optimal regulatory outcomes, but also what was most readily derived and horizontally spread through the course of evolution. Thus if a regulatory system is multicomponent, one would expect the component genes to be contiguous on the chromosome. In addition, if a regulator affects the expression of a substantial group of target operons, a powerful means of improving the regulation is to select spontaneous improved variants of the regulator (one locus) rather than adding a new regulator and a plethora of new binding sites at each target operon. As McAdams and Shapiro noted (146): Genetic circuits exhibit hierarchical organization: Regulons control operons, which control gene groupings. Electronic circuit designers structure complex systems as hierarchical structures to facilitate reuse of modular functions and simplified control by a few signals. The multigene genetic subfunctions in the hierarchy are points of high leverage for evolutionary adaptability because a single mutation in circuit logic can change the control of a large genetic cascade, thereby amplifying evolutionary consequences.
These improvements would probably be limited to the regulator’s signal detection properties, because altering its DNA sequence specificity could have negative consequences for regulation of a large number of genes. One might thus expect to find that homologous global regulators from different species have highly conserved DNA sequence specificities. It is interesting that the E. coli SOS regulator LexA appears to regulate damage-inducible genes when introduced into a variety of other bacterial species (152). In contrast, homologous global regulators from different species may have substantial differences in their coregulator binding properties or in the regulation of their own structural genes. If homologous global regulators from different species really do tend to have highly conserved DNA sequence specificities, genes belonging to a global regulon can move from cell to cell, in some cases even across species, and maintain proper regulation. This would increase the effective mobility of these genes, and give added breadth to the globality of global regulators. In this regard, the reciprocal experiments to the study of L e d function in different species showed that the LexA-controlled gene recA was regulated appropriately when moved from E. coli into various other species (153).
27
GLOBAL GENE REGULATION
4. SUMMARY
The advantages and disadvantages of using global regulators can be illustrated by using an economic analogy. In an unrestricted capitalist free market, there are the disadvantages of redundancies and disproportionate allocation of resources, and minimal coordination across various economic sectors, but the profound advantage that capital resources can be rapidly redistributed to take advantage of new developments. In a socialist economy completely under central control, there is the advantage of coordinated responses across many sectors of the economy and a minimum of redundancy, but this efficiency comes at the price of poor responsiveness to new developments. The reality is that there are few current examples of either economic model in pure form, precisely because in pure form the disadvantages of each system are profound. As Galbraith has written (154),“If the world is lucky enough to enjoy peace, it may even one day make the discovery, to the horror of doctrinaire free-enterprisers and doctrinaire planners alike, that what is called capitalism and what is called socialism are both capable of working quite well.” Unlike economic theoreticians, bacterial cells must constantly answer to physical reality. Bacteria, which often live in highly competitive environments and are not protected by the homeostases provided to cells of multicellular organisms, cannot afford to have a welter of uncoordinated responses to a change in (say) temperature or nutrient availability. Neither can they afford, for the sake of efficiency in the number of regulators required, to force the regulatory pattern for every gene into a Procrustean bed defined by the best available approximation of ideal regulation provided by an extant global regulator. It should come as no surprise, then, that gene regulation in bacteria uses both approaches: broad and overlapping central controls that modulate or, in some cases, can even be overridden by local regulation. In this sense, the solutions found by bacteria and by economies are quite analogous.
II. Methods for Experimental Analysis of Global Regulators
A. How Are Regulon Members Identified and Confirmed? 1.
IDENTIFICATION OF RECULON
MEMBERS
One of the first questions encountered in studying global regulation involves the breadth of the particular regulon, i.e., how many genes are
28
ROBERT M. BLUMENTHAL ET AL.
regulated. There are a number of methods available, both in uitro and in vivo, to help answer that question. Each method has its own strengths and weaknesses, but a combination of several methods often serves to gain a good understanding of a regulon’s size and sometimes even a hint of the class to which the relevant regulator belongs. In this section we attempt to review briefly the types of in uitro and in uiuo experiments currently used in studying global regulators. Where appropriate, we discuss improvements and new techniques needed to increase our understanding of regulation. a . lsolation of Operon Fusions to Reporter Genes. Gene expression and promoter activity can be assayed through gene fusions that allow the study of genes for which there is no simple method of directly assaying the gene product. These fall into two classes: operon fusions, which link a promoterless reporter gene to the target transcription unit, and protein fusions, which also make reporter gene expression dependent on the translational initiator of the target gene. For the purposes of this review, we are interested primarily in the operon fusions. Casadaban originally developed a method for introducing a A phage containing a promoterless lac operon into the chromosome of a lac deletion strain lysogenic for bacteriophage p (155).His A placpphage also contained a piece of the genome of bacteriophage p, to permit homologous recombination. This methodology revolutionized bacterial genetics, because it placed the power of lac genetics, including screening and selection methods, at the service of investigations of the regulation of any gene for which a disruption could be tolerated. Indeed, Casadaban himself used an araC::lacZY fusion to show that araC is autogenously regulated by its gene product, and that araC is also induced by CAMP and the catabolite repressor protein (156). The lactose operon is probably the best understood prokaryotic regulatory system, and lac2 is the most commonly used reporter gene in operon fusions. One reason for this includes the ability to substitute various derivatives of the substrate lactose that allow colorimetric or fluorometric detection when cleaved by the lacZ product, P-galactosidase; these include S-bromo-4chloro-3-indolyl-~-D-galactoside,X-Gal, and o-nitrophenyl-P-D-galactoside (ONPG) (157).Historical overviews of the use of gene fusions with the lac operon, including reviews of many of the novel studies resulting from these gene fusions, have been published (157,158). Numerous plasmid cloning vectors with promoterless reporter genes and upstream multiple cloning sites are commercially available and many are based on Casadaban’s original cloning vectors (159).In addition, a number of methods exist for creating gene fusions in the chromosome: the use of insertion sequence mutations, antibiotic resistance markers, and A and Mu prophage insertions (reviewed in Ref. 160).In general, creation of a chromosom-
GLOBAL GENE REGULATIOK
29
a1 fusion to an essential gene will not be detected due to its lethality. However, there are reports of the detection of Mu-1 insertions into essential genes (161);this would be expected where the insertion is so close to the end of the gene that its truncated product is still active, or where an additional copy of the essential gene exists on the chromosome. It should also be noted that many fusions cause transcriptional polarity, so that any essential genes in an operon that are downstream of the fusion may not be transcribed. Other reporter genes that are frequently used include cat (chloramphenicol acetyltransferase) (162), gaZK (galactokinase) (163), ZuxAB (bacterial luciferase) (164), Zuc (firefly luciferase) (165),phoA (alkaline phosphatase) (166), and uidA (P-glucuronidase) (167). Unique to the luciferase reporter gene systems is the ability to assay activity as bioluminescence (165, 168). An advantage of using this system is the ability to use bioluminescence to make real-time, in vivo measurements of gene expression (169).Alkaline phosphatase translational fusions are particularly useful for identifying genes that code for proteins whose products are exported from the cytoplasm, because alkaline phosphatase activity depends on formation of a disulfide bond, which generally only occurs following export of the fusion protein (166).This bond cannot form in the reducing environment of the bacterial cytoplasm under normal growth conditions, but when cell growth is stopped active alkaline phosphatase may be produced in the cytoplasm (170), requiring caution in interpreting plate-based results. A unique reporter gene system, recently developed (171), involves transcriptional fusions to tnpR, which encodes resolvase, a site-specific recombinase from transposon y8. When these fusions are induced, resolvase is produced and a linked tetracycline-resistance reporter gene is excised, resulting in tetracycline-sensitive descendant bacteria. This system is unique in that induction in a single cell can be assayed at a later time because there is a growth-amplified inheritable marker of prior gene expression. This transcriptional fusion approach was developed with the goal of in vivo studies of gene expression of pathogenic organisms within animal hosts, where it is useful to measure gene expression of small numbers of organisms at a later time and a different place (172),but it is applicable to other systems as well. Random insertion of reporter genes into the chromosome can be used to identify members of a regulon. For example, Lin et aZ. (173) used hpla,Mu9 mutagenesis to identify genes in the Lrp regulon by making protein fusions. Random hp,,Mu insertions, into genes whose expression was affected by leucine (a coregulator in this regulon) and/or Lrp, were isolated by screening the kanamycin-resistant fusion library on replica plates containing X-Gal with or without leucine, and looking for blue-white changes. This type of approach allows the detection of previously unsuspected regulon members. A concern, however, is that transposon or phage insertion is not really ran-
30
ROBERT M. BLUMENTHAL ET AL.
dom, and may fail to identlfy a subset of genes in the regulon. In Salmonella typhimurium, insertion of transposons Tn5, TnlO, and bacteriophage Mu is inhibited by active transcription of some target sequences (174). Gene fusions exhibiting the desired regulatory pattern can be identified by a variety of methods. If the fusion results in an obvious phenotypic defect, such as an auxotrophy, this can help in identifying the interrupted gene. Phage transduction mapping can then be used to test the tentative identification; in a particularly useful approach, a phage P1 lysate grown on a strain carrying TnlO near the candidate gene (175) can be used to try to transduce the fusion strain, scoring concomitant gain of tetracycline resistance, loss of kanamycin resistance, and loss of the ability to grow on lactose (175). Given the large and growing fraction of the E. coli chromosome that has a known nucleotide sequence, sequencing across the fusion junction is often useful for identification of the target gene. One of the first methods for identification of fusions by sequencing was developed by Wanner et al. (176). This analysis was complicated by the fact that early construction of Mu-based fusion vectors (including hpl,,Mu and the Mudl,$p vectors, but not the mini-Mu constructs) inadvertently introduced a 48-bp duplication at the S end. This forms a strong hairpin in single-stranded DNA, and makes it difficult to sequence across the fusion junction toward the promoter. Three approaches have been developed to deal with this. In one, a low-titer lysate is generated by UV irradiation of the fusion strain, followed by purification of the phage and sequencing from the c end (177). During Mu excision, host DNA adjacent to the S end is included; following circularization this DNA is adjacent to the c end, which lacks the long repeat. A second approach involves cleaving the chromosomal DNA of a fusion strain with a restriction enzyme that cuts uniquely within lac2 (EcoRI w a s used), circularizing the resulting fragments by ligation under dilute conditions, and using the polymerase chain reaction (PCR) and oppositely oriented primers to the retained portion of h Z . Because the DNA is circular, this results in an amplified linear fragment (178, 179). This inverse PCR product can be sequenced, but depends on the difficult sequencing across the inverted repeat in Mu S. It is possible that PCR generates deletions within the hairpin, otherwise the primer used in that method (which was complementary to one arm of the hairpin and could thus pair to both strands of the PCR duplex) should have primed in both directions at once. The third approach is useful in that it generates a plasmid clone of the fusion, such that the regulatory phenotype of the sequenced material can be directly confirmed (S. P. Bhagwat, R. G. Matthews and R. M. Blumenthal, unpublished experiments). This method uses the suicide vector pIVET1, which carries a promoterless lac2 gene, a gene for ampicillin resistance, and an origin of replication that functions only in the presence of the Xpir prod-
GLOBAL GENE REGULATION
31
uct (172). When this vector is introduced into hpla,Mu fusion strains (which are Xpir-), ampicillin-resistant transformants can result from homologous recombination between the ZacZ genes. At this point, cleaving the chromosomal DNA with a restriction enzyme (such as BgZII) that cuts uniquely in pIVETl, circularizing by dilute ligation, and transforming a Xpir+ host strain yields a replicating AmpR plasmid that carries the desired promoter region linked to the intact lacZ gene. This can be characterized phenotypically (confirming the relevance of the junction to be sequenced), sequenced from either end of the promoter region, and introduced into a new Xpir- host cell in which selection for ampicillin resistance can yield a chromosomal integrant by homologous recombination with the chromosomal copy of the fused gene. An alternative single-step method of cloning chromosomal fusions (constructed by the use of pIVET vectors) uses transduction with SalmoneZla phage P22 (172). Once the promoter of an operon fusion has been identified, further study is required to distinguish between direct and indirect regulation by the global regulator. Generally, one must first establish a direct interaction between the regulatory protein and the target promoter region using the wellestablished methods of DNA mobility-shift assay and footprinting and other protection methods (180).Mutational analysis of the DNA binding sites is necessary to demonstrate that binding of the regulatory protein affects transcription of the regulated gene. Certain cautions are required in the use of gene fusions to study global regulation. First, it is possible to have multiple fusions in the same strain, giving a mixed regulatory phenotype and making it possible to isolate and sequence the wrong fusion: fusions should either be cloned and confirmed, or transduced to a clean background by P1 transduction. Second, it is also important to note that fusion libraries should be amplified under more than one set of conditions: for example, some genes may be essential only in rich media and fusions to those genes will be lost from libraries amplified in rich media, but would be maintained in libraries amplified in minimal media. Third, there have been reports of both position-dependent and reporterspecific effects in gene fusions. Use of the ZuxAB reporter gene system showed that the reporter gene is responsible for activation or repression of transcription from some specific promoters (181).This effect is probably due to intrinsically bent DNA in the 5' coding region of ZuxA and does not affect all fusions. Thus, it is essential to be cautious in interpreting in uitro fusion data, because it is possible to get different results depending on the choice of a reporter system. Finally, data are lacking on the extent to which the magnitude and pattern of expression of the reporter gene depend on the position of the insertion within the target operon. Positional effects of both
32
ROBERT M. BLUMENTHAL ET AL.
transcriptional and translational rpoS::lacZ fusions (182) and of translational rpoH::ZacZ fusions (183, 184) have been observed, and induction of US by increased medium osmolarity was abolished in a rpoS::lucZ translational fusion at amino-acid 23, but seen in a fusion at amino-acid 247 (182). The observed positional effects on translational rpoH::lacZ and rpoS::EacZfusions are thought to derive from cis-acting sequences in the mRNAs that inhibit translational initiation, but the origin of positional effects in transcriptional fusions is not currently established. More systematic studies of the effect of position on the activities measured with a reporter gene are needed. One potential problem involves the effects different fused RNA sequences have on the stability of the reporter gene mRNA; in one case this led to artifactual variations in fusion activity between fusions at different positions in the same operon; this problem was solved by placing an RNase I11 processing site just upstream of the reporter gene (185). It would be very useful to have such sites added to a wider range of transcriptional hsion vectors.
b. Dijferentiul Rute of Polypeptide Synthesis: Use of Comparatizje Twodimensional Electrophoretic Analysis to Elucidate Stimulon or Regulon Membership. Neidhardt and co-workers (186) first examined the effects of temperature shifts on the rates of synthesis of selected individual polypeptides in a wild-type E . coli strain. Individual polypeptides were labeled and resolved by two-dimensional polyacrylamide gel electrophoresis in order to determine their differential rate of synthesis. Because temperature shifts led to widespread changes in the net rate of synthesis of proteins, against which specific changes had to be detected, a double-label protocol was employed. Polypeptides were uniformly labeled with [14C]arginine and [14C]lysine during steady-state growth at 28"C, and the unincorporated labeled amino acids were "chased with a 50-fold excess of unlabeled lysine and arginine. At various times before and after a shift to 42"C, portions of the culture were pulse-labeled with [3H]valine, [sHH]leucine,and [3H]isoleucine, then chased with an excess of the corresponding unlabeled amino acids. The 3H/"C ratio of individual polypeptides was determined and divided by the 3H/14C ratio of total cellular protein; this number represents the differential rate of synthesis of the polypeptide. Once a global regulatory protein has been identified, comparative twodimensional electrophoretic analysis can be used to determine the size of a regulon and the pattern of regulation of its component genes. One of the first instances in which this method was employed was an analysis of the patterns of polypeptide expression in the heat-shock regulon (187). Wild-type (htpR+)and mutant (htpR)strains were shifted from 28 to 42"C, and samples were pulse-labeled before and 4 minutes after the temperature shift. Protein extracts were prepared and polypeptides were separated by isoelectric point
GLOBAL GENE REGULATION
33
and by size using two-dimensional polyacrylamide gels that were then fixed, stained, dried, and exposed to X-ray film. By comparison of the intensity of individual spots on the autoradiograms from samples at 28 and 42°C and in mutant versus wild-type strains, 13 polypeptides were shown to be induced at 42°C in the htpR+ strain but not in the isogenic htpR strain, and their induction ratios were measured. These polypeptides are designated by an alphanumeric, the letter indicating their position in the horizontal isoelectric focusing dimension of the gel and the number indicating their molecular weight from the vertical dimension of the gel in which the polypeptides are separated by electrophoresis in the presence of sodium dodecyl sulfate. The molecular weights can be determined by comparison with standards (188). The use of two-dimensional electrophoretic analysis allows identification of polypeptides by comparison of the coordinates of the position of a spot on the autoradiograin with a reference gel of E . coli gene products run under standard conditions (189). Over the past two decades, Neidhardt, VanBogelen, and their colleagues have systematically identified E . coli gene products from the reference gel, producing a gene-protein database that now includes 386 identified polypeptides and 305 gene products (190). The fifth edition of the gene-protein database was published in 1992 (189) and the sixth edition will be published in the second edition of Escherichia coli and Salmonella typhimurium: Cellular and Molecular Biology” (190) and is available from the National Center for Biotechnology Information in the database repository as Eco2DBase. The expansion of the gene-protein index makes the use of twodimensional electrophoretic analysis to identify polypeptides whose expression is modulated by global regulatory proteins more and more powerful, provided the gels are run under conditions that allow comparison with the reference gels. By 1993, a two-dimensional electrophoretic analysis of the Lrpileucine regulon (191) identified more than 30 polypeptides whose expression was altered by the absence of Lrp, and which therefore are presumed to be regulated, either directly or indirectly by Lrp. Eight of these polypeptides corresponded to proteins that had been identified in the geneprotein database, and four of the polypeptides had not previously been identified as members of the regulon. For the identification of regulon members, useful innovations would include a more reproducible means of analyzing polypeptides with basic isoelectric points by two-dimensional electrophoretic analysis. With equilibrium two-dimensional electrophoretic analysis (IEF-SDS analysis), the first dimension involves isoelectric focusing to equilibrium using an ampholine mixture; this results in good resolution of hundreds of spots. However, at present, only polypeptides with an isoelectric point of less than -8 can be separated in the first dimension by isoelectric focusing using carrier am-
34
ROBERT M. BLUMENTHAL ET AL.
pholytes. Nonequilibrium two-dimensional electrophoretic analysis can be used for basic polypeptides (192), but it has the disadvantage of limited reproducibility. Techniques using immobilized pH gradients (IPGs) for isoelectric focusing (IPG-SDS analysis), which allow better resolution of many, but not all, basic polypeptides (193), are being developed. Lrp and most ribosomal proteins are examples of important proteins that are not detectable using either IEF-SDS analysis or IPG-SDS analysis. The use of comparative two-dimensional electrophoretic analysis has some advantages compared to the use of operon fusions. Knowledge of the regulatory phenotype is not needed, and regulon members not displaying the expected phenotype can be identified. Additionally, the response of many genes and proteins to different conditions can be studied simultaneously (194). Genes are not disrupted by fusion to a reporter gene, so regulation of essential genes will be detected. Also, identification of genes in the regulon can be made directly from a single autoradiogram, thus avoiding time-consuming mapping or sequencing strategies. A potential disadvantage is that cellular proteins in very low abundance may not be detected by comparative electrophoretic analysis; fusion to a sensitive reporter gene could be used to reveal these regulon members. Furthermore, if the regulated polypeptide is not in the gene-protein database, identification is sometimes dimcult. Various methods can be used for verification of the identities of polypeptide spots on two-dimensional gels. For example, if one believes that a given polypeptide responsive to Lrp levels is lysyl-tRNA synthetase form I1 (191), this can be tested in three ways: by coelectrophoresis of unlabeled purified lysyl-tRNA synthetase with labeled cell extract (the labeled spot should be visibly diluted); by electrophoretic analysis of a known mutant for lysyl-tRNA synthetase (in many cases the spot will move or disappear-transposon insertion mutants are particularly useful in this regard); and/or by N-terminal sequence analysis of the excised and eluted spot (195). Reverse genetic procedures are increasingly being used to identify unknown polypeptides on two-dimensional gels. After electrophoretic separation, the polypeptides are electroblotted onto a polyvinylidene difluoride (PVDF) membrane, and pieces of the membrane corresponding to specific peptides are excised and subjected to N-terminal sequence analysis (195).A tentative identification can be made if the sequence matches a sequence currently available in databases such as GenBank or EMBL. Alternatively, Southern blotting with a degenerate oligonucleotide probe derived from the N-terminal sequence data can be used with the Kohara library (196)to locate the position of the corresponding gene, which can then be cloned and sequenced by traditional methods. This technique of reverse genetics was
GLOBAL GENE REGULATION
35
used to clone, map, and sequence the gene encoding the universal stress protein (UspA) in E . coli after two-dimensional electrophoretic analysis revealed a polypeptide spot whose synthesis was greatly increased during growth inhibition (197). Gene mapping membranes containing all the clones of the miniset Kohara collection are now available and help to simplify this method (198).An alternative method using mass spectroscopy for the identification of proteins isolated by two-dimensional electrophoretic analysis has the ability to detect comigrating and covalently modified proteins (199). At present these methods are most appropriate for relatively abundant, unidentified polypeptides, but improved technologies may soon make almost any detectable polypeptide susceptible to sequence analysis. Once a tentative identification of a regulated polypeptide has been made, it is necessary to confirm the identification by chemical or genetic methods. Enzyme activity of regulated genes can often be assayed biochemically in strains which are either wild type or mutant for the regulator gene. A null mutation or overexpression of the regulator gene, if not lethal, should result in a change in expression of the regulon members. Likewise, and particularly important, mutation of the target operator should lead to a loss of sensitivity to the proposed regulator. Finally, changing the physiological conditions that lead to a response by the regulator gene should result in a change in expression of the regulon members.
c. Analysis of Gene Expression by Nucleic Acid Hybridization. An entirely different method of identlfying genes in the heat-shock regulon has been developed (200, 201). Dot blots were prepared from overlapping A clones spanning the genome of E . coli K-12. Preparations of total mRNA from a control strain (e.g., wild-type cells grown at 16°C) and experimental strains (e.g., wild-type and rpoH strains grown at 42°C)were used to prepare cDNAs that were labeled with 32P by random priming. The labeled cDNAs were used to probe the DNA dot blots. Each A clone contains between 9 and 21 kb of DNA, enough to code for more than a dozen genes. By comparison of the dot blots of the control and experimental strains, A clones containing genes that are strongly induced by a32 and/or by heat can be identified. In this manner, 26 new heat-shock genes ofE. coli were identified (201)as well as 10 genes that had previously been identified by other methods. This procedure has the advantage that the new heat-shock genes are simultaneously detected, cloned, and mapped. Furthermore, it is suitable for the identification of genes required for cell viability. It has the disadvantage of a relatively low signal-to-noise ratio, because most of the genes coded for by a given insert in the h clone will not be affected by either heat or u32. It was estimated that genes induced or repressed more than two fold by heat shock
36
ROBERT M. BLUMENTHAL ET AL.
and/or 1732 were best detected. Furthermore, this method of identification does not provide information about whether the effect of cr32 on transcription of the gene is direct or indirect. Chuang and Blattner (201)extended the analysis to proteins produced by active A clones in response to heat shock. They looked at patterns of expression of genes in the A clone by using UV irradiation to inactivate the chromosomal DNA as a transcription template and then infecting with the A clone and pulse labeling. In these experiments, the host cells were either grown at 16°C or pretreated by heat shocking before irradiation. They also looked at patterns of expression from A clones by in vitru transcription with 070 and/or a32-RNA polymerase holoenzyme. In this way, 16 of their 26 new heatshock proteins were shown to be regulated by a32-RNA polymerase. Using Southern blotting they were able to map the heat-shock-induced sequences to within a few kilobases in the A clones. This procedure provides a very powerful way to identify members of a regulon regardless of the phenotypes of the target genes, and will become increasingly useful as the sequencing of the E. coli genome is completed.
d. Limitations and Warnings. In searching for regulon members, it is important to distinguish between transient and steady-state regulation. Regulation of target genes may occur at discrete points in the growth curve if the concentration of the regulator(s) varies with the growth cycle. Although inany regulators are produced throughout exponential growth, some regulators, such as Fis, are maximally produced during the first few cell divisions (124);others, such as IHF, are maximal during transition to the stationary phase of growth (114), and yet others, such as RpoS, have their maximal expression during the stationary phase of growth (202-204). Genes regulated at points during the growth curve other than exponential growth are considered to be transiently regulated. Tests for specific regulon members need to be conducted during the portion of the growth curve in which the regulator is expressed. This is particularly relevant when libraries are screened on petri plates; for example, ZacZ fusions to starvation-induced genes expressed only long after nutrient exhaustion (205) may not turn blue on X-Gal plates for some time after the colony has appeared. Induction of genes after a shift in conditions is also often transient, and appropriate studies must be conducted to determine when the effect will be maximal (see Section 111,A). Finally, it is important to recognize the limitations experimentally imposed by a narrow selection of growth media and conditions. Medium choice may have a profound effect on the degree of regulation observed. Regulation in response to specific nutritional conditions will be very sensitive to slight changes in composition of the medium. To some extent, our limited understanding of the natural environments encountered by E . coli (and our limited
GLOBAL GENE REGULATION
37
ability to mimic those environments in the laboratory) parallels our limited understanding of conditions inside the living cell. This represents another area where technical innovations would be useful.
B. How Can Actions of the Regulator Be Studied in Vivo? When studying gene regulation, the ability to understand what is actu-
ally occurring inside the cell rather than what potentially could occur inside the cell is of the greatest interest. This distinction marks the difference between in vioo as opposed to in vitro experiments. Many DNA-binding proteins have similar binding motifs and specificities, and many promoters have overlapping DNA-binding regions. This complexity can make interpretation of in zjitro binding results tricky; just because a purified protein binds to a cloned DNA region in vitro, is it responsible for the observed in zjivo regulation of that gene? The ability to do in vivo experiments makes it possible to begin to elucidate the mechanism of regulatory control under the conditions leading to gene regulation. The coupling of in uivo and in vitro experiments provides particularly strong evidence of how regulons work.
1.
DETERMINING THE CONCENTRATION OF T HE REGULATORY
PHOTEINAND ITS COHEGULATORS To make any sense of regulatory patterns shown by a regulon member, it is essential to know how levels of the regulator vary in uiuo. Expression of the regulator can b e quantitated by Western blot analysis with specific antisera (206).To understand the physiology of global regulators, it is very important to have information on their levels of expression under different conditions. Knowledge of the variation in expression levels during growth in culture is particularly important. If the concentration of a regulatory protein varies during the growth cycle, and cultures are analyzed at different stages of growth and compared, highly misleading conclusions may be drawn about the effects of the global regulatory protein. The Fis protein shows particularly dramatic variation in expression as a function of growth phase (124). Johnson and colleagues measured the concentration of Fis in cells by Western blotting using a polyclonal antibody to Fis. A stationary overnight culture was diluted 100-fold into prewarmed Luria broth and incubated at 37 C. Samples were removed at intervals to monitor the number of viable cells and the concentration of Fis. An important control in these studies is the addition of known amounts of purified Fis to extracts of cells from a strain lacking the Fis protein, so that a standard curve for Fis recovery could be determined. Cells in stationary phase contain less than 50 dimers of Fis per cell. Fis levels were low during the lag
38
ROBERT M. BLUMENTHAL ET AL.
phase of growth (<5000 dimers per cell) and rose rapidly as cells entered exponential growth to a maximal level of -30,000 dimers per cell. The maximal concentration of Fis occurred at approximately the time of the first cell division as measured by the increase in optical density of the culture. When cell density exceeded 1.2 X l@viable cells/ml, Fis levels began to decline, even though exponential growth continued for several more hours. Further experiments demonstrated that this decline in Fis levels was not due to instability of the protein, but rather to cessation of synthesis and subsequent dilution of the intracellular Fis concentration by cell division. Methods such as Western immunoblotting determine the concentration of a regulatory protein, and not its activity. Because DNA-binding proteins are bound to chromosomal and plasmid DNA both specifically and nonspecifically, and because the occupancy of any given site is determined both by the affinity of that site for the regulatory protein and by the free concentration of the protein, the measured concentration and the effective concentration of the regulatory protein may differ by orders of magnitude (207), as was discussed for RNA polymerase in Section I,B,2. It is more difficult and perhaps more critical to determine the concentration of regulator that is actually available within the cell (the “effective” concentration; one method for determining the effective concentration of a regulatory protein is described in Section II1,C; another is described below). One would have to be cautious in interpreting regulation observed in vitro using concentrations of regulator or DNA that are orders of magnitude different from the effective in vivo concentrations. In addition to knowing the in uivo concentration of protein regulators, it is also often important to determine the in uivo concentration of metabolites acting as coregulators. Thus, for the Lrp regulon in which leucine affects regulation of target genes by Lrp, the effective intracellular concentration of leucine is a critical variable. Quay et al. (208a), in their work on transport systems in amino-acid metabolism, describe the measurement of intracellular levels of the neutral and acidic amino acids. Their method involves the suspension of cell pellets in a solution containing norleucine. The acidic and neutral amino acids are then analyzed with an amino-acid analyzer using the norleucine as an internal control. Schutt and Holzer (208) also describe a method for the determination of intracellular metabolite levels, including amino acids and ATP. Using an amino-acid analyzer and ion-exchange resins, they determined the amino-acid concentration by measuring the concentration difference between medium with cells and medium without cells. ATP concentrations within cells were determined using a luciferin-luciferase reaction and a scintillation counter to quantitate emitted light. Such painstaking analyses are often disdained as being “old-fashioned biochemistry,” but these measurements are essential for understanding the physiology of global
GLOBAL GENE REGULATION
39
regulation. As our knowledge of genetic regulation grows, our ability to ask more detailed questions will also increase. We need to gain a better understanding of how the actual conditions within the cell (i.e., high DNA concentrations, changes in degree of supercoiling, high concentrations of many autologous DNA-binding proteins) affect the interactions between a particular regulatory protein and its target operon(s). One way of assessing whether the in vivo concentration of regulator protein, as determined by one of the above methods, is reasonable for the regulation of its target genes is by performing in vivo titrations with the regulatory protein. Such titrations may also provide information about the mechanism of regulation. An amber mutation in araC present in a strain with temperature-sensitive amber suppressors was used by Haggerty et al. (209)to vary the intracellular AraC levels and the inducibility of the araBAD operon was then calculated. Using three temperature-sensitive suppressor strains and temperatures between 31 and 40°C, they obtained a 250-fold range in suppressor activity. They observed a direct linear relationship between the level of AraC and the inducibility of araBAD, and they determined that the inducibility of araBAD never went above the wild-type level. Their results indicated that the normal intracellular level of AraC was just sufficient for the observed regulation of the araBAD operon. Isopropyl thiogalactoside (IPTG) titration using either the lac W 5 promoter or the tacl promoter has been described as a method effective in modulating expression of cloned genes by varying the concentration of inducer (210).We are currently using a similar system to vary the intracellular level of Lrp in order to study the effect that changes in Lrp concentration may have on the pattern of regulation of Lrp-regulated genes by the coregulator, leucine. We have proposed that sensitivity to leucine depends on the effective intracellular Lrp level. Target promoters with a high affinity for binding Lrp are less sensitive to leucine at normal in vivo concentrations of Lrp than are low-affinity promoters (211). This model can be tested by varying the intracellular Lrp concentration from very low levels to several times the wild-type level. To achieve this, we are using E . coli strain AAEC546 (212),which contains a chromosomal IPTG-inducible transcriptional fusion between the lacW5 promoter and lrp. A library containing random lacZ operon fusions was constructed in strain AAEC546 (213), and Lrp-regulated fusions were identified. Induction with a ramp of increasing IPTG concentrations results in a corresponding concentration ramp of Lrp within the cells (verified by Western immunoblot analysis of cell extracts). The effect of the different Lrp concentrations on target gene expression is monitored by measuring P-galactosidase expression and can be compared to experiments conducted in vitro with known concentrations of purified Lrp and purified plasmid-encoded target DNA. In this manner, the effective
40
ROBERT M. BLUMENTHAL ET AL.
concentration of Lrp under in vivo conditions can be estimated and in vivo titration curves of the expression of different Lrp-regulated fusions as a function of Lrp concentration can be constructed ( S . P. Bhagwat, D. W. Borst, R. M. Blumenthal and R. G . Matthews, unpublished data). The data thus obtained are likely to be physiologically relevant, because environmental conditions of varying nutrient richness in the case of E . coli (gut vs. gutter) appear to result in different intracellular Lrp levels.
2. How DOESTHE REGULATORY PROTEININTERACT WITH TARGETOPERONSin Vivo? a. DNA-Protein Interactions. The ability to detect, under different experimental conditions, in vivo binding of proteins to a regulated gene is strong evidence for inclusion of that gene as a member of a specific regulon and is extremely informative in studies of the mechanism of gene regulation. In order to obtain in vivo footprinting data, one must have a reagent probe that readily permeates the cell membrane, is nondisruptive to protein-DNA interactions, is completely nonspecific in its DNA interactions, has an easy method of detection, and is not terribly toxic to those carrying out the experiments. Although there are a large number of in v i m footprinting reagents currently in use (reviewed in Refs. 214 and 215), none of them have all the properties listed above. Reagents that are currently used for in uivo footprinting studies include psoralens, alkylating agents such as dimethyl sulfate or N-ethyl-N-nitrosourea and ultraviolet radiation. Less commonly used are potassium permanganate and osmium tetroxide. Dimethyl sulfate is probably the most frequently used probe for in vivo footprinting due to its membrane permeation and high degree of well-understood reactivity with nucleotides. Methylation by dimethyl sulfate occurs predominantly at the N-7 position of guanine in the major groove, and there is lesser reactivity at the N-3 position of adenine in the minor groove. There are several methods of determining the footprint following methylation of these bases (214). In analyzing the results of in vivo experiments, it is essential to consider how similar the experimental system is to the wild-type situation. If a target gene is present on a multicopy plasmid and the regulator is not a highly abundant protein, it is very possible to “titrate out” the regulator. Similarly, if the regulator is overproduced by cloning its gene downstream of a strong promoter, the sensitivity of the target operons to the concentration of the regulator and the responsiveness of the regulator to varied physiological conditions in the cell may be greatly altered. Even if a gene has been cloned onto the chromosome, at a position other than its native site, there may still be copy-number effects due to its location on the chromosome (46,47):genes located near the origin of replication are preferentially amplified during growth in richer media.
GLOBAL GENE REGULATION
41
b. Growth Conditions. Reproducibility of results is of great importance in studies of gene regulation. A review (216) discusses steady, balanced, and exponential states of growth, and points out the experimental consequences of each. Most microbiologists studying regulation use batch cultures (i.e., flasks of liquid medium with a low initial inoculuin maintained at a constant temperature, with aeration provided by shaking at a constant rpm) for their experiments. Such batch cultures exhibit the typical phases of lag, exponential, and stationary growth. Even during the exponential phase of growth, however, such cultures are not in true steady-state growth because the composition of the growth medium changes as metabolites are consumed and waste products are produced. If the cultures are successively diluted into fresh, prewarmed medium, the resulting extended exponential phase can approximate steady-state growth, although the stringent definition of true steady-state growth (all properties of the system are invariant) can only be met by the use of a turbidostat. The complexity of growth in batch culture is illustrated by studies on the constancy of growth on simple and complex media in batch culture (217). Making continuous and very precise measurements using a specialized spectrophotometric apparatus, changes in the growth rate of batch cultures during exponential growth have been shown. The fluctuations in the specific growth rates were dependent on the medium: with succinate minimal medium, cultures exhibited a gradual increase in growth rate that never became constant. With both glucose-M9 nutrient broth and Luria broth, growth rates slowed progressively during exponential growth, probably due to nutritional diauxies. These fluctuations in growth rate may be a source of variability among experiments even if the researcher attempts to sample cultures at the same point in growth. The effect of limiting substrates and inhibitory metabolic products on the regulation of growth has also been studied (218).As mentioned above, such chemical factors result from growth in batch culture. By calculating changes in cellular concentrations between cultures that were successively diluted with culture filtrate (so that cell density decreased but the chemical composition was unaltered) and undiluted cultures, a method for identifying growth control density factors in the culture medium was developed. Using monocultures of E . coli growing in glucose medium, it was shown that glucose regulation (due to changes in glucose concentration during growth) was responsible for only 0.1-40% of the changes in the “feedback level” of the cells (measured spectrophotometrically as changes in biomass concentration). These results indicate that in glucose medium, there are factors other than glucose that are produced or consumed by the bacteria that affect their growth. In contrast to batch cultures, continuous culture techniques allow great-
42
ROBERT M. BLUMENTHAL ET AL.
er control of the culture. Continuous cultures can be obtained by the use of chemostats or turbidostats. Chemostats regulate cell growth by limiting the supply of a growth factor (external control), whereas with a turbidostat all nutrients are in excess, so the cells reach and maintain their maximum growth rate (internal control) (216).Cells grown in a turbidostat are in true steady-state growth because the composition of the growth medium is constant and the cells are growing exponentially. We are using a modified turbidostatic approach to determine how Lrp levels vary throughout the growth cycle in both rich and minimal media. Batch cultures are inoculated to a low density with a wild-type E . coli strain and their growth is followed spectrophotometrically. At various culture densities, fresh, prewarmed medium is pumped into the flasks and old medium is pumped out at the same rate to maintain the cultures at a precise optical density. After a length of time corresponding to two cell doublings, samples are removed and Lrp concentration is determined by Western immunoblot analysis with purified antibody to Lrp. The cultures are then allowed to increase in density to the next desired sampling point where the process is repeated. Using this method, we hope to determine true steady-state levels of Lrp at specific points during the growth cycle. These levels will be compared to those obtained by standard growth in a batch culture.
111. Experimental Analysis of Global Regulators and Their Roles in Escherichia coli: Some Examples At first, the array of different mechanisms by which global regulators exert their function in E . coli may seem bewildering. Some regulons are controlled by choosing an alternate sigma factor to reprogram the promoter specificity of RNA polymerase. Others are controlled by two-component regulatory systems in which phosphorylation or dephosphorylation of the response regulator is used to modulate its effect on transcription. Many regulators are allosterically modulated by cofactors, which influence DNA binding or the effect of the regulator once bound. Furthermore, few operons are controlled by just one regulator. In this section we attempt to provide some generalizations about the specific properties of three classes of global regulators, and how they differ from each other, in the context of analyzing how various experimental approaches have been brought to bear on selected global regulators. In the interest of brevity, we do not discuss the histonelike proteins, such as Fis, IHF, H-NS, and HU, although they pose very interesting questions
GLOBAL GENE REGULATION
43
in their own right (7, 8, 117). They are associated both with specific regulation and, to varying degrees, with global alterations in chromosome structure. As described in Section I,C,2, these regulatory proteins are present in very high copy number in cells. Several of them, e.g., Fis and IHF, are regulated in response to the growth phase of the culture (114, 124), suggesting important physiological roles for these proteins. Their very high concentrations lead to extensive nonspecific DNA binding as well as specific binding, making it difficult to distinguish specific effects on gene expression from general effects on chromosomal organization or on masking effects, described in Section I,B,2.
A. Use of Two-dimensional Gel Electrophoretic Analyses to Study the Heat-Shock Regulon 1. BACKGROUND
One strategy for global regulation is reprogramming of the specificity of RNA polymerase by substituting an alternate sigma factor for u70. In E . coli, four alternate sigma factors have now been identified. Two alternate sigma factors, ~3~ [the htpR or rpoH gene product (219)]and uE (220, 221), are induced when E . coli is exposed to high temperatures. Transcription of genes and operons associated with nitrogen assimilation and conservation using RNA polymerase containing a54 (the rpoN or ntrA gene product) is stimulated when the growth of E . coli is limited by nitrogen availability (222).us (the katF or rpoS gene product) is induced when E . coli cells make the transition from exponential growth to stationary phase due to carbon starvation (203, 223). These alternate sigma factors each compete with u70, and with one another, for binding to core RNA polymerase both in vitro (223, 224) and in vivo (225).They have distinct specificities for consensus sequences in the promoter regions of their target genes (67, 220, 224), and thus redirect RNA polymerase to transcribe genes and operons that may not be transcribed at high levels by a70-containing holoenzyme. If expression of the alternate sigma factor is sufficiently high, a large portion of the RNA polymerase may be sequestered for the transcription of the subset of genes and operons recognized by that sigma factor, and transcription from u70-dependent promoters may be depressed (225). Alternate sigma factors can be used to redirect cellular transcription capacity drastically when expression of a completely new set of genes is required, and the redirection is often associated with decreased levels of expression of o7V-dependent genes. 2. THE HEAT-SHOCKREGULON We first discuss the heat-shock regulon controlled by RpoH (also called HtpR). Our emphasis in discussing transcriptional regulation by RpoH is to
44
ROBERT M. BLUMENTHAL ET AL.
focus on the experimental methods used to obtain information about this regulon and its ability to reallocate gene expression capacity. We focus particularly on information about the regulon that has been obtained by twodimensional electrophoretic analysis. We believe that the approaches employed are generally applicable to the study of global regulation. A mutant, originally isolated as being temperature-conditional lethal in a background that contained a temperature-sensitive nonsense suppressor (226), was unable to induce a whole set of polypeptides that are normally induced on shift to high temperature (227, 228). The ropH (htpR) gene was cloned by complementation of the mutant phenotype and mapped to 75 minutes on the E . coli chromosome (187). Nucleotide sequence analysis of the heat-shock regulatory gene indicated similarities with a70 of RNA polymerase (219), and the purified htpR gene product could be mixed with core RNA polymerase and used to initiate transcription at heat-shock promoters (229). Purified RNA polymerase holoenzyme containing RpoH (032) exhibited promoter selectivity clearly distinct from that of the a7O holoenzyme (224). It was concluded that the spectrum of gene expression in E . coli is under a dynamic control by intracellular levels of competing u subunits. Weissbach and co-workers (225) isolated RNA polymerase from cells grown at 33"C, cells shifted from 33 to 40"C, and cells shifted from 33 to 45°C. They then used immunoblot analysis with antibodies to a70 and a32 to quantitate the relative amounts of the two sigma factors bound to RNA polymerase. The fraction of RNA polymerase associated with 070 was very significantly decreased in cells that had been shifted to 40"C, and even more decreased (to 40% of the preshift level) in cells that had been shifted to 45°C. In contrast, the fraction of RNA polymerase associated with u32 increased seven- to eightfold; after a shift to 40°C the amounts of bound 070 and u32 were approximately 60 and 40% of the total holoenzyme, respectively, whereas after a shift to 46°C approximately 75% of the RNA polymerase holoenzyme contained bound 032, and 25% contained ~ 7 0 Based . on immunoblot analysis of the level of the RNA polymerase core pp' and (Y subunits, the amount of RNA polymerase core in the cell extracts did not change following shifts to higher temperatures, but the total amount of a70 in wholecell extracts increased slightly. These results are consistent with the demonstration that $0 and u32 compete for binding to the core RNA polymerase (224), and suggest that a very significant fraction of the core polymerase is sequestered for the transcription of heat-shock genes following a shift to
45°C. Is the role of 0 3 2 merely to determine promoter specificity, while other transcription factors control the activation or repression of heat-shock genes? Or is a32 a regulatory protein that directly controls the level of transcription of the heat-shock genes? If it is a regulatory protein, the level of active 0 3 2
GLOBAL GENE REGULATION
45
must be regulated, and must correlate with the levels of expression of heatshock proteins in the cell. Indeed, many studies show that the expression of 6 2 is regulated at both transcriptional and post-transcriptional levels. Furthermore, these studies have shown that elevation of the concentration of a32 in the cell leads to increased synthesis of heat-shock proteins. Overexpression of 6 2 by induction of a strain carrying a transcriptional fusion of either the P, promoter of phageh or ploc to r p o H induced the synthesis of heatshock proteins (112, 230), and S-30 extracts from cells that had been shifted from 33 to 45°C initiated the transcription of heat-shock genes approximately eightfold better than extracts from cells grown at 33 C (225). Several lines of evidence indicate that the synthesis of 032 is primarily regulated at the translational level. Temperature-regulated synthesis of ~ 3 is2 not seen in rpoH::lacZ transcriptional fusions but is seen in rpoH::lacZ translational fusions, provided that most of the rpoH gene is present in the fusion (183, 184, 231). These findings have been interpreted as indicating that the r p o H mKNA contains internal cis-acting sequences that can block translational initiation under noninducing conditions. And although r p o H message accumulates after a rise in temperature, the increase in 133~levels precedes the increase in r p o H message accumulation (232),indicting that transcriptional regulation is not the important factor in regulation of u32. Perhaps the most important mode of control of u32levels is achieved by sequestration of u32 in a complex with DnaK, DnaJ, and GrpE (reviewed in Kef. 233). Formation of this complex prevents efficient binding of a32 to RNA polymerase in uitro and can disrupt a a"2-RNA polymerase complex. Any mechanism for induction of transcription of the heat-shock genes by u32 must also explain the transient nature of the increase in the relative differential synthesis of the heat-shock proteins after a temperature shift (186).The net rate of synthesis of u32 decreases during the shut-off phase of the heat-shock response (230).This decrease requires the heat-shock protein DnaK, and involves post-transcriptional control of r p o H expression. It is very likely that formation of the complex between DnaK, DnaJ, and 1332 leads to degradation of the sigma factor by targeting it for ATP-dependent proteolysis (138, 233). Figure 4 summarizes our current understanding of the flow of information in the heat-shock regulon controlled by u32. The stimulus, which may be heat, ethanol, or expression of a heterologous protein ( I E ) , leads to an imbalance of the rate of protein folding with the rate of translation, and the appearance of unfolded or misfolded proteins in the cytoplasm of the cell. The sensor of this stimulus is not definitely identified, nor is the signal that is transmitted to induce the accumulation of 6 2 . However, an attractive model for detection of the heat-shock signal, referred to as the DnaK titration model, has been proposed (234),and recent evidence provides strong sup-
46
ROBERT M. BLUMENTHAL ET AL.
7 Unfolded/misfolded propins
and/or sequestration
?Inclusionbody binding proteins FIG.4. The signal transduction pathway for the heat-shock response. One particular model, the DnaK titration model, for the detection of heat and the regulation of the concentration of 032, is presented here (234).According to this model, an elevation in temperature is detected as an imbalance between the rate of protein synthesis and the rate of protein folding. DnaK, which acts a molecular chaperone, is bound to the unfolded protein, and is therefore unavailable for Levels of 1 ~ 3 2 formation of the DnaK-DnaJ-GrpE complex that sequesters and degrades then increase, the transcription of the heat-shock genes is elevated, and protein folding is enhanced. Once balance between protein synthesis and protein folding is achieved, DnaK concentrations increase again and the rpoH message is sequestered and degraded at a level that reflects the new steady-state requirements for molecular chaperones and proteases.
+*.
port for this model (138, 233). When the temperature rises sufficiently that proteins fail to fold properly, or the rate of protein synthesis outstrips the rate of protein folding, the concentration of free DnaK chaperone in the cell is proposed to fall, and most of the cellular DnaK will be bound to unfolded protein. Under these circumstances, DnaK is not available to bind to $2. The molecular chaperone composed of the DnaK, DnaJ, and GrpE polypeptides appears to play a direct role in the shut-off of a32 production after a temperature upshift (138,230,232,233),and physical interactions between these proteins and a32 have been demonstrated (138, 233). Thus when the concentration of unfolded cellular proteins is elevated, the concentration of the free DnaK will be low, and degradation of a32 will be slowed. The heatshock response is self-limiting because when a32 stimulates sufficient synthesis of heat-shock proteins, including DnaK, the balance between protein synthesis and protein folding will be restored, the concentration of unfolded
47
GLOBAL GENE REGULATION
proteins in the cell will drop, and DnaK will be released and bind to a32 with resultant sequestration and degradation of the sigma factor.
3. IDENTIFICATION OF POLYPEPTIDES INDUCED BY HEAT SHOCKAND STUDY OF THE KINETICS OF THEIRINDUCTION
The effects of temperature shifts on the rates of synthesis of selected individual polypeptides in a wild-type E . coli strain were first examined by Neidhardt and colleagues (186). Individual polypeptides were radioactively labeled and resolved by two-dimensional electrophoresis to determine their differential rates of synthesis. The differential rates of synthesis of four polypeptides increased transiently after the shift from 28 to 42°C during growth in a synthetic rich medium (186).The alphanumerics and, where known, the identities of the polypeptides induced after a shift to 42°C are shown in Table I1 (189, 236, 237). The two most strongly induced polypeptides, GroEL (235) and ClpB (236), were subsequently identified as heat-shock proteins expressed from rpoH-dependent promoters. The maximal differential rates of their synthesis were observed within 7 minutes of the temperature shift, and decreased over the next 5 minutes, although not to preshift levels. In contrast, transient reduction in the relative differential rates of a variety of proteins was observed, as summarized in Table I11 (238, 239). Maximum decreases in the differential rates of synthesis of these peptides were observed 4 to 8 minutes after the temperature shift, followed by return toward the preshift levels. These transient effects are changes in differentid rates of synthesis, and therefore cannot be explained by a general change in protein synthesis rate. To test whether the response of proteins involved in transcription and translation showed decreased rates of protein synthesis because of transient accumulation of guanosine tetraphosphate (ppGpp) after the temperature shift, measurements of relative differential rates of synthesis of these polyTABLE I1 POLYPEPTIDES SHOWINGINCREASEDSYNTHESISAFTER
Alphanumeric B56.5
F24.5 F84.1 G32.8 '1
Gene (protein)
groEL (GroE, large subunit) ompA (outer membrane porin A) clpB (ATP-dependent protease) Not identified
From Ref. 186
A
SHIFT FROM 28 TO 42"Ca
Relative rate of synthesis (maximal)
Identification reference
11.5
(235)
2.0 35.7 3.2
(189) (236) -
48
ROBERT M. BLUMENTHAL ET AL.
TABLE I11 POLYPEYI'IDES SHOWING REDUCED SYNTHESIS AFTER A SHIFT FROM
Alphanumeric
E58.0 F58.5 G61.0 F47.8 E77.5 F107 1)lOO 1)Sti.S G36.0 1194.0
E 106 1184.0 c30.7 B65.0 840.7 1?157 C137 E 140 F178 G93.0 G97.0
'1
Gene (protein)
argS (arginyl-tRNA ligase) a s p s (aspartyl-tRNA ligase) glnS (glutamine-tRNA ligase)
g / t M (glutamate-tRNA ligase) g/yS (glycine-tRNA ligase, 0 subunit) ileS (isoleucine-tRNA ligase) /euS (leucine-tRNA ligase) lysS (lysine-tRNA ligase, formI) plzeS (phenylalanine-tRNA ligase, a subunit) pheT (phenylalanine-tRNA ligase, p subunit) ua/S (valine-tRNA ligase) fusA (elongation factor-6) tsf (elongation factor-Ts) rpsA (ribosomal subunit protein 1) rpoA (RNA polymerase, a subunit) rpoB (RNA polymerase, 0 subunit) inetH (methionine synthase) Not determined Not determined Not determined s z t d (a-ketoglutarate dehydrogenase)
28 TO 42Tfi
Relative rate of synthesis (minimum level)
Identification reference
0.7 0.5 0.8 0.5 0.85 0.55 0.65 0.5 0.4
237 189 189 237 237 237 23 7 237 237
0.35
237
0.6 0.4 0.6
237 188 188 238 36 36 189
0.4
0.8 0.3 0.4 0.6 0.8 0.2 0.5
-
From Ref. 186.
peptides were made in synthetic rich media fully supplemented with 20 amino acids or lacking leucine, isoleucine, and valine. Much more ppGpp is produced in the medium lacking leucine, isoleucine, and valine, but the differential rate of synthesis of most polypeptides after the temperature shift was little affected. Thus allocation of gene expression capacity away from the genes of the translation machinery in response to heat shock does not depend on the control system that norinally regulates these genes. These seminal experiments developed the methodology for examining relative differential rates of synthesis of individual polypeptides, and provided the first indication that a subset of polypeptides in E . coli are transiently induced on a shift from 28 to 42OC against a background of transiently decreased synthesis of other polypeptides. It should be noted that the ability to measure the kinetics of induction andlor repression of polypeptide syn-
49
GLOBAL GENE REGULATION
thesis in response to a change in growth conditions is a strength of using electrophoretic analysis. Such information is generally not obtainable by using operon fusions to reporter genes, although the use of luciferase as a reporter gene could theoretically provide such information because instantaneous flux can be measured. Knowledge of the kinetics of a response to changed growth conditions is essential if maximal effects are to be observed-the Inaxiinal response to a temperature shift from 37 to 42°C is seen -7 minutes after the shift, and is much diminished after 15 minutes (186). In a subsequent paper, the levels of 133 proteins during steady-state growth at temperatures ranging from 13.5 to 46°C were assayed (239).The levels of expression of these proteins were compared with those of a standard culture grown under the same conditions at 37°C. Of the 111 polypeptides whose expression was characterized, five showed increased expression at 46 as compared to 37"C, as shown in Table IV. The relative levels of proteins required for translation and transcription were reduced at very low and very high temperatures, consistent with the reduced steady-state growth rates at these temperatures. When global patterns of expression at 37 and 46°C were compared, the summed decrease in the levels of these proteins, from 38 to 22% of the cell's total protein mass, was found to be equivalent to the summed mass increase in just three proteins, B56.5 (GroEL), B66.0 (DnaK), and F24.5 (OmpA), from 4 to 20% of the cell's total protein mass. The method for identifying members of a regulon developed in Blattner's laboratory (200, 201) (see Section II,A) is equally useful for monitoring repression of genes following heat shock. The clone most repressed by heat shock was found at minute 72.6 and contains operons for most of the ribosomal proteins, which were also heat-depressed operons (186).Note however, that the method developed by Blattner and colleagues (200, 201) is not TABLE IV POLYPEPTIDES WITH INCREASEDSYNTHESISDURING STEADY-STATE GROWTH AT ELEVATEDBMPERATURES~ Relative rate of synthesis Alp haiiuineric
A165 B56.6 D40.7 F24.5 F32.3 F84.1
Gene (protein)
(46/37"C)
Not identified groEL (GroE, large subunit) /icJ (LeuiIlelVal binding protein) ompA (outer membrane porin A) Not identified c / p B (ATP-dependent protease)
25.0 7.2 4.4 2.4 9.0 4.7
Identification reference
235 189 189 -
236
50
ROBERT M. BLUMENTHAL ET AL.
suitable for detecting global repression of most protein synthesis, but only for detecting relative differential repression of synthesis of particular proteins following a heat shock. 4. IDENTIFICATION OF GENESAND PROTEINS IN THE
HEAT-SHOCK RECULON
The initial identification of proteins in the heat-shock (or high-temperature) regulon was made by comparing the patterns of protein expression observed by two-dimensional electrophoretic analysis before and after a shift from 28 to 42°C (187). The post-shift expression patterns of three strains were compared: a wild-type strain, a strain containing a nonsense mutation in htpR together with a temperature-sensitive suppressor, and the same strain complemented by a plasmid carrying the wild-type allele of htpR. Thirteen peptides, heat inducible only in cells bearing a normal htpR gene, were identified and were designated as members of this regulon. As noted in Section I,A,l, this definition of a regulon included both target operons that are directly transcribed by the a32-RNA polymerase holoenzyme, and operons that may be indirectly controlled by htpR. A different strategy for the identification of genes with heat-shock promoters used a promoter cloning vector to identify promoters that were responsive to 0 3 2 (240). Chromosomal DNA was cleaved with HaeIII, and HindIII linker arms were added to the fragments. These fragments were cloned into plasmid pFF6, which contains P-lactamase (arnpR) and a lac2 gene lacking a promoter with a HindIII site upstream of it. Transformants were selected on L agar containing ampicillin and X-Gal at 37°C and screened for blue colonies. The transformants producing blue colonies under these conditions were further screened for those that exhibited enhanced lac2 expression on a temperature shift from 30 to 42°C. One of the clones contained a DNA fragment containing the region upstream of clpB. A promoter upstream of the clpB gene, coding for the large subunit of an ATP-dependent protease, was also first identified by homology to the heat-shock consensus sequence and confirmed by measuring P-galactosidase activity from an operon fusion in rpoH+ and rpoH strains. The clpB promoter was transcribed in vivo by G ~ ~ - R Npolymerase, A thus establishing that clpB expression is under direct control of u32. ClpB was identified as F84.1 (236),one of the polypeptides induced by temperature increases as demonstrated by two-dimensional gel electrophoresis (241). The reader will note that one method frequently used to identlfy genes in a regulon, namely insertional mutagenesis, has not been widely employed in the search for heat-shock genes. This method, which is very effective in identifying genes in other global regulons, is not generally suitable for the identification of genes required for cell viability. Many of the genes induced
GLOBAL GENE REGULATION
51
by a shift to elevated temperature, ennumerated in Table V, are required for cell viability. One of the metabolic complications of a shift to a higher growth temperature, and particularly to a temperature above the optimal growth temperature (37°C)for E . coli, is that nascent polypeptides begin to misfold, because the elevated temperature leads to an increased rate of translation and decreased strengths of the hydrogen bonds essential for secondary structure formation. Among those polypeptides stimulated are the molecular chaperones GroE, DnaK, DnaJ, GrpE, and HtpG, which appear to assist folding and degradation of proteins as well as disassembly of protein complexes (11, 242). DnaK, DnaJ, and GrpE have recently been shown to constitute a celIular chaperone system for protein folding (10). Several ATP-dependent proteases or subunits of these proteases are also induced by heat shock, including Lon (La), ClpB, ClpP, and possibly HslV (236, 240, 243-246), and these proteases are thought to be involved in degrading misfolded proteins. In addition, two of the heat-shock polypeptides, IbpA and IbpB, are host components of the inclusion bodies formed when heterologous proteins are expressed in E . coli and fail to fold properly (144).Thus almost all the genes and operons in the heat-shock stimulon that have been identified appear to be involved in protein folding andlor the degradation or compartmentation of misfolded proteins. As discussed in Section 111,A,2, elevation in the concentration of a32 sequesters part of the RNA polymerase pool for the transcription of heatshock genes and operons, and results in increased transcription of these target operons. Elevations in the concentrations of the heat-shock proteins follow within a few minutes. Most of the operons in the heat-shock regulon appear to be controlled directly by the availability of ~ 3 2 and , are transcribed by a32 holoenzyme. However, when the rpoH gene is placed under the control of the Zac promoter, and ~3~ synthesis is induced by IPTG, several heat-shock proteins do not accumulate to the extent that they do after a shift to 42°C (112).These proteins include LysU, which is not induced at all by IPTG, and the inclusion body binding proteins IbpA and IbpB, which accumulate at much reduced levels. It has been postulated that a metabolic signal, present during heat induction but not during RpoH induction by IPTG, is required for the synthesis of these three heat-shock proteins (80), which may thus be the products of indirectly controlled operons, or operons whose transcription by cr32-RNA polymerase requires additional factors.
5. OTHERALTERNATE SIGMA FACTORS AND THE RECULONS THEYCONTROL:DIFFERENCESIN REGULATORYPATTERNS A detailed discussion of each sigma factor-controlled regulon is beyond
the scope of this review. However, there are some respects in which regula-
52
ROBERT M. BLUMENTHAL ET AL.
MEMBERS OF
TABLE V HEAT-SHOCK STIMULON
THE
Protein
Geneloperon
Alphanumeric
GrpE, chaperone subunit
RrPE
B25.3
GroE chaperonin, large subunit (GroEL; HspGO) I h a K chaperone subunit, ATPase (Hsp70) u''~subunit, RNA polymerase Inclusion body binding protein IhpB
groESL
B56.5
dna9
GroE chaperonin. sinall subunit (GroES) HtpG chaperone (HspSO)
Method of identification 2-D gel electrophoresis, nucleotide exchange factor 2-D gel electrophoresis In tiitro transcription
241 224
B66.0
2-D gel electrophoresis In nitro transcription
241 229
rPoD
B83.0
ibpA B
C14.7
241 229 144 187
groESL
C15.4
2-D gel electrophoresis In tiitro transcription 2-D gel electrophoresis Cloning and promoter consensus 2-D gel electrophoresis In zjitro transcription
htpG
C62.5
2-D gel electrophoresis Cloning, binding to
241 189, 194
hsll]
D33.4
hslVU
D48.5
2-D gel electrophoresis Global transcriptional analysis 2-D gel electrophoresis Global tlranscriptional analysis Sequencingipromoter consensus 2-D gel electrophoresis 2-D gel electrophoresis 2-D gel electrophoresis
a32
Lysyl-tRNA Synthetase, forin I1 ATPdse subunit, ATPdependent protease ATP-dependent protease regulatory subunit ClpB Inclusion body binding protein IbpA
Ref.
WJ clpP
D60.5 F10.1 F21.5
clpB
F84.1
ibpA B
(213.5
hslVU
G21.0
187
241 224
241 201
187 201 192 241 241 187, 195, 20.5, 225, 236. 241
Promoter cloning 2-D gel electrophoresis
190
2-D gel electrophoresis Cloning and promoter consensus 2-D gel electrophoresis Global transcriptional analysis
24 1
173
114 241 203
(continues)
53
GLOBAL GENE REGULATION
TABLE V
Protein
DnaJ. chaperone subunit ATP-dependent protease La
(Continued)
Alphanumeric
Method of identification
245
dnaK]
H26.5
Sequencing/promoter consensus 2-D gel electrophoresis
lon
H94.0
2-D gel electrophoresis Cloning and promoter consensus Global transcriptional analysis Global transcriptional analysis Global transcriptional analysis global transcriptional analysis Global transcriptional analysis Global transcriptional analysis
241 196
Gene/operon
hslC
hslD hslEFGf1
hslK hslLMN hsZXYZ
Ref.
24 1
201
201 201 201 201 201
tion by other alternate sigma factors differs from the model presented for u32. Perhaps the major difference is that the concentration of ~3~ is regulated, and varies as a function of the growth temperature, while the concentration of other alternate sigma factors may be constant, and regulation of the transcriptional activity of their target operons may involve additional factors. For example, there is currently no evidence that the concentration of us4 (also called RpoN or NtrA), the alternate sigma factor required for transcription of genes induced during nitrogen limitation, is increased when cells are exposed to limiting concentrations of ammonia (247, 248). Instead, binding of u54 and core RNA polymerase to the promoter of a target operon results in the formation of a closed complex, but unlike u70, a34 does not, by itself, confer the ability to form open complexes. Formation of an open complex requires binding of a transcriptional enhancer, NR, or NtrC, and the activity of this transcriptional enhancer is regulated by phosphorylation, which creates the form active as an enhancer, or by dephosphorylation. The regulation of the activity of NtrC is discussed in more detail in the next section of this review. However, u s sequesters a relatively constant fraction of the core RNA polymerase in the cell at the promoters of genes in the nitrogen utilization regulon, and this fraction of RNA polymerase is released only when
54
ROBERT M. BLUMENTHAL. ET AL.
transcription is initiated on binding of phosphorylated NtrC. In this case, the role of the alternate sigma factor is to specify the target genes in the regulon, but not to redirect transcriptional capacity.
B. Nitrogen Source Utilization and Two-component Response Regulators In E . coli, detection of a signal and the response to it often involve a twocomponent regulatory system. It has been estimated that there are over 50 two-component regulatory systems, often identified by their sequence similarities to well-studied systems (7). The two components are a sensor kinase and a response regulator. The sensor kinase is phosphorylated on a histidine residue in response to the signal. This component often contains both periplasmic and cytoplasmic domains, connected by a transmembrane section, and permits detection of an extracellular signal and transmission of the signal to the intracellular kinase domain. The phosphate on the active-site histidine of the sensor kinase is transferred to an aspartate residue on the response regulator, which results in the activation of the response regulator. Because aspartyl phosphate residues are intrinsically unstable, the lifetime of the active response regulator is relatively short. Two-component regulatory systems are ideally suited for controlling genes that must be expressed at rapidly fluctuating levels. The response regulator is further regulated by enzyme-catalyzed dephosphorylation. Often the sensor kinase is a bifunctional enzyme with phosphatase activity as well as kinase activity, but in other cases a separate protein catalyzes dephosphorylation of the response regulator. Some well-characterized two-component regulatory systems include the EnvZ/OmpR system that regulated porin expression in response to osmolarity of the medium, the NtrB/NtrC (NR,/NRIr)system that regulates glutamine synthetase expression in response to ammonia limitation, and the CheA/CheY/CheB system that regulates bacterial chemotaxis. 1. CONTROLOF NITROGENSOURCEUTILIZATION In this section, we discuss the two-component regulatory system responsible for controlling glutamine synthetase in response to the availability of ammonia (reviewed in Ref. 249). Genes that are regulated by this twocomponent regulatory system are listed in Table VI (247, 250-253). Again, our emphasis is on the experimental techniques that have been used to elucidate the regulatory mechanism. Here, an extremely complex signal transduction pathway has been elucidated by the systematic cloning, expression, and purification of the components and by biochemical characterization of their interactions in tiitro. Much of the pathway can now be reconstituted in vitro, and the results of
55
GLOBAL GENE REGULATION
GENESIN
TABLE VI NITROGENRECULON
THE
Regulated by Gene
glnA ntrC
ntrB gabPDT glnllPQ COdA
speB put prr
Protein specified
NR,
+4
Ref.
Glutamine synthetase (GlnA) NRI NRII y-Aminobutvrate transport High-&nity glutamine transport Cytosine deaminase Agtnatine ureohydrolase Putrescine aminotransferase A-Pyrroline dehvdrogenase
Yes Yes Yes
Yes Yes Yes
P
Yes -
24 7 247 24 7 250 251 252 253 253 253
Yes Yes Yes Yes Yes
this reconstitution can be compared with earlier studies on the changes in the levels of intracellular metabolites and enzyme activities that follow a change in nitrogen availability. Of all the global regulons, this is probably the one whose physiological significance is best understood, and for which the pathway linking the stimulus with the response is most thoroughly elucidated (249). The signal transduction pathway for the nitrogen regulon is summarized in Fig. 5. Growth of E. coli in the presence of low concentrations of aininonia (<1mM) requires the action of two enzymes, glutamine synthetase [Eq. (2)] and glutamate synthase [Eq. (3)].
+
+
+
glutamate + NH, ATP + glutainine ATP H,POz a-ketoglutarate + NADPH glutainine + 2 glutamate NADP+
+
+
(2) (3)
The net reaction catalyzed by these two enzymes is given by Eq. (4).
+
a-ketoglutarate NH, glutamate NADP+
+
+ NADPH + ATP+ ADP + H,PO,
(4)
Cellular energy in the form of ATP is used to drive the incorporation of ammonia into glutamate when the concentration of ammonia in. the cell is low. At higher levels of ammonia, glutamate dehydrogenase can substitute for glutainate synthase as shown in Eq. (5). a-ketoglutarate glutamate
+ NH,3 + NADPH + + NADP+ + H,O
(5)
However, when cellular levels of ammonia are low, this reaction becomes thermodynamically unfavorable. As might be expected for a central enzyme in nitrogen metabolism,
56
ROBERT M. BLUMENTHAL ET AL.
ammonia
1 Glutamine synthetase (GlnA)
1
gln4L.G glnHPQ, etc.
FIG. 5. The signal transduction pathway for regulation of glutamine synthetase and the nitrogen regulon. The sensor for ammonia is glutamine synthetase, and the signal is the product of the glutamine synthetase reaction, glutamine. Glutamine modulates the activity of the uridylyltransferase/uridylyl-removingenzyme (UTase/UR), inhibiting uridylyltransferase activity to PI, and stimulating uridylyl removal from P,,. PI, in turn regulates the activity ofboth the adenylyltransferase/adenylyl-removingenzyme, which is responsible for the covalent modification of glutamine synthetase, and the transcriptional activator NR,. Free PI, activates the adenylyltransferase activity of the ATase/AR, which in turn adenylylates glutamine synthetase and reduces its activity. Free PI, also forms a complex with NR,, that elicits the phosphatase activity of NR,,, which leads to the dephosphorylation of NR,-phosphate and a reduction in the transcription of glnALG and other operons in the nitrogen regulon. This in turn leads to a decreased rate of conversion of ammonia and glutamate to glutamine and completes the feedback loop for regulation of intracellular glutamine levels.
glutamine synthetase is extensively regulated, both at the level of activity and at the level of transcription. Glutamine synthetase is covalently modified by transfer of the AMP moiety of ATP to a specific tyrosyl residue on the enzyme (254). This adenylylation reaction is catalyzed by an adenylyltransferase (the glnE gene product), and results in inhibition of enzyme turnover under physiological conditions. The adenylyltransferase is also capable of deadenylylating glutamine synthetase, restoring its catalytic activity. Regulating the adenylylation/deadenylylationactivity of GlnE requires that
GLOBAL GENE REGULATION
57
it form a complex with another protein, P,, (the glnB gene product). P,, activity is controlled by uridylylation of a tyrosine residue, catalyzed by a uridylyltransferase (the glnD gene product) using UTP as the uridylyl donor. The uridylyltransferase is also responsible for removal of the uridylyl group, and its activities are regulated in response to the glutamine concentration in the cell. When the concentration of glutamine in the cell is high, an indication of an adequate supply of ammonia, the uridylyl-removing activity of the uridylyltransferase is activated, the uridylyl group is removed from PI,, and the adenylyltransferase activity of GlnE is induced. The net result is the inhibition of glutamine synthetase by adenylylation. When the concentration of glutamine is low, indicating nitrogen limitation, the uridylyltransferase activity is activated, PI, is uridylated and forms a complex with the adenylyltransferase, the deadenylylation activity of the transferase is induced, and glutamine synthetase activity is activated. Glutamine synthetase (GlnA) is also regulated at the level of transcription in response to the levels of glutamine and a-ketoglutarate in the cell. The glnA gene is the first gene in an operon that also codes for the sensor kinase (NR,,, or NtrB) and the response regulator (NR,, or NtrC). Three promoters of the glnALG operon have been identified, glnApl, glnAp2, and glnLp (2’5.5).The gl?tApl and glnLp promoters are recognized by a70-RNA polymerase and are transcribed when abundant ammonia is present. The level of glutamine synthetase in the cell is much lower when ammonia levels are high than when they are low. Phosphorylated NR, inhibits transcription from these promoters. When ammonia becomes limiting, transcription of glnALG is immediately activated from glnAp2. Transcription from this promoter requires a Us4-RNA polymerase complex, and phosphorylated NR, (256). The phosphorylation and dephosphorylation of NR, are regulated by the sensor kinase, NR,,, in response to the levels of glutamine and cx-ketoglutarate in the cell. The levels of these two metabolites are sensed by the same pair of proteins required for covalent modification of glutamine synthetase, the uridylyltransferase/uridylylate-removingenzyme and PI,. Here PI, modulates the phosphatase activity of NR,,, functioning as a dissociable regulatory subunit (257).When the uridylyl group of PI, is removed, PI, binds NR,, and the phosphatase activity of NR,, is induced. This leads to the deactivation of NR,. When PI, is uridylylated, it dissociates from NR,,, and the phosphatase activity of the sensor kinase is diminished. 2.
DETECTION OF PHOSPHORYLATED INTEHMEDIATES
A landmark paper in the elucidation of regulation by two-component regulatory systems is the study of Ninfa and Magasanik establishing that NR, activity is regulated by covalent modification (256). This study was made
58
ROBERT M. BLUMENTHAL E T AL.
possible by the availability of purified preparations of NRI, NR,,, RpoN (us), PI,, and core polymerase, and used single-cycle in zjitro transcription assays to measure the formation of open complexes when NR, is bound upstream of the gZnAp2-uw closed complex. In these experiments, RpoN, NR,, core polymerase, ATP, GTP, and CTP were incubated for 2 minutes at 37"C, NR,, was added, and after a defined length of time labeled UTP and heparin were added. Heparin competes with RNA polymerase for DNA in closed complexes, and thus only those complexes that are already in the open form will be transcribed. Transcription from glnAp2 was monitored by electrophoresis on urea-acrylamide gels and autoradiography for detection of the 309-nt labeled transcript expected for initiation at this promoter. Ninfa and Magasanik first examined the dependence of open complex formation on the concentration of NRII with a 7-minute incubation between the time of addition of NRI, and the addition of UTP and heparin. Using 240 nit4 NRI and 510 nM supercoiled gZnAp2 template DNA, they observed that initiation of transcription became constant at sufficiently high levels of NR,, and was halfmaximal at 1.5 nM NRII. At suboptimal concentrations of NRI,, increased transcription was observed if longer periods elapsed between the addition of NR,, and the addition of heparin and UTP. These observations suggested that NRII might be catalyzing the activation of NR,. Incubation of NR, and NR,, in the presence of ATP significantly shortened the period of time required for maximal open complex formation, and activation of NRI was correlated with the transfer of the y-phosphate of ATP to the protein. Also, addition of PI, to the incubation mixture resulted in the dephosphorylation of NRI and the loss of transcriptional activation, but dephosphorylation of NRI in the presence of PI, could be prevented by a mutation of NR,,. Thus they concluded that NR,, had both kinase and phosphatase activity toward NRI and that PI, is required to activate the phosphatase activity. We now know that NR, is the kinase, and can use acetyl phosphate, carbamoyl phosphate, phosphoramidate, or the phosphohistidine of NR,, as the source of phosphate (258). The first-order rate constant for dephosphorylation of NR,-phosphate in the absence of NR,, is 0.14-0.19 min-1 (half-life of 3.6-5 minutes) (259). Because the half-life of denatured NR,-phosphate is about 5.5 hours (260),it is apparent that the native enzyme destabilizes the phosphoryl group. Incubation of NR,, with ATP in the presence of magnesium results in transfer of the y-phosphate of ATP to a histidine residue on NR,, (260),and this reaction is freely reversible. Incubation of phosphorylated NRI, with NR, resulted in the transfer of the phosphate group to an aspartyl residue on NRI, and this reaction also required magnesium. Thus the role of the sensor kinase, NR,,, in this two-component regulatory system was established.
GLOBAL GENE REGULATION
59
3. MAPPING OF THE NtrA BINDINGSITES UPSTREAM OF glnA AND ELUCIDATION OF THE REQUIREMENT FOE PHOSPHORYLATION
A parallel line of investigation resulted in the identification of three strong binding sites for NR, near the glnAp2 promoter; sites 1 and 2 are spaced -130 and 100 bp upstream of the transcription start site from glnAp2 and overlap the glnApl promoter (261, 262); a third binding site (Lp) overlaps the u7O-dependent promoter glnLp. Movement of sites 1 and 2 more than 1000 bp upstream does not diminish the ability of NR, to stimulate transcription at glnAp2 (262), and they can also function downstream of the start site of transcription (88). Sufficiently high concentrations of NRI can activate formation of the open complex even when all NR, binding sites are removed by mutagenesis (88). These are the properties of enhancers and enhancer binding proteins in eukaryotes, and the NR, sites were the first enhancers to be identified in prokaryotes. Ninfa et al. (88)went on to suggest that binding of NR, to these high-affinity sites resulted in DNA bending that brought the bound enhancer binding proteins in contact with u54-RNA polymerase bound just upstream of the start site of transcription. The NR, binding sites facilitate open complex formation when they are bound by NR, on one ring of a singly linked catenane and the glnAp2 promoter with bound a5-1-RNA polymerase is located on the other, but not when the two rings are decatenated (263). Electron microscopy provided further evidence for a direct interaction between NR, and uSd-RNA polyinerase holoenzyme bound to the glnAp2 promoter region that involves looping out of the intervening DNA (264).Still unresolved, however, was the way in which NR contact facilitates open complex formation in the bound holoenzyme-promoter complex. NR,-dependent facilitation of open complex formation requires hydrolysis of ATP (265).Although binding of NR, to specific sites on DNA does not require phosphorylation, only NR,-phosphate can hydrolyze ATP and activate transcription. The cooperativity of binding of NR, to adjacent binding sites 1 and 2 is dramatically affected by phosphorylation, which decreases the cooperativity constant 50,000-fold, although it does not affect the affinity of NR, for the isolated site 1 (266).Furthermore, very high concentrations of NR,-phosphate are required to activate open complex formation when bound to a single strong site (Lp), whereas much lower concentrations suffice when sites 1 and 2 are both present. Measurements of activation by NR, to templates that contain no binding sites, one binding site, or two binding sites all give sigmoidal curves, indicating that in each case dimers are interacting to form an oligomer, presumably a tetramer. Thus it was concluded
60
ROBERT M. BLUMENTHAL ET AL.
(266)that activation of transcription requires the formation of a tetramer, and that the enhancer sequences facilitate tetramer formation by placing the two dimers on the same face of the DNA helix, separated by three turns. Further studies established that the required ATP hydrolysis associated with transcriptional activation by NR,-phosphate also exhibits a sigmoidal dependence on NR, concentration, suggesting that ATP hydrolysis might require a tetramer to be formed (267). NR,-phosphate and ATP hydrolysis are required prior to the initiation of transcription and the formation of the open complex (265). A series of experiments (268) tested the ability of a mutant form of NRI unable to bind DNA to activate transcription by interacting with a mutant form of NRI that binds DNA and has some ability to activate transcription without phosphorylation. Neither mutant form of NR, could activate transcription effectively at low concentrations, but if added together a marked synergy in transcriptional activation was seen. It was proposed (268)that phosphorylation of NR, alters the properties of the protein so as to favor protein-protein interactions that lead to oligomerization and to generate a complex capable of ATP hydrolysis. What remains to be understood is how ATP hydrolysis is coupled to a change in conformation of the bound RNA polymerase that leads to open complex formation and the initiation of transcription. The observations arising from these detailed studies of the activation of transcription of glnA by NRI have many implications for global regulation in general. One of the problems of global regulatory proteins is that they must select specific target sites from among many closely related sequences in chromosomal DNA. As illustrated in this system, placing such sites in tandem, on the same side of the DNA helix and approximately three helical turns apart, can greatly favor specific over nonspecific binding by allowing cooperative interactions between the two adjacent dimers. 4. TRACING THE PATHOF SIGNALTRANSDUCTION The pathway of signal transduction that links the availability of ammonia to the expression of glutamine synthetase is one of the best understood of all the signal transduction pathways in E . coli, as summarized in Fig. 5. As discussed above, expression of genes in the nitrogen regulon is regulated largely in response to the glutamine concentration in the cell. When nitrogen is limiting, the cellular level of glutamine is low, and when ammonia is added to the medium the level rises dramatically within -15 seconds (208). On addition of ammonia to a nitrogen-limited culture growing on proline/glycerol medium the rise in glutamine concentration is synchronized with an approximately equivalent drop in the cellular concentration of glutamate and of ATP (208).It is, of course, the activity of glutamine synthetase, the enzyme responsible for the synthesis of glutamine from ammonia and
61
GLOBAL GENE REGULATION
glutamate, that determines the glutamine level in the cell. Thus glutamine synthetase is the sensor of cellular ammonia levels, and glutamine, the product of its reaction, is the signal. This may represent a general principle, in that enzymes have properties that make them effective sensors. The activity of the uridylyltransferase/uridylate-removing enzyme is regulated in response to a-ketoglutarate and glutamine, with glutamine inhibiting the uridylyl transferase and stimulating the uridylate-removing enzyme (reviewed in Ref. 269). Recent studies have identified the uiidylyltransferase/ uridylate-removing enzyme as the direct target of glutamine regulation (270). The uridylyltransferase activity of a purified preparation of the enzyme is inhibited by glutamine, whereas the uridylyl-removing activity is activated. When ammonia is added to a nitrogen-limited medium the uridylylremoving activity will be activated as the cellular glutamine level rises. This leads to the removal of the uridylyl residues from PII-UMP. PI, can then activate the adenylyltransferase activity of GlnE, which leads to the covalent modification of glutamine synthetase and a decrease in glutamine synthetase activity. This decrease in glutamine synthetase specific activity is observed within 30 seconds after the addition of ammonia to the medium (270). As discussed in Section 111, B, 1, the protein kinase/phosphatase activity of NR,, is also regulated by the dissociable regulatory subunit PI,, and the ability of P,, to bind NR,, is in turn regulated by the uridylyltransferase/ uridylate-removing enzyme GlnD. PII-UMP does not elicit phosphatase activity from NR,,, but PI, does (270). Thus, the rise in cellular glutamine levels will also be associated with activation of the phosphatase, and conversion of the active transcriptional regulator of g h A , NR,-phosphate, to the inactive NRI. The resulting decrease in the transcription of glutainine synthetase leads to a lowering of the cellular concentration of the enzyme as cell division leads to dilution of the existing enzyme pools.
5. UNIQUE PROPERTIESOF
THE
NITROGEN REGULOK
Perhaps the salient property of the nitrogen regulon is the speed with which adaptation to changes in nitrogen availability can occur. As first demonstrated (96), addition of ammonia in the medium results in elevated levels of cellular glutamine within 15 seconds, and in inactivation of glutamine synthetase by covalent adenylylation within 30 seconds, as shown in Fig. 6. The speed of the response here rests on the availability of the regulatory proteins: adenylyltransferaseiadenylyl-removing enzyme and uridylyltransferase/uridylyl-removing enzyme are already present in the cell, and their activities are modulated in response to cellular levels of glutamine. No proteins need be synthesized before a response can be generated. The rapidity of the response is essential; addition of ammonia to the medium would otherwise result in depletion of cellular pools of ATP and
ROBERT M. BLUMENTHAL ET AL.
62
: 25 : 20 : 15
:10 : 5.0 : 0.0
-25 -50
-5.0
50
150
250
350
Time (sec) FIG. 6. Response of cells to addition of ammonia to the medium during growth in a nitrogen-depleted medium (glutamine as nitrogen source). [Adapted, with permission, from Schutt and Holzer (208).]During growth in a nitrogen-depleted medium, the level of glutamine synthetase is high and the enzyme is maximally active. Immediately after addition of ammonia (10 mM) at zero time (arrow), an enormous flux of ammonia into glutarnine occurs, as evidenced by the rise in glutamine, the fall in cellular ATP, and the transient fall in cellular glutamate. These changes occur within 15 seconds of the addition of ammonia. In response to the elevated glutamine level, the adenylyltransferase activity of the ATase/AR enzyme is activated, and glutamine synthetase is adenylylated within the next 15 seconds, as evidenced by the fall in the cellular activity of glutamine synthetase. Inactivation of glutamine synthetase allows restoration of the cellular levels of glutamate and ATP and a fall in the intracellular concentration of glutamine, although not to the level seen in cells growing in nitrogen-depleted medium. Rapid covalent modification of glutamine synthetase prevents the depletion of cellular stores of ATP and glutamate and possible cell damage due to the loss of these central metabolites.
glutamate due to increased activity of glutamine synthetase (208, 271). The rapidity of the response is also facilitated by dual layers of regulation, with covalent modification rapidly inactivating preexisting glutamine synthetase and decreased transcription more gradually lowering the concentration of glutamine synthetase in the cell. Here, the system seems designed for rapid response to nitrogen excess, rather than nitrogen depletion. Nitrogen excess results in too much glutamine synthetase activity, and transcriptional regulation is slow to deal with excess capacity. If the protein is stable, three cell divisions in the complete absence of further synthesis of the protein would be required before the cellular levels of glutamine synthetase would drop to one-eighth. Secondary methods such as protein degradation or inactivation are required for rapid responses to excess enzymatic activity. In contrast, transcriptional regulation
GLOBAL GENE REGULATION
63
seems much better suited to rapid production of proteins in response to a signal-especially to proteins that are not required in high concentrations. The two-component system consisting of NR, and NR,, functions at the level of the transcriptional regulation of glutamine synthetase. In response to nitrogen limitation, the pool of inactive adenylylated glutamine synthetase is activated, and the transcription of glnALG is induced. Induction is facilitated because C T ~ ~ - R Npolymerase A is already sequestered at the promoter for &A. It is also facilitated because NR, can catalyze its own phosphorylation using acetyl phosphate as the phosphate donor (258), bypassing the need for NR,, and ATP.
C. The Leucine-responsive Regulatory Protein and Metabolite-modulated Regulators In the nitrogen regulon, the linkage between elevation of cellular glutamine and decreased transcription of glutamine synthetase requires the action of five proteins: uridylyltransferase, PI,, NR,,, NR,, and aS4, in addition to core RNA polymerase. Coregulator-controlled transcriptional regulators offer a simpler solution to transcriptional regulation, where an intracellular metabolite directly modulates the activity of the regulatory protein. Because the authors have studied the regulon controlled by the leucineresponsive regulatory protein (Lrp), we will use this regulon to illustrate some general features of co-regulator-controlled global regulation. Three issues are particularly stressed in this section. First, what is the regulon and which genes and proteins belong to it? We continue our discussion of identification of genes and proteins in a global regulon, focusing here on the ability to sample adequately the diversity of phenotypes in a regulon with particular screening methods. Second, which other genes are controlled by members of this regulon? We discuss the experimental distinction between genes that are directly controlled by the global regulator and those that are indirectly controlled, and consider the degree to which indirectly controlled genes may be important in the physiology of a regulon. Third, we consider the criteria for establishing whether in uitro observations are useful in predicting in uivo behavior, discussing the kinds of experiments necessary to decide if conclusions based on in uitro data are valid.
1. BACKGROUND The leucine/Lrp regulon was first detected in several studies that suggested a special role for leucine as a regulatory metabolite (272-274). Leucine affects the expression of several operons whose gene products catalyze reactions unrelated to leucine biosynthesis and transformation, e. g., sdaA [L-serine deaminase (272, 273)], tdh [threonine dehydrogenase (275, 276)], lysU (277),iluIH (278, 279), livJ and liuKHMGF (274, 280), and oppA (281).
64
ROBERT M. BLUMENTHAL ET AL.
The Zrp gene was first identified by a mutation, then called ZivR, that abolished the repression of high-affinity branched-chain amino-acid transport when leucine was present in the medium (282).Insertional mutations in Lrp were subsequently shown to prevent the induction of acetohydroxyacid synthase I11 (IlvIH) (283)and 3-phosphoglycerate dehydrogenase (SerA) (275)in media lacking leucine, and to prevent the repression of serine deaminase (SdaA) (275), threonine dehydrogenase (Tdh) (275), lysyl-tRNA synthetase form I1 (LysU) (284),and oligopeptide perinease (OppABCDF) (285)in media containing leucine.
2. IDENTIFICATION OF TARGETGENES IN
THE
Lrp KEGULON
Three different methods have been used to identify genes in the regulon controlled by Lrp. Initial members of the regulon were identified as genes whose expression was affected by leucine, as described in the preceding section. A genetic screen was used to identify additional genes in the Lrp regulon (173). Protein fusions were constructed by random insertion of phage ApZucMu9 into the chromosome, placing p-galactosidase expression under the control of heterologous promoters and translation initiators. Lrpregulated genes were initially identified by observing differential expression of P-galactosidase in the presence and absence of leucine, and membership in the Lrp regulon was confirmed by the abolition of leucine sensitivity when an Zrp::TnlO mutation was transduced into the strain. This genetic screen identified 22 strains with AplacMr insertions into Lrp-regulated genes. A subsequent study (178)made use of inverse PCR to identify many of these Lrp-regulated genes by sequencing short pieces adjacent to the insert. Among the Lrp-regulated genes identified in this manner were gltD, Ziv], ZivH, leuA, leuB, sduC, and m l F . Two-dimensional gel electrophoresis was used to identify polypeptides whose expression is modulated by Lrp (191).In these experiments, patterns of expression of E . coZi polypeptides were compared in isogenic Zrp+ and Zrp strains, both in media lacking leucine and in the presence of leucine. These comparisons differentiated positive from negative control of expression of the polypeptide by Lrp. A polypeptide whose expression is increased by Lrp will be low in an Zrp strain, whether or not leucine is added to the medium, whereas a polypeptide whose expression is inhibited by Lrp will be constitutively high in an Zrp strain. These comparisons also indicate the degree to which leucine affects Lrp-dependent modulation of polypeptide expression. Genetic studies indicated that leucine generally antagonizes the effect of Lrp on expression of members of the regulon, decreasing both activation and repression by Lrp (275, 283,285). One exception has been noted: leucine is required for repression of high-affinity branched-chain amino-acid transport by Lrp (286).The patterns of expression observed in two-dimensional gels
GLOBAL GENE REGULATION
65
for members of the regulon that had been studied at that time, i.e., LivJ (high-affinity branched-chain amino-acid transport), Tdh (threonine dehydrogenase) and LysU (lysyl-tRNA synthetase form 11),mirrored the patterns inferred from genetic studies. But quite surprisingly, most polypeptides identified by two-dimensional electrophoretic analysis appeared relatively insensitive to leucine; expression was dependent on Lrp but not on leucine. All four polypeptides newly identified as Lrp-regulated by electrophoretic analysis, the small subunit of glutamate synthase (GltD), glutamine synthetase (GlnA), and the outer membrane porins OmpC and OmpF (191),fell in the leucine-insensitive class. Thus electrophoretic analysis suggested that genetic screens in which target genes were identified by effects of leucine on expression might miss a large proportion of the genes in the regulon. As discussed in Section II,B,2, we have developed an alternative approach to genetic screening for genes in the Lrp regulon, in which Lrpcontrolled ZacZ operon fusions are identified by varying the concentration of Lrp within the cell. This type of screen does not require leucine sensitivity in order to identify genes in the regulon. Thus far we have identified 10 Lrpdependent fusions (213), and cloning and identification of these is now in progress. The cloning method employed (see Section II,A,l) has the particular advantage that the expression pattern of the cloned fusion can be studied in vivo as a function of the Lrp concentration in the cell by titrating the IPTG concentration in the medium. While we have clearly identified some Lrpregulated genes that were not detected by prior genetic screens, we have thus far not found any completely leucine-insensitive fusions. Is our failure to identify a class of leucine-insensitive genes in the regulon because all genes in the regulon are at least partially sensitive to leucine? Recently, gZtD was detected by a n operon fusion screen based on differential expression in the presence or absence of leucine (178, 284). Because this polypeptide was one of the original leucine-insensitive polypeptides identified by electrophoretic analysis (191),and is induced -40-fold by Lrp and only repressed -%fold by leucine, its identification by a genetic screen based on leucine suggests that only minimal sensitivity to leucine may be required for detection of an Lrp-regulated gene.
3.
INDIRECT VERSUS
DIRECT k G U L A T J O N
None of the analyses of the Lrp regulon described thus far allow a distinction to be made between direct and indirect regulation of the operon by Lrp. Gel-shift assays and DNA footprinting analyses can be used to demonstrate that Lrp binds to the promoter of a gene whose expression is affected by Lrp, but mutational analysis of the control regions contacted by Lrp is required to prove that Lrp binding is needed for alterations in gene expression (see Section 11,A).
66
ROBERT M. BLUMENTHAL ET AL.
Of the four genes in the Lrp regulon identified by electrophoretic analysis, the promoter regions of gltBDF (211) and oinpC (287) have subsequently been shown to be directly contacted by Lrp, and ompF (287)and glnA (191) have been shown to be indirectly regulated by Lrp. Such studies suggest that a large number of polypeptides identified by electrophoretic analysis may in fact be indirectly regulated by a global regulatory protein. We have suggested, in another military analogy, that regulons obey the Genghis Khan principle: a master regulon has lieutenants, each of which regulates its own platoon (191). The physiological significance of such secondary regulation may be illustrated by a discussion of the regulation of glnA by Lrp (191). Glutamine synthetase (GlnA) is one of the proteins whose expression is increased by Lrp. Electrophoretic analysis distinguishes the adenylylated and unadenylylated subunits of GlnA, which differ in charge and mass. In an Zrp strain, GlnA accumulates in the adenylylated form and the total level of both forms decreases. Enzyme assays also can distinguish the adenylylated and unadenylylated forms of glutamine synthetase (288), and assays that determine the total of unmodified and modified enzyme activity show that the level of glutamine synthetase in an lrp strain is about one-fourth that in an lrp+ strain during growth on glucose-minimal MOPS medium with glutamine as the sole nitrogen source. The average number of adenylyl residues per dodecamer is 8.6 in an lrp strain and 2.6 in an lrp+ strain. Thus the physiological activity of glutamine synthetase (only unmodified enzyme is active under physiological conditions) is about 7% of the activity in an l r p f strain. The effect of Lrp on glutamine synthetase is indirect. In an E . coli strain with an NtrB (NR,,) deletion, Lrp has no effect on the total activity of glutamine synthetase (adenylylated and unadenylylated), which is constitutively elevated. Remember that when NR,, is absent, NR, is constitutively activated by acetyl-phosphate-dependent phosphorylation (Section II1,B). Thus the effect of Lrp on glutamine synthetase expression requires a functional NRII-NRI two-component regulatory system. The effect of Lrp on GlnA expression could be explained by the observation that Lrp regulates glutamate synthase expression (191). Glutamate synthase has subsequently been shown to be directly regulated by Lrp (211, 289). Mutations that abolish glutamate synthase activity are unable to induce proteins required for growth on poor nitrogen sources such as glutamine, or on low concentrations of ammonia (290). Because these strains are unable to convert glutamine to glutamate by the action of glutamate synthetase, they have high intracellular levels of glutamine, which leads to high levels of adenylylation of GlnA and decreased levels of expression of the enzyme. In fact, lrp strains are thought to be unable to induce any of the genes in the nitrogen regulon controlled by NR, and NR,, (191). Thus the regulation of
GLOBAL GENE REGULATION
67
glutamine synthetase by Lrp is secondary to a primary effect of Lrp on
gltBDF. Is this secondary effect of Lrp on glutamine synthetase expression physiologically significant? We have argued that it is (211).As discussed in Section III,B, the nitrogen regulon is controlled in response to the availability of ammonia. But during growth on rich media containing nitrogen in the form of amino acids and nucleotides, the need for ammonia is greatly reduced. Under these conditions, the level of Lrp in cells is greatly reduced (6, 284) and the levels of glutamate synthase are lower. Accumulation of intracellular glutamine then leads to decreases in the expression and activity of glutamine synthetase. Thus lowered levels of Lrp during growth in rich media can override the induction of nitrogen-regulated operons by ammonia limitation. Were this not occurring, the Ntr regulon might be needlessly expressed in a nutritionally rich medium.
4.
CONCLUSIONS BASEDON in Vitro DATAVALID DESCRIPTIONS OF in Viuo PHYSIOLOGY?
ARE
a. Is Leucine the Physiological Coregulutor of Lrp? Attempts to demonstrate the effects of leucine on regulation of expression by Lrp have typically employed very high levels of leucine. For electrophoretic analysis of in uiuo expression of cellular polypeptides, 10 mM leucine was added to the medium (191), while the effect of leucine in interactions between purified Lrp and gltBDF DNA in vitro was half-maximal at 3 mM, and required concentrations of 10 mM or more for maximal effect (211). The use of such high extracellular concentrations of leucine to obtain effects raises the question of whether such concentrations are physiologically relevant, and whethe r the actual co-regulator is not leucine but a product derived from leucine, such as leucyl-tRNA. The affinity of Lrp for leucine has been assayed by equilibrium dialysis using highly purified protein, yielding a dissociation constant for leucine of 12 p M , and 2 mol of leucine were bound per dimer (291). The intracellular concentration of leucine in a wild-type strain of E . coli during steady-state growth in glucose-minimal MOPS medium is 1.7 mM; after addition of 0.4 mM exogenous leucine to the medium the intracellular concentration rises to 11.7 mM within 10 minutes, and then slowly decreases to a new steadystate level of 5.3 mM (292).These effects are expected if addition of leucine initially leads to a high influx of leucine via the high-&nity branched-chain amino-acid transport system, followed by down-regulation of the liu operons by Lrp and leucine, and a reduced rate of uptake of leucine from the medium. Quantitative studies of the effect of leucine on Lrp binding to gltBDF and i h I H (211)indicate that the level of leucine required for maximal antagonism of Lrp binding is exceeded during the transient rise in intracellular
ROBERT M. BLUMENTHAL ET AL.
68
leucine following addition of 0.4 mM leucine to the medium, and thus that modulation of gltBDF and ilvIH expression in response to leucine could be physiologically significant. However, such quantitative studies of the effects of leucine on the expression of other operons are lacking. Both in ljivo and in vitro studies of this kind are needed if we are to understand the interplay of Lrp and leucine in regulation of genes controlled by Lrp. If the dissociation constant for leucine binding to Lrp is 12 F M , why should 10 mM leucine be required to see a maximal effect of leucine on Lrp binding during in vitro gel-shift and DNase I footprinting analyses? Taking the interaction of Lrp with g2tBDF as an example, 10 mM leucine abolishes the binding of dimeric Lrp to a specific site upstream of gltBDF DNA as analyzed by DNase I footprinting (289).Thus we may consider a competition for this site involving three species: Lrp-Leu, Lrp, and Lrp-DNA. Lrp binds to DNA as a dimer, and also exists as a dimer in solution whether or not leucine is present. It is not clear whether the decreased binding of Lrp to DNA requires one or two molecules of leucine to be bound to the dimer, but for the purposes of this calculation we shall assume that one leucine ligand is sufficient. 12 pM
Lrp,*Leu
2nM Lrp,
Lrp,. DNA
(6)
Assume a 2 nM a n i t y of the Lrp dimer for the specific site on DNA, which is a number consistent with the observed binding of Lrp dimer to gltBDF DNA. In order to reduce the binding of Lrp to DNA to one-half, the concentration of leucine must be 6000-fold higher than the concentration of the DNA target. For mobility-shift experiments, we typically use a DNA concentration of about 240 pM, and for half-maximal dissociation of Lrp from the DNA we would require 1.4 mM leucine. If binding of 2 mol of leucine per Lrp dimer is required to antagonize DNA binding, the required concentration of leucine for half-maximal dissociation of Lrp from DNA would be even higher. Thus the measured affinities of Lrp dimer for DNA and of Lrp for leucine, and the concentration of leucine required for a half-maximal effect on Lrp binding in oitro are completely consistent with the assumption that leucine is the physiological co-regulator. The requirement for high concentrations of leucine to see maximal effects on Lrp binding in vitro and in vivo arises simply because of the high affinity of Lrp for its target sites on DNA (typically in the range of 1-20 nM) and the relatively weak affinity of leucine for Lrp. According to this analysis, leucine-insensitive operons will have high-affinity binding sites for Lrp dimer, such that physiological concentrations of leucine are not high enough to
GLOBAL GENE REGULATION
69
cause appreciable dissociation from these sites, and leucine-sensitive operons will have lower af€inity sites (211).
b. Binding of Lrp to the gltBDF and ilvlH Operons: A Model f o r Transcriptional Activation Based on in Vitro Studies of the Binding Affinities of the Target Operons f o r Lrp. As discussed above, a simple analysis suggests that specific binding of Lrp to sites with high affinity will result in insensitivity of occupancy to physiological levels of leucine. Such a model predicts that positively regulated leucine-sensitive operons will have a lower affinity of binding of Lrp to one or more sites required for transcriptional activation than will leucine-insensitive genes. This hypothesis was initially tested by measuring apparent Kd values for Lrp binding to gltBDF and iZuZH using quantitative mobility shift assays (211). Occupancy of the three sites upstream of gltBDF, or of the two proximal sites required for transcriptional activation of ilvZH, was compared at varying concentrations of Lrp (Fig. 7). These experiments were conducted using homogeneous Lrp and purified DNA templates, in both the absence and the presence of 30 mM leucine; Lrp is in excess over the concentration of DNA (240 pM) at all but the lowest concentrations, where very little Lrp is bound to DNA, so that the concentration of Lrp bound is approximately equal to the total concentration of Lrp throughout the range of these experiments. Several inferences can be made from these experiments. First, in the absence of leucine, the sites upstream of gltBDF have a higher affinity for Lrp (-2 nM) than the proximal sites upstream of ihZH (-7 nM). Thus at low concentrations of Lrp, saturation of gltBDF sites is greater than saturation of ilvIH sites. Second, addition of leucine results in decreased occupancy of both gltBDF and ihZH sites (a rightward shift of the binding curves in Fig. 7B). In the presence of saturating leucine (30 mM), a higher percentage of sites for Lrp are occupied on gltBDF than on ilvZH at any given Lrp concentration, as predicted. Thus studies of Lrp binding to two target operons, as measured by mobility-shift assays, are consistent with the hypothesis that the DNA binding affinity of Lrp for target promoter sites that are required for transcription determines leucine sensitivity. These studies also permitted an estimate of the effective in vitro concentration of Lrp during exponential growth in glucose-minimal MOPS medium. This estimate involved comparison of the leucine sensitivity observed by in uiuo measurements of the effect of addition of leucine to the medium using gltBDF and ilvIH operon fusions to P-galactosidase with the in vitro mobility-shift data shown in Fig. 7B. At 5.5 nM Lrp, the in uitro studies predicted that saturating leucine would reduce expression of g1tBDF::lacZ fold, and would reduce expression of ilvZH::lacZ 5.5-fold. These predicted
70
ROBERT M. BLUMENTHAL ET AL.
A
2 3
1
4
100
-
60
-
a? 2
40
-
El
20
-
ms
€9.
6
. . . , I . .
80 U
5
7
8 9 10111213141516
. . 1 . ’ . . , ’ . ’ . , . .
0 100
: : : : ; : : : : : : : : ; : : *-I
80
-
a
60
-
a?
40
-
20
-
U
E
2 z
n
:
: : :::
0 -20 -10.5 -10
- 9 -8.5 log [Lrp dimer]
-9.5
-8
-7-5
FIG. 7. (A) Use of DNA mobility-shiftassays to measure the binding of Lrp to the promoter regions of the gZtBDF and iZvIH operons. These measurements were made in the absence of
GLOBAL GENE REGULATION
71
effects exactly mimic the effect of 10 mM exogenous leucine observed in in uiuo experiments. Thus these experiments suggest that the effective concentration of Lrp in the cell under these growth conditions is -5.5 nM (211). Western blot analysis of cell extracts has revealed an intracellular Lrp concentration more than 1000-fold higher (118).The necessity for this “excess” of Lrp within the cell could be explained if large amounts of Lrp are unavailable for regulation due to nonspecific interactions with DNA or other proteins (see Section I,B,2). However, there are several limitations to such in uitro studies. The first limitation is that DNA mobility-shift assays do not measure Kd values directly (293).Observing a shifted complex requires that the DNA-binding protein remain bound during electrophoresis. Because the electrophoresis buffer generally does not contain unbound binding protein (unbound Lrp is strongly positively charged at p H 8, like most DNA-binding proteins and does not enter the gel), in practice, observing a mobility shift requires that the off-rate constant for dissociation of the binding protein be longer than the time of analysis. Thus when leucine decreases the percentage of DNA which is shifted, it may be increasing the rate of association and dissociation of the binding protein and/or changing the Kd for binding. If only the unshifted band is quantitated in gel-shift experiments, the problem of dissociation of bound protein during the run may be avoided, but for mobility-shift assays in which more than one complex is formed, much information is lost if only the unshifted band can be quantitated. While less easy to perform, quantitative footprinting assays are more accurate measures of & values for binding of proteins to DNA (294, 295). An even more major limitation is that binding of a regulatory protein to target DNA is necessary but not sufficient for transcriptional activation. Thus leucine might alter Lrp binding to target DNA very little, but have a profound effect on transcriptional activation by altering Lrp-leucine or Lrppolymerase contacts. Very recent studies in our laboratory (D. Wiese, unleucine. [Reprinted, with permission, from Ernsting et al. (211).](B)Effect of leucine on Lrp binding to the gltBDF and ilvZH promoters. Data from mobility-shift assays carried out in the absence (A) or presence (not shown) of 30 mM leucine were quantified by Phosphorimager scanning and plotted as percent DNA bound at various concentrations of Lrp. Theoretical binding curves were fitted to the data to provide estimates of the apparent dissociation constants for Lrp binding as described in Ernsting et al. (211).The apparent Z& for Lrp binding to the lowaflinity complex with ilvZH (the more retarded band), required for transcriptional activation of the ilvIH promoter, is 6.9 nM in the absence of leucine and 14.1nM in the presence of30 mM leucine. The apparent Kd for Lrp binding to the gltBDF promoter is 2.0 nM in the absence of leucine and 6.6 nM in its presence. The vertical line is drawn at 5.5 nM Lrp, the estimated effective intracellar concentration of Lrp (see the text). [Reprinted, with permission, from Ernsting et al. (211).]
72
ROBERT M. BLUMENTHAL ET AL.
published) suggest that another determinant of leucine sensitivity is the degree to which a site whose occupancy is decreased by leucine is essential for transcriptional activation. Only one of the three sites for Lrp upstream of gltBDF is sensitive to leucine, and if this site is unoccupied, transcription is reduced to about 30% of the level seen in the absence of leucine. Although the mobility-shift assays tell us that leucine decreases the affinity of Lrp for the sites upstream of gltBDF, what they could not tell us is that leucine is also decreasing the transcriptional activation at apparently saturating concentrations of Lrp. One needs a way to compare the effect of varying intracellular Lrp concentrations on transcription in viuo with the effects of these same concentrations of Lrp on binding in uitro. The first obstacle to such experiments is varying the intracellular Lrp concentration in the cell in a controlled fashion. As previously described, the approach we have taken is to use a chromosomal lacUV5-lrp fusion, inducible by IPTG, and to vary the concentration of IPTG in the medium (296). We have shown that the intracellular concentration of Lrp can be varied reproducibly in this way, and the actual total concentration of Lrp in the cell can be determined by Western blotting. Using a suitable reporter construct, e.g., a gltBDF::lacZ operon fusion, one can then examine the expression of gltBDF as a function of the Lrp concentration in the cell. Such studies are currently in progress in our laboratory (296),and will permit comparison of in uitro binding with in uivo expression. In uiuo studies are also extremely important for establishing bench marks for developing in uitro transcription assays that reflect transcription of target operons in uiuo. There are several problems with in uitro studies of transcription, the most salient being that the dependence of transcription on the concentration of a regulatory protein may be very different in uitro than in vim. For example, there is an 11-fold difference in the levels of ilu1H mRNA between isogenic lrp+ and lrp strains, but in uitro transcription from supercoiled plasmid templates shows only a 2- to 5-fold stimulation by Lrp from promoter PI, and Lrp actually repressed transcription at another promoter that is not used to initiate transcription in uivo (297). In contrast, in uiuo assays of P-galactosidase expression from an i1uIH::lacZ operon fusion indicated a 31-fold stimulation of transcription when Lrp is present in the cell
(211, 283). Discrepancies between in uiuo and in vitro studies may arise because additional factors are important in regulating transcription in viuo and are not included in the in uitro system, or because supercoiling of the template DNA plays an important role in uivo (298). Although in uitro transcription analyses can be performed on supercoiled templates (266, 299), in uivo superhelix density is significantly lower than that measured for DNA extracted from the cell (300). In addition, plasmid supercoiling may vary depend-
GLOBAL GENE REGULATION
73
ing on the conditions under which the cells are grown prior to isolation of the DNA (300).In uitro transcription may also be sensitive to the exact concentration of salt and choice of ion (e.g., glutamate or chloride, polyamine or K+) employed for the studies (301). Thus an important challenge for detailed studies of in vitro transcriptional regulation is to mimic successfully the properties of the system in vivo. Once such an appropriate in uitro system has been developed, studies can provide additional insights not available from in uivo studies. They can establish at what stage the regulatory protein accelerates transcription (see Section I , B , 2 ) and whether additional factors are required. They also permit cross-linking studies to be pelformed to identify contacts between the regulatory protein and RNA polymerase (302, 303).
5. THEREMAY BE A CONTINUUM BETWEEN HISTONELIKE REGULATORYPROTEINS, GLOBALCOREGULATORDEPENDENT REGULATORY PROTEINS, AND LOCAL REGULATORY PROTEINS The levels of the histonelike regulatory proteins are sufficiently high that one may expect most of the molecules to be nonspecifically bound to DNA. Thus, it may be very hard to distinguish between effects on target genes due to specific binding of a histonelike regulatory protein such as Fis, and effects that reflect changes in chromosomal organization induced by Fis. This is particularly true when the effects on target genes are relatively small. Global regulatory proteins such as Lrp are present in high enough concentrations that observed effects on target genes may also be due to nonspecific interactions and indirect effects on chromosomal organization (119, 304). In this context, it should be noted that many of the observed effects of Lrp on target operons are relatively small: ompC expression is induced approximately twofold in an 1rp::TnlO strain (287) and expression of m l T , mdEFG, and malB-lamB-mulK is decreased two- to fivefold (178).Thus, just as there is a continuum between local and global regulatory proteins, there may also be a continuum between global and histonelike regulatory proteins.
IV. Summary and Conclusions This review focuses on experimental studies of global regulation in E . coli. In the introductory section, we presented a series of questions about global regulons as an organizing framework.
1. What is the regulon? Which genes and proteins belong to it? What is the physiological role of the regulon? Under what conditions
2.
74
ROBERT M. BLUMENTHAL ET AL.
are its member genes expressed? What is the signal recognized by the regulator? 3. How does the regulator function? How is the expression of individual genes in the regulon controlled by the regulator? How is the regulator activated or inactivated by the signal? 4. How does the regulon function?What is the feedback circuit by which the magnitude of the response is modulated? Which other genes are controlled by members of this regulon? 5. How is the regulon integrated into the cell's overall response to a given stimulus? Which other regulators control members of this regulon? How do different regulators interact with one another? As we have seen, there are now numerous methods for determining which genes and proteins belong to a given regulon. Although the ability of any one method to sample the diversity of the regulon may be limited, taken together, these methods will almost certainly be adequate in defining the membership in global regulons. The signal transduction pathway in the nitrogen regulon has been elicited in detail by in vitro studies with purified components, and the interaction of the transcriptional activator, NRI, with at least one target gene, gZnA, is understood at the molecular level. Our understanding of signal transduction in other global regulons is less advanced and in some cases woefully lacking. However, the approach that has been so successful in study of the nitrogen regulon can be expected to provide similar information on other global regulons. The greatest challenge that remains in understanding global regulation is to relate in vitro experiments to in vivo conditions. We need much more information on in vivo conditions; for example, we presently do not really know the concentration of active RNA polymerase in the cell relative to the concentration of promoters available for transcription. We lack adequate methods to determine the effective concentrations of DNA-binding proteins and their co-regulators in vivo. And we need more methods to allow in vivo analyses of protein-DNA, protein-RNA, and protein-protein interactions involved in gene regulation. We also need to develop methods that allow us to compare transcription of genes in vivo with transcription in vitro. The other area for which information is woefully lacking is the integration of various regulons in the cell's overall response to a stimulus. Although we know that the membership of individual global regulons overlaps and many regulators control a given gene, we still know rather little about how this control is integrated. Nor do we fully understand how the levels of the global regulators themselves are controlled, except in the case of the nitrogen regulon. Such fundamental questions as how the cell integrates its responses to the availability of carbon and nitrogen remain
75
GLOBAL GENE REGULATION
largely unanswered. Let us end, as we began, with a quote from “The Physiology of the Bacterial Cell” (8): Many circumstances in nature produce multiple changes in the environment of microbes. For example, excretion of E. coli cells from the mammalian gut may involve, within a short time period, shift-down in temperature, decrease of nutrient availability, decrease in concentration of toxic substances (bile salts and waste products of microbial metabolism), increase in oxygen concentration, change in pH, increase in redox potential, and exposure to UV irradiation.
How do cells adapt to simultaneous changes of this sort?
V. Glossary AmpR Arc (A,B) Che (A,B,Y)
Dam
EnvZ
Fis
GcvA GlnA GlnB
Repressor for a P-lactamase (ampicillin resistance) gene Two-component regulatory system for aerobic respiration control Parts of the chemotaxis two-component regulatory system; CheA is the sensor kinase, CheB and CheY are alternative response regulators Heat-shock-inducibleATP-dependent proteases Sensor kinase from a two-component regulatory system CAMP receptor protein (synonymous with CAP, or catabolite activator protein) DNA-adenine methyltransferase, which modifies the sequence 5’-GATC-3‘ Heat-shock-inducible proteins that help to assemble multisubunit proteins or target proteins for degradation; defects in these genes block DNA synthesis during lytic growth of bacteriophage h Controls synthesis of envelope proteins called porins (see below); it is the sensor b a s e that acts on the response regulator OmpR Protein that binds DNA with moderate specificity and bends it ($actor for inversion stimulation) Glycine cleavage enzyme activator Glutamine synthetase Regulatory protein PI,, which controls the adenylylation activity of GlnE
76 GlnD
GlnE
Glt (B,D, F)
Gre (A,B)
GroEL
H-NS HslV H tP HU
IHF
Ilv (1,H) IPG IPTG
Lac (Z,Y,I,UVS)
ROBERT M. BLUMENTHAL ET AL.
Uridylyltransferase/uridylyl-removing enzyme; involved in regulation of G h A activity and synthesis Adenylyltransferase/adenylyl-removingenzyme; involved in covalent modification of glutamine synthetase Products of the glutamate-biosynthetic operon; GltB and GltD are the two subunits of glutamate synthase; the role of GltF is unclear RNA polymerase accessory proteins that stimulate cleavage of nascent transcripts in paused transcription complexes Chaperonin; a heat-shock-inducible protein that refolds denatured proteins; mutants show a growth defect for some bacteriophages Heat-shock-inducible protein that interacts with DnaJK (see above) to form the active complex Histonelike protein that binds D N A with moderate specificity and bends it Heat-shock-inducible ATP-dependent protease Heat-shock inducible proteins; HtpR is synonymous with RpoH Histonelike protein that binds DNA with low specificity and bends it; forms nucleosome-like structures Heat-shock-inducible proteins that appear to promote aggregation of denatured proteins; inclusion body proteins Protein that binds DNA with moderate specificity and bends it; some mutants are defective for integration of A D N A into the chromosome: integration host factor Subunits of acetohydroxyacid synthase III; involved in biosynthesis of branched-chain amino acids Immobilized pH gradient Isopropyl-P-D-thiogalactopyranoside,a nonmetabolizable lactose analog used to induce expression of genes that are controlled by Lac1 Parts of the operon responsible for lactose metabolism; LacZ specifies P-galactosidase, Lacy is the lactose permease, Lac1 (not in the operon) is the
GLOBAL GENE REGULATION
LexA Liv (H,M,G,F) Liv (J,K) Lon LrP LysU MerR Ntr
NusA
porins
PPGPP PutA
77
repressor, and UV5 refers to a UV-induced promoter mutation that makes lac transcription independent of Crp Repressor for several genes induced by DNA damage Components of the high-affinity branched-chain amino-acid transport system Periplasmic amino-acid-binding proteins that confer specificity on the Liv transport system Heat-shock-inducible ATP-dependent protease Leucine-responsive regulatory protein Lysyl-tRNA synthetase form I1 Mercury resistance regulator protein Nitrogen regulon; important in conditions of ammonia limitation; controlled by a two-component regulatory system that includes the sensor kinase NRrr (NtrB or GlnL) and the response regulator NRr (NtrC or GlnG) RNA polymerase accessory protein that lengthens transcriptional pauses; some mutants behave as if the bacteriophage A N protein was undersupplied OmpA, OmpC, and OmpF are outer membrane proteins called porins; OmpR is a response regulator phosphorylated by EnvZ Members of the oligopeptide transport operon Components of the phosphate stimulon; PhoA is alkaline phosphatase, PhoB and PhoR are components of a regulatory kinase cascade Proteins that form pores (water-filled channels) through the outer membrane of Gram-negative bacteria such as E . coli, allowing small hydrophilic molecules to enter by diffusion Guanosine 5',3'-bisphosphate; guanosine tetraphosphate A multifunctional protein involved in proline utilization; it catalyzes two catabolic reactions and can act as a repressor Polyvinylidene difluoride; PVDF membranes bind proteins with high avidity RNA polymerase genes coding for alternative sigma (a)factors: RpoH controls the heat-shock re-
78
u
(a32, 054, u70)
SdaA SDS SerA
sos SoxR tac
Tdh TyrR
ROBERT M. BLUMENTHAL ET AL.
sponse, RpoN controls the nitrogen starvation response, and RpoS controls the adaptation to stationary phase (no growth) Polypeptides that bind to RNA polymerase and determine its promoter specificity; the superscript indicates the molecular mass in kDa; 19 is RpoH, a54 is RpoN, and a70 is the sigma factor that predominates during normal growth (RpoD) Serine deaminase Sodium dodecyl sulfate, a detergent 3-Phosphoglycerate dehydrogenase, the first enzyme in serine biosynthesis The response to DNA damage, involving induction of many genes and inhibition of cell division; from the nautical distress signal “save our ship” Regulator for genes involved in preventing or repairing oxidative damage A strong hybrid promoter containing the -35 region of trp and the -10 region of lac Threonine dehydrogenase Repressor for the tyrosine biosynthetic operon
ACKNOWLEDGMENTS Research from the authors’ laboratories is funded by National Science Foundation Grant MCG 9506911. We thank Alexander Ni& (University of Michigan) for critical review of a preliminary draft of this manuscript, and Ruth VanBogelen (Parke-Davis Division of WamerLambert) for information on the current status of the gene-protein database.
REFERENCES 1. F. Jacob and J. Monod, ]MB 3, 318 (1961). 2. W.H. Mager and A. J. J. De Kruijff, Microbiol. Reu 59, 506 (1995). 3. P.C. Loewen and R. Hengge-Aronis, Annu. Rev. Microbid. 48, 53 (1994). 4. M. R. Atkinson and A. J. Ninfa, in “Transcription: Mechanisms and Regulation” (R. C. Condway and J. W. Conidway, eds.), p. 323. Raven Press, New York, 1994. 5. J. L. Botsford and J. G. Harman, Microbiol. Reu. 56, 100 (1992). 6. J. M. Cdvo and R. G. Matthews, Microbiol. Reu. 58, 466 (1994). 7. J. A. Hoch and T. J. Silhavy, eds., ‘Two-component Signal Transduction.” ASM Press, Washington, DC, 1995. 8 . F. C. Neidhardt, J. L. Ingraham and M. Schaechter, in “Physiology of the Bacterial Cell:
GLOBAL GENE REGULATION
79
A Molecular Approach (F. C. Neidhardt, J. L. Ingraham and M. Schaechter, eds.), p. 351. Sinauer, Sunderland, MA, 1990. 9 . S . Kumar, J. Bact. 125, 545 (1976). 10. M. L. Sprengart, H. P. Fatscher and E. Fuchs, NARes 18, 1719 (1990). 1 1 . G. J. Murakawa and D. P. Nierlich, Bchem 28, 8067 (1989). 12. S. Varenne, J. Buc, R. Lloubes and C. Lazdunski, JMB 180, 549 (1984). 13. M. A. Sbrensen, C. G. Kurland and S. Pedersen, ] M B 207, 365 (1989). 14. H. Bremer and P. P. Dennis, in Escherichia coli and Selmonella typhimurium: Cellular and Molecular Biology” (F. C. Neidhardt et al., eds.), p. 1527. American Society for Microbiology, Washington, DC, 1987. 15. N. 0. Kjeldgaard and K. Gausing, in “Ribosomes” (M. Nomura, A. TissiBres and P. Lengyel, eds.), p. 369. CSH Lab, CSH, NY, 1974. 16. R. E. Kingston, W. C. Nierman and M. J. Chamberlin, JBC 256, 2787 (1981). 17. U . Vogel and K. F. Jensen, JBC 269, 16236 (1994). 18. T. D. Yager and P. H. von Hippel, in Escherichia coli and Salmonella typhimurium: Cellular and Molecular Biology” (F. C. Neidhardt et al., eds.), p. 1241. American Society for Microbiology, Washington, DC, 1987. 19. M. J. Chamberlin, Haruey Lect. 88, (1994). 20. M. J. Chamberlin, in “RNA Polymerase” (R. Losick and M. Chamberlin, eds.), p. 17. CSH Lab, CSH, NY, 1976. 21. S. L. Gotta, 0. L. Miller, Jr. and S. L. French, J. Boct. 173, 6647 (1991). 22. K. F. Jensen and S. Pedersen, Microbiol. Reu. 54, 89 (1990). 23. M. Cashel and K. E. Rudd, in Escherichia coli and Salmonella typhirnurium: Cellular and Molecular Biology” (F. C. Neidhardt et al., eds.), p. 1410. American Society for Microbiology, Washington, DC, 1987. 24. P. S. Reddy, A. Raghavan and D. Chatterji, Mol. Mlcrobiol. 15, 255 (1995). 25. M. T. Hansen, P. M. Bennett and K. von Meyenburg, JMB 77, 589 (1973). 26. A. M. Patel and S. D. Dunn, /. Bact. 174, 3541 (1992). 27. 0. Yarchuk, I. Iost and M. Dreyfus, Biochimie 73, 1533 (1991). 28. 0. Yarchuk, N. Jacques, J. Guillerez and M. Dreyfus, J M B 226, 581 (1992). 29. W. P. Donovan and S. R. Kushner, PNAS 83, 120 (1986). 30. S. Pedersen and S. Reeh, MGC 166, 329 (1978). 31. B. M. Herschbach and A. D. Johnson, Annu. Rev. Cell Biol. 9, 479 (1993). 32. G. Churchward, H. Bremer and R. Young, J. Bact. 150, 572 (1982). 33. E. Kellenberger, Res. Microbiol. 142, 229 (1991). 34. H. J. Sofia, V. Burland, D. L. Daniels, G. R. Plunkett and F. R. Blattner, NARes 22,2576 (1994). 35. F. R. Blattner, V. Burland, G . R. Plunkett, H. J. Sofia and D. L. Daniels, NARes 21, 5408 (1993). 36. D. L. Daniels, G. R. Plunkett, V. Burland and F. R. Blattner, Science 257, 771 (1992). 37. S. S. Broyles and D. E. Pettijohn, ] M B 187, 47 (1986). 38. K. Drlica and J. Rouviere-Yaniv, Microbiol. Reu. 51, 301 (1987). 39. D. E. Pettijohn, JBC 263, 12793 (1988). 40. D. E. Pettijohn, Cel2 30, 667 (1982). 41. D. F. Stickle, K. M. Vossen, D. A. Riley and M. G. Fried, /. Theor. B i d . 168, 1 (1994). 42. E. R. Hildebrandt and N. R. Cozzarelli, Cell 81, 331 (1995). 43. K. Tanaka, S. Muramatsu, H. Yamada and T. Mizuno, MGG 226, 367 (1991). 44. V. McGovern, N. P. Higgins, R. S. Chiz and A. Jaworski, Biochimie 76, 1019 (1994). 45. K. Yasuzawa et al., Gene 122, 9 (1992).
80
ROBERT M. BLUMENTHAL ET AL.
46. S . Cooper and C. E. Helmstetter, J M B 31, 519 (1968). 47. R. M. Blumenthal, Doctoral Dissertation (appendix), University of Michigan, Ann Arbor (1977). 48. N. Shepherd, G. Churchward and H. Bremer, J. Bact. 141, 1098 (1980). 49. W. D. Donachie and A. C. Robinson, in Escherichia coli and Salmonella typhimurium: Cellular and Molecular Biology” (F. C. Neidhardt et al., eds.), p. 1578. American Society for Microbiology, Washington, DC, 1987. 50. W Riinzi and H. Matsura, J. B a t . 125, 1237 (1976). 51. P. Singer and C. W. Wu, JBC 262, 14178 (1987). 52. J. Ryals, R. Little and H. Bremer, J. B a t . 151, 879 (1982). 53. W. R. McClure, in “Biochemistry of Metabolic Processes” (D. L. F. Lennon, F. W. Stratman and R. N. Zahlten, eds.), p. 207. Elsevier, New York, 1983. 54. B. L. Wanner, R. Kodaira and F. C. Niedhartd, J. Bact. 130,212 (1977). 55. R. M. Blumenthal and P. P. Dennis, MGG 165, 79 (1978). 56. R. M. Blumenthal and P. P. Dennis, J. B a t . 142, 1049 (1980). 57. 0. G. Berg, Biopolymrs 23, 1869 (1984). 58. P. H. von Hippel and 0. G. Berg, JBC 264, 675 (1989). 59. L. Clarke and J. Carbon, Cell 9, 91 (1976). 60. T. Ruusala and D. M. Crothers, PNAS 89, 4903 (1992). 61. P. U. Giacomoni, EJB 98, 557 (1979). 62. H. Kabata et aZ., Science 262, 1561 (1993). 63. R. Hannon, E. G. Richards and H. J. Gould, EMBOJ. 5, 3313 (1986). 64. W. R. McClure, ARB 54, 171 (1985). 65. M E. Mulligan, D. K. Hawley, R. Entriken and W. R. McClure, NARes 12, 789 (1984). 66. S. E. Warne and P. L. deHaseth, Bchem 32, 6134 (1993). 67. B. C. Hoopes and W. R. McClure, in Escherichia coli and Salmonella typhimurium: Cellular and Molecular Biology” (F. C. Neidhardt et al.. eds.), p. 1231. American Society for Microbiology, Washington, DC, 1987. 68. J. Tan, L. Shu and H. Y. Wu, J. B a t . 176, 1077 (1994). 69, D. Chen, R. Bowater and D. M. Lilley, J. Bact. 176, 3757 (1994). 70. D. K. Hawley, A. D. Johnson and W. R. McClure, JBC 260, 8618 (1985). 71. P. J. Schlax, M. W. Capp and M. T.Record, Jr., ] M B 245, 331 (1995). 72. E. Bertrand-Burggraf, J. F. Lefevre and M. Daune, JMB 193, 293 (1987). 73. A. Kurnar et al., JMB 235, 405 (1994). 74. A. Attey et al., NARes 22, 4375 (1994). 75. W. Niu, Y. Zhou, Q. Dong, Y. W. Ebright and R. H. Ebright, J M B 243, 595 (1994). 76. L. Bracco, P. Kotlartz, A. Kolb, D. S. and H. Buc, EMBOJ. 8, 4289 (1989). 77. M. Buckle, H. Buc and A. A. Travers, EMBOJ. 11, 2619 (1992). 78. M. G . Marinus and E. B. Konrad, MGG 149, 273 (1976). 79. G. E. Herman and P. Modrich, J. Bact. 145, 644 (1981). 80. R. E. Braun and A. Wright, MGG 202, 246 (1986). 81. D. Roberts, B. C. Hoopes, W. R. McClure and N. Kleckner, Cell 43, 117 (1985). 82. B. A. Braaten , X. Nou, L. S. Kaltenbach and D. A. Low, Cell 76, 577 (1994). 83. G. M. Studnicka, BJ 252, 825 (1988). 84. W. R. McClure, PNAS 77, 5634 (1980). 85. W. Zillig et al., C S H S Q B 35, 47 (1971). 86. A. K. Vershon, S. Liao, W. R. McClure and R. T. Sauer, J M B 195, 323 (1987). 87. A. Z. Ansari, J. E. Bradner and T. V. O’Halloran, Nature 374, 371 (1995). 88. A. J. Ninfa, L. J. Reitzer and B. Magasanik, Cell 50, 1039 (1987). 89. B. Krummel and M. J. Chamberlin, Bchem 28, 7829 (1989).
GLOBAL GENE REGULATION
81
90. D. E. Johnston and W. R. McClure, in “RNA Polymerase” (R. R. Losick and M. J. Chamberlin. eds.), p. 101. CSH Lab, CSH, NY, 1976. 91. A. J. Carpousis and J. D. Gralla, Bchem 19, 3245 (1980). 92. N. Shimamoto, T. Kamigochi and H. Utiyama, JBC 261, 11859 (1986). 93. K. I. Serensen, K. E. Baker, R. A. Kelln and J. Neuhard, J. B a t . 175, 4137 (1993). 94. L. M. Munson and W. S. Reznikoff, Bchem 20, 2081 (1981). 95. E. Nudler, A. Goldfarb and M. Kashlev, Science 265, 793 (1994). 96. D. Wang et al., Cell 81, 341 (1995). 97. J. R. Levin and M. J. Chamberlin, J M B 196, 61 (1987). 98. J. Lee and A. Goldfarb, Cell 66, 793 (1991). 99. D. J. Jin, JBC 269, 17221 (1994). 100. M. C. Schmidt and M. J. Chamberlin, Bchem 23, 197 (1984). 101. K. Liu and M. M. Hanna, P N A S 92, 5013 (1995). 102. S. Borukhov, A. Polyakov, V. Nikiforov and A. Goldfarb, P N A S 89, 8899 (1992). 103. S. Borukhov, V. Sagitov and A. Goldfarh, Cell 72, 459 (1993). 104. J. Sparkowski and A. Das, NAAes 18, 6443 (1990). 105. J. Sparkowski and A. Das, J. Bmt. 173, 5256 (1991). 106. M. Orlova, J. Newlands, A. Das, A. Goldfarb and S. Borukhov, PNAS 92, 4596 (1995). 107. R. Landick and C. Yanofsky, in Escherichiu coli and Salmonella typhimurium: Cellular and Molecular Biology” (F. C. Neidhardt et al., eds.), p. 1276. American Society for Microbiology, Washington, DC, 1987. 108. P. Babitzke, J. T. Stults, S. J. Shire and C. Yanofsky, JBC 269, 16597 (1994). 109. P. Gollnick, MoZ. Microbiol. 11, 991 (1994). 110. H. E. Choi and S. Adhya, P N A S 89, 11264 (1992). 111. J. A. Goodrich and W. R. McClure, J M B 224, 15 (1992). 112. W. K. Maas, Microbiol. Reu. 58, 631 (1994). 113. T. Kodadek, Chem. Bid. 2, 267 (1995). 114. M. D. Ditto, D. Roberts and R. A. Weisberg, J, Bact. 176, 3738 (1994). 115. M. B. Schmid, Cell 63, 451 (1990). 116. P. Hubner and W. Arber, EMBOJ. 8, 577 (1989). 117. S. E. Finkel and R. C. Johnson, Mol. Microbiol. 6, 3257 (1992). 118. D. A. Willins, C. W. Ryan, J. V. Platko and J. M. Calvo, JBC 266, 10768 (1991). 119. E. B. Newman, R. D’Ari and R. T. Lin, Cell 68, 617 (1992). 120. T. Horii et al.,Cell 27, 515 (1981). 121. R. Brent, Biochimie 64, 565 (1982). 122. K. Tilly, J. Erickson, S. Sharma and C. Georgopoulos, J. Bact. 168, 1155 (1986). 123. G. W. Huisman and R. Kolter, Science 265, 537 (1994). 124. C. A. Ball, R. Osuna, K. C. Ferguson and R. C. Johnson, J. Bact. 174, 8043 (1992). 125. 0. Ninnemann, C. Koch and R. Kahmann, EMBO J. 11, 1075 (1992). 126. C. Ueguchi, M. Kakeda and T. Mizuno, MGG 236, 171 (1993). 127. J. Wu, W. R. Dunham and B. Weiss, JBC 270, 10323 (1995). 128. E. Hidalgo and B. Demple, EMBO J. 13, 138 (1994). 129. T. Nunoshiba, E. Hidalgo, C. F. Amabile Cuevas and B. Demple, J. B a t . 174, 6054 (1992). 130. T. Nunoshiba, E. Hidalgo, 2. Li and B. Demple, J. B a t . 175, 7492 (1993). 131. J. Wu and B. Weiss, J, Bact. 173, 2864 (1991). 132. J. Ricard and J. Buc, EJB 176, 103 (1988). 133. U . Fiedler and V. Weiss, EMBOJ. 14, 3696 (1995). 134. D. Wilson, G. Sheng, T. Lecuit, N. Dostatni and C. Desplan, Genes Deu. 7,2120 (1993). 135. T. J. Wilson, P. Maroudas, G. J. Howlett and B. E. Davidson, J M B 238, 309 (1994).
82 136. 137. 138. 139.
ROBERT M. BLUMENTHAL ET AL. S. Wagner and M. R. Green, Science 262, 395 (1993).
G. M. Adams and R. M. Blumenthal, Gene 157, 193 (1995). K. Liberek, D. Wall, and C. Georgopoulos, PNAS 92, 6224 (1995). S. R. Maloy, in Escherlchio coli and Salmonella typhimurium: Cellular and Molecular Biology” (F. C. Neidhardt et al., eds.), p. 1513. American Society for Microbiology, Washington, DC, 1987. 140. P. Ostrovsky de Spicer and S. Mdoy, PNAS 90, 4295 (1993). 141. A. M. Muro-Pastor and S. Mdoy, JBC 270, 9819 (1995). 142. M. A. Savageau, “Biochemical Systems Analysis. A Study of Function and Design in Molecular Biology.” Advanced Book Program, Addison-Wesley, Reading, MA, 1976. 143. A. Jobe and S. Bourgeois, J M B 69,397 (1972). 144. S. P. Allen, J. 0. Polazzi, J. K. Gierse and A. M. Easton, J. B a t . 174, 6938 (1992). 145. A. Matin, Mol. Mic~obiol.5, 3 (1991). 146. H. H. McAdams and L. Shapiro, Science 269, 650 (1995). 147. B. L. Wanner, in “Two-component Signal Transduction” (J. A. Hoch and T.J . Silhavy, eds.), p. 203. ASM Press, Washington, DC, 1995. 148. M. Everett, T. Walsh, G. Guay and P. Bennett, Microbiology 141,419 (1995). 149. A. J. Ninfa et al., PNAS 85, 5492 (1988). 150. T. Oshima, K. Ito, H. Kabayama and Y. Nakamura, MGG 247, 521 (1995). 151. M. Freundlich, N. Ramani, E. Mathew, A. Sirko and P. Tsui, Mol. Microbiol. 6, 2557 (1992). 1.52 S. G . Sedgwick and P. A. Goodwin, Mutat. Res. 145, 103 (1985). 153. A. R. Fernandez de Henestrosa, S. Calero and J. Barbe, MGC 226, 503 (1991). 1-54,J. K. Galbraith, in “Years of the Modern” (J. W. Chase, ed.). Longmaas and Green, New York, 1949. 155. M. J. Casadaban, J M B 104, 541 (1976). 156. M. J. Casadaban, J M B 104,557 (1976). 157. T. J. Silhavy and J. R. Beckwith, Microbiol. Rew. 49, 398 (1985). 158. P. Bassford et al., in “he Operon” (J. H. Miller and W. S. Reznikoff, eds.), p. 245.CSH Lab, CSH, NY, 1978. 159. M. J. Casadaban and S. N. Cohen, J M B 138. 179 (1980). 160. A. I. Bukhari, J. A. Shapiro and S. L. Adhya, eds. “DNA Insertion Elements Plasmids, and Episomes”. CSH Lab, CSH, NY, 1977. 161. M. Nomura and F. Engbaek, PNAS 69, 1526 (1972). 162. T. J. Close and R. L. Rodriguez, Gene 20, 305 (1982). 263. K. McKenny et al., in “Gene Amplification and Analysis” (J. S. Chirikjianand T. S. Papas, eds.), p. 384. Elsevier, New York, 1981. 164. J. Engelbrecht, M. Simon and M. Silverman, Science 927, 1345 (1985). 165. A. J. Palomares, M. A. DeLuca and D. R. Helinski, Gene 81,55 (1989). 166. C. Manoil, J. J. Mekalanos and J. Beckwith, J . Bact. 172, 515 (1990). 167. R. A. Jefferson, S. M. Burgess and D. Hirsh, PNAS 83, 8447 (1986). 168. R. 0. Baldwin et al., Bchem 23, 3663 (1984). 169. W. H. R. Langridge, A. Escher, C. Koncz, J. Schell and A. A. Szalay, Technique 3, 91 (1991). 170. A. I. Derman and J. Beckwith, J . Bact. 177,3764 (1995). 171. A. Camilli, D. T. Beattie and J. J. Mekalanos, PNAS 91, 2634 (1994). 172. M. J. Mahan, J. M. Slauch and J. J. Mekalanos, J. Bact. 175, 7086 (1993). 173. R. Lin, R. D’Ari and E. B. Newman, J . Bact. 174, 1948 (1992). 174. J. Casadesus and J. R. h t h , MGG 216, 204 (1989). 175. M. Singer et al., Microbid. Rev. 53, 1 (1989).
GLOBAL GENE REGULATION
83
176. W. W. Metcalf, P. M. Steed and B. L. Wanner, J . Bact. 172, 3191 (1990). 177. R. N. Roy, S. Mukhopahyay, L. I. C. Wei and H. E. Schellhorn, NARes 23,3076 (1995). 178. E. Tchetina and E. B. Newman, J. Bact. 177, 2679 (1995). 179. H. Ochman, A. S. Gerber and D. L. Hartl, Genetics 120, 621 (1988). 180. G. G. Kneale, ed., “DNA-protein Interactions,” Vol. 30. Humana Press, Totawa, NJ, 1994. 181. A. J. Forsberg, G. D. Pavitt and C. F. Higgins, J. Bact. 176, 2128 (1994). 182. R. Lange and R. Hengge-Aronis, Genes Deu. 8, 1600 (1994). 183. A. S. Kamath-Loeb and C. A. Gross, J . Bact. 173, 3904 (1991). 184. H. Nagai, H. Yzawa and T. Yura, PNAS 88, 10515 (1991). 185. T. Linn and R. St. Pierre,]. Bact. 172, 1077 (1990). 186. P. G. Lemaux, S. L. Herendeen, P. L. Bloch and F. C. Neidhardt, Cell 13, 427 (1978). 187. F. C. Neidhardt, R. A. VanBogelen and E. T. Lau, 1. Bact. 153, 597 (1983). 188. S. Pederson, P. L. Bloch, S. Reeh and F. C. Neidhardt, Cell 14, 179 (1978). 189. R. A. VanBogelen, P. Sankar, R. L. Clark, J. A. Bogan and F. C. Neidhardt, Electrophoresis 13, 1014 (1992). 190. R. A. VanBogelen, K. Z. Abshire, A. Perselides, R. L. Clark and F. C. Neidhardt, in Escherichia coli and Salmonella typhimurium: Cellular and Molecular Biology” (F. C. Neidhardt et al., eds.). 2nd ed. American Society for Microbiology, Washington, DC, 1996 (in press). 191. B. R. Emsting, M. R. Atkinson, A. J. Ninfa and R. G. Matthews, J . Bact. 174,1109 (1992). 192. P. Z. O’Farrell, M. M. Goodman and P. H. O’Farrell, Cell 12, 1133 (1977). 193. A. Gorg, G. Bogutb, C. Obermaier, A. Posch and W. Weiss, Electrophoresis 16, 1079 (1995). 194. R. A. VanBogelen and F. C. Neidhardt, FEMS Microbiol. Ecol. 74, 121 (1990). 195. P. Matsudaira, JBC 262, 10035 (1987). 196. S. Tabata et al., J. Bact. 171, 1214 (1989). 197. T. Nystrom and F. C. Neidhardt, Mol. Microbiol. 6, 3187 (1992). 198. A. Noda, BioTechniques 10, 475 (1991). 199. K. R. Clauser et al., PNAS 92, 5072 (1995). 200. S.-E. Chuang, D. L. Daniels and F. R. Blattner, J . Bact. 175, 2026 (1993). 201. S.-E. Chuang and F. R. Blattner, J. Bact. 175, 5242 (1993). 202. R. Hengge-Aronis, CeU 72, 165 (1993). 203. R. Lange and R. Hengge-Aronis, Mol. Microbiol. 5, 49 (1991). 204. M. R. Mulvey, J. Switala, A. Borys and P. C. Loewen, J. Bact. 172, 6713 (1990). 205. D. Weichart, R. Lange, N. Henneberg and R. Hengge-Aronis, Mol. Microbiol. 10,407 (1993). 206. H. Towbin, T. Staehelin and J. Gordon, PNAS 76, 4350 (1979). 207. M. Ptashne, “A Genetic Switch: Phage A and Higher Organisms.” Cells Press & Blackwell, Cambridge, MA, 1992. 208. H. Schutt and H. Holzer, EJB 26, 68 (1972). 208a. S. C. Quay, T. E. Dick and D. L. Oxender, J. Bact. 129, 1257 (1977). 209. D. M. Haggerty, M. P. Oeschger and R. F. Schleif,J. Bact. 135, 775 (1978). 210. P. R. Jensen, H. V. Westerhoff and 0. Michelsen, EJB 211, 181 (1993). 211. B. R. Emsting, J. W. Denninger, R. M. Blumenthal and R. G. Matthews, J. Bact. 175, 7160 (1993). 212. I. C. Blomfield, P. J. Calie, K. J. Eberhardt, M. S. McClainandB. I. Eisenstein,]. Bact. 175, 27 (1993). 213. S. P. Bhagwat, M. R. Rice, R. G. Matthews and R. M. Blumenthal, J. Cell Biochem. 19A, 99 (Abst. A2-109) (1995). 214. I. L. Cartwright and S. E. Kelly, BioTechniques 11, 188 (1991).
84
ROBERT M. BLUMENTHAL ET AL.
P. E. Nielsen, BioEssoys 11, 152 (1989). I. Fishov, A. Zaritsky and N. B. Grover, Mol. Microbwl. 15, 789 (1995). C. H. Wang and A; L. Koch, J. Bact. 136, 969 (1978). A. G. Degermendzhy, V. V. Adamovich and V. A. Adamovich, J . Gen. Microbiol. 139, 2027 (1993). 219. R. Landick et al., Cell 38, 175 (1984). 220. J. W. Erickson and C. A. Gross, Genes Deu. 3, 1462 (1989). 221. S. Raina, D. Missiakas and C. Georgopoulos, E M B O ] . 14, 1043 (1995). 222. T.P. Hunt and B. Magasanik, PNAS 82, 8453 (1985). 223. L. H. Nguyen, D. B. Jensen, N. E. Thompson, D. R. Gentry and R. R. Burgess, Bchem 32, 11112 (1993). 224. N. Fujita, T.Nomura and A. Ishihama, JBC 262, 1855 (1987). 225. S. Skelly, T. Coleman, C.-F. Fu, N. Brot and H. Weissbach, PNAS 84, 8365 (1987). 226. S. Cooper and T. Ruettinger, MGG 139, 167 (1975). 227. F. C . Neidhardt and R. A. VanBogelen, BBRC 100, 894 (1981). 228. T.Yamamori and T. Yura, PNAS 79, 860 (1982). 229. A. D. Grossman, J. W. Erickson and C. A. Gross, Cell 38, 383 (1984). 230. A. D. Grossman, D. B. Straus, W. A. WalterandC. A. Gross, GenesDeu. 1, 179(1987). 231. D. B. Straus, W. A. Walter and C. A. Gross, Nature 329, 348 (1987). 232. J. W. Erickson, V. Vaughn, W. A. Walter, F. C. Neidhardt and C. A. Gross, Genes Deu. 1, 419 (1987). 233. K. Liberek and C. Georgopoulos, PNAS 90, 11019 (1993). 234. E. A. Craig and C. A. Gross, T1BS 16, 135 (1991). 235. K. Tilly, R. A. VanBogelen, C. Georgopoulis, and F. C. Neidhardt, J. Bact. 154, 1505 (1983). 236. C. L. Squires, S. Pedersen, B. M. Ross and C. Squires, J. Bact. 173, 4254 (1991). 237. F. C. Neidhardt, P. L. Bloch, S. Pedersen and S. Reeh, J. Bact. 129, 378 (1977). 238. R. M. Blumenthal, P. G. Lemaux, F. C. Neidhardt and P. P. Dennis, MGG 146, 291 (1976). 239. S. L. Herendeen, R. A. VanBogelen and F. C. Neidhardt, J. Bact. 139, 185 (1979). 240. M. Kitagawa, C. Wada, S. Yoshioka and T. Yura, J. Bact. 173, 4247 (1991). 241. F. C. Neidhardt, R. A. VanBogelen and V. Vaughn, ARGen 18, 295 (1984). 242. K. Nadeau, A. Das, and C. T. Walsh, JBC 268, 1479 (1993). 243. H. E. Kroh and L. D. Simon,]. B u t . 172, 6026 (1990). 244. D. T. Chin, S. A. Goff, T Webster, T. Smithand A. L. Goldberg, JBC 263, 11718(1988). 245. S. E. Chuang, V. Burland, P. G., D. L. Daniels and F. R. Blattner, Gene 134, 1 (1993). 246. B. Lipinska, J. King, D. Ang and C. Georgopoulos, NARes 16, 7545 (1988). 247. B. Magasanik and F. Neidhardt, in Eschrichiu cok and Salmonella typhimurium: Cellular and Molecular Biology” (F. C. Neidhardt et al., eds.), p. 1318. American Society for Microbiology, Washington, DC, 1987. 248. L. J. Reitzer and B. Magasanik, in Escherichia coli and Salmonella typhimurium (F. C. Neidhardt et al., eds.), p, 303. American Society for Microbiology, Washington, DC, 1987. 249. A. J. Ninfa, M. R. Atkinson, E. S. Kamberov, J. Feng and E. 6. Ninfa, in ”Two-component Signal Transduction” (J. A. Hoch and T. J. Silhavy, eds.), p. 67. ASM Press, Washington, DC, 1995. 250. S. Kahane, R. Levitz and Y. S. Halpern, J. Bact. 135, 295 (1978). 251. F. Claverie-Martin and B. Magasanik, PNAS 88, 1631 (1991). 252. L. Andersen, M. Kilstrup and J. Neuhard, Arch. Microbiol. 152, 115 (1989). 253. E. Shaibe, E. Metzer and Y. S. Halpern, J. Bact. 163, 938 (1985).
215. 216. 217. 218.
GLOBAL GENE REGULATION
85
254. E. R. Stadtman and A. Ginsburg, in “The Enzymes” (P. D. Boyer, ed.), p. 755. Academic Press, New York, 1974. 255. B. Magasanik, TIBS 13, 475 (1988). 256. A. Ninfa and B. Magasanik, PNAS 83, 5909 (1986). 257. M. R. Atkinson, E. S. Kamberov, R. L. Weiss and A. J. Ninfa, JBC 269, 28288 (1994). 258. J. Feng et al., J. Bact. 174, 6061 (1992). 259. J. Keener and S. Kustu, PNAS 85, 4976 (1988). 260. V. Weiss and B. Magasanik, PNAS 85, 8919 (1988). 261. G. F.-L. Ames and K. Nikaido, E M B O J . 4, 539 (1985). 262. L. J. Rietzer and B. Magasanik, Cell 45, 785 (1986). 263. A. Wedel, D. S. Weiss, D. Popham, P. Droge and S. Kustu, Science 248, 486 (1990). 264. W. Su, S. Porter, S. Kustu and H. Echols, PNAS 87, 5504 (1990). 265. D. L. Popham, D. Szeto, J. Keener and S. Kustu, Science 243, 629 (1989). 266. V. Weiss, F. Claverie-Martin and B. Magasanik, PNAS 89, 5088 (1992). 267. D. S. Weiss, J. Batut, K. E. Klose, J. Keener and S. Kustu, Cell 67, 155 (1991). 268. S. C. Porter, A. K. North, A. B. Wedel and S. Kustu, Genes Deu. 7, 2258 (1993). 269. E. R. Stadtman, E. Mura, P. B. Chock and S. G. Rhee, in “Glutamine: Metabolism, Enzymology, and Regulation” (J. Mora and R. Palacios, eds.), p. 41. Academic Press, New York, 1980. 270. E. S. Kamberov, M. R. Atkinson and A. J. Ninfa, J B C 270, 17797 (1995). 271. S. Kustu, J. Hirschman, D. Burton, J. Jeleski and J. C. Meks, MGG 197, 309 (1984). 272. A. B. Pardee and L. S. Prestidge, J. B a t . 70, 667 (1955). 273. J. Fraser and E. B. Newman, J. Bact. 122, 810 (1975). 274. S. C. Quay and D. L. Oxender, “Regulation of Membrane Transport.” Plenum, New York, 1980. 275. R. T. Lin, R. D’Ari and E. B. Newman, J . B a t . 172, 4529 (1990). 276. J. H. Rex, B. D. Aronson and R. L. Somerville, J. B a t . 173, 5944 (1991). 277. I. N . Hirshfield, F.-M. Yeh and L. E. Sawyer, PNAS 72, 1364 (1975). 278. M. DeFelice, C. T. Lago, C. H. Squires and J. M. Calvo, Ann. Microbial. Paris 133A, 251 (1982). 279. C. H . Squires, M. DeFelice, S. R. Wessler and J. M. Calvo, J . Bact. 147, 797 (1981). 280. J. J. Anderson and D. L. Oxender, J . B a t . 130, 384 (1977). 281. J. C. Andrews, T. C. Blevins and S. A. Short, J . Bact. 165, 428 (1986). 282. J. J. Anderson, S. C. Quay and D. L. Oxender, J. Bact. 126, 80 (1976). 283. J. V. Platko, D. A. Willins and J. M. Calvo, J. B a t . 172, 4563 (1990). 284. H. Lin et ul., J. Bact. 174, 2779 (1992). 285. E. A. Austin, J. C. Andrews and S. A. Short, “Abstracts: Molecular Genetics of Bacteriophage,” 11. 153. CSH Lab, CSH, NY, 1989. 286. S. A. Haney, J. V. Platko, D. L. Oxender and J. M. Calvo, J. B Q C ~174, . 108 (1992). 287. M. Ferrario e t al., J. Bact. 177, 103 (1995). 288. E. R. Stadtman e t al., Adti. Enzyme Regul. 8, 99 (1970). 289. D. E. Wiese, 11, B. R. Ernsting, R. M. Blumenthal and R. G. Matthews, J. Cell. Biochem. 19A, (1995). 290. G. Pahel, A. D. Zelenetz and B. M. Tyler, J. Bact. 133, 139 (1978). 291. R. Marasco et al., J. Bact. 176, 5197 (1994). 292. S. C. Quay, T. E. Dick and D. L. Oxender, J . Bact. 129, 1257 (1977). 293. J. Carey, PNAS 85, 975 (1988). 294. M. Brenowitz, D. F. Senear, M. A. Shea and G. K. Ackers, PNAS 83,8462 (1986). 295. D. F. Senear, M. Brenowitz, M. A. Shea and G. K. Ackers, Bchent 25, 7344 (1986).
86
ROBERT M. BLUMENTHAL ET AL.
296. D. W. Borst, R. M. BlumenthalandR. G. Matthews,J. Cell. Biochem. 19A, 100(Abstr. 112-112)(1995). 297. D. A. Willins and J. M. Calvo, J. Bact. 174, 7648 (1992). 298. C. F. Higgins, C. J. Dorman and N. Ni Bhriain, in “The Bacterial Chromosome” (K. Orlica and M. Riley, eds.), p. 421. American Society for Microbiology, Washington, DC, 1990. 299. E. Richet and 0. Raibaud, J M B 218,529 (1991). 300. C. F. Higgins et al., Cell 52, 569 (1988). 301. S. Cayley, B. A. Lewis, H. J. Guttman and M. T.Record, Jr., JMB 222, 281 (1991). 302. Y. Chen, Y. W. Ebright and R. H. Ebright, Science 265, 90 (1994). 303. P. S. Pendergrast, Y. Chen, Y. W. Ebright, and R. H. Ebright, PNAS 89, 10287 (1992). 304. R. D’Ari, R. T.Lin and E. B. Newman, TIBS 18, 260 (1993).
Eukaryotic Nuclear RNase P: Structures and Functions R. CHAMBERLAIN, * ANTHONYJ. TRANGUCH,~ EILEENPAGAN-RAMOS~ AND DAVIDR. ENCELKE*.+J
JOEL
*Program in Cellular and Molecular Biology +Department of Biological Chemistry The University of Michigan Medical School Ann Arbor, Michigan 48109
I. Ribonuclease P
................................................
A. Enzyme Structure B. Substrate Recogniti C. Catalytic Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Yeast Nuclear RNase-P RNA ................. A. Structure and Expression of the RPRl Gene . . . . . . . . . . . . . . . . . . . B. Phylogenetic Studies C. Solution Structure An 111. Analysis of Mutations in RPRI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. RPRl Mature Domain Replacements . . . . . . . . . . . . . . . . . . . . . . . . . . B. Randomization Mutagenesis of Universally Conserved Positions . . . C. A Subdomain Involved in Catalysis ........................... D. An RNase-P Mutation Affects rRNA Processing References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88 88 91 93 95 95 98 99 108
108 108 109 113 116
Over the past decade, ribonucleic acids have gained recognition as the catalysts for an increasing number of biochemical reactions (1,2). Examples of the types of reactions performed by catalytic RNAs, also known as “ribozymes,” include RNA self-splicing (e.g., group I and group I1 introns), RNA self-cleavage (e. g., hammerhead, hairpin, hepatitis delta, Neurosporu VS, and tRNA ribozymes), and tRNA 5’ end maturation (e.g., RNase P). With the exception of RNase P, most ribozymes act intramolecularly in uiuo. Recognition and alignment of the appropriate cleavage site proceeds through Watson-Crick base-pairing between regions of complementary sequence within the same RNA chain. In contrast, the function of RNase P in uiuo is to 1
To whom correspondence may be addressed.
Progress in Nucleic Acid Research and Molecular BioloW, Vol. 55
87
Copyright 0 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.
aa
JOEL R. CHAMBERLAIN ET AL.
bind and cleave all of the various tRNA precursors (pre-tRNAs). To accomplish this in the absence of base-pairing, common determinants in the threedimensional structures of the substrates become important. Because of the complex mode of substrate recognition by RNase P, a knowledge of the tertiary structures of both the catalytic and substrate RNAs as well as an understanding of how these structured RNAs interact is necessary for determination of its catalytic mechanism.
1. Ribonuclease P
A. Enzyme Structure Ribonuclease P (RNase P; EC 3.1.26.5)is a ribonucleoprotein enzyme that endonucleolytically cleaves precursor tRNA (pre-tRNA) molecules, generating mature 5 termini (3).The enzyme is essential, well-conserved h n c tionally, and has been found in every organism tested. In all but one case (4), RNAs copurlfy with RNase-P activity, including examples from eubacteria (5, 6), archaebacteria (7, 8), fungi (9, lo),vertebrates (11-13), and mitochondria (14, 15). Although RNase-P activity has been observed in these phylogenetically diverse sources, the composition and features of these enzymes appear to d a e r (16). RNase P has been studied most extensively in eubacteria, particularly in the organisms Escherichia coli (5, 17, 18) and Bacillus subtilis (6, 19). The eubacterial holoenzyme is composed of a single RNA subunit (-400 nt) and a single 14-kDa protein subunit. Temperature-sensitive mutations in each of these subunits have shown that both components are required for in uivo activity (2O), but, under elevated Mg2+ and monovalent salt conditions, the eubacterial RNA component alone can catalyze its reaction in uitro (21, 22). Historically, this provided one of the first two examples of catalysis by RNA (21, 23). Secondary-structure models for the catalytic RNA subunit from E . coli RNase P have been proposed, based on minimum energy calculations (5)and structure probing with a variety of enzymatic (24) and chemical (25, 26) reagents. However, many structures are equally consistent with the structure probing data. Phylogenetic comparative analysis has proved to be the more powerful approach to inferring the secondary structure of the RNase-P RNAs (27). This type of analysis is based on the theory that homologous functions imply homologous structures (28,29). If this is applicable, RNase-P RNAs from diverse species should have similar higher order structures. In those regions of an RNA molecule where the structure rather than the primary sequence is important, compensatory base changes that maintain
EUKARYOTIC
89
RNaSe P
base pairing potential are observed in homologous RNAs. Furthermore, structural inferences from phylogenetic analysis already take into account all functional aspects of the cellular environment (e.g., protein partners). This comparative approach has revealed a conserved core of secondary structure elements for eubacterial RNase-P RNA (30), even though some of these RNAs share less than 50% sequence identity. The secondary structure models for the eubacterial RNAs continue to be refined as more phylogenetically variable sequences are added to the database (31, 32). Despite the demonstration of catalysis mediated by RNA alone, in conjunction with the low percentage of protein (-10% by density) in the holoenzyme (19,33),the eubacterial protein subunit was found to be essential for RNase-P activity in vim. Because addition of either high monovalent salt or the eubacterial protein increased kcat by facilitating product release, it was proposed that the protein is likely to assist with electrostatic shielding between the intermolecular phosphate backbones (34-36). The protein may confer a better electronic and/or structural environment for catalysis, although additional roles within the cell have not been ruled out (e.g., intracellular localization, 37). Comparisons of the eubacterial RNase-P proteins revealed little overall sequence homology (<25%), with conservation of basic and aromatic residues at only a few positions (31, 38). This was unexpected considering that these proteins can form functional holoenzymes with most of the eubacterial RNase-P RNAs in uitro (21)and in vivo (39,40).Structures for the protein subunits have not yet been proposed, nor has the structure of the eubacterial holoenzyme been elucidated, although footprinting analysis has suggested some regions of interaction between the RNA and protein moieties (41, 42). In contrast to the situation for the eubacterial RNase P, the RNA components from the archaebacterial and eukaryotic enzymes have not exhibited catalysis that is independent of protein. In fact, it was originally disputed whether these RNAs were functionally or structurally homologous to their eubacterial counterparts. Although the archaebacterial RNAs were later shown to conform to the phylogenetically derived eubacterial structure, it remains uncertain why these RNAs do not catalyze the RNase-P reaction in vitro (8). Nuclear RNase-P RNAs were previously proposed to contain some regions of structural homology to the eubacterial RNAs (43), although sequence variability among the small number of eukaryotic RNAs available at that time was so great that sequence alignment and structure prediction was quite tentative (9-11). Distinct RNase-P activities have also been found in eukaryotic mitochondria (14, 15,44,45)and chloroplasts (4),organelles that encode their own unique set of tRNAs. Unfortunately, it has been difficult to derive an unequivocal structure for the mitochondrial RNase-P RNA subunits due to their extremely high (>85%)A U sequence content (46),and
+
90
JOEL R. CHAMBERLAIN ET AL.
it is currently unclear whether RNase P from chloroplasts contains an RNA subunit at all (4).The apparent lack of an RNA component should be viewed with caution, because there have been previous examples of nuclease-insensitive RNase-P holoenzymes that were subsequently shown to contain essential RNA subunits (7, 12). Whereas the eubacterial protein subunit comprises only a small percentage of the RNase-P holoenzyme, protein accounts for -50% (by density) of the eukaryotic holoenzymes (12, 13, 26, 47, 48) and at least one of the archaebacterial holoenzymes (7). To date there is relatively little known about the protein structures of noneubacterial RNase Ps. One protein that has been isolated, cloned, and its gene sequenced is a single, l(W-kDa polypeptide from Saccharomyces cereuisiae mitochondria1 RNase P (49, SO). This protein is considerably larger than the eubacterial subunit (14 kDa) and displays no obvious sequence homology. Other evidence from yeast also supports the notion of a large protein subunit as a component of the nuclear holoenzyme. A 100-kDa protein has been identified that copurifies with nuclear RNase-P activity of Schizosaccharomyces pombe (51), although further characterization of this protein has not been reported. The gene for a 100-kDa protein from S. cereuish has been cloned and sequenced. This gene, POPI, might encode a subunit common to both nuclear RNase P and RNase MRP (a related ribonuclease involved in rRNA processing). A temperature-sensitive pop1 allele displays defects in both tRNA and rRNA processing, and antibodies directed against a protein fusion product of Poplp immunoprecipitate both RNase-P and RNase-MRP RNAs (RPRI and N M E l RNAs) (52). Coimmunoprecipitation of RNase-P and RNase-MRP RNAs is consistent with previous reports of sera from autoimmune patients with the ability to immunoprecipitate both RNase-P and RNase-MRP RNAs from human cells (53). Considering the lack of catalysis mediated by RNA alone and the increase in the relative percentage of the protein moiety (-50% vs. -10%) for the eukaryotic and archeal RNase Ps, it is possible that the noneubacterial RNase-P protein subunits may have assumed some or all of this enzyme’s substrate binding and catalytic roles. However, it has been demonstrated biochemically (9,10,44)and genetically (54-56) that these RNA components are functionally indispensable, making it equally plausible that these protein subunit(s) might play a more vital role in stabilizing the proper structure of the catalytically competent RNA. Apart from RNA subunit stabilization, the protein components may be important for functions unique to higher organisms (e.g., subcellular localization). Lacking reconstitution of activity from separate RNA and protein components of the noneubacterial RNase Ps (J.-Y. Lee, unpublished observation), it is clear that any meaninghl structural
EUKARYOTIC
mase
P
91
analysis of noneubacterial RNase-P RNAs must be carried out in the context of the holoenzyme.
B. Substrate Recognition The most common natural substrates for RNase P are pre-tRNA molecules, although RNase P has also been shown to cleave pre-RNAs that are structurally analogous to pre-tRNAs, including 4.5-S RNA precursor (57, 58), lOSa RNA (of unknown function) (59, 60), an mRNA from E . coli (64, and certain viral RNAs (62-64). This section focuses on the binding determinants of pre-tRNAs and their interactions with RNase P. Within a given organism, there can be 30-150 different tRNA species (65),and RNase P recognizes and cleaves all of them. Beyond recognition of the organism’s own tRNAs, RNase P from any given organism generally can process pre-tRNA substrates from other species (10,21, 66, 67). This prompted a search for sequence and/or structural elements common to pre-tRNAs that would confer their ability to be recognized by RNase P. The most logical place to look for these binding determinants was in regions of conserved sequences in the substrate and RNase-P RNAs that might be complementary to one another. Standard base-pairing is commonly used by other ribozymes for specific substrate recognition. Currently, there are almost 2000 tRNA sequences known from >200 organisms and organelles (68). Most of these tRNAs exhibit widely divergent primary sequences, yet can be folded into the cloverleaf model of secondary structure as well as the [-shaped model of tertiary structure (Fig. 1).Comparison of tRNA sequences within the context of the cloverleaf structure revealed extreme sequence heterogeneity, including positions surrounding the RNase-P cleavage site (located at the 5’ end of the mature tRNA shown in Fig. 1). However, two short regions of contiguous nucleotides are conserved among the E . coli tRNAs: GTPCR in the T loop, and CCA at the mature 3‘ terminus. The importance of these sequences for binding to eubacterial RNase-P RNAs is influential (69-72), but not essential, leaving no sequences in the tRNA molecules absolutely required for RNase-P interaction. Similar to the primary sequence divergence observed among tRNAs, comparison of the eubacterial RNase-P RNAs revealed only a few dispersed patches of short, contiguous sequence conservation (see Section 11, B). Surprisingly, it was possible to remove every residue in these catalytic RNAs without completely abolishing the ability to bind and cleave substrate, as judged from a collection of deletion mutants from E . coli (73, 74) and B . subtilis (75, 76) RNase-P RNAs. This suggested that the elements that comprise the substrate binding pocket and catalytic cleft are dispersed throughout the primary sequence of an RNase-P RNA. Together, these observations
92
JOEL R. CHAMBERLAIN ET AL.
r-stem
Acceptor
FIG. 1. Secondary and tertiary structure of tRNA. (Left) Invariant nucleotides in the cloverleaf structure and base-pairing (dashes) are indicated. (Right) The L-shaped model of tertiary structure. [Reprinted from So11 (65), with permission.]
of sequence heterogeneity and dispensability of conserved regions in both substrate and RNase-P RNAs undermine the hypothesis of intermolecular base-pairing as the main mode of recognition. It is more probable that tertiary structural determinants are responsible for the interaction of enzyme and substrate. As mentioned above, tRNAs have a common three-dimensional structure, deduced from studies of mature tRNAs (77-80). Structural studies of pre-tRNAs also suggest that the “mature” portion of these molecules displays approximately the same tertiary conformation as that of mature tRNAs (81). In support of the significance of the tertiary structure of the mature region for recognition of substrates, RNase-P binding studies using mature tRNAs as competitive inhibitors of pre-tRNAs revealed that K , K , (34). This suggested that most of the binding energy comes from the “mature” portion of tRNAs. It seems reasonable that RNase P must recognize certain conserved tertiary features of the folded tRNA structure, with many of the data in support of this hypothesis. In general, mutations in a tRNA that disrupt its tertiary structure lead to a decrease in its ability to be cleaved by RNase P (82, 83). Similarly, mutations in RNase-P RNA that have the greatest effects on its tertiary structure also result in the most significant impairments in substrate binding and catalysis (84-87). These observations support the no-
EUKARYOTIC
RNaSe
P
93
tion that recognition by RNase P is a complex interaction that must involve the tertiary structures of both substrate and catalytic RNAs. The question concerning tRNA structure recognition now becomes a question of tertiary determinant recognition. A variety of experimental approaches have been taken to address this issue. Mutational and deletion studies reveal that the coaxial stack formed by the acceptor and T-stems is the only part of the tRNA consensus structure that is absolutely required for recognition by the eubacterial holoenzyme (69, 72, 88-94). Although it has been suggested that the eukaryotic holoenzymes might require additional determinants such as the D-stem (95) or the 5’ leader (96), it is unclear whether these other elements are directly involved in recognition. Chemical modification studies have identified nucleotide positions located in the T-stemlloop and acceptor stem that might be involved in close contacts with RNase P (97-100). Conversely, cross-linking studies using modified tRNA substrates have designated nucleotides within eubacterial RNase-P RNAs that are located in the vicinity of various portions of the tRNA (67, 101); however, due to the presence of the 9 - i cross-linking arm, residues of the substrate binding pocket have not been defined with certainty. Nucleotide base-pairing and chemistry in pre-tRNAs near the cleavage site affect the fidelity of the cleavage site (93, 94, 102-105). Attempts to arrive at a simple formula for cleavage site determination, such as measuring the length of the coaxially stacked acceptor and T-stems, have been unsuccessful. In summary, RNase P recognizes conserved tertiary structure determinants in its pre-tRNA substrates, rather than aligning the cleavage site through standard base-pairing. Many features of the complex three-dimensional architecture of the substrate, mainly located in its T-stem/loop and acceptor stem, contribute to binding and alignment of the cleavage site by RNase P, although the precise contacts have not been elucidated. It should be noted, however, that individual details of this architecture may vary in importance for different pre-tRNAs substrates.
C. Catalytic Mechanism RNase-P cleavage generates the mature 5’ terminus of tRNAs. Cleavage of the substrate leaves a 5’ phosphate and a 3 hydroxyl ( I 7,106)and, in this respect, is similar to the RNA-catalyzed reactions of group I and group I1 self-splicing introns (cf. 2’,3’ cyclic phosphate species generated by selfcleaving ribozymes). In contrast to the catalytic mechanism of self-splicing introns, the RNase-P cleavage reaction proceeds through hydrolysis, rather than transesterification, and forms no covalently linked intermediate species. Similar to other known ribozymes, RNase P is a metalloenzyme with an
94
JOEL R. CHAMBERLAIN ET AL.
absolute requirement for divalent cations (106). Mg2+ is the most efficient divalent cofactor, but Mn2+ or Ca2+ can also promote the reaction (34,107109). A photocross-linking assay has been developed that covalently links RNase-P RNA-tRNA complexes and can distinguish substrate binding from catalysis (34, 67). By use of this assay, the role of Mg2+w a s determined to be catalytic, not structural, because high concentrations of monovalent cations (in the absence of any divalent cation) support substrate binding but not catalysis (34). This type of assay also revealed that as many as three Mg2+ ions are required for optimal cleavage in the reaction mediated by RNA alone (110). Although the exact mechanistic details of the RNase-P reaction are not known, the postulated cleavage intermediate (based on required substrate stereochemistry) is a trigonal bipyramid with all three hexacoordinated Mg2+ ions used to stabilize the intermediate (110). A hydroxyl ligand from one Mg2+ would presumably serve as a general base, thereby activating a water molecule or a hydroxide ion that, in turn, attacks the phosphorus center of the scissile bond by means of SN2 displacement (106). It has also been proposed that the 2' hydroxyl of the nucleotide 5' to the cleavage site may act as a ligand for one of the catalytically important Mg2+ ions (110, I l l ) , although the absence (112)or modification (110,113) of this 2' hydroxyl continues to support catalysis, albeit at a lower rate. It is possible that proper coordination of only one or two of the three catalytically involved Mg2+ ions is needed to sustain partial activity. This possibility is consistent with eubacterial RNase-P RNA deletion experiments demonstrating that no region of the catalytic RNA molecule is absolutely essential for catalysis (73-76). Another interpretation of these results would be that deletion of potentially crucial regions simply brings other residues that could provide hydrogen donors and acceptors into the active site, partially substituting for the original nucleotide. Although it is currently unknown which residues in the RNase-P RNA coordinate any of the catalytic Mg2+ ions, some candidate nucleotide positions have been identified in the RNase-P RNA from E. coli. Self-cleavage at a few discrete sites (-5 out of 377) can be detected when the RNA is incubated with Mg2+ at elevated pH (114). Similarly, self-cleavage is observed at only a few positions when eubacterial RNase-P RNAs are incubated with Pb2+ (86, 87, 118, and many of these sites are one to two positions away from some of the Mg2+ cleavage sites. This has led to the speculation that these regions might be involved in forming a specific binding pocket for divalent cations (86); similar speculations attempting to correlate Mg2+ and Pbz+ cleavage data have been made for tRNA molecules (116)and self-splicing group I introns (117). For identification of essential features of the active site, future structural and mecha-
EUKARYOTIC
RNaSe
P
95
nistic studies should focus on how the catalytic Mg2+ ions are coordinated and what, if any, dynamic conformational changes may be required of the substrate or enzymatic RNAs during catalysis (36, 118).
II. Yeast Nuclear RNase-P RNA Yeast is a single-cell eukaryotic organism that contains two versions of RNase P: a nuclear form that processes nuclear-encoded pre-tRNAs destined for the cytoplasm, and a mitochondrial form that processes pre-tRNAs encoded by the mitochondrial genome. Mitochondrial RNase P from several different yeasts has been studied (14, 15, 46, 49, 50, 54, 97, 219) and is not discussed here, except to note that none of the RNA or protein subunits appear to be held in common between the nuclear and mitochondrial enzymes. The nuclear version of RNase P has been studied in s. pombe, a fission yeast (9,47, 51, 56,120),and in S. cerevisiae, a budding yeast (10, 55, 83,121-225). This section focuses on our efforts to understand the structure of the RNA subunit from the S. cerevisiae nuclear enzyme.
A. Structure and Expression of the RPR7 Gene Initially, we partially purified the nuclear RNase-P holoenzyme from S. cerevisiae and characterized a 369-nt RNA that copurified with the activity (10). Unlike the RNA subunits of the eubacterial RNase Ps, this yeast RNA by itself lacked catalytic activity 0.-Y. Lee, unpublished), a phenomenon observed with the other known eukaryotic RNase-P RNAs. In addition, this RNA displayed little sequence conservation with RNase-P RNAs from other eukaryotes (nuclear or mitochondrial) or with those from eubacteria, leaving it questionable whether this was the true RNA subunit of nuclear RNase P from S . cerevisiae. To prove the identity of the RNA and provide tools for subsequent analysis, the gene for this RNA ( R P R l )was isolated by synthesizing a cDNA from the isolated RNA and using antisense probes generated from the cloned cDNA to screen a genomic DNA library (55). Hybridization to yeast chromosome blots and subsequent DNA sequencing showed that the gene was single-copy and located on chromosome V very near the URA3 locus (Fig. 2A). The RPRl gene was found to be essential by disruption and tetrad analysis. Isolation of a temperature-sensitive allele and characterization of its steady-state tRNA populations revealed that the RPAl mutation adversely affected the RNase-P-dependent cleavage of nuclear precursor tRNAs, providing strong support that the RPRl gene product is the functional nuclear RNase-P RNA.
A
B - 3 0 7 attggtcata aaaatcaatc aatcatcgtg tgttttatat gtctcttatc taagtataag aatatccata gttaatattc acttacgcta ccttttaacc - 2 0 7 tgtaatcatt gtcaacaggt atgttaacga cccacattga taaacgctag tatttctttt tcctcttctt attggccggc tgtctctata ctcccctata
*
A BOX B BOX - 1 0 7 gtctgtttct tttcgtttcg attGTTTTAC GTTTGAGGCC TCGTGGCGCA CATGGTACGC TGTGGTGCTC GCGGCTGGGA ACGAAACTCT GGGAGCTGCG
~O~CDOMCUUCAT -TWCCCCTQ2 307
0
7
~
U
S
U
3
U
U
X
~
~
WcdAAcCca
o
M
c
T
C
c
-CCAT
M 369
T
~
X
W
Z
ATCCAACTTC CAATTTAATC
407 TTTCTTtttt aattttcact tatttgcgat acagaaagaa aaaagcgata gtaactattg aattttgttt ggatttggtt agattagata tggtttctct 5 0 7 ttatatttac atgctaaaaa tgggctacac cagagataca taattagata tatatacgcc agtacacctt atcggcccaa gccttgtccc aaggcagcgt
FIG.2. Location and sequence of the RPRl gene region. (A) The RPRl gene was located on chromosome V immediatelv adjacent to URA3. A brief restriction map including Bg, BglII and H,HindIII, and relative orientations of the RPRl and URA3 coding regions are shown. (B) The sequence of the RPRl gene region with the nucleotide positions numbered relative to the first nucleotide in the major RPRl RNA (+ 1). Capitalized nucleotides correspond to the longest RNA transcript with the first nucleotide indicated by an asterisk. The maturelength RNA is denoted by bold, underlined letters. A and B box RNA polymerase 111-like promoter elements are indicated by overlining.
X
V
2
%
EUKARYOTIC
wase
P
97
Analysis of RPRl gene expression revealed that its pattern of products resulted from initial transcription of a large RNA that itself is processed. Two predominant RPRl species appear on blot analysis of RNA from wild-type cells, a 369-nucleotide form (underlined sequence in Fig. 2B) and a less abundant RNA containing an 84-nucleotide 5’ leader and 16-30 nucleotides of 3‘ trailing sequences (all nucleotides shown as uppercase letters in Fig. 2B). Both forms of the RNA appear to be contained in ribonucleoprotein particles of very similar size and chromatographic properties (10; J. R. Chamberlain, unpublished observation), suggesting that the longer RNA is a precursor and is (or can be) assembled into an RNP particle prior to processing. In support of a precursor-to-product relationship, the RPRl gene was found to have only one identifiable transcription unit and its initiation site and termination site are compatible with the longer RNA. Further examination of RPRI transcription led to the identification of an unprecedented type of promoter (121).On discovering that we were unable to express the RPRl “mature” domain from an RNA polymerase I1 promoter, we demonstrated that RPRl RNA production was inhibited in a strain possessing a temperature-sensitive allele of the RNA polymerase I11 large subunit. The promoter structure was defined through a series of deletion and directed point mutations, followed by analysis of expression and transcription factor binding in vivo and in vitro. Transcription was entirely dependent on two short sequences in the 84-nucleotide 5’ leader region of the gene (termed the A box and B box in Fig. 2B). These sequences bore a notable resemblance to the internal promoter segments of the same names from eukaryotic tRNA genes. This similarity was subsequently confirmed by DNA footprint experiments using the tRNA gene transcription factor, TFIIIC. While the sequences upstream of the transcription start influenced transcription efficiency, the major promoter determinants were in the intragenic A and B boxes, reminiscent of the tRNA gene structure. This appears to make RPRl the only currently known case of a “disposable internal promoter,’’ because processing of the primary transcript results in loss of the 5’ sequences containing the promoter. It is not currently known whether the upstream region specifically binds any transcription factors other than TFIIIB. Several short segments have sequences similar or identical to those found upstream of the S. cerevisiue U6 small nuclear RNA gene that is also transcribed by RNA polymerase I11 using internal promoters. However, these upstream segments are not conserved in other Saccharomyces species (123), leaving their importance in doubt. Formation of the 3’ end of the RPRl transcript is accomplished through polymerase termination following the synthesis of 5 to 6 U residues, as expected for an RNA polymerase I11 transcript. This also holds true for RNA transcripts from other species of Saccharomyces (see Section 11,B).
98
JOEL R. CHAMBERLAIN ET AL.
B. Phylogenetic Studies The first step toward understanding RNA structure is to determine the base-paired secondary structure. Because the individual elements (e.g., double helices) that make up a complex folded RNA are typically stable by themselves, they are considered to be the building blocks of the threedimensional RNA structure (126,127).Although secondary structure models can be derived by using theoretical energy minimizations (128)or structurespecific probing reagents (129), the most incisive method of deducing secondary structure has been the use of phylogenetic sequence comparisons (29).Phylogenetic analysis produced a consensus secondary structure for the eubacterial RNase-P RNAs (27). This recently refined consensus structure (32),in combination with intra- and intermolecular distance constraints garnered from cross-linking studies, has provided the basis for three-dimensional model building of eubacterial RNase-P RNA in complex with tRNA (67, 101; N. Pace, personal communication). Although attempts were made to fold eukaryotic RNase P RNAs into structures resembling the eubacterial consensus (43),these structures have been regarded with skepticism because their primary sequences were too evolutionarily divergent from each other and from the eubacterial RNAs for a convincing alignment. A preliminary structure model of the RPR1 RNA had been suggested (lo),based on computer folding algorithms and superficial similarity to the eubacterial consensus, but this was acknowledged to be quite speculative. To approach this problem, one group initiated a phylogenetic comparison of several closely related Schizosaccharomycesspecies (120).The genes from Schizosaccharomyces malidevorans, Schizosaccharomyces japonicus, and Schizosaccharomyces versatilis were found to be identical to S . pombe, and the gene from Schizosaccharomyces octosporus diverges by only -20%. Consequently, only a few structural elements within the Schizosaccharomyces RNAs could be phylogenetically supported. This study was unable to provide enough information to confirm structural homology of the eukaryotic RNAs with their eubacterial counterparts. A very recent attempt to perform this same sort of analysis with vertebrate RNAs also met with limited success (130)due to insufficient sequence divergence. To provide a structural hypothesis for future experimentation we undertook a phylogenetic comparative analysis of a number of closely-to-distantly related budding yeasts with the hope of establishing a reasonable RNA secondary structure model (223).The yeasts were first surveyed by RNA blot analysis using S . cerevisiae RPRl probes to determine if their genes shared significant homology to be identified by hybridization. RNA from Candida glabrata and S . pombe did not produce a detectable hybridization signal, suggesting there is little sequence conservation across genus lines. Of the
EUKARYOTIC m a s e P
99
Saccharomyces species examined, Saccharomyces bayanus, Saccharomyces uvarum, and Saccharomyces diastaticus appeared to be closely related by size and strength of hybridization signal. Cloning and sequencing of these three RNase-P genes confirmed the sequence similarity, with only a few nucleotide identity changes within the mature domain. RNAs from other Saccharomyces species were more divergent, with Saccharomyces carlsbergenesis 86% identical, Saccharomyces kluyveri 70% identical, and Saccharomyces globosus 65% identical to S . cereoisiae. These species provided sufficient nucleotide conservation for the purpose of aligning the sequences of the mature domains and the leader regions. This alignment is shown in Fig. 3, along with the alignment of the S. pombe (Spo)and S . octosporus (Soc)made possible by the Saccharomyces consensus. There is no significant conservation of sequences upstream of the Saccharomyces transcription units, consistent with the notion of internal transcription signals. In each case, a transcribed leader was found containing conserved A box and B box sequences. There is also a short GTITG sequence of unknown hnction found 9-11 nucleotides upstream of the A box in each gene. A poly(T) terminator for RNA polymerase I11 transcription is found immediately downstream of each mature RNA coding region. Alignment of the sequences allowed us to propose a consensus structure that fit all of the yeast sequences (Fig. 4;Fig. 5, left) and resulted in identification of positions conserved among nuclear RNA subunits in yeast. This consensus structure is also compatible with the eubacterial consensus structure (Fig. 5, right) (30).The comparison identified a far smaller number of positions where the identity of the nucleotide appears to be conserved across kingdom lines. These conserved positions were the first to be considered when creating directed mutations in the RNA (see Section 111,B).
C. Solution Structure Analysis of the Holoenzyme RNA Because the phylogenetic consensus still held a number of ambiguities, the model was refined by structure-sensitive solution footprinting of the RNA while in the holoenzyme complex (119). A relevant RNA structure was likely to exist only in the context of the holoenzyme because the eukaryotic RNA subunit is not active in the absence of protein. The regions accessible to single-strand-specific chemicals [dimethyl sulfate (DMS), kethoxal (KE), carbodiimide (CMCT)], single-strand-specific nuclease (RNase ONE), and double-strand-specific nuclease (RNase V1) are shown in Fig. 6A. Almost all of the sensitivities of the holoenzyme are consistent with the phylogenetic model. Exceptions to this caused us to revise the proposed structure in ways compatible with the phylogenetic data (compare the structure in Fig. 6 to
JOEL R. CHAMBERLAIN ET A L
100 A.
-17Q -16Q -159. -24Q -13Q -11Q -21Q -1OQ -19Q SceY -28Q Sce atcgtgtPttttatatgtctcttatctaagtataagaatatccatagttaatattcacttacgctaccttttaacctgtaatcattg~ggatatg Sca cgtcatcaacgtcatcatcgaacccggcgaacagaagcccatcgcatttgcatggcagaacaaaccgcgtctataaggggaggtaaatactctatacata Skl tcagcttccactaacttagatgactgtttctcattctttatgtcaccttataacaccgt~gtgntaatatactagtaacacgaatactagtcgatggt~ Sgl caaaactacaataataagaatcactctcgtcacgtataacttcaacttatgttataacaatcttgtttaactgttacgtgga~aaaccnttaaaa SceY -1SQ -17Q -16Q -15Q -14Q -13Q -11Q -1lQ -lop -9Q Sce ttaacgacccacattgataaacgctagtatttctttttcctcttcttattggccggctgtctctatactcccctatagtctgtttcttttcgtttcgatt sca catcatcaatatacatacatcaatatacgtacacttcg~cagtagccgctgagaatcccccgctgcggtatgggccgtcccctgtgatttcaatt Skl gatacttgt~tttcgtt~catctaaggtctcataagacccagcgatcctnacatgctctcactcacactataatgaacgcatcacctctaacatt S g l gt~agaatgacgctg~taataatggtactgaatctttctcatctttaataatatttattaacgtacttatagagatactacaatttgatttaaa
SceY IQ 2Q 3Q 4Q 5Q 6Q 7Q 8Q Sce gtggaacaQtggtaattCCtacgattaagaaaCCTgtttacagaag.gatcccca..cctatgg~gggtt.atcagatattat~t~A sca gtggaaca~tggtgattCCtacgattaagnnaCCTgtttgcaaaag.ggcctgc...ccac...~cgg~c.atcttat.ttatc~t~~ Sk1 ccggancaQtggca.ttCCtacta....caaaCCTgtttgcacaaggagcggtcggcgtcagtcgcaccctgctcagaaata..cMot~A Sgl ttagagcaOtagcaactCCtacgatt..ttaaCCTgcttacgaact.att~......ptc tgrtctaataatacatgcMt~A spo CcatgctggacgtacMlgcgaac~Cgcnct... tCCTcaaattcarac.gcgtt gaaa.....agcgc.acagctcgtt.. gA00gOatA SCC ccatgccggacgtac(lgscaaacgCCpcac+ tCCTcaaattcagac.gc~ctt....ttac....aagtgt.tacgcgcatt. A t W . g Con ----------ga-caQt-g----tCCtac-s-------aaC~Tg-tt-c--a----------------------------------------c-tWgA
....... ....... ....... .......
.....
.... .......
.......
.
Sce#20 1OP 1lQ 11Q 13P l a 15Q 16Q 17Q scn AatTCWtOPUC. acagtggagccttgtcctcc..gggttaatgtcgcttttggcattggccc ctgctCc~Q.ag...&gaaatatactgpO Sca A O t K W t acagtggagccttgccttcc..gggccattgtc..tct.ggcagtggccc..ctgctC~.ag.. .AOgaaacttgctgpO skl AOtKWtOLMC gcggcggagcMctatttccgagggccg......ttct......cggtcctgatttcCg.ga...Aag.. cgtttG Sgl A O t ~ W t O L M C a C a ~ c a a a t g a g g t t t c n ~ t a t . . . . . . . ttgggCtCg~Q.agc..Agc.. ctggQ Spo AgpTCCCaoaAAC atcttcgttgc...... gtpctC~Mlagcg~a.. acG Soc AggTcooaosMC... ttcttcgttgc. atgctCgMA~agcggAgg.. acQ Con -tK00tQ-MC----c----gag------------------------------------------------C--------A----------t--~
..
..
~ ~ . . .
... ....... ....
.... ..... ....... .......
....................... ............................... ....................................
SceYlQ0 19Q 2OQ a 1Q aaQ 23Q P4Q a SQ Sce gAa...ccagtctTtaccgacCQttgttAtC~~tc.acggagttcggc..Ctag..gtCggaCtcc(raT900~Mlc~acgg... Sca gM. c c a g t c t T t a c c g a C C Q t t 9 c t A t ~ ~ g M c g g g g c c c g tCCCg...CCgPBcCtCgaT900Mc~caacgg ~ sk1 gAatttaaacgctT....ggaCQttgtcAt-Tac.accagtc.. tctt..... PpctggaTPQgM.~cagcgg... S g l gAa cctg.gt.taatagC0gcgctAt~.caacgctatcaa.....att.... ttgatggtgaT900McCcg=tagcaacaaacatatt Spo C . gttcT.....gCCQaatgWLC~TtC.aatcagtatggCCtcgtttgtcgtacctgattTtQ~c~attcg.. SOC gtccl gcC(ipgtgtAc-.c.gatctc......ttggttcgtcctttgagatcTt..ecccg........ Con gAa---------tT-------CQ--g--AtC-T---ac----------------------------ga-M-ospc----g-------------
..
.. .....
... ...... ~ . . . . . . .....
.......... ............. .......... ............ ......
..
8ceU 26Q 17Q 18P 29Q 3 OQ 3 1Q 3aQ 33Q 34Q Sce ttgttcc g T t T g a c t t g t c g c c c 9 c t a c ~ c 9 t g a g c g t c a a g g t c t ~ ~ ~ T c ~ a ~ c gttag..tggcg~cCQ tca Sca ttgttcc g~Tgacttgtcgcccgccacg g c g c g C t g c a a g g t c t g T ~ a ~ c g c c a ttcg..tggcg~CcCQ Skl ltgctcC gTtTgaCtgtgtttagC.ttcg gcgacaCagOtCtgTTPA(YMCMTcgta~cacttc ttat.gaggtg~CcCQ S g l ttatBtacctcacgtgTtTgacttaatca tttac tgnttaaagtctgTTOMcgta~cttctaa..gcaattagttgcACcCQ Spo agaaga....TtT..... atTTtMMCAATptgtOacctg. ..tttgt..caggt~tCQ ngaaga....TgT ctTTtAaT(1CMTgtgc~cncctgtgaaaagt..caggc~tCQ SOC Con ---------tc----~tTgact--------------------------a-gtct~-~T~MTcgta-c---------------g--g-ACcCQ
..... ..... .....
...... ......
...
... ...
.. ..... .... ..... .............................. ...................................
SceU 35Q 36 9 Sca ATaCCQAttAc..TgctgCtgtTccagC.. Sca ATaCCMttAc..TgctgctgCTcCatC. Skl ATaCCQAttAt..TgccggtgtTtcgtt.... S g l ATaCCMtcAa..TggtactgtTctaat. Spo ATtCCQActA.¶tcTtgtctgtaTgtctggtgtggtt SoC ATtCCOActAatcTtgtctgtaTgtctggtatgatt Con ATaCCQAt-A---Tg----tg-T-------------
...... ....... .... .......
D. SceU370 38Q 39Q 4 OQ Sce ccatatccaacttccaatttaatctttcttttttaatttt Sca ccaattccgactttccttttttgaatctggatttgcttgggaaatgaacagaaaataatactattgatttttttt Skl ttttttttttctttc sg1 tttttttggt
.... .... ...
EUKARYOTIC
RNaSe P
101
those in Figs. 4 and 5). The changes included the formation of a base pair between G23 nd C317; rearrangement of the local stem structure from G47 to C71; previous individual base pairs were opened at G134-U139, A16*-U189, A17LUlX6 U203-G2.50j0;the stel,, G2~9A270C271_G3OlU302C303 was opened; and the stem CzSO-Gz91 was rearranged. It is clear from the differential sensitivities to chemical and enzymatic reagents that although some regions are solvent accessible, as judged by chemical sensitivity, they are sufficiently covered by protein or buried in the KNA tertiary structure that recognition by nucleases is restricted. The sites accessible to nucleases lie primarily in stem domains that presumably protrude into the solvent. The protruding stem at positions 123-150 was predicted to be dispensable by the phylogenetic analysis, and directed deletions from the RPRl coding region have shown that this stem is dispensable. Also dispensable is stem 217-240, whereas 165-188 and 199-254 are essential (E. Pagrin-Ramos, unpublished observations). Nuclease and chemical sensitivities were exaggerated when protein was digested away by treatment with proteinase K under mild solution conditions (Fig. 6B). Although the stems predicted and observed in the holoenzyme remained intact, most of the interior of the RNA molecule became accessible to nuclease and chemical attack. This was an indication that those
FIG. 3. Gene structure antl s t v ~ i i t ~ iiilipniii(*nt i~~* 01' iiiickw 11Sase-I' IiSA hoinologs from ISrcr). S . k/rryceri (Sk/),and S. yeast. The genes from S. cerecisicw i S w ) . S. f,ci~/.~/J[,r~f,ii.~i.~ globosus (Sgl) were organized into upstrcwn 1.4).1twlc.r (1%). mature iC). antl tlr~wnstream(D) regions. Sequence was determintd li)r h i t 1 1 s t r i i i i d b 01' the regioiis shoi\vn. Tht. nuni1)erint: scheme (Sce#) is aligned according to the HPHl sc.iliit~iic~* lroin S. cerecisiae. with the first nucleotide of the S. cereuisiae mature domain tlesignatetl ;is + 1. (A) Upstream regions. Yeast retrotransposon insertion site consensus seqiiencrs (ycinca or its coinpleinent tgttgr) are underlined. The upstream sequences are not otherwise aligned due to insufficient similarity. (B) Leader regions. Dots [or dashes in the line lal)elrtl Con (ru)nsensus)Jindicate alignment gaps. Invariant positions are denoted hv uppercitse letters. iuitl iire also indicated in the line I;dwletl Con. Sequences on the line labeled Box indicatr coiisensus nucleotides Iron, the HSA polymerase 111-like promoter elenients: the most highly conserved positioiis itre t l o i l l ) l r iiiid(*rliiicd. The italicized cCg at the 3' end of the S . kluyceri Ieatler regioii 8-l)oscomprises the first three nucleotides of its mature region. This cCg is tlq)lic;rted ;it the beginning ol' tlw niiitiirt* tlomitiii sequence in B to indicate this. (C) h1;itw-r regions. The htrinologoiia gc*nc.w[1wnct's lroni S. pombe ( s p o ) and s. OCtOS/JcJrlt.Y (sot) itre inchlt~rdtiir the pllrpobe ol' strilctilrnl ciinipiriwn. Invariant nucleotides in all lines are denoted b y upperciise letters ;tiid iirr iilso shown ill ;I consensus (Con) sequence; lowercase letters ill the C h i seq~iencrintlicatc*iiiiclc.otitles conserved among the Saccharoinyces only (I>) 1)ownstrrnm region\. Siicltwtidt*w ~ ~ u ~ ~ tlo\vnnces stream of the inferred mature domain are shown for a sufficient tlistiincr to incliitlc. on(* or inore poly(T) stretches that could serve as pol I l l transcription ttwniniition sitrs.
FIG. 4. Secondary structure models for the nuclear RNase-P RNAs from Sacchuromyces and Schizosaccharomyces based on phylogenetic comparisons. Circled nucleotides in the Saccharomyces RNAs are invariant among the Sacchuromyces. Circled nucleotides in the Schizosaccharomyces RNAs indicate conserved positions between the Saccharomyces and Schiz~saccharomyces.Lines connecting nucleotides denote proposed canonical (Watson-Crick) base pairings. Filled dots indicate proposed G U base pairs, whereas empty dots indicate proposed noncanonical, non-G'U base pairs. The complementary sequences connected by the brackets denote long-distance base pairings between the indicated loops in the structures.
103
EUKARYOTIC
RNaSe
105
P
: invariant nudeotides between
eukaryotes
c
A
G
u
prokaryoles
: nucleotides conserved wahin eukaryotesp[ prokaryotes
a c g u : nudeotides conserved wlhin
Saccharomycas mly
0 : variable nucleolides I: pairings,supported by
.
covanations
- : unsupported pairings : long distance interactions
A
.
G
..
li A
c
.
*.
: bulged nudeotide
3%
..&. - . ... .
d
b’::;
F I C - ‘ d
.
*.,:
*a..
a
C O Y
‘*a.-
P-.
..
:E c-ti
... .
“.gI....IY
I
Eubacterial FIG. 5 . Comparison of the coilserved t”se I’ RNA cores troin ~ f ~ c c / l f l ~ o l l l f;IIKI / c . clrotn .~ eubacteria. A consensus structure I~itsetl priii~arily o n pliylogetietic blipport fiotn the yeast nuclear RNAs is proposed and comparrcl with the prc.vioi~slypostdated riil);cctt.rhl core (30. 32). Symbols used are defined in the kr!. in the itppt*rIdt. I’ositions iire rc4i*rrvtl to i t i tlw test b y the S. cereoisiae (Sce) position nuinlwrs.
regions are protected in the presence of protein. However, it is not clear whether this occurred because protein is directly obscuring the RNA or because the R N A tertiary structure unfolds in the absence of the protein. If the latter explanation of unfolding is true, it could provide at least a partial explanation of why the eukaryotic RNA subunits are inactive in the absence of protein subunits. The RNAs may not be able to adopt anything resembling the appropriate tertiary structure.
JOEL A. CHAMBERLAIN ET AL.
106
A
P RNssev1 RNaseONE
INTENSITY OF CLEAVAGE OR MODIFICATION IN THE RNASE P HOLOENZYME WEAK MODERATE STRONG 0
*
*
D
b
b
FIG.6. (A) Structure-specific enzymatic and chemical probing of the RNase-P holoenzyme.
The RNase-P holoenzyme was exposed to a variety of reagents that preferably target doublestranded (RNase V1) or single-stranded (RNase ONE, DMS, KE, and CMCT)regions of the RNA. “Hits” were mapped by primer extension analysis. Premature primer extension stops seen
EUKARYOTIC m a s e P
107
B
in control samples are denoted by a small filled circle next to the nucleotide; all other symbols are defined in the key. Indicated reactivities(weak, moderate, or strong) were from visual estimates of band intensities. The secondary structure model has been adjusted (cf. Fig. 1) to be consistent (continues)
JOEL R. CHAMBERLAIN ET AL.
108
111. Analysis of Mutations in RPR7
A. RPR7 Mature Domain Replacements Initial tests of functional hypotheses took two forms, the deletions of stem structures already alluded to above, and replacement of the entire RPRl coding region with the homologous mature coding regions from other RNase Ps. Complementation of an RPRl deletion strain required that the replacement RNAs be synthesized, assembled into RNPs, processed, and catalytically competent. This was attempted with the coding regions for the S. carlsbergensis, S. kluyueri, S. globosus, S. pombe, Homo sapiens, and E . coli RNase-P RNAs (55, 125). Because all of the genes retained the native RPRl promoter, all the RNAs were expressed. The yeast RNAs examined were stable at approximately the same level of expression. However, there was a direct relationship in the ability of the RNAs to complement only when they were processed. RNAs processed to mature form could substitute for RPRI, whereas those that remained the primary transcript size did not complement. It is not clear whether this was due to failure to assemble, process, or function. The complementing RNAs were from S. carlsbergenesis (86%identity with S. cereuisiae) and S . kluyveri RNA (70% identity with S. cereuisiae). The RNAs that did not complement were from S. globosus (65%identity with S. cerevisiae) and S . pombe (51%identity with S. cereuisiae).As expected, the human and bacterial RNAs also did not complement (55). On the basis of these data, it w a s possible to postulate that a significant percentage of the RNA sequence is not required for function in vivo, or for interaction with the protein component(s).
B. Randomization Mutagenesis of Universally Conserved Positions In order to obtain more subtle mutations, but ones directed against the heart of the enzyme, we chose to heavily mutagenize only those few posi~~~
~
~
with both the holoenzyme structure-probing data and the phylogenetic comparisons. (8)Changes in RNase-P RNA sensitivity to structure probes subsequent to deproteinization. RNase-P holoenzyme was incubated with proteinase K prior to exposure to structure-specific reagents. “Hits” were identified by primer extension analysis, and the reactivities were compared with those found in the holoenzyme (shownin A). Premature primer extension stops seen in control samples prevented assignments at those positions and are denoted by a small filled circle next to the nucleotide; all other symbols are defined in the key. Note that the shading of symbols (grey or black) refers to an increase in sensitivity to probes when the deproteinized RNase P RNA is compared to the holoenzyme, rather than to absolute strength of the hit. Enhanced reactivity is interpreted as exposure of protein-protected regions or rearrangement in the RNA structure following digestion of the RNase-P protein subunit(s). Asterisks next to open symbols denote decreased reactivities, which could be interpreted as structural rearrangements.
EUKARYOTIC m a s e P
109
tions in RPRl that are conserved across kingdoms (see Fig. 5). It seemed reasonable to postulate that these nucleotides serve essential functions in RNase P. To do this most effectively, three small regions of sequence, containing the majority of the conserved residues, were targeted for mutagenesis. These regions, referred to in Table I and Fig. 7 as A, B, and C, are at positions 87-94, 309-316, and 339-349, respectively, in the S . cerevisiae (Sce) RNA. For each region, a library of cloned genes was prepared from DNA fragments in which the four to five invariant positions had been randomized by PCR-based mutagenesis. The members of the gene library were then screened for an ability to serve as the sole copy of RPR1 in the cell (131). It was possible that no variation in the sequence would be permissible at these positions due to their strict conservation. In fact, we found that most of the 14 positions tested could be altered within some sequence context. Although only 2-496 of each library were viable genes, only positions G3x0 and G a g were intolerant of changes in what appeared to be a near-saturating screen. The sequences recovered from the viable strains are shown in Table I, along with their growth phenotypes. Over half of the variants, although viable, either grow slowly or are temperature sensitive. These variants are being used in genetic selections for intragenic and extragenic suppressors to study intramolecular and protein interactions. RNase-P holoenzymes have now been obtained in sufficient purity for kinetic analysis from each of the strains harboring variant RPRl copies. Investigation of the substrate binding and cleavage properties should prove useful in assessing the contributions of these positions to functional aspects of the enzyme.
C. A Subdomain Involved in Catalysis Table I also contains results from screening a fourth region randomization library, termed D, directed at positions not originally identified as having counterparts in the bacterial RNAs (Fig. 5). The stem/internal loop/stem structure from positions Sce 199-254 is one of the most conserved structural features of yeast nuclear RNase-P RNAs, yet it was not clear initially that there existed an analogous structure in the eubacterial consensus. However, further refinement of the bacterial consensus led to the refolding of the center domain of the bacterial RNAs into a structure not unlike the “orphan” yeast stem (32).In addition, a strong bacterial consensus on the 5’ side of the internal loop, ACAGAPuA, is similar in sequence and position to the conserved CAGAAA in the yeast internal loop (Sce 206-211). The resemblance was furthered when we used three-dimensional models of the E . coli and B . subtilis RNAs (132, 133) as templates from which to build a model of the S . cerevisiae RNA (134).When equivalent helices were assumed from the phylogenetic analysis, the yeast RNA could adopt a structure similar to the bacterial RNAs in most of its lowest free-energy forms using the YAMMP
JOEL R. CHAMBERLAIN ET AL.
110
TABLE I SEQUENCESAND PHENOTYPES OF VIABLERPRl Library
Sequence
Mutant
Growth Freq.
306-T T G 306-T T G 306-T T G
A A
Q T G Q T Q
HQT
HE
c c
A
WT
12
0-318
A 2 C 0-318
poor
12
C 0-318
WT
1
A T AC C
Q A-350
WT
9
338-AmC C C 0 A T A C C
84-0 T G
E
E
A
Q
m
2
6
ts
13
G 0-96
WT
16
G 0-96
ts ts
1
A A A T 31 Q
G D H A T
8 4 - G T C m G h A A T
203-T A T
8
t S
3 3 8 - A A I;: C C G B T A C C Q A - 3 5 0 p o o r / t s 3 3 8 - A A B C C G A T A C C p A-350 ts
84-0 T G Q G
1
C A m
(i
338-A A G C C 0
0-318
0 - 3 1 8 poor/ts
ax c
A
G C A
T
ZC
Brc
306-T T G A Q T G C A A 306-TMG
B
VARIANTS"
A-350
ICCGG-96
A A A T T C- 2 1 4 A T T
c- 214
3
WT
6
poor
1 1
203-T A T H A P a A A T T C-214
-
-
203-T A T E,MQ A A B T T C - 2 1 4
203-T A T
ma
Q A A A T T C- 214
203-T A T G A
fi A H A T T C - 2 1 4 P A A A T T C-214
HA
P A A B T T C-214
203-T A T
2 0 3 -T A T E, A
Q B A A T T C-214
EAH A
T T C- 2 1 4
1
ts poor poor
4 2
poor
1 1
WT poor
2 2
poor
2
WT
1
WT
3
WT
3
WT WT WT
2
WT WT WT WT WT
3
1 1
1 I 1 1
(continues)
EUKARYOTIC
RNaSe
111
P
TABLE I (Continued) Library
Mutant A,Q~C~,~
T~osT,,, T~Q~T,,,C,,~ TaosG207
Growth Freq.
Sequence 203 -T A T 203 -T A T 203-T A T
A PAH A T T
C- 214 T T C- 214 A T T C- 214
88P A A A
PE A P A A A T T C-214
WT
1
WT WT
1 1
WT
1
The names of the mutants represent the position and identity of the new nucleotide. Mutated positions are indicated by boxes on the DNA sequences. Also listed are the growth phenotypes and frequencies with which transformants were retrieved bearing the indicated sequences.
computer program (135). Although the new stem/internal loop/stem domain (containing the D library positions Sce 206-211) had not previously been placed in the structure because of lack of cross-linking distance constraints, most of the resulting structures for both bacterial and yeast RNAs had this domain in a position similar to the single example depicted for the yeast RNA in Fig. 7. This domain occurs near the side of the substrate cleavage site opposite the other essential regions (A, B, C). This would position it to participate directly in substrate binding or catalysis and is consistent with results from mutagenesis of Sce region 206-211. When library D positions 206-211 were randomized, a much larger variation in sequence was tolerated than was expected for such a conserved sequence (Table I). Although the viables still only amounted to less than 4% of the clones, no positions were absolutely unchangeable. Two of the positions, GZo8 and A2”, rarely varied from wild type, and then only in certain sequence contexts. A full analysis of the sequence variation is in progress (E. Pagin-Ramos and Y. Lee, unpublished). This library also contained a number of variants that conferred aberrant growth properties on the yeast. Unlike the previous libraries, this library was deliberately selected to provide this particular growth phenotype. For several reasons, we suspected that Sce region 206-211 would be primarily catalytic, rather than involved in substrate binding; more specifically, its function might be to coordinate one or more of the active site magnesium (110).Although strongly conserved in the bacterial RNAs, the region did not cross-link well to stably bound substrate and product tRNAs, suggesting a more transient functional interaction than other conserved domains of the ribozyme. Several lines of biochemical evidence suggested that the region was a site of preferential and functional interactions with divalent cations (86, 114, 115, 136). To test our hypothesis, we screened for viable variants of library D mutants using synthetic media containing 100 mM MgC& (the original pa-
f
A
EUKARYOTIC
RNaSe P
113
rental strain grows well at concentrations up to 300 mM) for at least a modest elevation in intracellular magnesium. We reasoned that if a small percentage of mutations caused inefficient magnesium usage, additional magnesium might rescue growth. As suspected, growth was uniformly better for most of the slow-growing or temperature-sensitive strains containing altered RPR1, but normal for the wild-type strain. Because many types of defects might lead to improved growth in the presence of high magnesium concentration (e.g., osmotic sensitivity), we isolated nuclear RNase-P holoenzyme from five of the most affected strains in which deficient growth was restored in the presence of elevated magnesium levels. Determination of their magnesium optima revealed that all the enzymes had optima at 10-12 mM MgC],, comparable to wild type, but the growth-defective variants generally had mildly to severely defective activity at low concentrations of magnesium (less than 4 mM). Kinetic constants were determined for these holoenzymes using a pre-tRNA substrate. Defects in catalysis were primarily reflected in decreases to one-half to one-seventh in k,,, in three of the five variants, with two of the holoenzymes having less than a twofold increase in K , (E. PagiinRamos and Y. Lee, unpublished). These data are consistent with the Sce 206-211 domain coordinating at least one of the magnesium cofactors involved in catalysis and may eventually be applicable to other catalytic mechanisms beyond RNase P. The aCAGAPuA sequence in this loop is identical to the conserved sequence in U6 small nuclear RNA at the catalytic heart of nuclear message splicing (137, 138) whose function in splicing has not been fully elucidated. It remains to be determined whether this sequence participates in coordination of a divalent cation cofactor.
D. An RNase-P Mutation Affects rRNA Processing In addition to their use in dissecting the RNase-P reaction mechanism, the RPR1 mutations provide tools with which to examine the role of RNase P in the eukaryotic cell. While examining small RNA production in the RPRIdefective strains we noticed a pronounced accumulation of an aberrant ribosomal RNA product in one of the strains displaying a temperature-sensiFIG. 7. Secondary and tertiary structure models of nuclear RNase-P RNA from the yeast Suchurotnyces cereoisiue. The upper panel shows the proposed secondary structure determined by phylogenetic and structure-probing analyses (Reprinted with permission from A. J. Tranguch et ul., Biochemistry 33, 1778, Copyright 1994 American Chemical Society.) This structure was used to derive a tertiary structure model for the enzyme-substrate complex of yeast RNase P (lower panel), extrapolating structural and distance constraints from eubacteria. Base-paired regions are shown as cylinders labeled P1-Pl5 in both panels. The tRNA substrate is shown as striped cylinders and a star represents the cleavage site. Regions of the RNA subjected to randomization mutagenesis in this work are labeled A-D in both panels, with asterisks denoting the actual randomized positions in the secondary structure.
114
JOEL R. CHAMBERLAIN ET AL.
tive growth phenotype in addition to defective pre-tRNA maturation (J. R. Chamberlain, unpublished). On investigation, it was found that the unusual accumulated RNA was a version of 5.8-S rRNA with about 35 unprocessed nucleotides at its 3’ terminus (Fig. 8). There was also a much less dramatic accumulation of RNAs that had received the early A2 cleavage in the internal transcribed spacer 1 (lTSl), but not further processed to give the mature 5’ terminus of 5.8-S rRNA. To test whether preribosomal RNA transcripts might be a substrate for RNase P, we used nuclear RNase P separated from other contaminating ribonucleases (including ribonuclease MRP) and demonstrated cleavage in uitro of pre-rRNA spanning 1TS1, 5.84, and 1TS2. The two strongest cleavage sites of several throughout this region were located between the A2 site and the 5’ end of 5.8-S rRNA. There is currently no information as to which of these sites might be used in uiuo, but the ability of RNase P to cleave pre-rRNA in conjunction with accumulationof an aberrant rRNA in the RNase P mutant suggest that defective 5.84 3’-end maturation in the mutant may be the result of the lack of pre-rRNA cleavage by RNase P or an associated enzyme. Support for additional roles for RNase P in the cell has come from studies of RNase-P temperature-sensitive mutants in E. coli. These studies reveal that RNase P processes several RNAs in viva known to fold into structures resembling pre-tRNAs. Two of the three alternate RNA substrates include 4.5-S, an RNA that is ribosome-associated and that has been implicated in translation initiation (57,58), and the polycistronic mRNA from the histidine
ITS1
18s
A3
1TTSZ I
5.8s
S ETS
A2
A4
B l L B l s C3
32s
25s
cz
c1
-FIG.8. The yeast rRNA transcription i i n i t . An enlargement of the internitl triuiscribetl spacer regions (ITS1 and 1TS2)and the 5 . 8 4 rHNA seqtience is shown Iwlon\. the dritwiiig of the full rRNA transcription unit. Previously charitcterized iti cico lirocessing sites iire shon\w Iteneath (143, 145). The extent of the al)rrrant rHNA from the HNase-P mutant is intlicatrtl 1)y a double-headed arrow.
EUKARYOTIC RNase P
115
operon (61). A small stable RNA from E . coli (1OSa) is also an RNase-P substrate, but its function in bacteria is unknown. A nuclear location for eukaryotic RNase P has been postulated from studies of pre-tRNA splicing mutants (139, 140). When pre-tRNAs are not spliced in these mutants, the accumulated products in the nucleus are endprocessed tRNAs. Other evidence for the nuclear localization of, at least, a portion of the RNase P in the cell was obtained from in situ hybridization in human cells using fluorescent antisense RNA probes directed against RNase P (144). Fluorescence was detected in a subnuclear location adjacent to the nucleolus, with some fluorescence also seen in the nucleolus. In this same study, another eukaryotic ribonucleoprotein endonuclease, RNase MRP, was shown to colocalize with RNase P by these methods. The colocalization of RNase P and RNase MRP was not unexpected, because a strong link had already been established in uiuo. This connection between RNase P and RNase MRP was first uncovered through examination of the RNA species coimmunoprecipated by sera from autoimmune patients (53).The T, antigen, recognized by the autoantibodies, associated with both RNase-P and RNase-MRP RNAs. RNase MRP subsequently was shown to function in the later stages of rRNA processing (141-143). Because the RNA components of both RNase P and RNase MRP associate with the same protein, Poplp (52), share structural similarities (43), and colocalize in the eukaryotic cell (144),it seems plausible that they also share aspects of function. A model has been proposed in an attempt to explain an evolutionary relationship between RNase P and RNase MRP (145). It is likely that RNase P and RNase MRP are different versions of what was once one enzyme. RNase P is probably the more ancient enzyme, because it is found in all species in all cellular compartments where tRNAs are synthesized. RNase MRP has been found only in eukaryotes, which suggests a more modern origin. The model is derived from the bacterial rDNA transcription unit in which a tRNA gene has been retained in the rRNA primary transcript between the large rRNA coding regions. The tRNA coding region may continue to reside at this position in the bacterial unit as a survival safety mechanism of the prokaryotic cell to ensure separation of the mature rRNAs by RNase-P cleavage. RNase MRP eventually evolved from RNase P and acquired its rRNA processing function, performing cleavage at the comparable eukaryotic site now critical in eukaryotic rRNA processing. RNase P appears to have retained a role, direct or indirect, in rRNA processing and may exist in a complex with RNase MRP as a component in this essential cellular pathway. Analysis of mutants from conserved sequence randomizations has already begun to provide information regarding function of the eukaryotic RNA subunit. One of these mutants has given us a glimpse of additional roles for
116
JOEL R. CHAMBERLAIN ET AL.
RNase P in the eukaryotic cell. It is reasonable to believe that other mutants
will give further clues to aspects of this enzyme. In addition to studies of the RNase-P RNA subunit, dissection of the associated protein by mutation in yeast should provide a powerful tool in understanding the structure and function of eukaryotie RNase P.
ACKNOWLEDGMENTS We thank Norman Pace, the Pace laboratory members, and James Nolan for valuable collaboration. This workwas supported by National Institutes of Health Grant RO1 GM34869 to D. R. E. J. R. C. was supported by National Institutes of Health Pre-doctoral Training Grant T32 GM07315. A. J. T. was supported bya Young Scientist M.D./Ph.D. Scholarship provided by the Life and Health Insurance Medical Research Fund. E. P.-R. was supported by the Rackham Merit Fellowship and the Merck Minority Graduate Student Fellowship.
REFERENCES T. R. Cech and B. L. Bass, ARB 55, 599 (1986). R. H. Symons, ARB 61, 641 (1992). S. Altman, Ado. Enzymol. Relat. Areas Mol. B i d . 62, 1 (1989). M. J. Wang, N. W. Davis and P. Gegenheimer, EMBOJ. 7, 1567 (1988). R. E. Heed, M. F. Baer, C. Guerrier-Takada, H. Donis-Keller and S. Altman, Cell 30,627 (1982). 6. C. Reich, K. J. Gardiner, G. J. Olsen, B. Pace, T. L. Marsh and N. R. Pace, JBC 261,7888 (1986). 7 . S. C. Darr, B. Pace and N. A. Pace, JBC 265, 12927 (1990). 8. D. T. Nieuwlandt, E. S. Haas and C. J. Daniels, JBC 266, 5689 (1991). 9. G. Krupp, B. Cherayil, D. Frendewey, S. Nishikawaand D. Soll, EMBOJ. 5, 1697 (1986). 10. J.-Y.Lee and D. R. Engelke, MCBiol9, 2536 (1989). 11. M. Bartkiewicz, H. Gold and S. Altman, Genes Deu. 3, 488 (1989). 12. M. Doria, G. Carrara, P. Calandra and G P. Tocchini-Valentini, NARes 19, 2315 (1991). 13. G. P. Jayanthi and G. C. Van Tuyle, ABB 296, 264 (1992). 14. M. J. Hollingsworth and N. C. Martin, MCBiol6, 1058 (1986). 15. H.-H. Shu, C. A. Wise, G. D. Clark-Walker and N. C. Martin, MCBiol 11, 1662 (1991). 16. S. C. Darr, J. W. Brown and N. R. Pace, TIBS 17, 178 (1992). 17. H. D. Robertson, S. Altman and J. D. Smith, JBC 247, 5243 (1972). 18. B. C. Stark, R. Kole, E. J. Bowman and S. Altrnan, PNAS 75, 3717 (1978). 19. K. Gardiner and N. R. Pace, JBC 255, 7507 (1980). 20. R. Kole, M. F. Baer, B. C. Stark and S. Altman, Cell 19, 881 (1980). 21. C. Guerrier-Takada, K. Gardiner, T. Marsh, N. Pace and S. Altman, Cell 35, 849 (1983). 22. C. Guerrier-Takada and S. Altman, Science 223, 285 (1984). 23. K. Kruger, P. J. Grabowski, A. J. Zaug, J. Sands, D. E. Gottschling and T. R. Cech, Cell 31, 147 (1982). 24. C. Cuerrier-Takada and S. Altman, Bchem 23, 6327 (1984). 25. H. Shiraishi and Y. Shimur;~.E M H O /. 7, 3817 (1988). 1. 2. 3. 4. 5.
EUKARYOTIC
mase
117
P
A. K. Knap, D. Wesolowski und S. Altman. Hiochirrrie 72. 779 (1990). B. D. James, G. J. Olsen. J. Liu and N . R. l’iice. Ce// 52, 19 (1988). B. D. James, G. J. Olseii and N . H. I’ace. . W c f / i o d y Eriz!yrnd. 180, 227 (1989). C. R. Woese and N . R. Pace, in “The H N A \VorkI (H.F. Crstelantl ;ind J. F.Atkins. c d a . ) , p. 91. CHSLab Press, Plainview. NY. 1993. 30. J. W. Brown and N. R. Pace. Hiochitnie 73, ti89 (1991). 31. J. W. Brown and N . R. Pace. NAHC.s 20, 14.51 (199.2). 32. E. S. Haas, J. W. Brown, C. I’itulle antl N. R. I’iice. P.\AS 91. 2527 (1994). 33. E. Akaboshi, C. Guerrier-Takada and S. A l t i i w i . HHHC 96, 831 (19801. 34. D. Smith, A. B. Burgin, E. S. Hails nntl N. II. 1 ’ ; ~ . J H C 267. 2429 (I<J921. 35. A. Tallsjo and L. A. Kirsel)om. SAHcv 21. 51 (1983). 36. C. Reich, G. J. Olsen, B. €?ice iiiitl N . II. 1’;ic.t.. Scicwcc, 239, 178 (19881. 37. A. Miczak, R. A. Srivastava and 11. Apirifni. l f d . .\licrcJ/Jio/. 5, I801 j 1!KJ1 I. 38. S. J. Talbot and S. A h a n . Hc/icwr 33. I.lO(i (19+J41. 39. N . P. Lawrence, A. Hichinan. H. Ainini iintl S. . ~ I t n i i i n .I’S.4.S 84. 6825 (19871, 40. D. S. Waugh and N. H. I’iict.. J . Acrct. 174. K3lfi (1990). 41. A. Vioque, J. Arnez und S. .411in;ui. J l I H 202. 835 (1988). 42. S. J. Talbot and S. Altn~iiii.Nc/ic.rn 33. 1399 (195J4). 43. A. C . Forster and S. .4ll1ni111. 62. 407 (1990). 44. C.-J. Doersen, C. (:iic.l.ric.I.~lithiul;i. ~ 1 i i i ; u i;ind (;. .4tturdi. JH<: 260. 5042 I 1HS.5). 45. S. Manam and C . C. \hi ’ lii! It.. J/N; 462. 1027.) (1987). 46. M . J. Morales, C. A. \\’ist.. \ I . J. tlollinpwirth iuntl S . C. Slartin. S.4Hv.v 17. Wi.5 1lW91. 47. L Kline, S. Nishikawa a i d I>. will. JSC 256. 5058 (1981). 48. N . Lawrence, D. Wesohski. f-1. <;oltl. 31. Bsrtkit.\c.iw. C. (;iirrric.i-liika[l~i. \\’. H. McClain and S. A h a n . C S / l S ( > / j 52, 23:3 (1987). 49. M. J. Morales, Y. L. Dang. Y. C. Lou, 1’. Sulo ;incl S . C. Martin. PS..\S 89, 987.5 (1992)). SO. Y. L. Dang and N. C . Martin, JHC 268, 19791 11993). 51. S. Zimmerly, D. Drainas, L. A. Svlvers ;md 1). Siill. EJR 217. 501 (lCfiJ3I. 52. Z. Lygerou, P. MitchelI, E. Petfalski, B. Seraphin and D. Tollervey, Genes Dev. 8, 1423 (1994). 53. H. A. Gold, J. N. Topper, D. A. Clayton and J. Craft, Science 245, 1377 (1989). 54. D. L. Miller and N. C. Martin, Cell 34, 911 (1983). 55. J.-Y. Lee, C. E. Rohlman, L. A. Molony and D. R. Engelke, MCBiol 11, 721 (1991). 56. B. Cherayil, G. Krupp, P. Schuchert. S. Chur iintl I). Siill. < h c , 60. 1.37 (1987). 57. A. L. Bothwell, R. L. Garber and S. Altman. J H C 251. 7709 (1976l. 58. D. B. Bourgaize and M. J. Fournier. Nofrrre 325, 281 (19871. 59. Y. Komine, M. Kitabatake, T. Yokogawa. K. Nichikawa and €1. lnokiiclii. / .\AS 91. 922:3 (1994). 60. J. W. Brown, D. A. Hunt and N . H. I?ice. NARes 18, 2820 (19%)). 61. P. Alifano, F. Rivellini, C. Piscitelli. C. hl. Arraiano, C. B. Bruni ;riitl 51. S. (~~irluniii~mi. Genes Deu. 8, 3021 (1994). 62. C. J. Green, B. S. Vold, M. 1). Morch. R. L. Josh antl A,-L. flwnni. J H C 263. 11617 (1988). 63. K. M. W., Mans, C. Giierrier-Xik;da. S . Altrn;in and C. \I:I’lt*ij. .Y.-\Hcs IN. 3479 I 19W. 64. Y. Kikuchi, N. Sasaki and Y. Aiitlo-~;iin;i~;iiiii.P.\AS 87, 8105 (19lM)l. 65. D. Soll, in “The RNA World“ ( H . F. (~estrl;~ntl and J. F. Atkiiis. t d b . ). 1). 15;. CSII Lid) Press, CSH, NY, 1993. 66. R. A. Koski, A. L. M. Bothwell ;inti S. Alfmiin. Cd/ 9, 101 ( I C J X ) . 67. A. B. Burgin and N. R. I’iice. I~.\fRO/.9, 4111 (1990). 68. S. Steinberg, A. Misch and M. Sprinzl. SAHes 21, :3011 (19931.
26. 27. 28. 29.
<:(a//
118
JOEL R. CHAMBERLAIN ET AL.
69. M. F. Baer, R. M. Reilly, G. M. McCorkle. T.-Y. Hai. S . Altman and U. L. HujBhantlary JBC 263, 2344 (1988). 70. C. Guerrier-Takada, W. H. McClaiti and S. A h a n . Cell 38, 219 (1984). 71. C. J. Green and B. S. Vold. JBC 263, 652 (1988). 72. B. S. Vold and C. J. Green. JHC 263, 14390 (1’3x8). 73. N. P. Lawrence and S. Altninn. J M R 191, 1&3 (1986). 74. C. Guerrier-Takada and S. Altmnn. C d l 45, 177 (1986). 75. D. S. Waugh and N. R. I’we, FASER 1. 7, IW (1W3). 76. D. S. Waugh, C. J. Green and N. H. Pace. Scicwct*244. 1569 ( 1989). 77. S. H. Kim, S. L. Suddath. C. J. Quigley. A. hlcPhersoii, J. L. Sussinan. A. H.J. \\brig. N. C. Seeman and A. Rich, Science 185, 425 (1974). 78. R. W. Schevitz, A. D. Podjariiv, N. Krishnainwliiiri. J. J. Hughes. 1’. 8. S i g h ond J. L. Sussman, Nature 278, 188 (1979). 79. N. H. Woo, B. A. Roe, and A. Rich. Ntttrtre 286. 346 (19XO). 80. D. Moras, M. B. Cornarmond, J. Fischer. R. Weiss. J. C. Thierr!; J. 1’. Elwl iuitl H.Giege. Nature 288, 669 (1980). 81. J. D. Smith, This Series 16, 25 (1976). 82. N. Leontis, A. DaLio, M. Strobel and I). Engelke. NARCS16, 2537 (1988). 83. 1. Willis, D. Frendewey, M. Nicliols, A. Htttiiiger-~Verleii.J. Schl\ack inid I). Siill. J H C 261, 5878 (1986). 84. N . Lumelsky and S. Altman, JMR 202, 443 (1988). 85. S. C. Darr, K. Zito, D. Smith and N. R. Pwe. Bclietti 31, 328 (19921. 86. K. Zito, A. Huttenhofer and N . H. Pace, NARes 21, 5916 (1993). 87. A Tallsjo, S. G. Svlrd, J. Kufel and L. A. Kirselwni, NARCS21. 3927 (199:3!. 88. C. K. Surratt, Z. Lesniktwski. A. L. Schifinan. F. J. Schmiclt ;ind S. 11. liecht. / R C 265, 22506 (1990). 89. W. H. McClain, C. Gnerrier-Tiikach and S. Altinan. Scicvico 238, 527 (I987I. 90. S. G. Svird and L. A. Kirwlmn. J.VR 227. 1019 (1992). 91. J. Schlegl, J. P. Furste, R. Bald. \’. A. Erdinann and R. K. Hartmann. .ViiHc,s 20, 5%3 (1992). 92. S. G. Svard and L. A. Kirsrhmi. SARes 21, 327 (1993). 93. P. S. Holm and G. Krupp, NARes 20, 421 (1992). 94. L. A. Kirsebom and S. G. Svird, NARes 20, 425 (1W2). 95. W.-D. Hardt, J. Schlegl, V. A. Erdmann and R. K. Hartmann, Bchem 32, 13046 (1993). 96. M. J. Hollingsworth and N. C. Martin, NARes 15, 8845 (1987). 97. D. Kahle, U. Wehmeyer and G. K ~ p p EMBO , J. 9, 1929 (1990). 98. D. Kahle, U. Wehmeyer, S. Char and G. Krupp, NARes 18, 837 ( 1W)O). 99. D. L. Thurlow, D. Shilkowski and T. L. Marsh. .\’ARc*.s 19. 6M (1991). 100. R. K. Gaur and G. Krupp, NARes 21, 21 (1993). 101. J. M. Nolan, D. H. Burke and N. R. I’ace. Scicwcc. 261, 762 (1993). 102. U. Burkard and D. SOH, NARCS 16, 11617 (1988). 103. G. Carrara, P. Calandra. P. Friisrwloni. hi. Doria und C. 1’. ~Kuliini-\~leiitini, Cell 58, (1989). 104. B. J. Carter. B. S. Vold and S. \ I . H e c k JHC 265, 7100 (1990). 10.5. G. Krupp, D. Kahle, T. \‘ogt iind S. Char. J M R 217, 637 (1991). 106. A. M. Pyle, Science 261, 709 (1993). 107. K. J. Gardiner, T. L. Marsh and N . H. P x e . JRC 260, Fi115 (1’3x5). 108. C. Guerrier-Takada, K. Haydock, L. Allen and S. Altman, Bchem 25, 1509 (1986). 109. C. K. Surratt, B. J. Carter, R. C. Payne and S. M. Hecht, JBC 265, 22513 (1990). 110. D. Smith and N. R. Pace, Bchem 32,5273 (1993).
EUKARYOTIC
111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124.
mase P
119
J.-P. Perreault and S. Altman, JAlR 226, 399 (1992). A. C. Forster and S. Altman, Science 249, 783 (19%)). R. G. Kleineidam, C. Pitulle. B. Sproat antl C . Eriipp. SAHc-s 21. 1097 1191J31. S. Kazakov and S. Altman, PNAS 88, 9193 (1991). J. Ciesiolka, W.-D. Hardt, J. Schlegl, 1: A. Erclniann iind it. E . Hiirtnimn. b:/H 219, 4<J
(1994). J. R. Rubin and M. Sundaralingam. J. Hiortiol. Sfrirc/./ h / t i . 1. (A19 i l<Jh:3). B. Streicher, U. von Ahsen and R. Sc.hrortlrr. A’AHCT21, 3 1 1 1199:)). S. Altman and C. Guerrier-Takadii. Hchetrr 25, 1205 (198(i). C. A. Wise and N. C. Martin. JRC 266, 191Fi( 11991). S. Zimmerly, V. Gamulin. U. Burkurrl and I). Soll. FEHS Ixrr. 271, I89 11WH)). J.-Y. Lee, C. F. Evans and I). R. Engelke. PXAS 88, (iY86 (1991). J.-Y. Lee, Ph.D. Thesis. l’nivtwity of hlicliigiin. A n n Arbor (1991). A. J. Tranguch and 11. R. Engrlkr. J H C 268, 14045 (1993). A. J. Tranguch, D. W. Kilitlel1)rrger.C. E. 1toliIni;in. ].-I:Lrr d 1). I\. i
33, 1778 (1994). 125. E. Pa&-Ramos, A. J. Triingtdi. 1). \\’. Einclrllwrger ;itid 1). It. Engc.lkt*..\:\/h~s 22. 2Ol) (1994). 126. H. A. Heus and A. I?irdi. J,\f/3 217, 11:) (1991). 127. J. A. Jaeger, J. SantaLncin. Jr. iind I. Titwo. Jr.. AHR 62. 255 (1W3). 128. D. H. Turner, N. Sitginioto ;ind S. .\I. Frrirr. h t w . H ~ T .B i o p h y . 5 . C/wtn. 17, 167 (l<JSh). 129. C. Ehresrnann, F. Baudin. h l . hlougrl. 1’. ltonil~y.I.-]’. KIWI iind H. E:lir~*~nienn. .Y:\Hvs 15, 9109 (1987). 130. S. Altman, D. Wesolowski and R. S. Puranam, Genomics 18, 418 (1993). 131. E. PagAn-Ramos, Y. Lee and D. R. Engelke, RNA, in press. 132. M. E. Harris, J. M. Nolan, A. Malhotra, J. W. Brown, S. C. Harvey and N. R. Pace, EMBO J . 13, 3953 (1994). 133. E. Westhof and S. Altman. PNAS 91, 5133 (1994). 134. A. J. Tranguch, Ph.D. Thesis. C’niwrsity of .\lieliigiin. A n n Arbor (1996). 135. A. Malhotra, R. K.-Z. Tan and S. C. Harvey. P S A S 87. 1950 (19%)). 136. W.-D. Hardt, J. M. Erdmann. \’. A. Ertlmann antl H. E. H;irtniiinn. b X H O /. 14. 29:)s (1995). 137. H. D. Madhani and C. Gutherie. Cell 71, 803 (1992). 138. J. A. Wise, Science 262, 1978 (1993). 139. 6. Knapp, J. S. Beckmann, P. F. Johnson. S. A. Fuhrman and J. Abrlsoii. (:d/14. 221 (1978). 140. A. K. Hopper and F. Banks. Cell 14, 211 (1978). 141. K. Shuai and J. R. Warner. NARes 19, 5059 (1991). 142. M. E. Schmitt and D. A. Clayton. .\fCHio/ 13. 733.5 (1993). 143. S. Chu, R. H. Archer, J. M. Zrngel ;ind L. Lintlalil. PSAS 91 (i5Y IlHW). 144. A. G . Matera, M. R. Frey I(. hlargdot and S. L. \Vttlin. /. C c # Rid. 129. 1181 1995). 145. J. P. Morrisey and D. Tollervey. TlHS 20, 78 (1995).
This Page Intentionally Left Blank
Effects of the Ferritin Open Reading Frame on Tra ns Iat io na I Induct io n by Iron DAVIDP. MASCOT TI,^ LISA S. GOESSLING,DIANERUP AND
ROBERT E. THACH~ Department of Biology Washington University St. Louis, Missouri 63130
I. The IRE and IRPs Are Necessary for Iron Inducibility of Ferritin ....
.......... 111. Comments and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Sequences Downstream of the IRE Augment Its Function
122 126 131 131
Ferritin, a multimeric iron-storage protein, is evolutionarily conserved from prokaryotes to eukaryotes ( 1 ) . It has been proposed that ferritin acts both to store iron for later use and to defend the cytosol against generation of potentially toxic free radicals via Fenton oxygen chemistry (2-5). This may have important implications for human health. Humans afflicted with hemochromatosis accrue extremely high iron levels due to deregulated iron uptake in the intestine. Genetic hemochromatosis is not uncommon in humans. Approximately 1 in 10 persons of European ancestry are heterozygous, and 1 in 300 are homozygous for defects in the HLA-A locus of chromosome 6 (6). In addition, recent evidence suggests a positive correlation between iron levels and coronary heart disease (7, 8).Thus, a clearer understanding of the expression of ferritin is crucial to sorting out the effect of iron on a variety of diseases as well as aging (8). The regulation of ferritin synthesis is tightly coupled to changes in intracellular iron concentrations in both cultured cells and model vertebrate organisms (9). Cells in vertebrate organisms respond to excess iron chiefly by post-transcriptional mechanisms (9-11). Factors that modulate the effect of iron on ferritin expression are cytokines [e.g., interleukin-lp (IL-lP) and tumor necrosis factor-a (TNF-a)], nitric oxide (NO), and oxidative agents 1 Present address: Richard Stockton College of New Jersey, NAMS Office, Pomona, NJ 08240-0195. 2 To whom correspondence may be addressed.
Progress in Nucleic Acid Research and Molecular Biology, Vol. 55
121
Copyright 0 1996 by Academic Press, Inc. All rights of reproduction in any form rerervrd.
122
DAVID P. MASCOTTI ET AL.
(e.g., H,O,). The relationship between these factors and iron on translational expression of ferritin is complex, but recent results are helping to sort out the connections.
1. The IRE and IRPs Are Necessary for Iron Inducibility of Ferritin All ferritin mRNAs in vertebrate organisms contain an iron-responsive element (IRE) near the 5’ cap (12).The IRE forms a stem-loop structure wherein the loop contains the consensus sequence CAGUGH (where H may be C, U, or A), adjacent to a canonical 5-base pair (bp) stem (Fig. 1)(13-15). Within the loop sequence, a base-pair interaction occurs between positions 1 and 5 (C and G, in the consensus sequence) (16, 17).In addition, there is a conserved unpaired cytosine residue on the 5’ side of the IRE immediately preceding the 5-bp stem. An IRE located in the 5’ untranslated region (5’UTR) of a mRNA serves as a binding site for two related types of iron-responsive proteins (IRP-1 and IRP-21, which, when bound to the IRE, serve to repress translation of that mRNA (9-12, 18, 19). Binding of IRPs to the IRE may prevent access of eIF-4F (the cap binding protein) to the 5’ cap structure, thereby blocking initiation (20-23). When iron levels increase in the cell, IRPs are induced to dissociate from the IRE, thereby allowing initiation of translation of ferritin mRNA (9, 10, 12, 18, 19). Expressed homologs of IRP-1 are found in organisms as diverse as chickens, frogs, fish, flies, humans, and mice (24), and cDNAs for IRP-2 have been obtained from rats and humans (25).The reason for the existence of two IRPs is unclear. Rat IRP-2 is 61%homologous to rat IRP-1 and the primary difference between the two is a 73-aminoacid sequence found in IRP-2 that is absent from IRP-1. IRP-2 binds to the same wild-type IRES as does IRP, but its specificity varies slightly (16). Messages other than ferritin contain a functional IRE in their 5‘ UTRs that serve to depress translation when iron levels are low. These messages encode erythroid b-aminolevulinic acid synthase (&ALAS) (22,26)and mitochondrial aconitase (mAcon) (27).The message for transferrin receptor (TfR; the major receptor protein involved in the cellular uptake of iron &om serum) contains several functional IREs in the 3’ untranslated region (3’ UTR) of its mRNA (28).While bound to the IREs in TfR mRNA, IRP confers protection from degradation by nucleases (29-31). Dissociation of IRP by iron uptake results in rapid degradation of TfR mRNA, which correlates in time with an activation of ferritin translation and creates a cycle for iron uptake/storage (9). In addition to its role as a binding site for the IRP, the IRE acts as a
FERRITIN TRANSLATIONAL REGULATORY SEQUENCE ELEMENTS
123
3
G5
2+*.A ...,,
IRE lC
8 to 10 nucleotides; some base-paired with opposing strand
C
E
K1
Elk
1
Base-paired Flanking Region
FIG. 1. A schematic representation of ii C O I I S ~ I I S IIHE. I~ The interaction lietween C - I and G-5 within the loop is indicated. H represents ;in\ iiiicleotitle except C. Also showii is ii Ixuepaired flanking sequence located 8-10 iiiiclrotitles froin the lower stein of the IHE. Iw)stubtetl to enhance iron inducibilitv of the IHE (1.2, .3.2).
translational enhancer element for ferritin in vitro in the absence of IRP (12, 32). Thus, the IRE can confer both positive and negative regulation of open reading frame (ORF) translation. A manifestation of this activity in viwo is that polysome formation on control mRNAs not containing an IRE is less efficient than on analogous mRNAs containing an IRE in cells treated with iron (33). The correlation of translational efficiency with the presence of an IRE is interesting, but of unknown significance. It has been postulated that an IRE adjacent to a 5' cap structure might enhance cap accessibility and thereby promote binding of translation initiation factors (12, 21). Direct binding of translation initiation factors to the IRE seems unlikely. In vitro RNA-bandshift experiments using lysates from heme- or iron-treated cells reveal no specific complex formation with a minimal IRE structure. These observations do not preclude a low-affinity interaction of some protein with the IRE that occurs in the absence of IRP. The weak stability of such a protein could be enhanced by association of the 40-S ribosome. Regions flanking the IRE account for a roughly twofold effect in transla-
124
DAVID P. MASCOTTI ET AL.
tional induction of ferritin synthesis in uitro (Fig. 1)(12, 34), consistent with observations that the affinity of IRP-1 for the ferritin IRE is increased when flanking regions are extended (35). However, studies using plasmid constructs containing genes with a complete IRE (containing flanking sequences) followed by a heterologous reporter ORF show routine induction by iron of no more than sixfold, whereas endogenous ferritin synthesis is induced 25- to 50-fold (11).Therefore, the difference between the iron inducibility found with an IRE alone versus that seen with endogenous ferritin must be due to sequences downstream from the IRE.
A. Effects of Iron and Hemin on the IRPs The question of how IRPs sense iron availability has been addressed by several laboratories. One widely accepted model for IRP-1 regulation (the “aconitase” model) holds that chelatable iron is directly involved in the ironsensing mechanism (9, 10, 19). This model was originally based on the homology of IRP-1 with mitochondria1 aconitase (mAcon), which forms an iron-sulfur cluster and has aconitase activity (36, 37). It has since been shown that IRP-1 can form an iron-sulfur cluster and exhibit aconitase activity like that of mAcon (37-41). IRP-1 with a fully loaded iron-sulfur cluster (i.e., cAcon) has only very weak affinity for IRES; therefore, iron serves to reduce the affinity of IRP-1 for IRES and thereby increases translation of its major storage protein. In another model of IRP-1 regulation (the “heme model”), it is postulated that influx of iron causes cytoplasmic heme levels to rise. Excess heme binds to IRP-1 and triggers its proteolysis (42).This process is thermodynamically irreversible, and new IRP-1 must be synthesized in order to restore the repressed state. IRP binds to heme in uitro and in uivo (43,44).The hemeIRP-1 complex apparently forms several large species that are subsequently degraded proteolytically in an ill-defined pathway (42, 45). This process is detectable only under certain cell growth conditions for IRP-1, so its significance is unclear (45). The iron response of IRP-2 differs strikingly from the aconitase model and more closely resembles the heme model for IRP-1 (25, 46-52). Recent evidence indicates that unlike IRP-1, IRP-2 cannot acquire aconitase activity in uiuo or in vitro (48).Iron administration down-regulates the IRE-binding activity of IRP-2 by way of a specific degradation of IRP-2 by the proteosome degradation pathway (25, 52). Also, inclusion of succinyl acetone (an inhibitor of heme biosynthesis) prevents the breakdown of IRP-2 by iron treatment but not by hemin treatment in RAB-9 cells (L. S. Goessling and R. E. Thach, unpublished results), suggesting that heme may be required for the degradation of IRP-2.
FEHHITIN TRANSLATIONAL REGULATORY SEQUENCE ELEMENTS
12s
B. Effects of Agents Other Than Iron on IRP Activity Nitric oxide (NO) is an extremely potent second messenger of many cytokine and neurotransmitter actions (53-56). NO affects many iron-containing proteins (55,57-59), including aconitases (53,56,60,61). The mechanism of NO action on iron-containing proteins occurs via metabolites of NO and oxygen (57, 58). For example, NO disrupts the iron-sulfur cluster in cAcon, and enhances the RNA-binding ability of IRP-1 (56, 60, 61). It has been suggested that one function of cAcon may be to sense NO and transduce this signal to alter iron metabolism (11, 59). This possibility is supported by results observed when rat cerebella are treated with N-methyl-D-aspartate (NMDA) (a neurotransmitter that, among other effects, induces nitric oxide synthase activity). NM DA activates the binding of IRP-1 to the IRE by action of NO (56). In contrast, recent results indicate that the IRE-binding activity of IRP-1 is not altered in rat liver during acute inflammation, when high levels of NO accrue (50). This suggests that the presence of NO may not act to convert cAcon to IRP-1 in acutely inflamed rat liver (50). Unlike IRP-1, IRP-2 RNA-binding activity increases in inflamed rat liver and TfR mRNA accumulates for approximately 18 hours. Treatment with NS-nitro-~-arginine(a NO synthase inhibitor) blocks these effects, indicating a direct involvement of NO in this process. Although IRP-2 contains the conserved cysteine residues required for iron-sulfur coordination, it has not been possible to detect aconitase activity under any condition (25, 48). Therefore, IRP-2 may not bind iron to form an iron-sulfur cluster for NO to disrupt. Thus, in two systems that address the question of whether cAcon transduces the NO signal to alter iron metabolism, cAcon can be either unaffected or converted to IRP-1 by NO. It is not known whether the outcome is dependent on cell type or pathway of NO induction. Nevertheless, because N O synthase activity is related to intracellular iron availability in murine macrophages (62),it is certain that iron metabolism and the inflammation response are connected, and a key mediator is NO. Hydrogen peroxide is a natural by-product of cellular metabolism, and is a potent oxidizing agent. Levels of H,O,, like NO, increase during the inflammatory response (63). Recently, it was observed that the IRE-binding activity of IRP-1 is increased by treatment of cultured B6 murine fibroblasts with 100 p M H,O, (64). H,O, did not alter the intracellular iron levels, indicating iron depletion is not likely to account for this effect on the IREbinding activity of IRP-1. The effect on RNA-binding ability was not observed in cell lysates, suggesting a role for cellular mechanisms initiated by
126
DAVID P. MASCOTTI ET AL.
H,O, rather than a direct action of H,O, on IRP (64).Very recent evidence suggests that H,Oz can act as a signal-transducing molecule for many cellular activities in mammals (65). Alternatively, HzO, could contribute to production of Fenton-derived oxygen radicals, which may act directly on the ironsulfur cluster of cAcon to transform it into IRP; however, cell lysates may lack metabolic pathways to reduce the oxidized IRP. The ultimate effect of both NO and H,O, on ferritin translation and/or TfR message stability seem to reverse the effects of iron treatment, although each appears to act on a different IRP by different mechanisms. One ramification of the recent results regarding the effect of NO andlor H,O, on IRP-1 and -2 during inflammation is that, at least in liver, inflammation induces a state whereby the IRE-binding activity of IRP is induced and TfR levels increase. Cells in the liver, therefore, take up serum iron, if it is available, during inflammation. Consistent with this picture, chronic inflammation from various sources is often accompanied by anemia in humans. Another factor to consider in iron metabolism is the growth status of the cells. IRP-1 can undergo phosphorylation mediated by protein kinase C (PKC) (66).Phosphorylation correlates with increased IRE-binding affinity of IRP-1 in uitru; however, the phosphorylatable serines of human IRP-1 are not present in human IRP-2 (66). The significance of this observation is not clear. Because mitosis is linked to PKC activity, the phosphorylation of IRP may also be related to cell proliferation and to cytokine stimulation of cells. We have begun to ask whether IRP-1 might be degraded by the hemin pathway only when phosphorylated. This would be consistent with reports that hemin induces more IRP-1 degradation in cells that are rapidly growing (45). The results from the aforementioned studies underscore the difficulty in unraveling the complex relationships between iron metabolism and other metabolic pathways.
II. Sequences Downstream of the IRE Augment Its Function
A. Cytokine-responsive RNA Sequences in the 5’ UTR of Ferritin Cytokines such as IL-lp modulate the expression of ferritin in the presence of iron (67, 68). The regulation is accomplished both by transcriptional (69-71) and translational mechanisms (72, 73). For instance, in human hepatoma cells, IL-16 increases the synthesis of both H- and L-ferritin (72) without increase of the corresponding mRNA levels (70-73). The induction of L-ferritin synthesis is accompanied by a shift of L-ferritin mRNA from
FEHRITIN TRANSLATIONAL REGULATORY SEQUENCE ELEMENTS
127
monosomes to polysomes, indicating an enhancement of its translatability (72). The individual effects of cytokines vary as reviewed previously (11, 68). A recent report shows that a 20-nucleotide sequence within the ferritin H-chain 5’ UTR, distinct from the IRE, is responsible for an approximately twofold enhancement of translation in vivo following treatment by IL-lP (74). This 20-nucleotide sequence is similar to sequences found in many mRNAs that encode for acute-phase response proteins; thus it has been named the acute box (67-73). When the iron chelator Des is added in conjunction with IL-1P, no change in L-ferritin translation is observed (72). Thus, the translational induction by IL-lP must occur after release of IRP from the IRE. The effect of the acute box is thought to act at the level of translational initiation.
6. Iron-responsive RNA Sequences in the Ferritin mRNA ORF Augment the Function of the IRE
In early studies, it was clearly demonstrated that the 28-nt IRE is both necessary and sufficient to confer iron inducibility on mRNAs (75, 76).However, it was noted that the full range of inducibility, exemplified by endogenous ferritin, is not achieved. Consistent with this observation, several reports over the past several years implicate sequences other than the IRE for optimal ferritin translational induction (12, 33, 34, 73, 77-79). Only one report has suggested that full iron inducibility of a heterologous ORF can be conferred by the IRE (80). More typically, endogenous ferritin synthesis undergoes a 25- to 50-fold induction on iron stimulation, whereas an IRE placed within the 5’ UTR of a human growth hormone (hGH) cDNA transfected into HeLa cells conferred only a 5- to 8-fold iron induction on hGH production (20, 75, 81). Similarly, an IRE placed within the 5‘ UTR of a CMV-promoted CAT gene transfected in Chinese hamster ovary (CHO) cells confers only 3-fold iron induction (compared to 27-fold for the endogenous ferritin) (33).A slightly better iron inducibility was obtained in HepG2 cells (8- to 16-fold) when a construct containing the ferritin H-chain IRE was placed at the 5’ end of a message encoding CAT (73);however, this is still lower than that expected for endogenous ferritin. Furthermore, sequences within the 5’ UTR do not enhance the iron inducibility of translation beyond that of an IRE alone (73, 81). The studies mentioned above regarding the lower iron inducibility, relative to endogenous ferritin, of mRNAs bearing an IRE and ferritin 5‘ UTR lead to the hypothesis that sequences within the ferritin ORF direct maximal iron inducibility. This possibility has been tested using stably transfected mouse C127 cells (Table I) (78). Transfection of a plasmid DNA construct [pBMCF(APst)T]containing all of the rabbit ferritin light chain (rfL)5’ UTR
DAVID P. MASCOTTI ET AL.
128 TABLE I PROTEIN SYNTHESIS IN
MOUSEC127 CELLS STABLYTRANSFECTED WITH PLASMIDS~
RATES OF &PORTER
Plasmid
0
mRNA produced
mRNA
Fold induction (ironino iron)
half-life (hours)
6
5
Protein is rabbit ferritin light chain (rfL.) or CAT. For details, see Ref. 78
and ORF proximal to a downstream CAT ORF results in messages that are iron inducible approximately 15-fold. The mRNA produced from transcription of pBMCF(APst)T is also quite stable (tlI2 20 hours). In contrast, transfection of a plasmid [pBMCF(ASacA)T] that contains only the rfL 5' UTR (including the IRE), followed by a CAT ORF, results in messages that are inducible approximately 6-fold (Table I). The mRNA produced from this construct is less stable than that produced from pBMCF(APst)T (tl,2 5 hours vs. 20 hours). These results strongly suggest that the IRE alone, or with the 5' UTR, provides only 40%, at best, of the inducibility of endogenous ferritin synthesis, in agreement with other data (33, 73, 81). Additionally, sequences downstream from the IRE may increase message stability. An interesting phenomenon observed with pBMCF(APst)T is that translation of the second ORF, CAT, is severely restricted. The ratio of rfL to CAT synthesis rates of transfected C127 cells grown in the presence of iron was greater than 500:l on a molar basis (78). Other messages in which the ORF was fused in-frame with a different reporter gene [encephalomyocarditis virus polypeptide (EMC), which served as an antigenic "tag] were also examined for this effect on downstream ORF suppression. The result was a ratio of rfL-EMC fusion protein to CAT synthesis of over 500:1, which was similar to that observed for pBMCF(APst)T. In contrast, a message that contained only the rfL IRE followed by the antigenic reporter gene displayed a ratio of 20:l for expression of the first ORF vs. the downstream CAT ORF, whereas a similar message entirely lacking the IRE produced a ratio of 2:l for the same two polypeptides. This
-
-
FERHITIN TRANSLATIONAL REGULATORY SEQUENCE ELEMENTS
129
loss of suppression of downstream ORF translation correlates strongly with the loss of iron inducibility and message stability (78). The basis for the effect of the downstream ferritin sequences on iron inducibility, mRNA stability, and downstream ORF suppression is unknown; however, it is plausible to interpret these observations in terms of mRNA and mRNP structure (see Fig. 2). In general, mRNA in the cytoplasm is known to be associated with an array of proteins. The associated proteins stabilize given conformations of the RNA and protect it from nucleases (82). Ribosome migration over a mRNA must necessarily unwind secondary/ tertiary structures and probably induces the dissociation of protein factors. If the equilibrium is shifted by the ribosomes such that the stabilizing proteins do not reassociate between passes of ribosomes, the mRNA could be exposed to nucleases. This effect is even more striking when an mRNA contains a specific destabilizing element, such as exists for a related set of mRNAs, pBMCF(ASacA) and pBMCF(APst). See reference 78 for details. Given these generally accepted observations about rnRNP particles, the effect of the ferritin downstream sequences on iron inducibility of translation, message stability, and downstream ORF suppression might be explained by the following model (Fig. 2). Here, cis-acting sequences or structures (regulatory sequence elements, RSEs) within the ferritin ORF serve as binding sites for one or more specific proteins (regulatory sequence proteins, HSPs). An RSP could interact with the IRP to stabilize a ternary complex, thereby increasing the range of iron inducibility. In addition, an RSP might mask the mRNA from nucleases as well as inhibit the translation of downstream ORFs. Figure 2B depicts a scenario wherein an RSP is interacting with an RSE within the ferritin ORF of a bicistronic mRNA. Bound IRP prevents association of the small ribosomal subunit, and an array of proteins, including RSP, protects the untranslated RNA from degradation. On induction by iron, the IRP dissociates and translation of the message occurs. If an RSP remains associated with a mRNP complex that covers the second ORF, only translation of the first ORF will occur. Additionally, the RSP could transiently associate with the RSE between passes of ribosomes to stabilize the mRNA from degradation. This is consistent with results obtained with PBMCF(APst)T wherein the mRNA is very stable, highly iron-inducible, and translation of the downstream ORF is greatly suppressed (Table I). Recent data corroborate the results shown here (D. P. Mascotti, L. S. Goessling, and D. Rup, unpublished data). Using a HeLa cell line stably transfected with the gene encoding the tet transactivator protein (tTa)as well as plasmids containing genes under control of the tTa, we have found that translational iron inducibility of a ferritin-luciferase (LUC) fusion protein is 14- to 20-fold. However, using the same system with a gene containing only
A
B
No RSE leads to mRNA instability during translation
ow-1
I
Presence of an RSE leads to mRNA stabilitv and downstream om SUDDESSiOn
om-2
WE
+ Iron
mRNA Degradation
-------
An
U
+ Iron
Key: IRP,
@ ;IRE,f
RNA-binding proteins,
a
translating ribosomes,
0
; RSP, .... I
;n o n m i c
; small ribosomalsubunit,
0;
B
FIG.2. Potential roles for regulatory sequence elements (RSEs) are shown schematically. (A) A scenario where a bicistronic message containing an IRE at the 5' end is translated on iron treatment. Lacking RSEs, the message is degraded during translation. (B) A scenario like that of A, except that there is an RSE located within the first ORF. The RSE provides a site fbr an RSP to bind the RNA to provide protection from nucleolytic attack as well as stabilize the inRNP covering the second ORF.
FERRITIN TRANSLATIONAL REGULATORY SEQUENCE ELEMENTS
131
the LUC ORF downstream from an IRE, there is only a 3- to 5-fold inducibility. This corroborates the results obtained earlier using a different transcriptional induction protocol (78).More work is under way to map the RSEs that are downstream from the IRE.
111. Comments and Future Directions There have been numerous reports demonstrating that the effect of the ferritin IRE on iron inducibility of heterologous open reading frames is not as potent as on that of endogenous ferritin mRNAs. The difference in inducibility is accounted for by sequences downstream from the IRE, and probably within the ferritin ORF (Table I); however, mRNAs that contain the ferritin ORF but lack an IRE have little or no iron inducibility. This indicates a requirement for the IRP to bind the IRE, and this interaction serves as the fundamental iron-responsive component of the translational regulatory system for ferritin. Treatment of cells with iron or hemin results in either formation of cAcon or degraded IRP; in both cases new IRP must generally be synthesized to re-repress ferritin translation (45). It thus appears that the IRP is a cornerstone of the untranslated ferritin mRNP. Beyond the interaction of IRPs with the IRE, other interactions are postulated here for proteins with regulatory sequence elements located within the ferritin ORF. We have speculated that these proteins may serve to stabilize the binding of the IRP and to protect the mRNA from endonucleolytic attack. It would be interesting to know if IRP-1 or IRP-2 interacts in the same way with proteins. These questions require further investigation.
ACKNOWLEDGMENT DPM is supported by an NIH postdoctoral fellowship (F32-DK08987).
REFERENCES 1 . M . J. Grossman, S. M. Hinton, V. Minak-Bernero, C. Slaughter and E. I. Stiefel, PNAS 89,
2419 (1992). 2. B. Halliwell and J. M . C. Gutteridge, Methods Enzymol. 186(B),1 (1990). 3. R. R. Crichton, in “Inorganic Biochemistry of Iron Metabolism” (R. R. Crichton, ed.), p. 213. Honvood Publishing, New York, 1991. 4. G. Balla, H. S. Jacob, J. Balla, M. Rosenberg, K. Nath, F. Apple, J. W. Eaton and G. M. Vercellotti, JBC 267, 18148 (1992). 5 . V. Herbert, S. Shaw, E. Jayatilleke and T. Stopler-Kasdan, Stem Cells 12, 289 (1994).
132
DAVID P. MASCOTTI ET AL.
6. J. W. Halliday and L. W. Powell, in “Iron and Human Disease” (R. B. Lauffer, ed.), p. 131. CRC Press, Boca Raton, FL, 1992. 7. J. L. Sullivan, Am. HeartJ. 117, 1177 (1989). 8. R. B. Lauffer, in “Iron and Human Disease” (R. B. Lauffer, ed.), p. 1 CRC Press, Boca Raton, FL, 1992. 9. R. D. Klausner, T. A. Rouault and J. B. Harford, Cell 72, 19 (1993). 10. 0. Melefors and M. W. Hentze, BioEssays 15, 85 (1993). 11. D. P. Mascotti, D. Rup and R. E. Thach, Annu. Reu. Nutr. 15, 239 (1995). 12. E. C. Theil, BiaFactors 4, 87 (1993). 13. Y.-H. Wang, S . R. Sczekan and E. C. Theil, NARes 18, 4463 (1990). 14. A. J. E. Bettany, R. S. Eisenstein and H. N. Munro, JBC 267, 16531 (1992). 15. S. R. Jaffrey, D. J. Haile, R. D. Klausner and J. B. Harford, NARes 21, 4627 (1993). 16. B. R. Henderson, E. Menotti, C. Bonnard and L. C. Kuhn, JBC 269, 17481 (1994). 17. H. Sierzputowska-Gracz, R. A. McKenzie and E. C. Theil, NARes 23, 146 (1995). 18. E. A. Leibold and B. Guo, Annu. Reo Nutr. 12, 345 (1992). 19. H. Munro, Nutr. Rev. 51, 65 (1993). 20. B. Goossen, S. W. Caughman, J. B. Harford, R. D. Klausner and M. W. Hentze, E M B O J . 9, 4127 (1990). 21. R. E. Thach, Cell 68, 177 (1992). 22. C. R. Bhasker, G. Burgiel, B. Neupert, A. Emery-Goodman, L. C. Kuhn and B. K. May, JBC 268, 12699 (1993). 23. N. K. Gray and M. W. Hentze, E M B O J . 13, 3882 (1994). 24. S. Rothenberger, E. W. Mullner and L. C. Kuhn, NARes 18, 1175 (1990). 25. B. Guo, J. D. Phillips, Y. Yu and E. A. Leibold, JBC 270, 21645 (1995). 26. B. K. May, C. R. Bhasker, M. J. Bawden and T. C. Cox, Mol. Biol. Med. 7, 405 (1990). 27. T. Dandekar, R. Stripecke, N. K. Gray, B. Goossen, A. Constable, H. E. Johansson and M. W. Hentze, EMBOJ. 10, 1903 (1991). 28. J. B. Harford, in “Control of Messenger RNA Stability” (J. Belasco and G. Brawerman, eds.), p. 239. Academic Press, San Diego, CA, 1993. 29. J. L. Casey, D. M. Koeller, V. C. Ramin, R. D. Klausner and J. B. Harford, E M B O J 8, 3693 (1989). 30. E. W. Mullner, B. Neupert and L. C. Kuhn, Cell 58, 373 (1989). 31. D. M. Koeller, J. L. Casey, M. W. Hentze, E. M. Gerhaardt, L.-N.L. Chan, R. D. Klausner and J. B. Harford, PNAS 86, 3574 (1989). 32. D. J. Dix, P.-N. Lin, Y. Kimata and E. C. Theil, Bchern 31, 2818 (1992). 33. R. M. R. Coulson and D. W. Cleveland, PNAS SO, 7613 (1993) 34. D. J. Dix, P.-N. Lin, A. R. McKenzie, W. E. Walden and E. C. Theil, J M B 231,230 (1993). 35. H. A. Barton, R. S. Eisenstein, A. Bomford and H. N . Munro, J B C 265, 7000 (1990). 36. M. W. Hentze and P. Argos, NARes 19, 1739 (1991). 37. T. A. Rouault, C. D. Stout, S. Kaptain, J. B. Harford and R. D. Klausner, Cell 64, 881 (1991). 38. D. J. Haile, T. A. Rouault, C. K. Tang,J. Chin, J. B. Harford and R. D. Klausner, PNAS 89, 7536 (1992). 39. D. J. Haile, T. A. Rouault, J. B. Harford, M. C. Kennedy, G. A. Blondin, H. Beinert and R. D. Klausner, PNAS 89, 11735 (1992). 40. A. Constable, S. Quick, N. K. Gray and M. W. Hentze, PNAS 89, 4554 (1992). 41. M. C. Kennedy, L. Mende-Mueller, G. A. Blondin and H. Beinert, PNAS 89,11730 (1992). 42 L. S. Goessling, S. Daniels-McQueen, M. Bhattacharyya-Pakrasi, J. J. Lin and R. E. Thach, Science 256, 670 (1992). 43. J. J. Lin, M. M. Patino, L. GafKeld, W. E. Walden and R. E. Thach, PNAS 88,6068 (1991).
FERRITIN TRANSLATIONAL REGULATORY SEQUENCE ELEMENTS
133
44. D. P. Mascotti, L. S . Goessling, D. Rup and R. E. Thach, in “Metal Ions in Gene Regulation” (S. Silver and W. E. Walden, eds.). Chapman & Hall, New York, (in press). 45. L. S . Goessling, D. P. Mascotti, M. Bhattacharyya-Pakrasi, H. Gang and R. E. Thach, J B C 269, 4343 (1994). 46. B. R. Henderson, C. Seiser and L. C. Kuhn, JBC 268, 27327 (1993). 47. G. Cairo and A. Pietrangelo, JBC 269, 6405 (1994). 48. B. Guo, Y. Yu and E. A. Leibold, ] B C 269, 24252 (1994). 49. F. Sarnaniego, J. Chin, K. Iwai, T. A. Rouault and R. D. Klausner, JBC 269, 30904 (1994). .50. G. Cairo and A. Pietrangelo, EJB 232, 358 (1995). 51. B. R. Henderson and L. C. Kuhn, JBC 270, 20509 (1995). 52. H.-Y. Kim, R. D. Klausner and T. A. Rouault, JBC 270, 4983 (1995). .53. N. Welsh, D. L. Eizirik, K. Bendtzen and S . Sandler, Endocrinology 129, 3167 (1991). 54. C. J. Lowenstein and S . H. Snyder, Cell 70, 705 (1992). .55. Y. Henry, M. Lepoivre, 1.-C. Drapier, C. Ducrocq, J.-L. Boucher and A. Guissani, FASEB J. 7, 1124 (1993). 56. S . R. Jdrey, N . A. Cohen, T. A. Rouault, R. D. Klausner and S. H. Snyder, PNAS 91, 12994 (1994). 57. L. Castro, M. Rodriguez and R. Radi, J B C 269, 29409 (1994). 58. A. Hausladen and I. Fridovich, JBC 269, 29405 (1994). .59. J, S. Stamler, Cell 78, 931 (1994). 60. J.-C. Drapier, H. Hiding, J. Wietzerbin, P. Kaldy and L. C. Kuhn, E M B O J. 12, 3643 (1993). 61. G . Weiss, B. Goossen, W. Doppler, D. Fuchs, K. Pantopoulos, G . Werner-Felniayer, H. Wachter and M. W. Hentze, E M B O J . 12, 3651 (1993). 62. G . Weiss, G. Werner-Felmayer. E. R. Werner, K. Grunewald, H. Wachter and M. W. Hentze, J . Exp. M e d . 180, 969 (1994). 6.3. G. Camussi, E. Albano, C. Tetta and F. Bussolino, E J B 202, 3 (1991). 64. K. Pantopoulos and M. W. Hentze, E M B O J. 14, 2917 (1995). 65. M. Sundaresan, Z.-X. Yu, V. J. Ferrans, K. Irani and T. Finkel, Science 270, 296 (1995). 66. R. S . Eisenstein, P. T. Tuazon, K. L. Schalinske, S. A. Anderson and J. A. Traugh, J B C 268, 27363 (1993). 67. J. T. Rogers, in “Iron and Human Disease” (R. B. Lauffer, ed.), p. 77. CRC Press, Boca Raton, FL, 1992. 68. S . V. Torti and F. M . Torti, A d a Inorg. Biochetn. 10, 119 (1994). 69. Y. Wei, S. C. Miller, Y. Tsuji, S. V. Torti and F. M. Torti, BBAC 169, 289 (1990). 70. L. L. Miller, S . C. Miller, S. V. Torti, Y. Tsuji and F. M. Torti, PNAS 88, 4946 (1991). 71. E. L. Kwak, D. A. Larochelle, C. Beaumont, S . V. Torti and F. M. Torti, JBC 270, 15285 (1995). 72. J. T. Rogers, K. R. Bridges, G. P. Durniowicz, J. Glass, P. E. Auron and H. N. Munro,JBC 265, 14572 (1990). 73. 1. T. Rogers, J. L. Andriotakis, L. Lacroix, G. P. Durmowicz, K. D. Kasschau and K. R. Bridges, NARes 22, 2678 (1994). 74. M. Falimy and S . P. Young, BJ 296, 175 (1993). 7 5 . M. W. Hentze, T. A. Rouault, S . W. Caughman, A. Dancis, J. B. Harford and R. D. Klausner, PNAS 84, 6730 (1987). 76. M. W. Hentze, S. W. Caughman, T. A. Rouault, J. G . Barriocanal, A. Dancis, J. B. Harford and R. D. Klausner, Science 238, 1570 (1987). 77. C. M. Harrell, A. R. McKenzie, M. M. Patino, W. E. Walden and E. C. Theil, PNAS 88, 4166 (1991). 78. S . Daniels-McQueen, L. S. Goessling and R. E. Thach, Gene 122, 271 (1992).
134
DAVID P. MASCOTTI ET AL.
79. I. Toth, J. T. Rogers, J. A. McPhee, S. M. Elliott, S. L. Abramson and K. R . Bridges, j B C 270, 2846 (1995). 80. S. W. Caughman, M . W. Hentze, T. A. Rouault, J. B. Harford and R. D. Klausner, j B C 263, 19048 (1988). 81. B. Goossen and M. W. Hentze, MCBiol 12, 1959 (1992). 82. G. Dreyfuss, Annu. Reu. Cell Biol. 2, 459 (1986).
Depletion of Nuclear PoIy (ADP- ribose) Polymerase by Antisense RNA Expression: Influence on Genomic Stability, Chromatin Organization, DNA Repair, and DNA Replication CYNTHIAM. G. SIMBULANROSENTHAL, DEANS. ROSENTHAL,RUCHUANG DING, JOANY JACKMAN AND MARKE. SMULSON~ Department of Biochemistry and Molecular Biology Georgetown University School of Medicine Washington, D.C. 20007 I. Biological Holes of PARP as Assessed by Studies with Chemical 1nhil)itors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Molecular Biological Approaches to Study Functi 111. Induction of PARP Antisense RNA Depletes Endogenous PARP
....... IV. Influences of Organization and Genomic Staldity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Effects of PARP Antisense RNA Expression on Nuclear DNA Repair,
VI. Other Putative Roles of PARP Currently under Study: Apoptosis
.....
139 144
151
Poly(ADP-ribosyl)ation, a post-translational modification of nuclear proteins wherein ADP-ribose moieties from NAD are covalently attached to acceptor proteins, plays an important role in the regulation of DNA strandbreak rejoining in a variety of nuclear processes, such as DNA replication, 1
To whom correspondence may be addressed.
Progress in Nucleic Acid Research and Molecular Biology, Vol. 55
135
Copyright 8 1996 by Academic Press, Inc All rights of reproduction in any form reserved.
136
CYNTHIA M. G . SIMBULAN-ROSENTHAL ET AL.
differentiation, recombination, and, particularly, DNA repair (1-19). This modification is catalyzed by poly(ADP-ribose)polymerase (PARP),a chromatin-associated, highly conserved enzyme that binds tightly to and is activated by DNA breaks, and modulates the structure and function of numerous nuclear proteins involved in either chromatin architecture or DNA metabolism (20). PARP encompasses three functional domains: a 46-kDa aminoterminal DNA-binding domain with two zinc-finger motifs acting as a sensor for DNA strand breaks and a bipartite nuclear localization signal; a central 22-kDa autopoly(ADP-ribosy1)ationdomain; and a 54-kDa carboxy-terminal catalytic or NAD-binding domain (21).
1. Biological Roles of PARP as Assessed by Studies with Chemical Inhibitors Many insights into the functional roles of PARP in various nuclear processes have been obtained largely from studies using PARP competitive inhibitors, i.e., benzamide and their derivatives. It has been proposed that PARP plays a role primarily in DNA repair, because treatment of cells with PARP inhibitors hypersensitizes these cells to DNA-damaging agents. Inhibition of PARP activity by chemical inhibitors in cells exposed to DNA damage increases the frequency of DNA strand breaks and sister chromatid exchanges, and enhances DNA repair replication (1-6) as well as DNA amplification (22). A role for PARP in cell proliferation is supported by studies showing that PARP expression and activity are markedly enhanced in various proliferating cells, newly replicated chromatin, and replication-fork-enriched DNA fragments (7-13). Addition of PARP inhibitors results in a G , arrest of the cell cycle (12). Likewise, PARP plays a role in differentiation as shown by induction of PARP expression and activity early in the differentiation process (17 , 19), and a decrease is correlated with terminal differentiation (18).Inhibitor studies have also shown that PARP inhibitors markedly inhibit differentiation of 3T3-Ll cells into adipocytes (23). Finally, a potential negative role for PARP in apoptosis (programmed cell death) is indicated in recent studies showing that PARP undergoes proteolytic cleavage into 85- and 25-kDa fragments during chemotherapy-induced apoptosis (24) as well as during spontaneous apoptosis (25). Inhibitors of PARP have previously been reported to affect significantly the extent of apoptosis (26,27),and it has been suggested that activation of PARP by DNA strand breaks rapidly depletes cells of NAD and ATP, leading to cell death (28,29).We have recently purified a protease termed apopain that is responsible for the cleavage of PARP during apoptosis (25).This protease is com-
EFFECTS OF PARP ANTISENSE RNA ON DNA METABOLISM
137
posed of two subunits of 17 and 12 kDa derived from a common proenzyme termed CPP-32p (25, 30), which is related to the interleukin-lp-converting enzyme and CED 3 jproducts of genes required for apoptosis).
II. Molecular Biological Approaches to Study Functional Roles of PARP Although inhibitor studies have proved useful in the elucidation of the biological functions of PARP and poly(ADP-ribosyl)ation, results are often contradictory, and these inhibitors have recently been shown to have limitations because of their lack of specificity, as well as inhibition of mono(ADPribosy1)ation reactions (31-33). Cloning of the PARP gene, however, has allowed the development of more specific molecular genetic approaches to modulate the expression and activity of PARP within cells in order to gain further insights into its biological roles. Some of these molecular approaches include the use of deletion mutants of PARP (6, 34), the use of “knockout” mice with a disrupted PARP gene (35),truns-dominant inhibition of PARP activity by overexpression of its DNA-binding domain (4, 36-38), and depletion of endogenous PARP protein and activity by antisense RNA induction at selected biological time frames (5,39-41). The latter approach is the focus of discussion in this essay.
A. Deletion Mutants of PARP Using bacterially expressed mutants of PARP with deletions in the three functional domains of the enzyme, it was shown that deletion of the NAD binding domain abolishes catalytic activity, whereas deletions in the DNA binding or automodification domain only partially inhibit it (34). Furthermore, although deletion mutants with an intact DNA binding domain inhibit DNA repair, those with a deletion in this domain have no effect on the in vitro repair assay (6).Inhibition of repair exhibited by mutants with deletions in the NAD binding or the automodification domain was not alleviated by NAD. These results support the proposed model on the mechanism of PARP action in DNA strand-break rejoining wherein unmodified PARP binds tightly to DNA breaks, and subsequent automodification releases the enzyme from DNA, allowing access to DNA repair enzymes (2).
B. Generation of Knockout Mice Another molecular genetic approach recently employed to study PARP function uses homologous recombination in embryonic stem cells to generate knockout mice with a disrupted PARP gene. These mice are viable and fertile despite the fact that they lack PARP, which constitutes one of the
138
CYNTHIA M. C. SIMBULAN-ROSENTHAL ET AL.
major nonhistone nuclear proteins in eukaryotes, and exhibit no poly(ADPribosy1)ation in all tissues. However, whereas DNA repair is unaffected in mutant embryonic fibroblasts, PARP-deficient mutant fibroblasts and thymocytes exhibit proliferation deficiencies following DNA damage, and older mice develop spontaneous skin lesions, suggesting that PARP plays an ancillary role in cellular responses to environmental disturbances (35).Apparently, isolated cell systems show the more subtle effects due to the lack of PARP that are not evident in whole animals.
C. Trans-dominant Inhibition of PARP by
Overexpression of DNA-binding Domain
Another molecular genetic approach recently employed to inhibit poly(ADP-ribosy1)ationis overexpression of the PARP DNA binding domain (DBD) in cells in either transient or stable transfections. This peptide causes a trans-dominant inhibition of resident PARP activity due to competition between endogenous PARP and the overexpressed DBD for DNA strand breaks (36). Inducible, high-level expression of this domain in stably transfected cell lines leads to 90% reduction of poly(ADP-ribosy1)ationand sensitizes cells to the cytotoxic effects of DNA-damaging agents, but has no effect on cellular proliferation (37). Trans-dominant inhibition blocks DNA repair synthesis induced by alkylation damage (base excision), but not by UV irradiation (nucleotide excision) (4). When HeLa cell lines were established constitutively expressing either the wild-type DBD or a point-mutated version of the DBD, leading to transdominant inhibition of PARP activity, both cell lines had a markedly increased sensitivity to DNA-damaging agents as evidenced by increase in doubling time, G,/M arrest, decrease in cell survival, apoptotic DNA ladders, and chromosomal instability in damaged cells (38). However, the mechanism of action of dominant-negative mutants appears to involve more than its DNA binding action, because lines expressing a mutated DBD lacking DNA-binding activity had the same effects as the wild-type DBD.
D. Depletion of Nuclear PARP by Antisense RNA Expression In the trans-dominant inhibition approach described above, although poly(ADP-ribosy1)ation is almost completely inhibited, the production of overexpressed PARP DBD has no effect on the endogenous levels of the PARP protein. Because it is known that cells contain much PARP, more than what is required for poly(ADP-ribose)synthesis (42), other approaches may be necessary to elucidate fully the biological roles of PARP. The antisense RNA expression strategy involves the establishment of cell lines stably trans-
EFFECTS OF PARP ANTISENSE RNA ON DNA METABOLISM
139
fected with PARP antisense cDNA under the control of an inducible promoter, which selectively reduces the levels of endogenous PARP mRNA transcripts. Consequently, this lowers the endogenous PARP protein levels as well as its activity at selected biological time-frame windows. It has been suggested that antisense RNA depletes cells of their complementary endogenous mRNA by forming unstable hybrid sense-antisense duplex RNA molecules that are rapidly degraded or are incapable of initiation of protein translation (43),thereby drastically reducing the levels of protein in the cell. Using this strategy, we have established and characterized a variety of mammalian cell lines, including HeLa cells (5,39),keratinocytes (40)and 3T3-Ll preadipocytes (41), which are stably transfected with an inducible PARP antisense RNA construct, under the strict control of the long terminal repeat of the mouse mammary tumor virus (MMTV LTR).
111. Induction of PARP Antisense RNA Depletes Endogenous PARP mRNA, Protein levels, and Activity at Selected Biological Time Frames
A. PARP Antisense Vectors and Cell Transfection Either a 3.9-kb XhoI full-length human PARP cDNA (for the human HeLa cell and keratinocyte transfections), or a 1.1-kb fragment of murine PARP cDNA encoding the PARP DBD and the N-terminal automodification domain (for the mouse 3T3-Ll cell transfections), was subcloned in an antisense orientation in the expression vector pMAM-neo (Clontech), under the control of the dexamethasone-inducible MMTV promoter (Fig. 1). These pMAM-As or pMAM-neo (control) plasmids were transfected into cells by calcium phosphate precipitation or by lipofection. G418-resistant colonies were screened for reduced PARP activity following dexamethasone induction, and DNA analysis of these positive clones was performed for correct integration of the PARP antisense sequence. In transfections with HeLa cells, keratinocytes, or 3T3-Ll cells, intact integration of pMAM-As into genomic DNA occurred in high copy numbers; in HeLa cells, multiple integration of the exogenous gene was also observed (5).
6. Expression and Stability of PARP Antisense
RNA in Transfected Cells The kinetics of antisense PARP RNA expression was studied by Northern analysis using riboprobes that hybridize specifically to PARP antisense RNA. Antisense RNA transcripts were detected 5 hours after induction with dexamethasone and remained relatively constant up to 48 hours in HeLa cells
CYNTHIA M. G . SIMBULAN-ROSENTHAL ET AL.
140
B
Antisense Orientation
0
lz
-I
E
I n n m
m
o c x
v
X
m
S V 4 0 early splicing and polyadenylation region V 4 0 ori and early promoter
S V 4 0 early splicing and polyadenylation region
Pst
I
FIG. 1. Structure and ri*strictioii 3itc.s of t l 1 r pblAhl-As plasmids contiliiiing (A1 IIIIIII~III or (B) murine PARP cDNA in untisc-nw oricwt;itioii tlownstreani of the Mh.1TV LTR. The expression vector contains the g l ~ ~ c ~ K ~ ~ r t i c ~ ~ i51 ( l51TI’ - i ~ ~proinoter ( l ~ ~ ~ i ligated l ~ l ~ ~t o i~3.9-kh Illllnitn 1.4)
141
EFFECTS OF PARP ANTISENSE RNA ON DNA METABOLISM
A
Control
As 5
I
o 28s-
5 244872
o
1
5 244872
,
Antisense18s
B
-+
I
’
Control 6 24 48 72 0
n K
89
.
Antisense
$2 624 48 72 0
‘
3.9 Kb-
FIG. 2. Expression atid stnbilit! 01 l’.AlW ailtisense HNA in (A) 3T3-LI prt.;aliptm\tt*sor ( 8 ) HeLa cells stably transfected with pl;isinitl p\IAM-As. following induction with clexiimethasone. Control (nontransfected) ct4s mid w l l a stubl! 1r;iiisfectedwith phlAh1-As werr iiiculxited with or without dexamethasone ( 1 p.W liir thc iiitlicatetl tiiiie periods. dier which totel I N . + \\ps isolated and 10 p,g was subjected to Northern triuisli.r imtl hyl)ridization with a :’2-P-lal)eled riboprobe that detects PAHP antisense HNA.
(Fig. 2B), or were significantly decreased by 48 hours in 3T3-Ll cells (Fig. 2A), and were partially or totally degraded by 72 hours (Fig. 2). In contrast, as expected, antisense RNA was not detected in control cells in the absence or presence of dexamethasone, nor in antisense cell lines in the absence of
or a 1.1-kh murine (B) PARP 5’ c D N A fraginciit. in rc‘\rrse orieiitutioii. The I’AHI’ srcliieiice was flanked downstream by the S\’-40 enrly splicing ;uid ~ ~ o l ~ ~ ~ l e i i \ I ~regions l a t i o i;id i ;i traitscription start site within the MMTV LTH \\its loc;itrtl 260 111’ cipstrt.:im of the cloiliiig d c . . l‘hr entire expression plasmid is 12.2 kl) ( A ) o r 9.4 kl) ( 8 ) .
142
CYNTHIA M. G . SIMBULAN-ROSENTHAL ET AL.
dexamethasone, confirming that the MMTV promoter is under tight control in these cells (Fig. 2) and that this inducible expression system is not “leaky.’’
C. Effects on Endogenous PARP mRNA Transcripts, Protein Levels, and Activity When the effects of PARP antisense RNA expression on the levels of endogenous PAHP inRNA transcripts were determined either by Northern analysis (Fig. 3A) or a more sensitive RNase protection assay (Fig. 3B), it was found that dexamethasone treatment for 48 to 72 hours had no effect on steady state concentrations of PARP mRNA in control cells, whereas the levels of endogenous PAHP transcripts in induced antisense cells were reduced by 80-95% in the three antisense cell lines studied. However, normal PARP inRNA concentrations were restored 16 hours after removal of dexamethasone (39). A significant reduction in iininunologically detectable PARP protein was observed concomitant with the depletion of endogenous PARP mRNA transcripts in induced antisense cells, as evidenced by immunoblot analysis of total cellular proteins extracted from control and antisense cells induced with 1 dexamethasone for various periods up to 96 hours (Fig. 4). The blots were probed with polyclonal antibodies to either human or murine PARP. The levels of PARP protein in all antisense cell lines studied were reduced by 80-90% after 48 hours exposure to dexamethasone, whereas PARP protein levels were essentially the same in treated control cells or in
--
A Control cells
Antisense cells
Dexamethasone NoDex Dex ame t has one NoDex n
0 5 48 729696
0 5 24487272
B Control cells
0 24 4 8
Antisense cells clone 2 clone 1 0 24 48 0 24 4 8
’ ’
’
FIG. 3. PARP antisensv HS A c.\pression depletes endogenous PARP mHS.4. <:oiiIrd n i d antisense cell clones (if 3T3-LIcc-11~(.4\ a i d keratinocvtes (B) were incul~atrilwith o r w i t h i t dexamethasone (1 K M ) for the intlicated tiine periods. and totd HNA wiis tlwn iso1atc.d iiiitl analyzed by Northern analysis (A1 or I)! ril)oiiwlease protection assay (8). using railiiiliilx*lcd PARP mRNA-specific probes.
EFFECTS OF PARP ANTISENSE RNA ON DNA METABOLISM
Antisense cells
A Control cells 1
-0- 24 -48
1
I
Dexarnethssone NoDen
7 2 72
B Control cells o 2 1 4a 72
143
DexamBthasone
Nzeh
0 24 48 7 2 96 96 - - - -
-
Antisense cells 0 24 4 8 72 96
uninduced antisense cells. To achieve significant depletion of PARP protein levels, an induction period of antisense expression of at least 48 hours is necessary, because it is known that PARP has a half-life of about 48 to 72 hours in cells (44). PARP protein levels in induced antisense cells correlated with a significant reduction in endogenous PARP catalytic activity, which was measured directly in sonicated total cell extracts. As summarized in Table I, PARP activity decreased by about 50-60% by 48 hours after induction with dexamethasone, with maximal inhibition (70-80%) by 72 hours in both HeLa cells and 3T3-Ll antisense cell lines. The effect was even more significant in keratinocyte antisense cell lines where 80%reduction in enzyme activity was observed by 48 hours after induction. In contrast, uninduced antisense cells, or control cells treated with dexamethasone for the same time periods, exhibited no reduction in endogenous PARP activity. In the case of keratinocyte control cells, a slight reduction (13%)in endogenous PARP activity was observed. These results are consistent with reports that translatable endogenous sense mRNA is depleted by complementary antisense RNA when they hybridize and form duplex RNA molecules in nuclei (45), followed by inhibition
144
CYNTHIA M. G . SIMBULAN-ROSENTHAL E T AL. TABLE I PARP ACTIVITY Is DEPLETED IN ANTISENSE 3T3-Ll PREADIPOCYTES, HELA CELLS, AND &RAT1NOCYTESa PARP activity of antisense cells (% activity relative to uninduced cells) Time after Dex induction (hours)
3T3-Ll
HeLa
Keratinocytes
0
100
65 50 30
100 nd 43 17
100
24 48 72
100 20 nd
11 Antisense cell lines were induced by dexamethasone (1 F M ) for the indicated time periods, sonicated, and initial velocity assays of [=PINAD incorporation were performed as described previously (68).Values represent the average of triplicate determination. nd, Not determined
of nuclear to cytoplasmic transport of the sense mRNA and/or its enhanced degradation by RNase H, which is activated in the process (43). Consequently, inhibition of protein translation results in eventual depletion of endogenous protein levels and activity within the cell.
IV. Influences of PARP Antisense RNA Expression on Chromatin Organization and Genomic Stability We have shown that the induction of antisense RNA expression can selectively lower the levels of endogenous PARP mRNA transcripts and protein levels, as well as total catalytic activity in stably transfected mammalian cells. The use of these PARP-depleted cells has allowed us to gain insights into some of its biological roles. Because the antisense RNA strategy depletes the level of the enzyme protein, when required, rather than simply causing inhibition of its activity, as in studies with PARP inhibitors or in the trans-dominant inhibition approach, this method has advantages because PARP has essential structural as well as catalytic roles in chromatin. A structural role of PARP in chromatin organization has been supported by earlier studies showing that PARP levels and activity per unit of chromatin change with polynucleosome chain length, and that PARP is bound in internucleosomal regions of chromatin (46)with a fixed periodicity of approximately one enzyme molecule per 8 to 10 nucleosomes (47,48).Depletion of PARP in HeLa cells by induction of antisense RNA evidently alters chroma-
145
EFFECTS OF PARP ANTISENSE RNA ON DNA METABOLISM
tin structure as demonstrated by increased susceptibility to either DNase digestion or to micrococcal nuclease digestion (Table 11). This hypersensitivity of chromatin can be attributed to localized changes in the nuclear organization of the 300-A chromatin fiber. PARP also complexes with and modifies histone H1 (47), which plays an important role in the maintenance of higher order chromatin structure. Poly(ADP-ribosy1)ation of histone H1 results in the formation of H1 dimers cross-linked by polymer chains, and is implicated in polynucleosome condensation (49) and relaxation of chromatin (50). Depletion of nuclear PARP by antisense RNA expression in HeLa cells decreases poly(ADP-ribosy1)ation of histone H1, although nuclei of antisense cells are still capable of H1 cross-linking (39). The extent of total protein poly(ADP-ribosy1)ation in induced antisense cell nuclei is also significantly lower than in control cells or in uninduced antisense cells, and newly synthesized polymer chains in antisense cell nuclei are markedly shorter, indicating reduced initiation and elongation of polymer chains (Table 11). Depletion of PARP by induction of antisense RNA therefore alters chromatin structure presumably by changing the periodicity of PARP in chromatin, and concomitantly decreases the extent of poly(ADPribosy1)ation of nuclear acceptor proteins.
TABLE I1 EFFECTSOF PARP DEPLETION BY ANTISENSE RNA INDUCTIONON CHROMATIN SENSITIVITY TO DNASEAND MICROCOCCAL NUCLEASE DIGESTION, TOTALPOLY(ADP-RIBOSE) SYNTHESIS, POLYMERCHAINLENGTH,AND CELLRESISTANCE TO PALAa
DNA digested (%)
Cells
DNase
Micrococcal nuclease
Antisense cells +'Dex Control cells + Dex Antisense cells - Dex Control cells - Dex
45 20 nd nd
75 40 45 40
Total Poly(ADP-ribosyl) synthesis ( x 105 cpm)
Number of PALA-resistant colonies (per 105 inoculated cells)
Maximal polymer size ~
250 430 550 440
21-25 36-40 36-40 31-35
~
~~
15-20 0-3 2-5 0-3
'' Antisense and control HeLa cells were induced with 1 pM dexamethasone for 72 hours, labeled with [3H]thymidine (after which nuclei were isolated), and digested for 5 minutes with DNase (50 unitshl) or for 60 minutes with microcowal nuclease, released DNA was then measured by scintillation spectroscopy, and the percent DNA digested was calculated. Induced and uninduced control and antisense nuclei were also pulsed with[3zP]NAD (2 pCi) for 45 seconds and chased with cold NAD for 5 to 10minutes, after which part of the nuclei were lysed, precipitated with TCA, and total radioactivity determined by scintillation spectroscopy; or ADP-ribose polymers were released from proteins and subjccted to electrophoresis on a DNA sequencing gel, and polymer chain length was determined. Cells were repeatedly induced with 1 mM dexamethasone and continuously exposed to PALA at a dose of 9 X LD,, and the number of PALAresistant colonies were counted after 4 to 5 weeks.
146
CYNTHIA M. G . SIMBULAN-ROSENTHAL ET AL.
Altered chromatin structure due to PARP depletion in antisense cells may in turn cause changes in the frequency of gene amplification, an indicator of genomic instability in neoplastic cells (51). Inhibition of PARP by 3-aminobenzamide enhances MNNG-induced DNA amplification 2- to 6-fold in SV-40-transformed cells (22). Consistently, amplification of genes coding for selectable markers (i.e., CAD2 gene) is greatly increased in the induced antisense HeLa cells, as shown by a 10-fold increase in the number of PALA3-resistant colonies (Table 11)and a significantly larger average colony size, indicative of multiple amplification copies of the CAD gene (22). Measurement of CAD gene copy number of induced PALA-resistant antisense cells by slot-blot hybridization further showed an average of 3- to 5-fold amplification of the gene compared to resistant uninduced cells (39). Depletion of PARP by antisense RNA expression causes an increase in gene amplification and genomic instability in these cells, which may be attributed to a number of mechanisms such as inhibition of NAD depletion, accumulation of DNA strand breaks, cell-cycle perturbation, and alterations in the hnctions of SV-40 T antigen (22).
V. Effects of PARP Antisense RNA Expression on Nuclear DNA Repair, Replication, and Differentiation
A. DNA Strand-break Rejoining, DNA Repair, and Cell Survival after DNA Damage Numerous studies support the view that the major role for PARP and poly(ADP-ribosy1)ation is the regulation of DNA strand-break rejoining, DNA repair, and in cell recovery from DNA damage (20). PARP activity increases in cells treated with genotoxic DNA alkylating agents, such as methyl methane sulfonate (MMS) and ionizing radiation, that induce singlestrand breaks (SSBs) (&), and PARP has high affinity for DNA containing SSBs (48). Using the antisense-RNA-expression approach, we have shown that the initial rate of DNA repair after exposure to MMS is significantly inhibited in PARP-depleted HeLa cells, which cannot undergo DNA strand-break rejoining (Table 111).However, there was no difference in the extent of DNA repair at later time points (90minutes), indicating that even the low concentrations of PARP in the antisense nuclei are still sufficient to allow DNA 2
CAD, Carbamoyl phosphate synthase, aspartate transcarbamoylase, dihydroorotase. PALA, (Phosphonoacety1)-L-aspartate.
147
EFFECTS OF PARP ANTISENSE RNA ON DNA METABOLISM
TABLE 111 EFFECTSOF PARP DEPLETIONBY ANTISENSERNA INDUCTION ON SURVIVAL OF MMS-TREATEDCELLSAND REPAIR OF MMS-INDUCEDSSBs
Cell survival (%p
SSB repaired (%) after 2 mM MMS treatmentb Repair time (minutes)
Cells
MMS (2 mM)
Nitrogen mustard (10 p M )
10
20
45
90
Antisense + Dex Control + Dex Antisense - Dex Control - Dex
7 65 60 55
0.42 3.53 3.74 3.53
1 35 33 34
3 80 58 79
25 90 70 80
90 90 90 90
a Antisense and control cells were induced by dexamethasone (1 F M ) for 48 hours, washed and treated with 2 mM MMS for 1 hour or grown in MMS-free medium for 2-3 weeks, and the number of colonies containing >40 cells was determined to assess colony-forming ability. b Control and antisense cells were treated with 2 mM MMS for 1 hour subsequent to the usual incubation with or without dexamethasone, and the treated cells were allowed to repair DNA for the indicated times, after which MMS-induced SSBs were quantiated by alkaline elution (34).
repair. Consistently, the truns-dominant inhibition of endogenous PARP activity by transient overexpression of the PARP DNA binding domain (DBD) described earlier causes a similar block in DNA repair synthesis induced by alkylation damage (base excision pathway) (4). Aside from blocking SSB rejoining, inhibition of PARP by PARP DBD results in enhanced cytotoxicity of DNA-damaging agents, increased doubling time, G,/M accumulation, a significant reduction in cell survival on exposure to DNA-alkylating agents such as MNNG, as well as higher frequencies of sister chromatid exchanges (38). These results are consistent with our observations that induction of antisense RNA increases the doubling time of induced antisense cells by 50% relative to uninduced cells (5). Induced antisense cells depleted of PARP are also 10-fold more sensitive to MMS compared to uninduced cells, and there is a 90% reduction in cell survival of PARP-depleted antisense cells exposed to MMS or nitrogen mustard (Table 111). In response to DNA damage from DNA alkylating agents or ionizing radiation, there is usually a marked increase in PARP activity and the extent of poly(ADP-ribosy1)ation in cell nuclei, which is generally assayed by incorporation of NAD into cells. This was confirmed recently by double immunofluorescent studies using human keratinocyte control and antisense cell lines induced with dexamethasone, treated with MNNG to induce DNA strand breaks (40),and then stained with polyclonal antibodies specific for
148
CYNTHIA M. G. SIMBULAN-ROSENTHAL ET AL.
human PARP as well as for poly(ADP-ribose) polymer. Treatment of control cells with MNNG caused a marked increase in the polymer levels as well as PARP protein, whereas induced antisense cells exposed to MNNG exhibited a drastic decrease in the amount of polymer and PARP compared to both control cells and uninduced antisense cells (40). Thus, keratinocytes depleted of PARP by antisense RNA expression show a reduced response to DNA damage.
B. Differentiation and DNA Replication Competitive inhibitors of PARP markedly inhibit differentiation of 3T3-
L1 cells into adipocytes by preventing a transient increase in PARP activity that appears to be essential for entering into the digerentiation program (23). PARP mRNA levels are significantly enhanced 12 to 24 hours after HL-60 cells are induced to differentiate with retinoic acid or Me, SO (17), whereas a decrease in PARP mRNA occurs on their terminal differentiation into granulocytes (18). Consistently, stably transfected 3T3-Ll cells expressing antisense RNA do not show the increase in PARP protein and activity normally apparent 24 hours after exposure to inducers of differentiation (Fig. 5A), and they eventually fail to differentiate into adipocytes (41).The failure of antisense cells to synthesize PARP during the first 24 hours of differentiation appears related to their inability to undergo the round of DNA replication characteristic of the initial stages of differentiation, as assessed by either cell growth data (Fig. 5B) or incorporation of thymidine into nascent DNA (Fig. 5C). Consistently, flow cytometric data revealed that, while control 3T3-Ll cells go through one round of DNA replication prior to the onset of terminal differentiation, antisense cells appear to be blocked at G,/G, phase of the cell cycle (55). Double immunofluorescent studies likewise demonstrated significant incorporation of BrdU and PARP protein expression 24 hours after induction of differentiation in control cells, but not in antisense cells, indicating temporal coincidence of PARP expression and the onset of DNA replication. DNA replication is thought to be the primary decision point in the initiation of differentiation, because it results in reconfiguration of chromatin and consequently sets and changes committed patterns of gene expression (53).The failure of antisense 3T3-Ll cells to differentiate terminally into adipocytes may therefore be attributed partially to their inability to undergo sufficient replication-associated reconfiguration of chromatin. Alternatively, PARP may be required for differentiation-linked DNA replication partly because it forms a complex with DNA polymerase a during this initial stage of differentiation, as evidenced by coimmunoprecipitation of
I
1
'1
- 0
- 0
0
2
4
6
0
8
2
4
6
8
Time after Induction of Differentiation (days) 12
0
0
20
40
Control Cells Antisensecells
60
80
Time after Induction of Differentiation (hours) FIG. 5. Time courses of PARP activity (A), cell proliferation (B), and in uiuo DNA replication (C) in control and antisense 3T3-Ll cells after incubation with inducers of differentiation. Two days after achieving confluency, 3T3-Ll cells were exposed to inducers of differentiation (dexamethasone, methylisobutylxanthine, and insulin). At the indicated times, cells were assayed for PARP activity by the sonication method (A), counted with a hemocytometer and trypan blue dye and with a Coulter counter (B), or pulse labeled for 15 minutes with [3H]TdR (0.2 pCi/ml), and the acid-insoluble radioactivity was then measured by liquid scintillation spectroscopy (C). Values are the means of duplicate determinations.
150
CYNTHIA M. G. SIMBULAN-ROSENTHAL ET AL.
both proteins in control cells but not in PARP-depleted antisense cells (Fig. 6). PARP specifically binds and dose-dependently stimulates the activity of DNA polymerase a in uitro, an effect that is lost when PARP is automodified; both activities copurify in a larger complex, indicating that the two proteins associate in uiuo (14). A part of the activity of topoisomerase I, a major acceptor of poly(ADP-ribosy1)ationin uiuo, also copurifies with PARP (54). Complex formation between PARP and DNA polymerase a suggests that PARP may exist within nuclear replicative complexes, which was further confirmed by the colocalization of PARP protein and distinct intranuclear granular foci associated with DNA replication centers using double immunofluorescence staining and confocal image microscope image analysis of 3T3-Ll cells at S phase (55). More significantly, when purified replicative complexes from other cell types that had been characterized for their ability to catalyze viral DNA replication in uitro were analyzed for the presence of PARP, it was found to exclusively copurlfy through a series of centrifugation and chromatography steps with the core proteins of a multiprotein replication complex (MRC) from mouse FM3A cells and a corresponding MRC from HeLa cells (55). These MRCs have been previously well characterized and shown to con-
Immunoprecipitated with anti-DNA pol d
or
Differentiation 0
0
-
4 2 -
Pre-immune serum No Ab
--
24
24
24
24
5
6
7
8
kDa 200PADPRP= 851
2
3
4
FIG.6. PARP and DNA polymerase a associate during early stages of differentiation in control cells but not in antisense cells. Cells were lysed prior to differentiation (time 0) or 24 hours after induction of differentiation, and aliquots of cell extracts (50 pg) were immunoprecipitated with an anti-DNA pol IX antibody. The immunocomplex was then separated by SDS gel electrophoresis, transferred to nitrocellulose, probed with a polyclonal anti-PARP antibody, and detected by enhanced chemilumipescence (ECL). Lanes 1, 3, 4, 7, and 8 are samples from control cells at time 0 (lane 1)or 24 hours after induction of differentiation (lanes 3, 4, 7, and 8); lane 7, control cell extracts incubated with only protein A-Sepharose; lane 8, preimmune serum was used instead of anti-DNA pol a antibody; lanes 2, 5, and 6, samples from antisense cells either at time 0 (lane 2) or at 24 hours (lanes 5 and 6); lanes 3 and 4, as well as lanes 5 and 6, are duplicate determinations.
EFFECTS OF P A W ANTISENSE RNA ON DNA METABOLISM
151
tain replicative enzymes necessary for leading- and Iagging-strand DNA synthesis, including DNA polymerases a and 6 , DNA primase, DNA ligase, DNA helicase, and topoisomerases I and 11, as well as accessory proteins such as PCNA, RF-C, and RP-A (56-59). It has been suggested that the interaction of PARP with the replicative apparatus implies that it may function as a molecular nick sensor, controlling the progression of the replication fork when DNA strand breaks are present, such as during DNA damage, ensuring that lesions are not replicated before repair (21). Interestingly, immunoblot analysis of MRCs from both cell types with antipolymer antibody revealed the presence of about 15 poly(ADP-ribosy1)atedproteins (55), thus indicating that, aside from its role as a molecular nick sensor, PARP may also play a regulatory role within the replicative apparatus by modulating component replicative enzymes or factors in the complex by directly associating with them or by catalyzing their poly(ADP-ribosy1)ation. Although various replicative enzymes are acceptors for poly(ADP-ribose) in uitro, further studies are being directed at elucidating which of the replicative enzymes and protein factors in the MRC serve as physiological poly(ADP-ribose) acceptor proteins in uivo, and ultimately how the modification d e c t s their functions within the complex. Furthermore, because only the complex isolated during S phase is capable of supporting in uitro DNA replication, although MRC-component enzymes exist in complex form throughout the cell cycle (59),modification of the MRC-component proteins by poly(ADP-ribosyl)ation may be partly responsible for conversion of the MRC from the latent to an active form.
VI. Other Putative Roles of PARP Currently under Study: Apoptosis PARP has been implicated early in the processes relating to DNA damage processing. Antisense studies against PARP demonstrate that cells in which the level of PARP is suppressed display delays in strand-break rejoining, alterations in chromatin conformation, and decreased cell survival following genotoxic stress (5, 39). Although these studies indirectly infer a role for PARP in apoptosis, more recent studies in Caenorhabditis elegans provide a more direct link for PARP in the processes relating to apoptosis. In C. elegans, developmentally regulated apoptosis is directly controlled by the products of three genes, CED-3, CED-4, and CED-9 (60).Both CED-3 and CED-4 are absolutely required for the apoptotic cell death during development, and CED-9, in turn, is required to prevent indiscriminate apoptosis. Subsequent cloning and comparative sequence analysis revealed that the human homolog of the apoptosis-suppressor gene, CED-9, is the protoon-
152
CYNTHIA M. G . SIMBULAN-ROSENTHAL ET AL.
cogene, bc12, whose importance in blocking apoptosis has been well established (61). CED-4 as yet has no known human homolog, but sequence analysis revealed that this novel gene encodes a putative Ca2+ binding site (62). Therefore, it is probable that this gene is important in regulation of Ca2+ fluxes, which occur during apoptosis and may be required for the function of one or more of the endonucleases activated during apoptosis. The key apoptotic gene product, CED-3, is highly homologous to the human interleukin-converting enzyme (ICE) protease (62). Since this initial observation, homologous regions between CED-3 and ICE have been used to isolate several new members of this growing family of cysteine proteases, including ICE, Nedd-2, and CPP-32 (30, 63-66). In general, all members of this protease family cleave proteins at aspartate residues, are synthesized as a proenzyme whose cleavage product produces the two polypeptide subunits that form the enzyme, and are made up of two heterodimeric subunits that contain the consensus sequence QACRG at its active site (64).Alteration of the cysteine at the active site abolishes protease activity. Cleavage of these pro-forms of the proteases produces a cascade, or perhaps a more apt term is “rock slide,” of activation of downstream cleavage events, leading to the apoptotic phenotype. Searches of the database for other known proteins containing the cleavage site for pro-IL-lp that are recognized and cleaved by ICE revealed a similar site within the PARP protein (DEVD 216-G 217). Several members of the ICE-like family of proteases, with the exclusion of ICE, can cleave PARP in uitro into two fragments. Recently, we developed an assay for PARP cleavage utilizing 35s-labeled in uitro-transcribed and -translated PARP, and a synthetic tetrapeptide aldehyde with the amino-acid sequence of the PARP cleavage site (Ac-DEVD-CHO) was found to be a very effective inhibitor of the protease (I50= 1nM). Using a biotinylated Ac-DEVD-CHO column, the active form of CPP-32 (termed CPP-32p or apopain) that cleaves PARP and is necessary for apoptosis was further purified and characterized (25).In vivo cleavage occurs concurrent with the induction of apoptosis; addition of synthetic peptides to the PARP cleavage site, DEVD, completely inhibited the cleavage of endogenous PARP and the concomitant induction of intranucleosomal fragmentation (25, 66). In immunofluorescence studies to further elucidate the cascade of events promoting activation of the ICE-like protease and cleavage of PARP during spontaneous apoptosis in osteosarcoma cells, we recently observed that the cleavage of PARP and induction of internucleosomal fragments is preceded by a burst of PARP activity in these cells (67).Thus, DNA strand breaks and PARP activation appear to be early events in apoptosis, whereas inactivation and cleavage of PARP occur at a later stage when the cells are irreversibly committed to apoptosis, presumably as a result of increased unregulated
EFFECTS OF PARP ANTISENSE RNA ON DNA METABOLISM
153
binding of the cleaved PARP DNA-binding domain to DNA ends (67).It is as yet unclear whether the burst of ADP-ribosylation that temporally precedes DNA fragmentation is required to stimulate the cleavage of PARP in viuo. If this is a universal circumstance, it may be hypothesized that one of the participants of the protease cleavage cascade is a substrate for PARP. Although a specific functional role for PARP in apoptosis has as yet to be defined, several hypotheses come to mind when examining the cleavage products of PARP. The cleavage of PARP into two fragments occurs between the N-terminal zinc-finger DNA-binding domain and the automodification and NAD-binding domains at the carboxy end of the protein (25). Other studies, using truncated fragments from recombinant PARP protein, demonstrate that the break recognition and the DNA end-binding function of this protein reside completely within this fragment (4, 68). Linking the NAD-binding and automodification domains to the DNAbinding region permits targeting of and regulation by the catalytic regions of the protein, and removal of the automodification domain from the DNAbinding domain results in the release of an unregulated DNA-binding h n c tion within cells. This activity has been described in other studies using trans-dominant mutants in which the 46-kb DNA-binding domain and a portion of the automodification domain are ectopically expressed. In HeLa cells transfected with the truncated DNA-binding domain, enhanced cytotoxicity was evident following exposure to DNA-damaging agents (37, 38). However, when a trans-dominant mutant that contains a mutation in the DNA-binding site that abolished its in vitro affinity for DNA is used, enhancement of apoptosis and cytotoxicity is observed, as with the intact transdominant DNA-binding domain protein. This suggests that unregulated DNA binding may not be the only apoptotic consequence or function attributable to this fragment (38).Nevertheless, incubation of bacterially expressed PARP DNA-binding domain protein interferes with repair of UV-damaged plasmids in repair-competent extracts ( 3 4 , indicating that the regulation of the DNA-binding function is crucial to timely DNA repair. Alternatively, when PARP is rendered catalytically inactive by apopain proteolysis prior to extensive fragmentation of DNA, this ensures that the normally very high pools of NAD (derived from ATP) are not depleted by an active PARP. Accordingly, it appears that both ATP and possibly NAD are required for cells to terminate apoptosis. Despite these important functions for PARP in apoptosis, it is important to note that “knockout” mice, which are nullzygous for PARP gene, developed normally, suggesting that PARP cleavage may not be required during developmentally regulated apoptosis (35). PARP cleavage was also not required for apoptotic depletion of irradiated thymocytes in nullzygous animals. These studies suggest that PARP cleavage alone may not be sufficient
154
CYNTHIA M. G . SIMBULAN-ROSENTHAL ET AL.
to sustain apoptosis but may work in concert with other cleavage products, leading to a threshold of events resulting in cellular collapse (64).
ACKNOWLEDGMENTS This work was supported in part by funding from Grants CA25344 and CA13195 from the National Cancer Institute, from the United States Air Force Office of Scientific Research (Grant AFOSR-89-0053), and from the United States Army Medical Research and Development Command (Contract DAMD17-90-C-0053). Authors thank Dr. Helmuth Hilz for the polyclonal antibody to murine PARP, and Dr. Veronica Kang for contributions to the data in this review.
REFERENCES 1. E. Jacobson, K. M. Antol, H. Juarez-Salinas and M. Jacobson, JBC 258, 103 (1983). 2. M. S. Satoh and T. Lindahl, Nature 356,356 (1992). 3. M. S. Satoh, G . G. Poirier and T. Lindahl, JBC 268, 5480 (1993). 4. M. Molinete, W. Vermeulen, A. Burkle, J. Menissier-de Murcia, J. Kupper, J. Hoejmakers and J. de Murcia, E M B O J. 12, 2109 (1993). 5. R. Ding, Y. Pommier, V. Kang and M. Smulson, JBC 267, 12804 (1993). 6. M. Smulson, N. Istock, R. Ding and B. Cherney, Bchem 33, 6186 (1994). 7. D. M. Gill, JBC 247, 5964 (1972). 8. L. Burzio and S. S. Koide, FEBS Lett. 20, 29 (1972). 9. A. R. Lehman, S. Kirk-Bell, S. Shall and W. J. Whish, E r p . Cell Res. 83, 63 (1974). 10. B. Anachkova, G . Russev and G . G. Poirier, Cytobios 59, 19 (1989). 11. G. deMurcia, J. Jongstra-Bilen, M. E. Ittel, P. Mandel and E. Delain, E M B O J . 2, 543 (1983). 12. P. R. Stone and S. Shall, Exp. Cell Res. 91, 95 (1975). 13. C. F. Cesarone, L. Scarabelli, I. Scovassi, R. Izzo, M. Menegazzi, A. C. DeProti, M. Orunesu and U. Bertazzoni, BBA 1087, 241 (1990). 14. C. M. G. Simbulan, M. Suzuki, S. Izuta, T. Sakurai, E. Savoysky, K. Kojima, K. Miyahara, Y. Shizuta and S. Yoshida, JBC 268, 93 (1993). 15. A. M. Ferro and B. M. Olivera, JBC 259, 547 (1984). 16. F. Farzaneh, R. Meldrum and S. Shall, NARes 15, 3493 (1987). 17. K. Bhatia, Y. Pommier, C. Giri, A. J. Fornace, M. Imaizumi, T. R. Breitman, B. Cherney and M. E. Smulson, Carcinogenesis 11, 123 (1990). 18. H. Suzuki, K. Uchida, H. Shima, T. Sato, T. Okamoto, T. Kimura, T. Sugimura and M. Miwa, ”ADP-Ribose Tranfer Reactions: Mechanisms and Biological Significance.” Springer-Verlag, New York, 1989. 19. A. I. Caplan and M. J. Rosenberg, PNAS 72, 1852 (1975). 20. F. R. Althaus and C. Richter, Mol. Bid. Biochem. Biophys. 37, 1 (1987). 21. G. d e Murcia and J. de Murcia, TZBS 19, 172 (1994). 22. A. Burkle, T. Meyer, H. Hilz and H. Zur, Cancer Res. 47, 3632 (1987). 23. J. Lewis, Y. Shimizu and N. Shimizu, FEBS Lett. 146, 37 (1982). 24. S. Kaufmann, S. Desnoyers, Y. Ottaviano, N. Davidson and C. Poirier, Cancer Res. 54, 3976 (1994).
EFFECTS OF PAW ANTISENSE RNA ON DNA METABOLISM
155
25. D. Nicholson, A. Ali, N. Thornberry, J. Vaillancourt, C. Ding, M. Gallant, Y. Gareau, P. Griffin, M. Labelle, Y. Lazebnik, N. Munday, S. Raju, M. Smulson, T. Yamin, V. Yu and D. Miller, Nature 376, 37 (1995). 26. C. Nosserrie, S. Coppola and L. Ghibelli, Erp. Cell Res. 212, 367 (1994). 27. W. G. Rice, C . D. Hillyer, B. Harten, C. Schaeffer, M. Dorminy, D. Lackey, E. Kirsten, J. Mendeleyev, K. Buki, A. Hakam and E. Kun, PNAS 89, 7703 (1992). 28. G. Alvarez, R. Eichenberger and F. Althaus, BBRC 138, 1051 (1986). 29. K . Wielckens, A. Schmidt, E. George, R. Bredehorst and H. Hilz, JBC 257, 12872 (1982). 30. M. Tewari, L. Quan, T. O'Rourke, S. Desnoyers, Z. Zend, D. Beidler, G. Poirier, G . Salvesen and V. Dixit, Cell 81, 801 (1995). 31. D. J. Hunting, B. J. Gowans and J. F. Henderson, Mol. Pharmucol. 28, 200 (1985). 32. K. M. Milam, G. H. Thomas and J. E. Cleaver, Erp. Cell Res. 165, 260 (1986). 33. J. E. Cleaver, K. M. Milam and W. F. Morgan, Radiat. Res. 101, 16 (1985). 34. B. Cherney, B. Chaudry, K. Bhatia, T. Butt and M. Smulson, Bchem 30, 10420 (1991). 35. 2. Wang, B. Auer, L. Sting], H. Berghammer, D. Haidacher, M. Schweiger and E. Wagner, Genes Dew. 9, 509 (1995). 36. J. Kupper, G. de Murcia and A. Burkle, JBC 265, 18721 (1990). 37. J. Kupper, M. Muller, M. Jacobson, J. Miyajima, D. Coyle, E. Jacobson and A. Burkle, M C B 15, 3154 (1995). 38. V. Schreiber, D. Hunting, C. Trucco, B. Gowans, P. Grunwald G. de Murcia and J. de Murcia, PNAS 92, 4753 (1995). 39. R. Ding and M. Smulson, Cancer Res. 54, 4627 (1994). 40. D. S. Rosenthal, T.Shima, G. Celli, L. De Luca and M. Smulson, J. Invest. Dermutol. 105, 38 (1995). 41. M. Smulson, V. Kang, J. Ntarnbi, D. Rosentbal, R. Ding and C. Sirnbulan, JBC 270, 119 (1995). 42. H. Yamanaka, C. Penning, D. Willis, B. Wasson and D. Carson, JBC 263, 3879 (1988). 43. J. G. Izant and H. Weintraub, Cell 36, 1007 (1984). 44. M. Smulson, P. Schein, D. Mullins and S. Sudhakar, Cancer Res. 37, 3006 (1977). 45. S. Kim and B. Wold, Cell 42, 129 (1985). 46. C. Gin, M. West and M. Smulson, Bchem 17, 3495 (1978). 47. T. Butt, J. Brothers, C. Giri and M. Smulson, NARes 5, 2775 (1978). 48. T. Butt, D. Jump and M. Srnulson, PNAS 76, 1628 (1979). 49. M. Wong, N. Malik and M. Smulson, EJB 128,209 (1982). 50. G. Poirier, G. de Murcia, J. Jongstra-Bilen, C . Niedergang and P. Mandel, PNAS 79, 3423 (1982). 51. E. Otto, S. McCord and T. Tlsty, JBC 264, 119 (1989). 52. D. Jump and M. Smulson, Bchem 19, 1024 (1980). 53. P. Villareal, Microbiol. Rew. 55, 512 (1991). 54. A. Ferro, N. Higgins and B. Olivera, JBC 258, 6000 (1983). 55. C. Sirnbulan-Rosenthal, D. Rosenthal, H. Hilz, R. Hickey, L. Malkas, N. Applegren, Y. Wu, G. Bers and M. Srnulson, Bchem, in press (1996). 56. Y. Wu, R. Hickey, K. Lawlor, P. Wills, F. Yu, H. Ozer, R. Starr, J. Y. Quan, M. Lee and L. Malkas, J. Cell Biochem. 54, 32 (1994). 57. L. H. Malkas, R. J. Hickey, C. J. Li, N. Pedersen and E. F. B a d , Bchem 29,6362 (1990). 58. N. Applegren, R. J. Hickey, A. M. Kleinschrnidt, Q. Zhou, J. Coll, P. Wills, R. Swaby, Y. Wei, J. Y. Quan, M. Y. Lee and L. Malkas, J. Cell Biochem. 59, 91 (1995). 59. R. Hickey, T. Tom, P. Wills and L. Malkas, J. Cell Biochem, in press (1996). 60. H. Ellis and H. Horvitz, Cell 44, 817 (1986). 61. J. Yuan and H. Horvitz, Dev. Biol. 138, 33 (1990).
156
CYNTHIA M. G. SIMBULAN-ROSENTHAL ET AL.
62. J. Yuan, S. Shaham, S. Ledoux, H. Ellis and H. Horvitz, Cell 75, 641 (1993). 63. T. Fernandez-Alnemri, G. Litwack and E. Alnemri, JBC 269, 30761 (1994). 64. S. Martin and D. Green, Cell 82, 349 (1995). 65. L. Wang, M. Miura, L. Bergeron, H. Zhu and J. Yuan, Cell 78, 739 (1994). 66. Y. Lazebnik, S. Kaufmann, S. Desnoyers, G. Poirier and W. Earnshaw, Nature 371, 346 (1994). 67. D. Rosenthal, R. Ding, C. Simbulan-Rosenthal, B. Cherney, J. Vaillancourt, D. Nicholson, P. Vanek and M. Smulson, Cancer Res., in press (1996). 68. B. Cherney, 0. McBride, D. Chen, H. Alkhatib, K. Bhatia, P. Hensley and M. Smulson, PNAS 84, 8370 (1987).
The Large Ribosomal Subunit Stalk as a Regulatory Element of the Eukaryotic Translational Machinery P. G. BALLESTAAND MIGUEL REMACHA
JUAN
Centro de Biologia Molecular Sever0 Ochoa Canto Blanco 28049 Madrid, Spain Components of the Eukaryotic Ribosomal Stalk .................... The Cytoplasmic Pool of the Stalk Components . . . . The PI/P2-P0 Protein Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exchange of P Proteins in the Ribosome Phosphorylation of the Stalk Proteins ............................. A. P-protein Phosphorylation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Protein Kinases Involved in P-protein Phosphorylation . . . . C. Role of Phosphorylation on the Function of P Proteins . . . . . VI. Functional Roles of the Eukaryotic Stalk Components . . . . . . . . . . . . . . A. Function of the PI and P2 Acidic Proteins ..................... B. Functional Analysis of Protein PO . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1. 11. 111. IV. V.
Ribosomal Stalk
IX. Future Prospects References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
159 165 167 169 171 171 173 175 176 176 181 183 184 187 189 190
One of the most characteristic structural features of the large ribosomal subunit is the stalk, a highly flexible lateral protuberance that, in bacterial particles, is formed by a pentameric protein complex of two dimers of proteins L7 and L12, and one copy of protein L10. It has been proposed that the stalk participates in the interaction of elongation factors with the ribosome during protein synthesis, although it is not part of the ribosomal factor interaction site that has been located precisely at its base (Fig. 1)( 1 , la). Proteins L7 and L12, the N-terminal acetylated and nonacetylated forms of a unique, strongly acidic polypeptide, form in solution stable dimers that have been proposed to have an elongated structure with a globular carboxyl Progress in Nucleic Acid Research and Molecular Biology, Vol. 55
157
AU
Copyright 0 1996 by Academic Press, Inc. rights of reproduction in any form reserved.
158
JUAN P. G. BALLESTA AND MIGUEL REMACHA
Central protuberance
FIG. 1. Large ribosomal subunit model derived from the Escherichia coli model (1). The most relevant structural elements are indicated. The probable locations of proteins PO, P1,P2, and L15,as deduced from the position of their bacterial equivalents (14).are also marked.
domain and a rigid amino end joined by a flexible “hinge” (Ib). The two L7/L12 dimers interact with the carboxyl domain of protein L10 through their amino ends, forming an extraordinarily stable pentameric complex that resists very high urea concentrations. The (L7/12),-L10 complex binds through the amino domain of protein L10 to the 23-S rRNA, exposing to the cytoplasm the globular carboxyl region that forms the tip of the stalk. The pentameric complex binding region comprises from around position 1000 to position 1200 in the 23-S rRNA nucleotide sequence and is highly conserved in the large ribosomal RNA from all species. This region of the 23-5 rRNA overlaps extensively with the interaction site of protein L11 and that of thiostrepton, an inhibitor of the elongation factor interaction with the ribosome. The ribosomal region, including protein L10, protein L11, and thiostrepton binding sites, has been defined as the GTPase domain and is an important part of the EFG and EFTu interaction sites. The corresponding components of the stalk and the GTPase domain in the eukaryotic ribosomes have been identified and characterized. They show characteristics that resemble closely those of their bacterial counterparts, and it is generally accepted that they must play an analogous functional role in the basic protein synthesis machinery, facilitating also the interaction of the elongation factors. However, the eukaryotic proteins have some peculiarities that strongly suggest they have evolved to a more sophisticated
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
159
structure that may modulate the ribosomal activity, probably regulating the translational process. In this report, the structural and functional properties of the different eukaryotic stalk components are reviewed, focusing on the Saccharomyces cerevisiae system, for which an extensive analysis has been carried out.
1. Components of the Eukaryotic Ribosomal Stalk
A. The 12-kDa P1 and P2 Acidic Ribosoma I Proteins A number of proteins with p l values ranging from 4.0 to 3.0 and molecular masses close to 12 kDa have been found in all the eukaryotic ribosomes studied (2).These acidic proteins have been called by different names, depending on the species and on the analysis system used, but in this report they are designated as P proteins, following the recently proposed nomenclature (3). Two type of P proteins, P1 and P2, have been described in higher eukaryotes, including humans (4), rat liver (3), invertebrates (5-7), and slime molds (8). However, the number of acidic proteins seems to be higher in some lower eukaryotes. Three have been described in plants (9), four in S. cerevisiae (10-12) and Schizosaccharomyces pombe (13), and the number is apparently higher in protozoa such as Trypanosomu crmzi (14).Nevertheless, amino-acid sequence comparisons of the proteins in these multimembered families reveal that they can form in two groups showing high sequence similarity to the mammalian P1 and P2 proteins, respectively. Thus, two of the four S. cerevisiue proteins, YPla and YPlP, are similar to mammalian P1, and the other two, YP2a and YPBP, to mammalian P2. The yeast proteins YP2a and YP2P were formerly called either L44 and L45 (15)or L35 and L36 (16),respectively. YPlp was considered to be the protein originally called L.44 (17). However, L44 has recently been found to be a degradation product of the native protein designated Ax (18).In spite of the close aminoacid sequence similarity and apparently analogous function, YPZP is four amino acids larger than YP2ol, which, however, has the same 106 amino acids as the two P1 proteins (Fig. 2). The carboxyl end is highly similar in all P proteins. The terminal peptide DMGFGLFD is found in practically all eukaryotic organisms, from yeast to humans, and only minor deviations to this sequence are found among the plant proteins and protozoa (Table I). The high degree of conservation of this structure suggests it plays an important functional role, and because it seems to be exposed to the external medium (19), it probably interacts with the
160
JUAN P. G. BALLESTA AND MIGUEL REMACHA
-Fa-
IIIIIIIIIII -FGLFD IIIIIIIIIII mmmmrnm IIIIIIIIIII -ML?'D
110 106 106 106
FIG.2. Amino-acid sequence comparison of Saccharomyces cerevisim acidic proteins YPlu, YPlP, Y P k , and YP2P. The putative hinge region is underlined, lysine residues insensitive to trypsin are circled, and phosphorylated serines are outlined.
elongation factors. Interestingly, this peptide is very antigenic and is one important antigen for sera from patients having autoimmune and infectious diseases such as systemic lupus erythernatosus (ZO), Chagas' heart disease (21, 22), and leishmaniosis (23). Apart from the C terminus, the members of each one of the two P protein families show the highest similarity at the amino-terminal end, except in the case of the plant P1 polypeptides, which seem to diverge extensively from the consensus sequence (Table 11); also, the protozoa P2 N-terminal sequence shows notable differences, and in some cases is somewhat longer than the average protein. An additional interesting difference between the two P protein families is the processing of the first methionine residue followed by the acetylation of the next residue, usually a serine, that seems to take place only in P1 (18, 24). This is an interesting differentiating structural feature, considering that the blocking of the amino-terminal end is the only difference between the two bacterial acidic protein forms. Although this modification seems to be functionally irrelevant for the bacterial proteins (25,26),its highly conserved character suggests that it might be involved in some as yet undiscovered acidic protein function. In general, the amino-terminal region of the P1 proteins seems to be very sensitive to degradation. Thus, the previously mentioned protein L44' is a degradation product of the native YPlP(Ax) protein that lacks the first eight amino acids, and seems to be generated during purification, because its proportion is affected by the preparation conditions (18). It must be stressed that in spite of the analogous function that the bacte-
TABLE I
CARBOXYL TERMINUSAMINO-ACIDSEQUENCE Accession number
P1
PO5318 P10622 P17476 P17477
KEEEEAKEESDDDMGFGLFD EKEEEAAEESDDDMGFGLFD KEEAKEEEESDEDMGFGLFD KEEAKEEEESDEDMGFGLFD
Dictyostelium discolideum Babesia bwis Leishmania infanturn
P22684
KKEEVKKEESDDDMGMGLFD
Trypanosom cruti
P26643
kAAKI(EEEEEDDDMGFGLFD
Tetrahymena themophila Chlamydomonas reinhardtii Altemaria altemata Arabidopsis thaliana Chenopodium rubrum Parthenium argentatum Oryza sativa Zea ma s PolyorcXis enicillatus Artemia sagna Caenorhabditis elegans Drosophila melanogaster Gallus gallus Mus m u s c u h Rattus norvegicus Homo sapiens
P24002 P29763
KKEEPKEEETclMwvIG .DLFG KKEEKKEPSEEEDMGFSLFD
218207
EEKKKEESEEEEDFGFDLFG
D15562
EKKEEAKEESDDDMGFSLFD
P27464 PO2402 TO2149 PO8570 P18660
EKXAESEDESDDDMG . .LFD EEKKEESEEEDEDMGFGLFD KKKEEPKEESDDDMGFGLFD KKXEEESDQSDDDMGPGLFD EEXWEESEESDDDMGFGLFD
P19944 PO5386
EAKXEESEESEDDMGFGLFD EAKKEESEESDDDMGFGLFD
Organism
Saccharomyces cerevisiae Schitosaccharomyces pombe Cladosporium herbarum
a
Accession number PO5319 PO2400 PO8094 P17478 P42038 P42039 P22683 P27055 406382 406383 P23632 P26795
OF
PROTEINSP1, P2, AND P@ P2
MEEEAAEESDDDMGFGLFD MEEEAKEESDDEMGFGLFD KEEAKEEEESDEJBGFGLFD AKEEEAAEESDECWGFGLFD AAKEEEKEESDDEMGFGLFD EKAEEMEESDDEMGFGLFD KWE-SDDDMCXLFD KKPEAEPEEEEDEMGFSLFD AKKDEPEEEADDEMGFGLFD KKEEPFZEEADDWFGLFD APAAAAEEEEDDDMGFGLFD AD-DIMGFGLFD
P42037 -SDEDEFGLFD 226542 EEKKEEKEESDDDMGFSLFE P41099
EEKSE..EESDEELX;FSLFDDN
U29383
EEKVEMEESDDDMGFSLFD
P.023 99
EEKKEESEEEDEDMGFGLFD
PO5389
EEKKEESESEDDDMGFALFE
PO2401 DEKKEESEESDDDMGFGLFD PO5387 DEKXEESEESDDDMGFGLFD
Amino-acid sequences either from Swiss-Prot or translated from the nucleotide sequences in GeneBank.
Accession number
PO
PO5317 EAAAEEEEESDDWFGLFD
P22685 VWEEKKEESDD-LFD P39097 KEEPEESDEDDFGM .GLF P26796
AEPEEEDDDDDm.ALF
226534 VEEKEESDEEDYGGDFGLFDEE P29764 MKEEPEEESDDDIGFSLFDD P41095
EKXEEPEEESIXDIAUSLFD
P14869 AEAKEESEESD=FGLFD P19945 W E S E E S D - F G L F D PO5388 VEAKEESEESD-FGLFD
TABLE I1
AMINO-TERMINAL SEQUENCE OF PROTEINS P1 AND
Organism Saccharomyces cerevisioe
Accession number
P1
Accession number
PO5318 P10622 P17476 P17471
MS.TESALsYAALILAD MSDSIISFAAFILAD MSASELATSYSALILAD MSASELATSYSALIWID
Dichtyostelium discolideum Babesia bmis kishmunia infantum
P22684
MSEIKTEELACNSGLLLQD
Trypanosoma cruzi
P26643
MSSKQQIACTYAALIWID
Tetrahymena t h e m p h i l a Chlamydomonas reinhardtii Alternaria alternata Arabidopsis thaliana Parthenium argentatum Oryza satioa
P24002 P29763
~ I E K W K G A S Y ~ MSTSELWTYAALILHD
218201
MGVPSWCKSKGGEWI'A
D48782 D15754
MSSSEVACPLAALILIID MGWTFVCRSSGDEWI'A
P27464 PO2402 Xi16480 TO2149 PO8570 P18660 P19944 PO5386
MADSSTSELACWSACILIID
Schitosaccharomyces pombe Cladosporium herbarum
Triticum aestivum Zea mu s PolyorcXis penicillatus Artemia salina Brugia malayi Caenorhabditis elegans Drosophila melanogaster Gallus gallus Rattus noroegicus Homo sapiens a
MASI(DELAcVyAAL1LLD
MA NQELACWAALILQD
Pza
PO5319 PO2400 PO8094 P11478 P42038 P42039 P22683 P27055 406382 406383 P23632 P26795
P2 lawIAA~GN.TPDA MKYLAAYLLLVQGGNAAPSA MKYLAAYLLLTVGGKDSPSA MKYLAAYLLLTVGGKQSPSA MKyMAAYLLuiLAGNSSPSA
MKYLAAFLLu;LFGNSSPSA
M K Y L ? a A r L L A s ~ MAUCYVSSYLLAVAAWENPSV MSTKYLFAYALlSLS.KASPSQ MQYLAAYALVALSGK TPSK MKYLAAYALVGLSG GTPSK MsMKyLAAyALAsLN KPTFGA
P42037 217464 P41099 D22651 D15912 PO5390 U29383 PO2399
MASNQBLACWAALILQD MsTKABLAsvyAsLILvD
PO5389
MASVSELACIYSALILHD MASVSELACIYSALILHD MASVSELACIYSALILIID
PO2401 PO5387
Amino-acid sequences either from Swiss-Pmt or translated from the nucleotide sequences in GeneBank.
.
.
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
163
rial and the eukaryotic acidic proteins seem to play, they show no relevant sequence similarity. Based on low sequence homologies between different parts of the proteins from the two kingdoms, it was proposed that a transposition of domains might have taken place in the eukaryotic polypeptides in such a way that the eukaryotic C terminus would correspond to the bacterial N domain (27). However, the fact that, like the bacterial polypeptides, the eukaryotic P proteins interact with the ribosome through their amino domain and expose the C end to the medium (19)does not support that hypothesis, or at least indicates that if the transposition occurred it did not affect the function of the different protein domains. The archaebacterial acidic proteins are structurally closer to the eukaryotic than to the bacterial polypeptides. The eukaryotic P proteins form stable dimers in solution (5, 28), resembling in this aspect the bacterial protein L71L12. However, little is known about the structure of the eukaryotic acidic proteins. Because they are similar in function, it has generally been assumed that the eukaryotic polypeptides are also similar in structure to the bacterial L7/L12. The bacterial proteins have a globular carboxyl domain and an elongated amino domain linked by a highly flexible hinge (29). Preparation of both domains is simply achieved by means of controlled proteolysis of the sensitive hinge and crystallization of the carboxyl part (30).However, an analogous treatment of the yeast YP2p protein yields only one fragment that corresponds to the amino domain, whereas the carboxyl region seems to be easily degraded (J. Zurdo and J. P. G. Ballesta, unpublished). On the other hand, assuming that the hinge region corresponds to the alanine- and glycine-rich region around positions 70 to 80 (Fig. 2), the carboxyl domain must be limited to the last 20 amino acids, including the string of very acidic residues and the highly conserved DMGFGLFD peptide. This domain is considerably smaller than the bacterial counterpart and, moreover, due to its highly acidic character, it is dimcult to envision how it can have the stable tertiary structure of the Escherichia coli C-terminal fragment (31). An interesting peculiarity of the yeast P proteins is the low overall polypeptide structure, almost a random coil, detected by CD and NMR at neutral pH. However, at a pH of around 3.0, the polypeptides show a reasonable degree of structuration. Neither phosphorylation nor interaction with other yeast acidic P proteins seems to help in inducing a higher level of structure at neutral pH (J. Zurdo and J. P. G. Ballesta, unpublished). Because the proteins must have a certain degree of structure when bound to the ribosome in the cell, the question arises: how is this attained at the neutral cytoplasmic pH? Therefore, the scarce data available do not support a very close structural similarity between bacterial and eukaryotic acidic ribosomal proteins, al-
164
JUAN P. G . BALLESTA AND MIGUEL REMACHA
though additional experimental evidence is required to substantiate this conclusion.
B. The Ribosomal Protein PO Protein PO was first reported in chicken ribosomes as a nonacidic 40-kDa polypeptide cross-reacting with antibodies to the acidic P1 and P2 proteins (32).A similar protein has been found in other eukaryotes (3,4,12,33-38)as well as in archaebacteria (39). Although the amino-acid sequence similarity is not striking, the eukaryotic ribosomal protein PO has been proposed as the functional homolog of bacterial protein L1O (12). PO is larger than L10, and has a carboxyl-terminal prolongation, missing in the bacterial polypeptide, which has an amino-acid sequence highly similar to that of the small acidic P proteins. This similarity is especially notable in the last amino acids that show the consensus DMGFGLFD sequence, characteristicof the PUP2 proteins (Table I). From a structural point of view, PO resembles the fusion of one L10-like protein and one acidic protein. In fact, it has been proposed that the present PO and L10 proteins may be the derivatives of an ancestral protein resulting from the fusion of the primitive L1O and acidic protein genes; the modern bacterial L1O probably lost the carboxyl domain as a result of a different evolutionary process (39). Interestingly, archaebacteria have a eukaryotic PO-like protein (39),confirming a closer phylogenetic relationship to eukaryotes than to eubacteria in this respect. The amino-acid sequence similarity of the PO and PUP2 carboxyl regions explains the anti-P protein sera cross-reactivity of PO as well as the fact that PO is also an antigen for autoimmune and Chagas’ disease sera (20-22). However, in some organisms there are differences in the sequences of these regions in the two protein types (Table I) that affect their immunoreactivity. Thus, in T. cruzi, PO is not fully recognized by antibodies to the eukaryotic carboxyl-end consensus sequence that react with the PUP2 proteins (22). Interestingly, the Typanosomu PO carboxyl-end sequence is closer to the corresponding archaebacteria sequence than to other eukaryotic PO sequences (14). As expected, protein PO interacts with the acidic PUP2 proteins, confirming its functional similarity to the bacterial protein L10. A PO-acidicprotein complex has been reported in mammalian cell extracts (4,20). Moreover, PO can be cross-linked to proteins P1 and P2 in Artemia salina ribosomes, supporting the existence of a (Pl),-PO-(PB), protein complex (40). Protein PO, like protein L10, binds directly to the rRNA in the so-called GTPase center (41). This interaction, as opposed to the situation previously described for the pentameric protein complex, is much stronger than in the
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
165
bacterial ribosome. Thus, PO remains bound to the ribosome in conditions that completely remove protein L1O from the particles (42), e.g., washing with NH4C1-ethanol (32).
C. Eukaryotic L11 -like Ribosomal Protein Although protein L11 is not a part of the stalk, it binds to the GTPase center, overlaping extensively with the LlO-(L7/Ll2), pentameric complex interaction site. In fact, a cooperative effect in the binding of L1O and L11 to the ribosome has been proposed (42). Moreover, cross-linking studies have shown the physical proximity of protein L11 to proteins L71L12 (43).Protein L11 is located at the base of the stalk, forming part of the elongation factor binding site (44, 45), and in this way being functionally and structurally related to the acidic proteins. Yeast ribosomal protein L15, and consequently its mammalian homolog L12, are equivalent to bacterial protein L11 in different ways. Thus, there is immunological cross-reactivity between L11 and L15 (28), and both show similar structural properties, such as being the most methylated proteins in the ribosome (46, 47). Protein L15, like bacterial L11 and contrary to PO, is removed from the ribosome by NH,Cl-ethanol washing (48)and can form a complex with the P proteins in solution (49). In addition, L15 binds to the GTPase region in the 26-S rRNA (50), which was afterward confirmed for the equivalent rat protein L12 (41).
II. The Cytoplasmic Pool of the Stalk Components None of the ribosomal components, neither rRNA nor proteins, is usually found free in the cytoplasm. It seems that cells adjust the amount of each component required for proper ribosome assembly be degrading any excess that may have leaked out from other control mechanisms (51).However, the acidic ribosomal proteins are an exception to this rule.
A. The 12-kDa Acidic Protein Pool In bacteria, a cytoplasmic pool of L7 and L12 has been reported (52). Similarly, in A. salinu (24), S . cerevisiue (53-55), and mammals (20, 56), there are free P1 and P2 proteins in the cytoplasm. The size of the eukaryotic acidic protein pool has not been precisely determined, and the available data show notable differences. By radioimmunoassay using rabbit specific antisera, about 75% of the acidic P proteins in the cell is in the supernatant of A. sulina after removal of ribosomes (24).A similar proportion was estimated in S. cerevisiae extracts by immunoprecipitation (53), but only 0.3% of the
166
JUAN P. G . BALLESTA AND MIGUEL REMACHA
ribosome-bound acidic protein was detected free by immunoblotting in the same species (55). Important differences have been also detected when the amount of each individual acidic protein was estimated using specific monoclonal antibodies. According to this data, YP2p is the most abundant free acidic protein in the cytoplasm, with the amount estimated by ELISA in the SlOO fraction being about 60% of the total in the cell. On the other hand, only 20% of YP2a and 3%of YPlp are detected free in the supernatant (57).Discrepancies may also be due to the metabolic state and to the genetic background of the cells; considerable differences have been found for the same protein among cells collected at different stages in the growth curve as well as in different yeast strains (B. L. Ortiz-Reyes and J. P. G. Ballesta, unpublished results). In spite of these discrepancies, the results clearly indicate the presence of an important and unusual cytoplasmic pool of acidic PUP2 proteins.
6. The Pool of Protein PO As for the other components of the GTPase center, proteins PO and L15, data on the existence of a cytoplasmic pool are even more scarce. PO has not been detected in the cytoplasmic fractions from S . cereuisiae either by immunoprecipitation (58)or by immunoblotting (59), using specific antibodies. Moreover, PO does not seem to accumulate in the cytoplasm in conditions that result in an important overexpression of PO mRNA (59), indicating that the excessive amount of protein is degraded, as has been reported for many other ribosomal proteins in similar conditions (51).In similar overexpressing conditions, the acidic proteins accumulate in the cytoplasm at even higher concentrations than in standard conditions (60).Therefore, PO does not seem to have the structural characteristics that protect PUP2 from the cellular mechanisms that identlfy and degrade all non-ribosome-bound ribosomal components. However, it must be noted that a complex of PO and PUP2 was found by column chromatography in supernatant fractions derived from mammals (4, 20). Also, protein PO immunogenic determinants have been reported on the surface of the mammalian cytoplasmic membrane (61). In S . cerevisiae, PO has been detected in membrane preparations, but the estimated amount does not seem to exceed that corresponding to the ribosomes that usually are found associated with those cellular fractions (C. Santos and J. P. G. Ballesta, unpublished results). On the whole, the results indicate the existence of substantial differences in the functional role of protein PO in higher and lower eukaryotes that should be explored further.
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
167
C. The Pool of the Eukaryotic L11-like Protein Using specific monoclonal antibodies, we have not found noticeable protein L15 present in ribosome-free extracts of wild-type S. cerevisiue. This protein is detected in a relatively high proportion in total cell extracts of disrupted strains carrying one or more inactivated acidic protein genes (B. L. Ortiz-Reyes and J. P. G. Ballesta, unpublished results). Independent of the role that these results suggest for the acidic proteins in the control of L15 expression, they clearly indicate that L15 may escape, like P1 and P2 and contrary to PO, from the scavenging mechanism that destroys the unbound ribosomal components.
111. The Pl/P2-P0 Protein Complex
A. Structure of the Complex As previously noted, PO interacts with P1 and P2, forming an association analogous to the bacterial LlO-L7/L12 complex (62), also present in archaebacteria (63). This complex was first found free in extracts from dog pancreas (20) and HeLa cells (4). Afterward, it was established by cross-linkingthat, in the A. salina ribosome, protein PO is flanked by P1 and P2 homodimers, which led to the proposal of a PO-(PB),/(Pl), pentameric structure for the complex (40). The estimation by ELISA, using a rabbit antibody reacting with the carboxyl-terminal peptide common to PO and P1/P2, of about five antigenic molecules per ribosome (64)is also in agreement with the pentameric structure of the complex. In bacterial (65, 66) and archaebacterial(67) systems, the acidic proteins interact with protein L1O through their amino-terminal domain. Analysis of the acidic protein interaction with the ribosome in S . cerevisiae has been explored using LmZ gene fusions. Different fragments of the four acidic protein genes were fused to the reporter gene, and the distribution of the fused proteins in the cell, followed by the P-galactosidase activity, was estimated (19). Fused proteins carrying at least 50 amino acids from the aminoterminal domain are bound to the particles, indicating that the carboxyl domain is not required for interaction with the particles (19). In addition, removal of the last 20 amino acids from the C terminus of YPZP does not affect its capacity to bind to the ribosome, whereas, on the contrary, a fragment of the protein comprising about 40 amino acids from the C end does not occur in the ribosome (T. Naranda, J. Sanchez, and M. Remacha, unpublished results). All these results confirm that, in eukaryotic ribosomes, acidic protein binding also takes place through the N-terminal domain, and
168
JUAN P. G. BALLESTA AND MIGUEL REMACHA
the C-terminal part does not seem to have an important role in this interaction (19).
6. Stoichiometry and Stability of the Complex The total amount of acidic ribosomal proteins in the ribosome-bound complex seems to be variable. This variability, which has not been reported for the bacterial acidic proteins (68), was initially attributed to the release of the PUP2 proteins from the ribosomes due to subunit dissociation during preparation (69). Afterward, it was found that the amount of acidic proteins in S. cerevisiae is different in various ribosomal fractions; thus, it is lower in 80-S ribosomes than in polysomes (64),and native 60-S subunits are totally deprived of acidic proteins (S . Zinker, personal communication). Moreover, the number of P protein molecules per ribosome in the whole ribosome preparation from yeast stationary phase cells, estimated using rabbit antisera specific to the conserved carboxyl end of the acidic proteins (also including PO), was about three, whereas five were estimated in the ribosomes from exponentially growing cells (64).On the contrary, in all the cases tested, the amount of PO in the ribosomes does not seem to be affected, independent of the preparation. These results indicate that proteins P1/P2, but not PO, can be easily dissociated from the ribosome, indicating a lower stability of the eukaryotic PO-P1/P2 complex in comparison to the bacterial counterpart. This relatively low stability is also supported by the fact that the PO-P1/P2 structure does not resist the strong denaturing conditions that the LlO-L7/L12 complex can withstand. The bacterial proteins remain associated even under standard conditions of electrophoresis for ribosomal proteins, and become visible in the gels as an independent spot that was initially considered to be a dif€erent ribosomal protein (70).An equivalent situation has not been reported for any eukaryotic system. Moreover, as commented previously, NH,Clethanol washing only dissociates P1 and P2 from the eukaryotic ribosome (32), whereas it releases the whole LlO-L7/L12 complex from the bacterial particles (42). The difliculty in detecting the stalk in eukaryotic ribosomes may also be connected with the instability of the PO-P1/P2 complex. This structure is very prominent in bacterial ribosomes (71),but seems to be much less evident in eukaryotic ribosomes as observed by electron microscopy. It can be detected only in some views of the 60-S subunit (72) and is not discerned in the random-conical reconstruction model of the 80-S ribosome (73).The high degree of flexibility of the acidic proteins (74), can support a folding of the stalk over the body of the particle that may hide this structure in the eukaryotic ribosome; however, considering the lability of the eukaryotic PO-
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
169
P1IP2 complex, absence of the acidic proteins from a fraction of the ribosome population cannot be excluded. In addition to the relative instability of the eukaryotic stalk, an interesting conclusion that can be derived from the results on the variability of the acidic protein content is the apparent relationship between the amounts of PUP2 proteins in the ribosome and the activity of the particle. The ribosomal particles either actively involved in translation (polysomes) or obtained from metabolically active cells (exponential growth phase) contain higher amounts of acidic proteins (64).However, the ratio of the PUP2 acidic proteins is independent of the total amount of these components in the ribosome and does not change substantially in the different ribosome preparations. This fact, together with the requirement for the presence of both P1 and P2 for binding to the ribosome (which will be dealt with later) indicates that a lower content of acidic proteins in a ribosome preparation probably reflects the presence of a fraction of particles totally depleted of these ribosomal components. Therefore, the experimental data suggest that roughly half of the ribosome population in stationary phase cells may totally lack P1 and P2 proteins.
IV. Exchange of P Proteins in the Ribosome A. Exchange of P1 and P2 Proteins An early study of ribosomal proteins in S. cerevisiae found that incorporation of the labeled acidic proteins (then called proteins L35 and L36) into the particles takes place in conditions of ribosome assembly inhibition (16). The simplest interpretation of these results was that the labeled proteins present in the cytoplasm were exchanged with the unlabeled proteins in the preexisting mature ribosomes. This exchange process, which seemed to require active protein synthesis to occur, was limited to the 12-kDa acidic proteins and to a 40-kDa protein that is considered below. Similarly, using a different experimental approach, an exchange of acidic proteins between the ribosome and a cytoplasmic pool of proteins was also confirmed in regenerating rat liver (56) and in plant cell cultures (9). As mentioned previously, the presence of a cytoplasmic pool of acidic proteins, required for the exchange to take place, was confirmed subsequently in different eukaryotic organisms, including yeast and mammals. The existence of this acidic protein exchange is also an important functional feature differentiating the bacterial and eukaryotic acidic protein systems. This process has not been detected in E. coli in spite of the fact that it also has a cytoplasmic pool of L7iL12 (75).
170
JUAN P. G. BALLESTA AND MIGUEL REMACHA
Th.2 acidic protein exchange has also been confirmed in vitro using a yeast mRNA-dependent translation system supplemented with exogenous radioactive acidic proteins. It was possible to show that about 35% of the particles had incorporated one acidic protein molecule in conditions of protein synthesis; inhibition of the translation with cycloheximide inhibited to a great extent the incorporation of exogenous P proteins into the particles. The incorporation of exogenous PUP2 proteins took place in a much smaller proportion (only 12% of the ribosomes had one radioactive protein) in a poly(U)-dependentpolyphenylalanine-synthesizingsystem; this low-level incorporation was insensitive to inhibition of the polymerization process and can be considered unspecific in the sense of not being linked to the translation process (76). Because it is accepted that the phenylalanine-polymerization system proceeds in the absence of the standard initiation and termination steps, these results suggest that exchange of acidic proteins probably requires one of these processes. Taking into consideration that, as mentioned previously, native 603 subunits are deficient in acidic proteins, it seems reasonable to conceive that the exchange may take place before subunit association during initiation, although additional experimental evidence is required to confirm this proposal. The scarcity of experimental data makes it difficult to determine at this moment whether all acidic proteins are involved in the exchange and whether the process takes place in every ribosome after every round of translation. Data on the exchange rate would be of help in obtaining a clearer picture of the process and proposing a model. Unfortunately, an estimation of either the in vivo or in vitro exchange rates has not been reported. The exchange process is probably controlled by the cell, but understanding the control mechanism, as with many other aspects of the exchange process, requires additional experimental data. It is, nevertheless, tempting to think that the exchange process is somehow related to the population of acidic-protein-depleted ribosomes present in the cells, perhaps through a phosphorylation/dephosphorylationmechanism, as will be discussed below (Section V,C).
6. Exchange of Protein PO As indicated previously, a 40-kDa protein has been detected that is also exchanged in yeast under conditions similar to those for the 12-kDa acidic protein exchange (16).Although PO had not yet been described when the 40kDa protein was detected, the size of PO and its capacity to be phosphorylated (Section V,A) make it possible to maintain that it corresponds to this 40kDa ribosomal protein. However, the absence of a detectable cytoplasmic pool of protein PO in S. cerevisiae (58, 59), as well as the strong affmity of PO
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
171
for the ribosome (resistant to strong washing conditions), make it questionable that such an exchange process can take place. In fact, subsequent data seem to indicate that this 40-kDa protein is probably not PO but rather the 5-S rRNA binding protein L3 (77). Therefore, it seems improbable that an exchange of PO, analogous to the one found for P1 and P2, is taking place in S. cerevisiae. However, the situation is much less certain in the case of higher eukaryotes. The presence of a PO cytoplasmicpool in dogs and humans (4,20)would be compatible with an exchange with the protein in the ribosome, but this has yet to be shown experimentally. Nevertheless, the possibility that PO may have a different behavior in higher eukaryotes cannot be totally excluded.
V. Phosphorylation of the Stalk Proteins A. P-protein Phosphorylation One of the most interesting features of eukaryotic acidic proteins is their capacity to be phosphorylated. Proteins P1 and P2 are found phosphorylated in the ribosome (2), with the only reported exception of Tetrahymena pyrifomis (78). If precautions are taken to avoid dephosphorylation, the four P proteins from S. cerevisiae are found totally modified in the ribosome. Curiously, bacterial proteins L7tL12 can be phosphorylated in vitro by a rabbit protein kinase that uses GTP as a donor substrate (79), although this is metabolically irrelevant because these proteins are not found modified in the cell. No further analysis of the phosphorylation sites in the bacterial proteins has been performed, and it is therefore impossible to conclude up to what point the bacterial modification is structurally related to the eukaryotic process. A careful estimation of the phosphate groups in the acidic proteins has been carried out only in the case of S . cerevisiae (1 7 , s ) . Serine was the only modified amino acid, and only one phosphate per protein was estimated, which is in agreement with the presence of only one phosphorylated form of the proteins in isoelectrofocusing gels (17). The A. salina P proteins also seem to have a unique phosphate group (24).On the other hand, P I and P2 proteins from mammalians are multiphosphorylated in the ribosomes; up to eight different bands are detected when resolved by the standard 2-D PAGE for ribosomal proteins; the bands are reduced to only two after phosphatase treatment or in SDS-PAGE (80). No precise analysis of the number of phosphorylation sites has been performed on mammalian P proteins, but at least three phosphorylation sites must exist in rat liver (27).
172
JUAN P. G. BALLESTA AND MICUEL REMACHA
It must be noted that the acidic proteins in the cytoplasmic pool are nonphosphorylated in S. cerevisiae (53,54),although they seem to be at least partially modified in the cytoplasm of mammals (81). The first report on the phosphorylation sites in the 12-kDa acidic proteins indicated that A. salinu eL12, equivalent to P2, was modified at the last serine within the conserved carboxyl end (24). Afterward, an equivalent serine in the P proteins from ascites was shown to be susceptible to casein kinase I1 in vitro (81)and also a probable site of phosphorylation in vivo. In S. cerevisiae, an amino-terminal sequencing of tryptic phosphopeptides indicated that proteins YPla and YP2a are modified in peptides containing serines around position 70 (82). However, further mutagenesis studies to confirm the phosphorylation sites in these proteins indicate that the proteins that mutated at the initially proposed phosphorylation sites, namely, serine 71/79 in YP2a and serine 62 in YPla, could still be phosphorylated (83). These results led to a repetition of the previous tryptic analysis, this time carrying out a complete sequencing of the labeled peptides. It was found that, in spite of the conditions used to ensure total trypsin digestion of proteins (1O:l enzymembstrate ratio, at 37°C for 18 hours in 1M guanidine), the last lysine residue in the sequence, probably due to its highly acidic environment (Fig. 2), is totally insensitive to the enzyme. In these conditions, the tryptic phosphopeptide derived from the proteins extended from around position 60 to the end of the sequence, including the complete carboxyl-terminal domain, containing the last serine residue in either position 96, in proteins YPla, Y l p , and YP2a, or position 100 in YP2p (Fig. 2) (83). An analysis of phosphopeptides using staphylococcal V8 protease has clearly shown that the only phosphopeptide derived from the four yeast acidic proteins, labeled in vitro as well as in viuo, contains only the last serine (Fig. 2) (R. Zambrano, E. Briones and J. P. G. Ballesta, unpublished). Therefore, these results indicate that S. cerevisiae acidic proteins are phosphorylated in the same position as those from A. salina and ascites. However, it must be noted that additional phosphorylation sites must be present in the ascites P proteins because multiphosphorylated forms of these proteins occur in mammalian cells (27, 80),although no data on their location are available. The fact that Tetruhymena, the only eukaryotic organism in which phosphorylation of the acidic proteins has not been found (78), has P proteins that lack the last serine in the sequence (84) confirms the involvement of this residue in the protein modification. Protein PO also seems to be phosphorylated. As stated above, a phosphorylation band moving around the 40-kDa position in SDS-PAGE was found
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
173
when the in viuo-labeled proteins from the large ribosomal subunit were analyzed (16). Because PO has a molecular mass of 38 kDa, it has been assumed that the phosphorylated band corresponded to this polypeptide, although it may also correspond to protein L3, as has just been indicated. Nevertheless, two factors support the idea that the polypeptide is also modified in the cell, namely, the presence in PO of the same structure that is the phosphorylation site in proteins P1 and P2 (Table I), together with the capacity of the protein to be phosphorylated in vitro (81). Moreover, we have been able to show that, in yeast in vivo-phosphorylated ribosomes, a radioactive band coincides with the protein recognized in Western blots by an antibody specific to PO, and the antisera can immunoprecipitate a radioactive protein of the expected molecular mass (C. Santos and J. P. G. Ballesta, unpublished results). Information on the possible phosphorylation sites in protein PO is very scarce. Considering that this protein contains the same carboxyl end as the PUP2 proteins, including the phosphorylatable serine (Table I), it is reasonable to assume that this residue will also be modified in PO. In agreement with this assumption, a truncated PO lacking the carboxyl end is not phosphorylated in the ribosome (C. Santos and J. P. G. Ballesta, unpublished results), although the possibility that the truncation affects the modification at a different position cannot be ruled out.
B. Protein Kinases Involved in P-protein Phosphorylation A number of protein kinases can phosphorylate the ribosomal acidic proteins (81,85-91). Not surprisingly, considering the acidic character of the P proteins, casein kinase I1 (CKII) is one of them (81,88,92). CKII phosphorylates the four acidic proteins from yeast in vitro to roughly the same extent (92), as well as the P proteins from HeLa cells (81). Phosphorylation of mammalian proteins takes place at the last serine in the carboxyl terminus (84, and the equivalent serine is also modified in the yeast YP2p and YPla proteins. This serine is in an amino-acid sequence that fits the consensus for the CKII phosphorylation site (93). Two protein kinases that can phosphorylate the acidic proteins in S. cerevisiae were found some time ago (86) but, to our knowledge, no further studies were reported. However, the purification of a different enzyme, a 60S subunit specific kinase (PK60), having similar specificity toward the P proteins, but with somewhat different properties, was reported later by the same laboratory (89). This new protein kinase, contrary to CKII, phosphorylates the yeast acidic proteins to a different extent, with YP2a and YPlp being strongly modified and with YPZP practically unmodified (92);however,
174
JUAN P. G. BALLESTA AND MIGUEL REMACHA
the proteins are phosphorylated by PK60 at the last serine in the amino-acid sequence, (R. Zambrano, E. Briones and J. P. G. Ballesta, unpublished results), indicating that this enzyme has the same specificity as CKII with respect to the modification site. The insensitivity of protein YP2p to PK60 is unexpected because it is readily phosphorylated by CKII. These results, in addition to confirming the nonidentity of PK60 and CKII, suggest that PK60 does not exclusively recognize the almost identical carboxyl end of the proteins, but can detect differences in the rest of the protein structure that do not sect the activity of CKII. In this respect, one of the most obvious structural features that distinguishes YP2p from the other three acidic proteins is that the former is four amino acids larger. The acidic protein kinase purified from maize seems to phosphorylate only protein P2 (91), confirming the existence of protein kinases specific for individual P proteins. Regardless of the importance this difference in sensitivity to phosphorylation may have for the role of the various acidic proteins in vivo, which must wait for experimental confirmation, it reveals the existence of structural differences that are functionally detectable. It is especially notable that these differences in S. cerevisiae are detected between two members of the same type, YP2a and YPBP, which suggests that duplication of the acidic proteins in yeast is not a simple evolutionary accident without physiological consequences, but probably has some metabolic meaning. In addition to CKII and PK60, other protein kinase activities can phosphorylate the yeast acidic proteins. In fact, three of these ribosomal acidic protein (RAP) kinase activities are being studied in our laboratory. One of them, RAPI, has been highly purified, and shows functional and structural properties different from CKII and PK60; like CKII, RAP1 phosphorylates the four acidic proteins to the same extent but is not related to it either structurally or immunologically (90). RAP111 is especially interesting in that it is the only one that does not phosphorylate casein (G. Bou and J. P. G. Ballesta, unpublished results); all the other acidic protein kinases, including PK60, although preferentially modifying P proteins, phosphorylate casein to a certain extent, suggesting that, from that point of view, they can be considered as members of the casein kinase family. Independently of the number of kinases that phosphorylate the P proteins in vitro, little is known about the enzyme responsible for their in vim phosphorylation. Some experiments have been performed in our laboratory using yeast strains carrying a temperature-sensitive CKII. A clear reduction of the P protein phosphorylation rate has not been detected on shifting the cells to the restrictive temperature, suggesting that CKII is probably not involved in the in vivo modification of the yeast P proteins (E. Briones and
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
175
J. P. G. Ballesta, unpublished results), but further studies are required to confirm this conclusion.
C. Role of Phosphorylation on the Function of P Proteins Phosphatase treatment of acidic proteins extracted from ribosomes in conditions that excluded effects other than the removal of the phosphate groups shows that phosphorylation is required for protein activity. Thus, phosphatase-treated S . cerevisiae P proteins, but not untreated proteins, were unable to reconstitute the EF2-dependent activities of ribosomes stripped from their original acidic proteins by salt-ethanol washing; this inactivation of the proteins was due to the inability of the dephosphorylated polypeptides to bind to the ribosomal core particles (17, 54). Similarly, dephosphorylated rat liver P proteins are inactive in reconstituting active ribosomes, but they are reactivated by i n vitro phosphorylation (94). These data from lower eukaryotes as well as from mammals clearly indicate that phosphorylation is required for the binding of the P1 and P2 proteins to the ribosome in vitro and consequently for optimal activity of the particles. At the time these experiments were performed, protein PO had not been identified as a component of the stalk-forming protein complex, and therefore, the question about the possible effect of its phosphorylation state on the interaction of the P1 and P2 proteins was not addressed. However, because under the conditions used to remove the acidic proteins from the ribosome PO is not released (32, 48), and the phosphatase treatment was performed only on the split protein fraction, it is reasonable to assume that this polypeptide was present in the core particles in the phosphorylated form; therefore, the binding of the acidic proteins to the cores was carried out in the presence of a phosphorylated PO. It would be interesting to know whether PO phosphorylation has any role in the interaction of PI and P2, but at this moment there are no data either supporting or ruling out that possibility. In order to test whether phosphorylation is also a requirement for the acidic protein activity in vivo, a site-directed mutation study was performed on yeast proteins YPlp and YP2p. It was found that only substitution of YPlp serine 72 and YP2p serine 19 makes the proteins unable to bind to the ribosomes in the mutated strains, and because the initial tryptic peptide analysis seemed to indicate these serines were the phosphorylation site in the respective polypeptides (82, 95), the conclusion reached was that phosphorylation is also a requirement for the interaction of the acidic proteins i n vivo (82, 95). However, the recent demonstration that the four yeast acidic proteins are
176
JUAN P. G . BALLESTA AND MIGUEL REMACHA
in fact phosphorylated at the last serine in the sequence has led us to reappraise the previous results and to perform new mutagenesis experiments to test the effect of the substitution of that residue on the protein activity in uiuo. The results (83)show that a mutation in serine 96 of YPla, YPlp, and YP2a, and in serine 100 of YPSp, does not affect the capacity of the protein to bind to the ribosome. Ribosomes from strains carrying each one of these mutations separately contain apparently normal amounts of the mutated protein, mostly in its nonphosphorylated form. The presence of a minor band in the position of the phosphorylated protein suggests that there may exist a small part of the protein that is phosphorylated at an alternative site, but it does not alter the conclusion that phosphorylation does not seem to be a prerequisite for binding of the individual acidic proteins to the ribosome. The previous results showing an effect of serines 19 and 72 substitution on the interaction of proteins YP2p and YPlp with the ribosome are probably due to an alteration on the protein conformation. The explanation for the apparent contradiction between the in vitro data, indicating a requirement for phosphorylation, and the in vivo results, which do not show such a requirement, may lie in the different conditions in which the protein binding takes place. The in vitro phosphatase treatment simultaneously dephosphorylates all the proteins in the preparation, and, therefore, the binding to the ribosome takes place in the presence of only nonphosphorylated polypeptides. In the mutant strains, on the contrary, only one of the acidic proteins is dephosphorylated at one time, and because the proteins form a PUP2 complex, a possible helping effect of the phosphorylated proteins on the unmodified polypeptide may take place. This possibility could be tested in vivo by preparing an S . cerevisiae strain carrying only unphosphorylatable P proteins; an attempt to obtain this type of cell is being made using mutant strains lacking P proteins (96-98).
VI. Functional Roles of the Eukaryotic Stalk Components A. Function of the P1 and P2 Acidic Proteins Detailed analyses of the function of the eukaryotic stalk and its individual components in translation are scarce. The data available indicate that the acidic ribosomal proteins are required for activity of the elongation factors in A. salina (5), S . cerevisiae (48), and rat (99), which has led to the assumption that they play the same role that L7/L12 play in bacteria (100). If the basic role of the acidic proteins in translation is the same in all systems and can be performed by just one protein, as in bacteria and archae-
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
177
bacteria, their evolution into two different protein types, P1 and P2 in eukaryotes may be related to other types of functions. Eukaryotes control gene expression at the level of translation more frequently than do prokaryotes, using different types of mechanisms, and it is possible that duplication of the eukaryotic P proteins may be a consequence of their involvement in a regulatory process. The data available on the P proteins are compatible with this hypothesis.
1. DISRUPTION OF YEAST W k D A ACIDIC PROTEIN GENES The four S. cerevisiae acidic proteins, Y P l a , Y P l P , Y P 2 a , and YPBP, are encoded by both single and independent genes, and the role of each one of them has been studied by gene disruption. The effect of these inactivations on cell growth and on the amount of the remaining acidic proteins in the total cell extracts and in the ribosome has been assessed, and the data (96-98) are summarized in Table 111. The results indicate that none of the four acidic proteins is essential for cell growth and consequently for ribosome activity. Nevertheless, the independent inactivation of each gene produces a different but relatively small effect on cell growth. Thus, although the absence ofYP2ci scarcely affects the growth of mutant D4, inactivation of Y P l a causes an increase of about 50%in the D7 doubling time. Considering the existence of a cytoplasmic pool of acidic proteins that could replace the missing polypeptide, the negative effect on cell growth of the absence of an acidic protein is an indication of the specificity of its function, a function that cannot be performed by any of the remaining P proteins. This conclusion is reinforced by results of the estimation of the proteins in ribosomes from mutant strains, which showed no compensatory increase in the other acidic polypeptides. 2. ACIDICPROTEINSMUST BIND TO THE RIBOSOME AS A P1-P2 COMPLEX Simultaneous inactivation of two P protein genes causes a harmful effect on cell growth that is roughly the sum of the effects separately induced by inactivation of genes of individual proteins. However, when the two proteins belong to the same group, either P1 as in mutant D67 or P2 as in mutant D45, a synergistic effect of both mutations is found; these mutants grow approximately a third as fast as do the wild-type cells. Moreover, no P protein was found in the ribosomes from these strains, nor even the acidic proteins (whose genes are intact), which must be free in the supernatant because they are found in the total cell extracts. The presence of these proteins in the supernatant, especially conspicuous in the case of the mutant D67, indicates that the proteins of one type, either P1 or P2, do not interact strongly with the ribosome; binding of the P
TABLE I11 CHARACTERISTICS OF Succharomyces cereuisiae DISRUPTED STRAINS DEFECTIVE IN ONEOR MORE ACIDIC RIBOSOMAL PROTEINS ~
Estimation of Proteinsb
YPla Strain
W303-lb D4 D5 D6 D7 D45 D46
D47 D56 D57 D67 D456 D457 D467 D567 D4567
Missing protein
Growth rate"
None
1 1.05
YP2u YP2P YPlP YPla YPZaIP YP2a YPlP YP2a YPla YP2P YPlP YP2$ YPla YPlaIP YP2alp YPlP YP2aIP YPla YPla/$ YP2a YPlalP YPZP YPlaIP YP2alP
C.E.
1.22 1.61 2.61 1.35
n.e. n.e. n.e. n.e. n.e. n.e. n.e.
1.77
n.e.
1.77
n.e.
1.83
n.e.
2.83 2.88
n.e. n.e.
3.16
n.e.
3.16
n.e.
3.66
n.e.
3.20
n.e.
1.38
YPlP
Ribosome
C.E.
YP2a
Ribosome
++ ++ ++
+
++ ++ +++ -
-
++ ++ ++
+
+
C.E.
Ribosome
++ ++ ++
-
+I-
++
Ribosome
++ ++ ++ ++
++
++
C.E.
YPZP
+
+
+ ++++ -
+I-
+++ +++
Growth rate in YEP rich medium expressed as the ratio of mutant doubling time to W303-lb doubling time (90minutes). Proteins were estimated in the ribosomes by isoelectrofocusing (17),and in the total cell extracts (C.E.)by ELISA using specific monoclonal antibodies (57);YPla, could not be estimated (n.e.) due to the lack of a specific antibody. The amount of protein is proportional to the number of plus signs (+); +/-, trace; -, not detected. 0
b
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
179
proteins requires the formation of a complex containing at least one protein of each type, P1-P2. It seems, therefore, that the composition of the stalk protein complex, PO-P1-P2, detected in standard A. salina ribosomes by chemical cross-linking (40),is obligatory, and not the consequence of both acidic proteins being present at the same time. At least in S. cereuisiae, one of the proteins cannot act as a substitute for the other in forming the structure of the stalk. Nevertheless, the presence of ribosome-bound P proteins in all the other doubly disrupted mutants, D46, D47, D56, and D57, indicates that the two forms, (Y and p, of the yeast proteins are equivalent in this respect, and can form a complex with either form of the other type with equal probability. 3. P1 AND P2 PROTEINS ARE NOT ESSENTIAL FOR CELL VIABILITY
One of the most surprising results from these gene disruption studies is
the viability of the mutant strain D4567 carrying four inactivated P protein genes. In fact, it is not more affected than the previously noted slow-growing mutants that have P-protein-depleted ribosomes but still express one or two of them, like the double disruptant D45 and D67 and the four triple disruptants. The presence of free P proteins does not improve the growth rate, and even in some cases, like strain D567, the cells expressing one P protein grow slower than the quadruply disrupted D4567 (Table 111). It is, therefore, evident that all these slow-growing mutant strains, including D4567, contain ribosomes, that are functional in spite of being totally deprived of P proteins. In this respect, eukaryotes differ from prokaryotes. Although bacterial ribosomes deprived of protein L7112 are active under certain unusual conditions (101-103), these proteins seem to be required for ribosomal activity under standard conditions, and they are not among the numerous ribosomal proteins that are not absolutely necessary for ribosome activity (104). The activity of the P-protein-depleted ribosomes derived either from the double disruptants D45 and D67 or from D4567 have been tested in uitro; as expected, they are functional in polymerizing systems using synthetic as well as natural mRNAs, showing a level of activity around 35% of the control particles. This activity is strongly stimulated by exogenous P proteins (97, 98). The absence of acidic proteins from yeast ribosomes at€ects, as indicated, the efficiency of the particles but does not apparently alter their accuracy, because no significant increase has been detected in their in uitro misreading activity. Moreover, the capacity of the mutants to suppress nonsense codons in uiuo is equally unaffected (98).
180
JUAN P. G . BALLESTA AND MIGUEL REMACHA
4. P1 AND P2 PROTEINS AFFECT THE PATTERN OF PROTEINEXPRESSION Notable differences were observed when the patterns of proteins expressed by the D4567 strain and the wild-type cells are compared by twodimensional gel electrophoresis (98),and similar although not identical alterations are found in the other slow-growing double and triple disruptants (A. Jimenez-Diaz, M. Remacha and J. P. G . Ballesta, unpublished results). Expression of a number of proteins is notably reduced or even suppressed in the mutant strains but other proteins are specifically expressed by these cells. A change in the proteins expressed was also detected in vitro when mutant and wild-type ribosomes were used to translate the same preparation of total yeast mRNA; the wild-type pattern is restored on addition of exogenous PUP2 proteins to the mutant extracts (98). An effort is being made to identify these differentiating proteins. A search of the protein 2D-electrophoresis data bank has indicated that one set of proteins that is notably increased in the D4567 pattern (76) seems to correspond to some heat-shock proteins that are also expressed preferentially in stationary phase cells (105). A change in the protein expression pattern as indicated by the absence of acidic proteins in the ribosome is congruent with the phenotypes of the mutant strains. Thus, some are cold sensitive, do not use some carbon sources, and, when in diploid state, are unable to sporulate (98). These phenotypic characters, which might be the result of suppression of the expression of any of the proteins involved in the respective processes, can be reversed by transformation with plasmids that carry the gene encoding a complementary acidic protein, whose expression allows the formation of the Pl-P2 protein complex that can bind to the ribosome. It seems, in summary, that the eukaryotic ribosomes depleted of acidic P1 and P2 proteins can become involved in the normal protein synthesis machinery, but translate the mRNA pool with different efficiency, either higher or lower, than the standard particles; this results in a different pattern of expressed proteins, affecting some metabolic pathways. The effect of the acidic P proteins on the translation efficiency of the ribosomes and, consequently, on the pattern of protein expression by the cells is important with respect to the role of these ribosome components in translation. The availability of ribosomes lacking acidic proteins is, therefore, an useful tool for studying this function further. However, what makes these results especially interesting is the fact that this type of P-protein-depleted ribosome seems to exist in the cell, and increases in certain metabolic conditions. Thus, as previously noted, ribosomes from stationary phase yeast contain roughly half the amount of P
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
181
proteins, and this reduction seems to affect the four acidic proteins equally. Because all the previous considerations about the composition of the stalk suggest that ribosomes cannot contain less than one dimer of P1 and one dimer of P2, this reduction in the total amount of P proteins implies the accumulation of ribosomes in stationary cells-roughly half of the population-that do not contain P proteins. These ribosomes may take part in a translation control mechanism specific for stationary cells; the expression of stationary phase specific proteins by strains containing P-protein-depleted ribosomes, mentioned above, would support this hypothesis.
B. Functional Analysis of Protein PO 1. Po IS ESSENTIAL FOR CELL VIABILITY Protein PO is the other fundamental component of the eukaryotic stalk protein complex. The structural and functional role of PO has also been analyzed by gene disruption methods. PO is encoded in S. cerevisiae by a single gene. All attempts to knock out the PO gene, using either haploid or diploid strains, have failed. At the same time these results showed a little-understood capacity of the gene to be duplicated during the disruption process (59), and indicated that the gene is probably required for cell viability. A conditional null mutant strain in which the genomic PO gene was placed under the control of an inducible GAL1 promoter was then constructed. Shifting the mutant cells from galactose to glucose media resulted in a growth halt as well as in a dramatic increase in cell death, confirming that in contrast with the 12-kDa acidic P1 and P2 proteins, PO is essential for ribosome activity (59). In the absence of PO synthesis, defective 60-S particles, lacking P1 and P2 as well as PO, accumulate, which results in the formation of half-mer polysomes. Afterward, degradation of the defective particles takes place (59). 2. FUNCTION OF THE PO CARBOXYL-TERMINAL DOMAIN
Taking advantage of the conditional null mutant, the role of the C-terminal domain of the PO protein was tested. A successive deletion of the 3' region of the gene was performed, resulting in a series of truncated forms of PO lacking different portions of the carboxyl end. The deleted genes were tested for their capacity to induce the growth of the conditional mutant strain in glucose. In this way, it was shown that removal of the last 21 amino acids, almost totally identical to those in the P1 and P2 proteins (Table I), has little effect on the ribosome function, affecting the cell viability very mildly; however, a progressive reduction of the C-end extension increasingly affects the stability of the stalk complex, making the ribosome less active, and reduces
182
JUAN P. G. BALLESTA AND MIGUEL REMACHA
the cell growth rate notably (Fig. 3). Finally, when 132 amino acids are removed, representing practically the whole extension, proteins P1 and P2 cannot interact with the truncated form of protein PO. In these conditions, the ribosomes become inactive and the cells are nonviable in glucose (106). It seems, therefore, that the part of the PO extension corresponding to the amino-terminal domain of the PUP2 proteins has a key role in the interaction of the 12-kDa acidic proteins and in the formation of the stalk. On the contrary, the last amino acids do not seem to have a relevant part in the stabilization of the complex process. This is in agreement with the previously mentioned results obtained with the 12-kDa acidic proteins, which indicated the involvement of the N terminus but not of the C terminus in the interaction of these polypeptides with the ribosome (19). However, the presence of the PO C-terminal peptide is essential for ribosome activity and cell viability if the 12-kDa PllP2 proteins are not found on the ribosome. This was shown when the effect of the PO deletions was tested in a conditional PO null mutant strain that also carries PUP2 gene disruptions, and contains Pl/P2-deficient ribosomes. In this strain, the truncated polypeptide lacking only the last 21 amino acids, which was functional in the wild-type strain, was unable to restore growth of the mutant cells (106).
GENETIC BACKGROUND VIABILITY
WILD TYPE
D67
PoAC2 1 WILD TYPE
POAC2 1 D67
POACl32 WILD-TYPE
YES
YES
YES
N)
MJ
FIG.3. Diagram of the eukaryotic stalk structure in different stalk-protein mutant strains and effect on ribosome activity and cell viability. In wild-type ribosomes (A) it is proposed that the stalk is formed by the association of one P1 dimer and one P2 dimer with the carboxyl extension of protein PO. Proteins PO and L15, by binding to the 2 6 4 rRNA GTPase center (heavy black line), form the base of the stalk structure. (B) P1 and P2 can be eliminated from the ribosome in different disrupted strains, like D67, reducing the growth rate but without an effect on cell viability (C) The region encoding the last 21 aa (POAC21) of PO can be deleted in wildtype ribosomes, but the same deletion is lethal in disruptant strains lacking P1 and P2 in the ribosome (D). (E) Deletion of the complete PO C-terminal extension (POAC132) impedes the formation of the PO-PI/P2 complex and is lethal in all cases.
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
183
Together with the data on the activity of the ribosomes from the quadruple disruptant strain discussed earlier, these results indicate that, as schematically shown in Fig. 3, the eukaryotic PO-P1/P2 complex, and, therefore, the large ribosomal stalk, contain five structurally and functionally similar C terminii, four provided by the two PUP2 dimers and one by PO. Supporting the pentameric structure of the C-end complex is the estimation of close to five molecules of P proteins per ribosome, using antibodies that cross-react mainly with the conserved carboxyl terminii of PO and PUP2 proteins (64). Optimal activity of the ribosome is achieved in the wild-type strain by the simultaneous presence of the five C-end moieties; however, protein PO alone is able to provide the minimal stalk required for the activity of the ribosome in the disruptant strains (Fig. 3). The addition of P1 and P2 increases the efficiency of the ribosome, but does not seem to affect the accuracy of the particle (98).The removal of the C end from PO only slightly afFects the activity of the particle in the wild-type, because it simply reduces to four the number of this domain in the stalk, but it is obviously lethal in the disruptants that, in these conditions, would totally lack this essential element.
C. Functional Exchange of Heterologous Acidic Proteins The analogous function that the acidic proteins have in their respective cells, together with the highly conserved primary structure, strongly support the idea that the proteins from different eukaryotic species would be easily interchangeable. However, the attempt to confirm this experimentally has shown the idea to be erroneous. Thus, the acidic proteins P1 and P2 from Dictyostelium discoideum, in spite of the close sequence similarity to those from S. cerevisiue, are unable to reverse the negative effect caused by the absence of their counterparts when transformed into the appropriate yeast disruptant strains. Whereas D. discoideum P1 protein can bind to the PIdeficient yeast ribosomes, although the binding is not functional, protein P2 does not bind to the yeast ribosomes and is immediately degraded (107). Similar results have been obtained when rat liver P1 and P2 proteins are expressed in yeast disruptant strains (B. Bermejo, M. Remacha, and J. P. G. Ballesta, unpublished results). In both cases the results suggest that the heterologous proteins are unable to bind properly to the ribosome, in other words, to form a functional heterologous PO-Pl/PS complex. Transformation of S. cerevisiue PO conditional null mutants with plasmids carrying D. discoideum PO allows growth of the cells under restrictive conditions, namely, in the presence of glucose, indicating that in this case the heterologous protein is functional (M. A. Rodriguez-Gabriel and J. P. G. Ballesta, unpublished results). It seems that D. discoideum PO binds correctly to the yeast 26-S rRNA, and the bound protein can perform its func-
184
JUAN P. G . BALLESTA AND MIGUEL REMACHA
tion as part of the stalk. In fact, this result is not totally surprising considering that the PO binding site, the GTPase center, is one of the more conserved regions in the large rRNA (50, 108). Therefore, the RNA binding site at the amino-terminal domain of the protein can also be expected to be highly conserved. On the other hand, as indicated several times before, the PO C terminus is also highly conserved and practically identical to the one in the yeast polypeptide. Altogether, the heterologous expression data indicate that although the PO rRNA binding site in the N-terminal domain and the functional carboxyl end are highly conserved, the PUP2 binding site in the intermediate region of the protein has evolved to an extent that only the homologous acidic proteins are able to interact and form an active protein complex. In a similar way, the C end of the PUP2 proteins has been conserved, but the amino domain, implicated in the formation of the PO-P1/P2 complex, has coevolved with its partner site in the PO protein, so the PO-Pl/PB complex, the stalk, must be considered as a structural and functional unity that during evolution has been forced to maintain unaltered its interfaces with the other elements in the protein synthesis machinery. These interfaces correspond to the PO N terminus interacting with the ribosome, and the C end of the proteins forming the tip of the stalk, essential for the elongation factor interaction. Apparently, mutations in the regions implicated in the interaction of the complex components were less damaging for cell viability, which has allowed a substantial evolutionary drift in the sequence of these regions, leading consequently to an incompatibility of the components from relatively close species.
VII. Regulation of Ribosome Activity and Translation by the Eukaryotic Ribosomal Stalk All the experimental data on the structure and function of the eukaryotic stalk previously noted are compatible with the existence of a mechanism that may control the ribosomal activity and, therefore, the translation process, by regulating the amount of protein P1 and P2 bound to the ribosome. As shown schematically in Fig. 4, ribosomes carrying a full complement of proteins P1 and P2 are probably in equilibrium with particles deprived of these components. This equilibrium might be controlled by either protein lanase(s) or protein phosphatase(s) that would regulate the level of phosphorylation of the 12-kDa acidic proteins. Variation of the amount of acidic proteins in the ribosome as a function of the metabolic state of the cells
185
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
IUNASE
rl
D4567
* 9
R
'
PHOSPHATASE
b ?
W303-lb
.I
#
FIG. 4. Scheme of the proposed ribosomal activity regulation by the P1 and P2 proteins. Pl/P2-depleted 60-S ribosomal subunits are in equilibrium with complete particles, through the acidic protein cytoplasmic pool, possibly by a phosphorylationldephosphorylation process. Both subunit types can participate in the translation process, but translate the pool of mRNAs in the cell with different efficiencies. Many mRNAs (b) seem to be translated to a similar level, but some, perhaps due to their more complex structure (a), seem to be exclusively translated by the complete particles; others (c), due to still unknown reasons (specific structure?), are expressed only by the deficient ribosomes. As a result of these differences, the protein patterns expressed by cells carrying complete and depleted particles are not the same; some proteins are absent from the PUP2-depleted system and a few are expressed only in this system, as indicated in the panels (bottom) selected from 2D-electrophoresis gels from wild-type (W303-lb) and PllP2deficient strains (D4567).
186
JUAN P. G . BALLESTA AND MIGUEL REMACHA
suggests that these enzymatic activities are also integrated in the general cellular control network and modulated in response to the protein synthesis requirements. In actively growing cells, ribosomes containing a full complement of acidic proteins would accumulate, whereas in nongrowing conditions-in stationary phase in the case of yeast-PUP2-depleted ribosomes will prevail. Ribosomes containing normal amounts of P1 and P2, and therefore a complete stalk, will be involved in polysomes translating to different extents a set of mRNAs in cells growing exponentially, resulting in a defined 2Delectrophoresis pattern of expressed proteins. Depleted ribosomes will also be engaged in polysomes but will be translating a partially different set of mRNAs, producing a partially different pattern of protein expression. A large number of proteins are equally represented in extracts from both systems, indicating that their respective mRNAs are similarly translated by the two types of ribosomes. A substantial number of proteins are absent or notably reduced in the PUP2-defective systems, and in these cases their mRNAs must be poorly translated by the incomplete ribosomes; in a few cases the proteins present in the defective systems are absent in the wildtype extracts. In this last case, the corresponding mRNAs seem to be exclusively translated by the Pl/PZdepleted ribosomes. Although the experimental data are compatible with this regulatory model, there are a number of aspects that must be confirmed. First is the mechanism by which the binding of the P proteins to the ribosomes is regulated. The effect of dephosphorylation on the in uitro affinity of the proteins for the ribosome, noted earlier, strongly suggests that this modification is involved in the process; however, the in uiuo implication of P protein phosphorylation is still not totally confirmed. Second, it has not been defined which protein kinases and phosphatases are responsible for the modification of the proteins, and therefore it is not known whether they are also regulated. Finally, the mechanisms by which the differential translation by the PUP2-defective ribosomes takes place have not as yet been explored in detail. The reduction or even the total absence of some proteins in the Pl/P2-defective systems may be due to a more complex secondary structure of their mRNAs, imposing some serious constraints on their translation by the less efficient defective ribosomes. This possibility is being explored by comparing the expression of various CAT mRNAs carrying a 5’ UTR with double-stranded structures of different stabilities (109). The preliminary results indicate no substantial differences in the relative levels of expression of the same mRNAs in wild-type strains and in PUP2 disruptants (E. Guarinos and J. P. G. Ballesta, unpublished results), indicating that at least the structure of the untranslated leader region does not notably affect the efficiency of
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
187
their translation by complete and PUPS-defective ribosomes. These results are in some ways not totally unexpected and confirm that the large ribosomal subunit does not seem to have a role in scanning the mRNA 5’ leader region, as indicated by the generally accepted model for eukaryotic protein synthesis initiation (110). Probably, the structure of the mRNA coding region will be more relevant in clarifying this mechanism, and experiments using mRNAs differing in the stability of the secondary structure of this part will be also performed. In any case, the greater difficulty in opening up complex double-stranded mRNA structures during translation cannot explain the expression of some specific proteins only by the defective ribosomes, because simple mRNAs should also be equally well translated by normal ribosomes. In these cases a positive selection of some mRNAs by the defective ribosomes must be considered. How the acidic proteins can affect the mRNA selection by the ribosomes is not clear, but the data confirm an implication of the large ribosomal subunit in the translation initiation process, as suggested by previous reports (111). The regulatory mechanism proposed, although analyzed in more detail in yeast, also seems to be present in other eukaryotic organisms and may have been developed as a way to regulate the expression of proteins required for stationary phase or, more generally, for conditions of low metabolic cell activity. In agreement with this suggestion is the fact that the ribosomes from the cells at the first stages of plant germination seem not to have acidic proteins, which are incorporated in a phosphorylated state at later stages of development (112). It would be interesting to explore whether this mechanism is also working in other types of resting cells.
VIII. Regulation of P1 and P2 Expression Expression of the acidic P1 and P2 proteins seems to be controlled by the same mechanism used for the rest of the ribosomal proteins. In S. cerevisiae, expression of the YPZP gene and the structure of its promoter were analyzed in detail and did not show any particular differences with respect to other ribosomal proteins studied in yeast (113). Similarly, expression of protein YP2a is controlled in a way similar to that of other ribosomal components in secretory pathway mutants wherein the synthesis of ribosomes is affected (114). Nevertheless, in S. cerevisiae, the relative expression of the four 12-kDa proteins changes with the metabolic cellular state. Using Lac2 fusions, expression of the four yeast P genes was tested in different growth conditions, including nutritional shifts and heat-shock treatments (115).The results indi-
188
JUAN P. C . BALLESTA AND MIGUEL REMACHA
cated that the carbon source afFects the synthesis of the four proteins differently, and, thus, P2 protein expression seems to be significantly inhibited in acetate media. Similarly, on heat shock, YPlP and YP2a levels drop notably, whereas those of YPla and YP2p increase. Moreover, the optimal ratio of different acidic proteins also seems to be reached by a specific mechanism. Thus, the disruption of the genes encoding the P1 proteins in S. cereuisiae results in a strong stimulation of the P2 proteins, and more specifically of protein YPSP, which increases four to six times its normal level in the total cell extracts; on the contrary, elimination of the P2 proteins results in a drastic decrease of the amount of P1 proteins in the cell (60). These results indicate that P2 proteins strongly stimulate the synthesis of P1, while P1 proteins repress the expression of P2. By means of this mutual and opposite regulation, the ratio of P1 and P2 proteins in cells reaches an equilibrium that is probably optimal for the specific conditions in which the cells are growing. The mechanism by which this mutual regulation takes place is presently being studied, but is only poorly understood. P1 proteins might work at the level of transcription because the amount of YP2p mRNA is increased up to threefold on disruption of the P1 protein genes, although the increase in the amount of protein is substantially higher. On the contrary, disruption of the P2 genes causes only a limited reduction of the P1 mRNA, which would not account for the reduction of the amounts of these proteins detected in the corresponding disruptants, suggesting the existence, in this case, of some kind of post-transcriptional control. In any event, considering the acidic character of the proteins, it is quite improbable that they interact directly with the nucleic acids and, therefore, they probably work through interaction with other proteins. In this sense, it is interesting to note that in order to carry out its repressive effect, P1 proteins must be able to bind to the ribosome, because mutated YPlp proteins that do not bind to the particles are unable to repress the overexpression of P2 proteins (M. A. Rodriguez Gabriel and J. P. G. Ballesta, unpublished results). All these observations together indicate that, in S . cereuisiae, the ratio of the four acidic proteins seems to be regulated, and it changes as a function of the metabolic conditions of the cell. At this time, there are no data indicating what the physiological meaning of these changes may be, but in the same way that the presence or absence of the acidic proteins af€ects the pattern of protein expression, as noted above, we can speculate that the presence of different acidic proteins in the ribosome may also affect the capacity of the particles to select the mRNAs. In this way, a variation in the metabolic conditions may induce a change in the ratio of the two P1 and two P2 proteins, which will result in ribosomes translating a pattern of proteins better adapted for the new physiological state of the cells.
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
189
In fact, acidic proteins regulate their own expression by a mechanism that may be a case of control through the ribosome-bound P proteins. Thus, as we have just indicated, expression of YP2p is increased on removal of P1 proteins from the ribosomes whereas expression of YPlp is strongly stimulated when P2 proteins are bound to ribosomes. It may be that the YP2p mRNA is preferentially translated by P protein-depleted ribosomes but the YPlp mRNA requires complete particles for translation. If this is so, variation in the expression of the different acidic proteins after heat shock, indicated previously, may also be due to a change in the amount of P proteins bound to ribosomes caused by the sudden temperature alteration; in fact, the apparent expression of some heat-shock proteins by the P-protein-depleted ribosomes establishes a connection between these deficient particles and temperature stress. These possibilities are presently being explored.
IX. Future Prospects The stalk plays an essential role during protein synthesis, facilitating the interaction of the supernatant factors with the ribosome. This function has been conserved during evolution and, in all species tested, the stalk is required for elongation-factor-dependent activities. Nevertheless, the data clearly indicate that the ribosomal proteins forming the stalk in the eukaryotic ribosome, the proteins PO, PI, and P2, have a number of peculiarities that distinguish them from their prokaryotic counterparts. It seems that in addition to performing their basic role in the supernatant factor interaction, they may modulate the activity of the particle. The presence of P1 and P2 in the ribosome affects the efficiency of the particle but also seems to play a role in the selection of some specific mRNAs. However, this additional regulatory function has been studied in detail only in the case of S. cerevisiae, and although the data suggest that there is a similar function in other species, they also indicate that the regulatory mechanism may not be exactly the same in all organisms. Thus, the level of phosphorylation of the P1 and P2 proteins increases with the evolution of organisms and is notably higher in mammals than in yeast. It would be important to know whether this difference has any functional significance in relation to their regulatory function of the proteins. Another remarkable difference among eukaryotes is the existence of more than one acidic protein of each type, P1 and P2, in some species. The number seems to be especially high in some protozoa. The S . cerevisiue data indicate that the relative expression of the different acidic proteins is altered by the nutritional conditions of the cells. Are the different proteins implicated in the translation of specific mRNAs?
190
JUAN P. G . BALLESTA AND MIGUEL REMACHA
If phosphorylation of acidic proteins is required for their regulatory role, it is obvious that characterization of the protein kinases and phosphatases responsible for their in vivo modification is essential in order to know how the ribosome regulatory mechanism is connected to the overall metabolism control network. In any case, regardless of the questions that are still pending to definitively confirm and clarlfy the regulatory process in which the eukaryotic acidic proteins seem to be involved, they are a good example of how the structure and function of some ribosomal components have evolved from having only a role in the basic protein synthesis machinery to also performing regulatory activities. From this perspective, it will be highly interesting to study in detail the structure and function of the ribosomal stalk components in the archaebacteria, where they are encoded in bacteria-like operons (116-119) but have a structure closer to the eukaryotic proteins (39).
ACKNOWLEDGMENTS We thank M. C. Fernandez Moyano for expert technical assistance. This project was supported by Grant PB94-0032from the Direccidn General de Politica Cientifica (Spain) and by an institutional grant to the Centro de Biologia Molecular from the Fundacidn Ram6n Areces (Madrid).
REFERENCES 1. M. Radermacher, T. Wagenknecht, A. Verscboor and J. Frank, EMBO j . 6, 1107 (1987). la. M. Stoffler-Meilike and G. Stoffler, in “The Ribosome: Structure, Function and Evolution” (W. E. Hill, A. Dalhberg, R. B. Garrett, P. B. Moore, D. Schelisinger and J. R. Warner, eds.), p. 123. American Society for Microbiology, Washington DC, 1990. l b . A. Liljas, Int. Rev. Cytol. 124, 103(1991). 2. H. Bielka, ’The Eukaryotic Ribosome,” Springer-Verlag, Berlin and New York, 1982. 3. I. G. Wool, Y. L. Chan, A. Gluck and K. Suzuki, Biochimie 73, 861 (1991). 4. B. E. Rich and J. A. Steitz, MCBiol7, 4065 (1987). 5. A. J. van Agthoven, J. A. Maassen and W. Moller, BBRC 77, 989 (1977). 6. S. Qian, J.-Y. Zhang, M. A. Kay and M. Jacobs-Lorena, NARes 15, 987 (1987). 7. J. D. Wigboldus, NARes 15, 10064 (1987). 8. J. Prieto, E. Candel and A. Coloma, NARes 19, 1340 (1991). 9. K.-D. Scharfand L. Nover, BBA 909,44 (1987). 10. M. Remacha, M. T. Saenz-Robles, M. D. Vilella and J. P. G. Ballesta, JBC 263, 9094 (1988). 11. K. Mitsui and K. Tsurugi, NARes 16, 3575 (1988). 12. C. H. Newton, L. C. Shimmin, J. Yee and P. P. Dennis, j . B a t . 172,579 (1990). 13. M. Beltrame and M. E. Bianchi, MCBiol 10, 2341 (1990). 14. M. J. Levin, M. Vazquez, D. Kaplan and A. G. Schijman, Purasitol. Today 9,381 (1993).
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
191
T. Kruiswijk and R. J. Planta, FEBS Lett. 58, 102 (1975). S. Zinker and J. R. Warner, JBC 251, 1799 (1976). F. Juan-Vidales, M . T. Saenz-Robles and J. P. G. Ballesta, Bchem 23, 390 (1984). C. Santos, B. L. Ortiz-Reyes, T. Naranda, M. Remacha and J. P. G. Ballesta, Bchem 32, 4231 (1993). 19. J. M. Payo, H. Santana-Roman, M. Remacha, J. P. G. Ballesta and S. Zinker, Bchem 34, 7941 (1995). 20. K. B. Elkon, S. Skelly, A. Parnassa, W. Moller, W. Danho, H. Weissbach and N. Brot, PNAS 83, 7419 (1986). 21. E. A. Mesri, G. Levitus, M. Hontebeyrie-Joskowicz, G. Dighiero, M. H. V. van Regenmortel and M. J. Levin, J. CBn. Microbiol. 28, 1219 (1990). 22. A. G. Schijman, G. Levitus and M. J. Levin, Immunol. Lett. 33, 15 (1992). 23. M. Soto, J. M. Requena, M. Garcia, L. C. Gomez, I. Navarrete and C. Alonso,JBC 268, 21835 (1993). 24. A. van Agthoven, J. Kriek, R. Amons and W. Moller, EJB 91, 553 (1978). 25. N. Brot, R. Marcel, E . Yamasaki and H. Weissbach, JBC 248, 6952 (1973). 26. S. Isono and K. lsono, MGG 183, 473 (1981). 27. A. Lin, B. Wittman-Leibold, T McNelly and I. G. Wool, JBC e57, 9189 (1982). 28. F. Juan-Vidales, F. Sanchez-Madrid, M. Saenz-Robles and J. P. G. Ballesta, EJB 136,275 (1983). 29. A. Liljas and A. T. Gudkov, Biochimie 69, 1043 (1987). 30. A. Liljas, S. Erikson, D. Donner and C. G. Kurland, FEBS Lett. 88, 300 (1978). 31. M. Leijonmarck, S. Eriksson and A. Liljas, Nature 286, 824 (1980). 32. G. Towbin, M. Siegmann and J. Gordon, JBC 257, 12709 (1982). 33. K. Mitsui and K. Tsurugi, NARes 16, 3573 (1988). 34. K. Mitsui, T. Nakagawa and K. Tsurugi, J. Biochem. 106, 223 (1989). 35. J. Prieto, E. Candel and A. Coloma, NARes 19, 1342 (1991). 36. A. G. Schijman and M. J. Levin, NARes 20, 2894 (1992). 37. Y. A. Skeiky, D. R. Benson, M. Parsons, K. B. Elkon and S. G. Reed, J . Exp. Med. 176, 201 (1992). 38. Y. Hihara, M. Umeda, C. Hara, K. Toriyama and H. Uchimiya, Plant Physiol. 105, 753 (1994). 39. L. C. Shimmin, G. Ramirez, A. T. Matheson and P. P. Dennis, J. Mol. Euol. 29, 448 (1989). 40. T Uchiumi, A. J. Wahba and R. R. Traut, PNAS 84, 5580 (1987). 41. T. Uchiumi and R. Kominami, JBC 267, 19179 (1992). 42. J. H. Highland and G. A. Howard, JBC 250, 831 (1975). 43. G. N. Zecherle, A. Oleinikov and R. R. Traut, Bchem 31, 9526 (1992). 44. A. D. Beauclerk, E. CundWe and J. Dijk, JBC 259, 6559 (1984). 45. J. Egebjerg, S. D. Douthwaite, A. Liljas and R. A. Garrett, JMB 213, 275 (1990). 46. M. J. Dognin and B. Wittman-Leibold, FEBS Lett. 84, 342 (1977). 47. M. Cannon, D. Schindler and J. Davies, FEBS Lett. 75, 187 (1977). 48. F. Sanchez-Madrid, R. Reyes, P. Conde and J. P. G. Ballesta, EJB 98, 409 (1979). 49. M. T. Saenz-Robles, M. D. Vilella, G. Pucciarelli, F. Polo, M. Remacha, B. L. Ortiz, F. Vidales and J. P. G. Ballesta, EJB 177, 531 (1988). 50. T. T. A. L. El-Baradi, V. C. H. F. de Regt, S. W. C. Einerhand, J. Teixido, R. J. Planta, J. P. G. Ballesta and H. A. Rau6, JMB 195, 909 (1987). 51. J. R. Warner, Microbiol. Reu. 53, 256 (1989). 52. S. Ramagopal, EJB 69, 289 (1976). 53. S. Zinker, BBA 606, 76 (1980).
15. 16. 17. 18.
192
JUAN P. G . BALLESTA AND MIGUEL REMACHA
54. F. Sanchez-Madrid, F. Juan-Vidales and J. P. G. Ballesta, EJB 114, 609 (1981). 55. K. Mitsui, T. Nakagawa and K. Tsurugi, J. Biochem. 104, 908 (1988). 56. K. Tsurugi, and K. Ogata, J. Biochem. 98, 1427 (1985). 57. M. D. Vilella, M. Remacha, B. L. Ortiz, E. Mendez and J. P. G. Ballesta, EJB 196, 407 (1991). 58. K. Mitsui, M. Motzuki, Y. Endo, S. Yokota and K. Tsurugi, J. Biochem. 102, 1565 (1987). 59. C. Santos and J. P. G. Ballesta, JBC 269, 15689 (1994). 60. B. Bermejo, M. Remacha, B. L. Ortiz-Reyes, C. Santos and J. P. G. Ballesta, JBC 269, 3968 (1994). 61. E. Koren, M. W. Reichlin, M. Koscec, R. D. Fugate and M. Reichlin, J. Clin. Inoest. 89, 1236 (1992). 62. I. Pettersson and A. Liljas, FEBS Lett. 98, 139 (1979). 63. C . Casiano, A. T. Matheson and R. R. Traut, JBC 265, 18757 (1990). 64. M. T. Saenz-Robles, M. Remacha, M. D. Vilella, S. Zinker and J. P. G. Ballesta, BBA 1050, 51 (1990). 65. A. J. van Agthoven, J. A. Maasen, P. J. Schrier and W. Moller, BBRC 64, 1184 (1975). 66. A. T. Gudkov and J. Behike, EJB 90, 309 (1978). 67. A. K. E. Kopke, P. A. Leggatt and A. T. Matheson, JBC 267, 1382 (1992). 68. S. Rarnagopal and A. R. Subramanian, PNAS 71, 2136 (1974). 69. T. Kruiswijk, R. J. Planta and W. H. Mager, EJB 83, 245 (1978). 70. I. Pettersson, S. J. Hardy and A. Liljas, FEBS Lett. 64, 135 (1976). 71. W. A. Strycharz, M. Nomura and J. A. Lake, JMB 126, 123 (1978). 72. L. Montesano and D. G. Glitz, JBC 263, 4932 (1988). 73. A. Verschoor and J. Frank, JMB 214, 737 (1990). 74. R. R. Traut, J. M. Lambert and J. W. Kenny, JBC 258, 14592 (1983). 75. A. R. Subramanian, BBA 374, 400 (1974). 76. M. Remacha, A. Jimenez-Diaz, C. Santos, R. Zambrano, E. Briones, M. A. Rodriguez Gabriel, E. Guarinos and J. P. G. Ballesta, Biochem. Cell B i d . 73, 959 (1995). 77. F. Campos, M. Corona-Reyes and S. Zinker, BBA 1087, 142 (1990). 78. J. Sandermann, A. Kruger and K. Kristiansen, FEBS Lett. 107, 343 (1979). 79. 0. Issinger and R. R. Traut, BBRC 59, 829 (1974). 80. R. Reyes, D. Vazquez and J. P. G. Ballesta, EJB 73, 25 (1977). 81. P. Hasler, N. Brot, H. Weishach, A. P. Parnassa and K. B. Elkon, JBC 266,13815 (1991). 82. T. Naranda, M. Remacha and J. P. G. Ballesta, JBC 268, 2451 (1993). 83. R. Zambrano, Ph.D. Thesis, Madrid Autonomous University (1995). 84. T. S. Hansen, P. H. Andreasen, H. Dreisig, P. Hfljrup, H. Nielsen, J. Engberg and K. Kristiansen, Gene 105, 143 (1991). 85. W. Kudlicki, N. Grankowski and E. Gasior, Mol. Biol. Rep. 3, 121 (1976). 86. W. Kudlicki, N. Grankowski and E. Gasior, EJB 84, 493 (1978). 87. W. Kudlicki, R. Szyszka, E. Palen and E. Gasior, BBA 633, 376 (1980). 88. C. Thoen, E. D. Herdt and H. Slegers, BBRC 135, 347 (1986). 89. M. Pilecki, N. Grankowski, J. Jacobs and E. Gasior, EJB 206, 259 (1992). 90. R. Szyszka, G. Bou and J. P. G. Ballesta, BBA 1293, 213 (1996). 91. G. Sepdveda, R. Aguilar and E. Sanchez de Jimenez, Physiol. Plant. 94, 715 (1995). 92. R. Szyszka, A. Boguszewska, N. Grankowski and J. P. G. Ballesta, Acta Biochim. Pol. 42, 357 (1995). 93. E. A. Xuenzel, J. A. Mulligan, J. Sornmercorn and E. G. Krebs, JBC 262, 9136 (1987). 94. W. P. MacConnell and N. 0. Kaplan, JBC 257, 5359 (1982). 95. T. Naranda and J. P. G. Ballesta, PNAS 88, 10563 (1991). 96. M. Remacha, C. Santos and J. P. G. Ballesta, MCBiol 10, 2182 (1990).
REGULATORY FUNCTION OF THE RIBOSOMAL STALK
193
97. M. Remacha, C. Santos, B. Bermejo, T. Naranda and J. P. G. Ballesta, JBC 267, 12061 (1992). 98. M. Remacha, A. Jimenez-Diaz, B. Bermejo, M. A. Rodriguez-Gabriel, E. Guarinos and J. P. G. Ballesta, MCBiol. 15, 4754 (1995). 99. W. P. MacConnell and N. 0. Kaplan, BBRC 92, 46 (1980). 100. W. Moller, Biochimie 73, 1093 (1991). 101. J. P. G. Ballesta and D. Vazquez, FEBS Lett. 28, 337 (1972). 102. V. E. Kotelansky, S. P. Domagatsky, A. T. Gudkov and A. S. Spirin, FEBS Lett. 73, 6 (1977). 103. B. R. Glick, FEBS Lett. 73, l(1977). 104. E. R. Dabhs, Biochimie 73, 639 (1991). 105. E. K. Fuge, E. L. Braun and M. Werner-Washbume, J. Bmt. 176, 5802 (1994). 106. 6.Santos and J. P. G. Ballesta, ]BC 270, 20608 (1995). 107. B. Bermejo, J. Prieto, M.Remacha, A. Coloma and J. P. G. Ballesta, BBA 1263,45 (1995). 108. T. Uchiumi and R. Kominami, EMBO J. 131, 3389 (1994). 109. F. A. Sagliocco, L. M. Vega, D. Zhu, M. F. Tuite, J. E. McCarthy and A. J. Brown, JBC 268, 26522 (1993). 110. M. Kozak, J. Cell Biol. 108, 229 (1989). 111. A. B. Sachs and R. W. Davis, Cell 58, 857 (1989). 112. A, Perez-Mendez, R. Aguilar, E. Briones and E. Sanchez de Jimenez, Plant Sci. 94, 71 (1993). 113. L. S. Kraakman, W. H. Mager, J. J. Grootjans and R. J. Planta, BBA 1090, 204 (1991). 114. K. Mizuta and J. R. Warner, MCBiol 14, 2493 (1994). 115. J. M. Payo, Ph.D. Thesis, Madrid Autonomous University (1993). 116. T. Itoh, EJB 176, 297 (1988). 117. A. K. E. Kopke and B. Wittmann-Liebold, Can. J. Microbiol. 35, 11 (1989). 118. L. C. Shimmin, C. H. Newton, C. Ramirez, J. Yee, W. L. Downing, A. Louie, A. T. Matheson and P. Dennis, Can. J . Microbiol. 35, 164 (1989). 119. C. Ramirez, L. C. Shimmin, P. Leggatt and A. T. Matheson, JMB 244, 242 (1994).
This Page Intentionally Left Blank
Regulation and Function of Adenosine Deaminase in Mice’ MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS~ Verna and Marrs McLean Department of Biochemistry Baylor College of Medicine Houston, Texas 77030
I. Developmental and Tissue-specific Expression of Ada . . . . . . . . . . . . . . 11. Regulation of Ada Gene Expression . . . . . . . A. Gene and Promoter Structure . . . . . . . . . B. Tissue-specific Regulation ................................... C. Model for Ada Gene Expression .............................. .......... 111. Physiological Role of ADA during Development IV. Role of ADA in the Murine Immune System ...................... V. Role of ADA in the Secondary Deciduum . . . . . . . . . . . . . . . . . . . . . . . . . VI. Role of ADA in the Gastrointestinal Tract . . VII. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References ....................
198 203 203 204 207 208 216 219 220 221 223
Adenosine deaminase (ADA; EC 3.5.4.4) is an essential and widely distributed enzyme of purine catabolism that catalyzes the hydrolytic deamination of adenosine and deoxyadenosine to inosine and deoxyinosine ( 1 , 2 ) (Figs. 1and 2). The enzyme is present throughout the evolutionary phyla and the amino-acid sequence of the protein has been highly conserved from bacteria to humans (3). The catalytic moiety of ADA exists as a single polypeptide chain with approximately 83.1%amino-acid sequence identity between humans and mice (4,5). The human ADA open reading frame consists of 1089 nucleotides with a deduced sequence corresponding to a 40,762-Da peptide containing 363 amino acids (4, 6). The murine ADA open reading frame is 1056 nucleotides long and encodes a deduced 39,900-Da protein of 352 amino acids (5).Murine ADA lacks the 11 carboxy-terminal residues of 1 Abbreviations: ADA, adenosine deaminase; Ada, adenosine deaminase gene; 5’-NT, 5’nucleotidase; PNP, purine nucleoside phosphorylase; XO, xanthine oxidase; GDA, guanine deaminase; dpc, days postcoitum; CAT, chloramphenicol acetyltransferase; AdoHcy, S-adenosylhomocysteine; AdoMet, S-adenosylmethionine. 2 To whom correspondence may he addressed.
Progress in Nucleic Acid Research and Molecular Biology, Vol. 55
Copyright 0 1996 by Academic Press, Inc.
195
All rights of reproduction in any form reserved.
196
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
HOCHp
OH
OH
adenosine
inosine
0
OH
2deoxyadenosine
2-deoxyinosine
FIG. 1. Schematic of the ADA-catalyzed reaction showing the hydrolytic deamination of adenosine and deoxyadenosine to inosine and deoxyinosine.
the human protein; however, the importance of this region is not known. Recombinant murine ADA has been subjected to crystallographic analysis, and the architecture of the enzyme is a parallel a/p barrel with eight central (3 strands and eight peripheral a helixes (7). One zinc atom for each ADA molecule is tightly coordinated with three histidines, one aspartate, and a catalytic H,O molecule within the active-site pocket (8).Mutations that alter these amino acids abolish ADA enzymatic activity, suggesting that zinc is an essential cofwtor (9u). Functional studies indicate that ADAs from different species are similar with regard to substrate specificity and kinetic constants. The conservation of structure and function suggests that ADA plays an essential role throughout evolution. The importance of ADA for vertebrate systems stems primarily from the physiological impact of its substrates, adenosine and deoxyadenosine. Adenosine functions as an extracellular signal transducer that mediates a large
197
ADA REGULATION AND FUNCTION
1 J
5’-AMP
5’-dAMP
J
5’-NT
adenosine
deoxyadenosine
ADA
inosine
1
deoxyinosine
PNP
J
hypoxanthine
IMP
xanthinehric acid
FIG. 2. Purine catabolic pathway. 5’-Nucleotidase (5’-NT)catalyzes the dephosphorylation of adenosine 5’hionophosphate (5’-AMP) and deoxyadenosine 5’-monophospbate. Adenosine deaminase (ADA) catalyzes the hydrolytic deamination of adenosine and deoxyadenosine to inosine and deoxyinosine, which are subsequently converted to hypoxanthine by purine nucleoside phosphorylase (PNP). Hypoxanthine can then be salvaged back to inosine monophosphate (IMP) by hypoxanthine phosphoribosyl transferase (HPRT), or can enter the nonsalvage pathway, where it is converted first to xanthine and then uric acid by xanthine oxidase (XO).
variety of physiological effects by binding to adenosine receptors present on the surface of target cells (10, 11). Deoxyadenosine behaves as a cytotoxic metabolite that can kill cells by interfering with deoxynucleotide metabolism and/or S-adenosylmethionine-dependent cellular transmethylation reactions (12).Deoxyadenosine cytotoxicity is believed to provide the metabolic basis for the immune deficiency that accompanies ADA deficiency in humans (13, 14). Because of the physiological importance of ADA, it is not surprising that the enzyme is ubiquitously distributed among vertebrate tissues (1, 15). In addition, the levels of ADA are developmentally regulated, and expression varies markedly among tissues (15-21), suggesting tissue-specific metabolic responsibilities for the enzyme. In mice, the tissue-specific differences in ADA enzymatic activity span a range of more than 10,OOO-fold (15). During prenatal development, the highest levels are present at the maternal-fetal interface throughout postimplantation stages of development (18-21). In the adult, the highest levels occur in the gastrointestinal tract (15).Within these tissues, the levels of ADA are
198
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
subject to pronounced developmental regulation. Thus, the expression of the Ada gene is characterized by highly enhanced expression in a small collection of diverse tissues and a low level of expression in most other tissues. In this review, our current understanding of how this complex pattern of ADA expression is achieved is discussed. In addition, progress toward understanding the physiological importance of ADA in various tissues using transgenic mice, is reviewed.
1. Developmental and Tissue-specific Expression of Ada
A. Expression during Prenatal Development Very high levels of ADA are found at the maternal-fetal interface of mice throughout prenatal development in two genetically distinct tissues, the maternal deciduum and embryo-derived trophoblasts of the chorioallantoic placenta (18-21). ADA is initially of maternal origin and begins to accumulate in the gestation site 6.5 days postcoitum (dpc), 2 days after implantation. Immunohistochemical and in situ hybridization studies demonstrate that ADA synthesis at this time is primarily by secondary decidual cells (19, 20), which are specialized uterine stromal cells that surround the embryo on the antimesometrial half of the uterus (Fig. 3A). Examination of Ada expression during the development of experimentally induced deciduoma in rats and mice indicates that secondary decidual cells can synthesize high levels of ADA protein and mRNA in the absence of an embryo (22). Thus this initial phase of ADA synthesis at the maternal-fetal interface is almost exclusively maternal in origin. ADA expression is developmentally regulated and increases over 200fold to reach a maximum in secondary decidual cells by 9.5 dpc (Fig. 3A and C). Subsequently, this maternal source of ADA is lost to tissue regression, which is largely complete by 11.5 dpc (19). During this time the responsibility for synthesizing ADA shifts to embryo-derived trophoblast cells as they differentiate at the ectoplacental cone starting at 7.5 dpc, and in the giant cells that surround the implantation site. By 13.5 dpc, the junctional zone of the placenta is the major site of ADA synthesis in the gestation site. At this stage high levels of ADA are produced by all three trophoblast populations-labyrinthine trophoblasts, spongiotrophoblasts, and secondary giant trophoblasts-with the highest levels being found in the spongiotrophoblasts of the junctional zone (19,20) (Fig. 3B and C). This pattern of expression persists through term (23). Transfer of murine blastocysts carrying an electrophoretic variant of ADA into wild-type recipients was initially utilized
199
ADA REGULATION AND FUNCTION
t
1400 1200 lo00
-
-
-
800
-
400 -
600
U n U
200
7
0 1 ,
6
, ,
7
0
,
9
10 1 1
I
,
,
12 13 14 15 16 17
days post d u r n
FIG.3. Spatial and temporal pattern of ADA expression at the maternal-fetal interface. Paraffin cross-sections were reacted with a monospecific sheep antiserum to mouse ADA followed by immunoperoxidase detection (16).(A) By 9.5 dpc, ADA immunoreactivity is intense in the secondary deciduum, which by this stage is referred to as the decidua capsularis (dc). The mesometrial deciduum, also known as decidua basalis (db), is negative. Immunoreactivity is seen in trophoblasts of the labyrinthine (la) and junctional (basal) zone (bz) of the placenta. The embryo (located centrally) is negative. (B) By 13.5 dpc, the decidua capsularis has undergone massive regression (arrows), and ADA immunoreactivity is intense in the basal zone (bz) of the mature placenta. Reactivity is also seen in giant trophoblasts and the labyrinthine zone. The olfactory neuroepithelium (on) is indicated. (C)The temporal pattern of ADA enzymatic activity in the secondary deciduum and placenta is shown. Reproduced with permission from Knudsen et (11. (19).
200
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
to demonstrate a major shift from maternal to embryo-derived ADA during midgestation (19). More recently, this has been shown by the use of ADAdeficient mice (Section V, B). Thus, the second phase of ADA synthesis at the maternal-fetal interface is provided predominantly by embryo-derived trophoblasts of the placenta. ADA enzymatic activity in the developing embryo and fetus is low relative to those found in the deciduum and placenta (19). However, immunoreactivity is detected in subsets of cell populations in the fetal central nervous system, including dorsal root ganglion (24)and the olfactory neuroepithelium (19). Activity has also been found in regions of the developing rat brain (25), and the developing gastrointestinal tract in the mouse also expresses ADA at birth (15).Taking into account total ADA enzymatic activity levels measured in the fetus and placenta, greater than 95% of ADA activity in the fetal gestation site is found in trophoblasts of the placenta (23).These high levels of ADA suggest there is an important role for ADA in purine metabolism during prenatal development in mice. Other enzymes of purine catabolism are found in the deciduum and placenta (20, 21) (Fig. 2), including 5’-nucleotidase (5’-NT),which produces ADA substrates from the dephosphorylation of AMP and dAMP (21).5’-NT is in the primary deciduum during implantation stages and in giant trophoblast cells of the developing and mature placenta. The levels of 5’-NT, however, are much lower than those measured for ADA. This pattern of 5’-NT expression suggests that there may be a need for adenosine synthesis during implantation and placentation. Indeed, endogenous adenosine levels surge in the pregnant uterus between 4.5 and 7.5 dpc (implantation stages) corresponding to elevations in 5’-NT enzymatic activity and low ADA enzymatic activity. Purine nucleoside phosphorylase (PNP) and xanthine oxidase (XO), enzymes of purine catabolism distal to ADA, are detected in the gestation site, but at low levels that do not change during development. Therefore, enzymes of the purine catabolic pathway do not appear to be coordinately expressed at the maternal-fetal interface in mice. ADA is by far the most expressed enzyme in the pathway, suggesting a physiological need for the control of ADA substrates during postimplantation development.
6. Postnatal Expression Postnatally, the highest levels of ADA in the mouse occur in the gastrointestinal epithelium (Table I), where the enzyme accounts for more than 20% of the total soluble protein (15,26).ADA is in the keratinized squamous epithelium that lines the tongue, esophagus, and forestomach, and the simple columnar epithelium of the proximal small intestine (Fig. 4).In these tissues the level of ADA is subject to pronounced developmental control, being low at birth and achieving enormous levels within the first 5 weeks of postnatal
201
ADA REGULATION AND FUNCTION
TABLE I ADA ENZYMATIC ACTIVITY I N VARIOUSMOUSETISSUES~~ Tissue Gastrointestinal tract Tongue Esophagus Forestomach (nonglandular) Glandular stomach Duodenum Jejunum Ileum Large intestine Female reproductive tract Deciduum (experimentally induced) Embryo-deciduum unit (9,5 dpc)C Embryo (13.5 dpc) Placenta (13.5 dpc) Uterus Vagina Miscellaneous Thymus Spleen Liver Kidney Lung Bladder Ear pinme (skin) Foot pad (skin) Back (skin) Cardiac muscle Skeletal muscle ‘J
1,
ADA activity’)
5200 2200 4200 150 4200 1900 360 50 1400 1100 23 590 24 31 130 44 63 63 7 37 26 19 7 4 4
Data from Chiriskv et aZ. (15). Values are given as nmoliminimg protein
dpc, Days postroitum.
life (15) (Fig. 4G).Intermediate levels of ADA enzymatic activity are found in the thymus, spleen, and liver, whereas brain and heart are tissues that exhibit low levels of ADA enzymatic activity (15, 20) (Table I). In humans, as with the mouse, high levels of ADA are found in the proximal small intestine (20, 27). The human thymus also exhibits high levels of ADA enzymatic activity in cortical thymocytes (28). ADA is also expressed in the human deciduum and placenta, but at levels much lower than those measured in the mouse (29). Unlike the uterus in pregnancy, Ada expression in the gastrointestinal
0
1
2
3 4 5 Age (weeks)
11
12
FIG.4. ADA enzymatic activity in the postnatal gastrointestinal tract of mice. Immunofluorescent localization of ADA in the murine forestomach (a), esophagus (b), and proximal small intestine (c), using a monospecific antiserum to murine ADA. The arrows indicate the specific immunofluorescent staining in the mucosal epithelial layer (Mu); L, lumen; M,mucosal layer. ADA-formazan staining for catalytic activity is denoted by arrows in the forestomach (d) and small intestine (f). Ontogeny of ADA-specific activity in tissues from BALBlc mice at the postnatal ages indicated (9). Samples are given as means 2 standard deviations. Reproduced with permission from Dtfferentiation, Developmental expression of adenosine deaminase in the upper alimentary tract of mice, Chinsky et al., 42, 172-183, Figures 1 and 3, 1990, 0 SpringerVerlag.
ADA REGULATION AND FUNCTION
203
tract is accompanied by high levels of expression of other enzymes of the purine catabolic pathway (Fig. 2), including PNP, guanine deaminase (GDA), and XO (20, 30). The proximal small intestine of the adult is the richest source of these enzymes. Because they serve collectively to produce uric acid, it is possible that production of this antioxidant is important in gastrointestinal physiology. The diverse array of Ada expression during both prenatal and postnatal development suggests that this enzyme may serve different functions in different tissues, and demonstrates the complexity of Ada gene regulation. These issues are addressed below.
II. Regulation of Ada Gene Expression The mammalian genome consists of approximately three billion basepairs of DNA sequence information organized into an estimated 100,000 genes. The evidence suggests that approximately 25,000 genes are expressed in all tissues and that the remaining 75,000 genes are expressed in a tissuerestricted manner. The Ada gene has features of each category, i.e., ubiquitous expression and tissue-enhanced expression, making it an ideal model for studying developmental and tissue-specific gene regulation. Considerable progress toward defining key regulatory features of the human and murine Ada gene has been made.
A. Gene and Promoter Structure The complete nucleic acid sequence for the ADA gene in humans has been determined. The locus resides on chromosome 20 at q13.11 (31-33). The murine Ada gene is in the H2 region of chromosome 2 (34). The structures of the human and murine gene are similar (Fig. 5A and B), both containing 12 exons that span 32 kb (human) and 23 kb (murine) of nucleotide sequence. Intron 1 is large, spanning over 15 and 12 kb in the human and mouse, respectively (32, 35). Initial studies in the mouse determined that Ada promoter activity resides within a 240-bp fragment that contains the transcriptional initiation sites (36). Subsequent studies identified the position of core promoter and modulating elements within this region (37). Sequence analysis of this promoter region showed it to be extremely rich in G C (77%)and devoid of TATA and CAAT box consensus sequences (Fig. 5C) (37).This region contains multiple Spl binding sites, which bind purified Spl protein. The sequence TAAAAAA, found 27 bp upstream of the major transcriptional initiation site, together with at least one Spl site, is necessary for basal transcription in uitro. This A-rich sequence binds transcription factor IID (TFIID), which can promote transcription in uitro. Similar features have been shown for the human ADA promoter (38).
+
204
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
1
2
I
I
A
I
1
2
3 4
3
4 5 8 780 101112
I
I I
II
Human
5 8789 1011 12
B
Murine
C
G-stretch Spl sites
FIG. 5. Structure of the human (A) and murine (B) A d a gene. Exons, denoted by black boxes, are numbered. Data for the human and murine gene are from Wiginton et al. (32)and AlUhaidi et al. (35),respectively. (C)Enlargement of the murine ADA promoter region. The arrow denotes the transcriptional start site and the black box represents exon 1. The TAAAAAA element is essential for transcription and is believed to bind basic transcriptional complexes. Other modulating elements are shown (37).
ADA enzymatic activity and the steady-state abundance of messenger RNA correlate in all tissues and cell lines examined. This suggests that transcriptional regulation plays an important role in the expression of this gene. The 240-bp promoter region, alone or with the addition of 2.7 kb of 5’ flanking sequences, does not deliver enhanced expression of reporter genes in cell transfection studies or transgenic mice (36, 39, 40). Therefore, elements necessary for enhanced and tissue-specific expression reside elsewhere in the Ada gene.
B. Tiss u e- s pec ific Regu Iation 1.
REGULATION OF
PLACENTALEXPRESSION
Prenatally, A& is expressed at high levels in trophoblast cells of the developing and mature chorioallantoic placenta (19, 20). Ada expression appears in trophoblast cells as they differentiate from the ectoplacental cone, starting 7.5 dpc. In the mature placenta, Ada is highly expressed in the spongiotrophoblasts that comprise the junctional zone, in trophoblasts of the labyrinthine zone, and in giant trophoblasts that surround the implantation site. Studies in transgenic mice have been critical in capturing regulatory elements responsible for developmental and tissue-specific expression in the
205
ADA REGULATION AND FUNCTION
placenta. A 6.4-kb fragment of DNA from the 5' flanking region of the murine Adu gene can direct high-level reporter gene expression to the placenta in a correct developmental manner, and to the forestomach postnatally, but not to any other tissue (23). Deletion analysis of this region revealed that placental and foresetomach enhancer sequences are separable (D. Shi, unpublished data). The placental enhancer was delineated to a 756bp fragment located between 5 and 6 kb upstream of the Ada transcription start site (Fig. 6). This fragment directed high levels of placental-specific chloramphenicol acetyltransferase (CAT) expression in transgenic mice. Furthermore, in situ hybridization studies monitoring the cellular localization of CAT transcripts showed that this placenta enhancer directed expression to all three trophoblast lineages at the appropriate levels. Insight into the mechanism by which the ADA placenta enhancer functions was provided by DNase-I footprinting assays and sequence analysis. DNase-I footprinting experiments revealed both general and placenta-specific protein-binding sites within this region (D. Shi, unpublished data). These regions may bind truns-acting factors necessary for placental enhancement. In addition, sequence analysis of this ADA placenta enhancer revealed that it contains sequence elements homologous to transcription-factor binding-sites found in other placental-specific genes. These include potential sites for the trophoblast-specific-element binding protein (TSEB) (42), zinc-finger transcription factors such as GATA-2 (43)and GATA-3 (44),CAMP 2
1
I
3
I
/
A
ATG
0 promoter region
0
m
placenta enhancer forestomach enhancer
I thymus enhancer 0
facilitators (human), basal activators decidual specific hypersensitive site
FIG.6. Schematic of tissue-specific regulatory domains in the Ada gene. Depicted is a region of the Ada gene encompassing 5’ flanking sequences and the first three exons (black boxes) and introns. The locations of various regulatory doinains are shown. The bent arrow represents the transcriptional start site. This schematic is not to scale.
206
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
response-element binding proteins (45), and basic helix-loop-helix factors (46). This array of potential transcription-factor binding-sites is consistent with either the use of a single combination of sites capable of functioning in all three trophoblast cell types, or a unique combination of sites, one for each type. Studies are under way to test these possibilities by creating deletion and site-directed mutants of each site for expression in transgenic mice, using in situ hybridization to monitor transgene cellular localization. In addition, efforts to identify the proteins that bind these regulatory sequences are under way. It is possible that both novel and previously characterized trans-acting factors may interact with these elements. The identification of these factors will be an important step toward understanding Ada expression in trophoblast cells of the placenta, as well as defining the genetic regulatory pathways governing trophoblast differentiation.
2.
REGULATION OF
THYMUS EXPRESSION
DNA sequences within intron 1 of the human ADA gene confer highlevel expression of the CAT reporter gene to the thymus of transgenic mice and in permanent lines of human T cells (27, 47, 48). Analysis of DNase-I hypersensitive sites within intron 1 revealed a thymus-specific enhancer domain responsible for high-level thymus expression. The enhancer domain is 200 bp long and contains at least two critical core regions that bind ADANF1 (c-myb) and ADA-NF2 (47a). The enhancer also contains consensus matches for other potential transcription factors. Closer analysis revealed that sequences flanking the enhancer are required for positional independent and copy-number proportional expression of transgenes in the thymus of transgenic mice (Fig. 6), but not in permanent lines of human T cells (47, 48). These elements, which are required for enhancement in transgenic animals, are referred to as facilitators, and are believed to participate in establishing a chromatin configuration necessary for enhancer function (48). The comparable region of DNA from the murine Ada gene has been cloned and shows considerable regions of conservation with the human gene (J. H. Winston, unpublished data). A 3.6-kb intron-1 fragment encompassing this region was tested in transgenic mice using a CAT reporter gene. High-level CAT expression was seen in the thymus, suggesting that a thymus-specific enhancer for the murine Ada gene resides in a region of intron 1 similar to that of the human thymus enhancer (Fig. 6). Apart from increased expression in the thymus, the 3.6-kb fragment activated basal levels of CAT expression in all tissues examined, a feature also associated with the human Ada T-cell enhancer (27). Associated with this basal activation was the formation of a DNase-I hypersensitive site at the promoter of the transgenes, suggesting that sequences within intron 1 may promote basal activation by altering chromatin configuration at the promoter. It is speculated
ADA REGULATION AND FUNCTION
207
that sequences within intron 1 may function as a locus-control region ensuring that the ADA promoter is accessible to the basal transcription apparatus in all cells. It is not known if the facilitator sequences defined within intron 1 of the huhan gene are the same as the basal activator sequences found within intron 1in the mouse. Efforts to delineate these regulatory regions and determine what truns-acting factors are involved will contribute to our understanding of how the Ada gene is activated.
3.
REGULATION IN OTHER
TISSUES
Other tissues that exhibit high levels of Ada expression are tissues of the proximal gastrointestinal tract, including the squamous epithelium of the tongue, esophagus, and forestomach, and the absorptive columnar epithelium of the proximal small intestine (15). During reproduction, the secondary deciduum is highly enriched in ADA as well (19). Evidence suggests that, like the thymus and placenta, Ada expression in these tissues is regulated by specific enhancer sequences. DNA sequences in the 5’ flanking region, proximal to the placenta enhancer, are capable of directing expression of the CAT reporter gene to the forestomach of transgenic mice, but not to any other tissue (23). The forestomach enhancer lies within -2.7 to -5.0 kb relative to the Ada transcriptional start site (Fig. 6). From these findings, it is likely that regulatory sequences responsible for enhanced Ada expression in the tongue, esophagus, and proximal small intestine lie elsewhere within the gene. Current studies are under way analyzing regions of the murine Ada locus for such activities. Digestion of intact chromatin with DNase-I endonuclease is a means of identlfying regions of altered chromatin organization within a locus that may represent areas containing cis-regulatory elements involved in tissue-specific gene regulation. Analysis of DNase-I hypersensitive sites in the murine Ada gene revealed a strong decidual-specific hypersensitive site within intron 2 (L. Long, unpublished data) (Fig. 6). This region has a high degree of sequence homology with a similar region of intron 2 of the human Ada gene. This suggests a conserved function for this region of DNA, which may include enhanced expression of Ada in the secondary deciduum of the pregnant uterus. Studies are in progress to test functionally this region of D N A for its ability to drive expression of CAT reporter genes in a decidual-specific manner in transgenic mice. These transgenic approaches should allow us eventually to define functionally regions of the Ada gene necessary to accomplish its unique pattern of expression.
C. Model for Ada Gene Expression The association between increased levels of ADA enzymatic activity and steady-state mRNA levels suggests that Ada expression is regulated at the
208
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
level of transcription; however, the mechanisms involved are less clear. Nuclear run-on transcription experiments show an abundance of transcription complexes at the 5’ end of the Ada gene, in all tissues and cells tested. In contrast, probes representing downstream portions of the gene reveal high levels of transcription only in tissues and cells that are enriched for ADA (49-52). These findings suggest that the Ada promoter region is available to initiate transcription in all cell types, and that transcription continues only in those cells that are programmed for enhanced expression. Transcriptional studies in Xenopus laeois oocytes and in vitro have defined two transcriptional arrest sites in the 5’ end of the murine Ada gene, one within the 3‘ end of exon 1 and the other in the 5’ end of intron 1 (5357). It is not clear what role, if any, these sites might play in tissue-specific regulation of ADA. Additional studies show that the Ada promoter is readily accessible to DNase-I digestion in nuclei isolated from a variety of tissues, regardless of ADA levels (L. Hong, unpublished data). It is speculated that sequences within intron 1 may function as a locus control region, ensuring that the Ada promoter is accessible to the basal transcription apparatus in all cells (Section 11,B,2). Collectively, these data suggest that paused transcriptional complexes are likely present at the Ada promoter in all cells. The presence of 5’-paused transcriptional complexes has been proposed for other genes (58, 59), and the transcription of such genes is thought to be regulated b y modifications in the paused polymerase complexes that allow for transcript elongation. A potential model for Ada tissue-specific enhancement would involve the regulation of elongation of these paused transcriptional complexes, allowing them to elongate in tissues in which Ada is highly expressed. This could involve modifications in the paused complexes regulated by interactions with complexes associated with tissue-specific enhancer domains such as those defined for the thymus (27, 47, 48), placenta, and forestoinach (23).Continued efforts to define tissue-specific regulatory elements and the proteins that interact with them, and to test their interaction with transcriptional complexes, will be paramount to understanding the genetic mechanisms that regulate the complex pattern of Ada expression.
111. Physiological Role of ADA during Development A. Genetic Disruption of Ada by Homologous Recombination The use of homologous recombination techniques together with embryonic stem-cell manipulation has allowed for the targeted disruption of genes
ADA REGULATION AND FUNCTION
209
to assess their physiological function. A number of possible phenotypes can be predicted, based on the tissue distribution of ADA. The first tissue to express ADA abundantly during development is the placenta (19, 20);therefore, an embryonic or fetal phenotype might be expected in the absence of placental ADA. During postnatal development, ADA is enriched in the proximal gastrointestinal tract (15), implying a possible gastrointestinal tract phenotype in ADA-deficient mice. By analogy to humans, the absence of ADA in the immune system, particularly the thymus, might result in lymphopenia or immunodeficiency. Finally, during reproduction, ADA is abundant in the secondary deciduum (19), suggesting a possible reproductive phenotype in female ADA-deficient mice. Thus, the generation of ADAdeficient mice would provide extensive information on the physiological importance of ADA in a variety of tissues, prenatally and postnatally. ADA-deficient mice have recently been generated by two independent groups, producing animals with independent sites of gene disruption (60, 61). In both cases, similar phenotypes were observed. Heterozygous matings produced Ada-null fetuses that died perinatally in association with severe hepatocellular impairment, incomplete expansion of the lungs, and small intestinal cell death. Liver damage was evident by 16.5 dpc and worsened through 18.5 dpc preceding death of the fetuses (60). This phenotype was accompanied by profound disturbances in purine metabolism. The lymphoid organs of ADA-deficient fetuses and newborn pups were not largely d e c t e d , although there were minor reductions in the number of CD4-positive and CDB-positive lymphoid cells in livers of ADA-deficient fetuses by 16.5 dpc (60).These observations suggest that ADA is essential for fetal survival in mice.
B. Metabolic Disturbances in ADA-deficient Mice In attempting to understand the metabolic basis for the liver damage and perinatal lethality seen in ADA-deficient fetuses, the levels of ADA substrates and products were assayed (60). Profound disturbances in purine metabolism were observed, with adenosine and deoxyadenosine increasing in ADA-deficient fetuses and levels of inosine decreasing (Fig. 7A). Substrates began to increase 12.5 dpc, and by 17.5 dpc adenosine and deoxyadenosine had increased 3-fold and 1000-fold (Fig. B), respectively. These observations suggest that disturbances in purine metabolism are likely involved in the mechanisms that lead to the phenotype observed in ADAdeficient fetuses. In attempting to understand the physiological importance of ADA, attention is invariably focused on the metabolic impact of its substrates, adenosine and deoxyadenosine (12). However, it cannot be ruled out at this point that the production of ADA-catalyzed reaction products may be necessary. However, it is difficult to envisage a need for inosine and/or deoxyinosine. The
210
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
* €c Ino lo
+/+
-1-
B ATP
ATP
E c
I
(rnin)'9
.
,
11
.
,
13
.
,
15
+/FIG. 7. HPLC chromatographic profiles showing purine metabolic disturbances in ADAdeficient fetuses. (A) Nucleoside profiles from wild-type (+/+) and homozygous mutant (-/-) fetuses 16.5 dpc. Peaks of interest include inosine (Ino), adenosine (Ado), and deoxyadenosine (dAdo). Notice that In0 is the major peak in the profile from the +I+ fetus, whereas Ado is low
ADA REGULATION AND FUNCTION
211
days post coiturn
FIG.8. Schematic of phenotypes and temporal disturbances in purine nucleoside and nucleotide metabolism in ADA-deficient fetuses. Depicted are the phases of ADA expression at the maternal-fetal interface. Decidual expression is during early postimplantation stages and placental expression is during fetal stages of development. Slight increases in fetal ADA expression are also shown. The shaded box represents consequences observed in ADA-deficient fetuses, including the absence of placental and fetal ADA; the accumulation of adenosine, deoxyadenosine, and dATP; the appearance of liver damage 16.5 dpc; and perinatal lethality.
only known fate of each is to be phosphorylized to produce hypoxanthine and the respective sugar phosphate (Fig. 2). These compounds do not serve as precursors to critical metabolites that cannot be produced by other means. Thus, we do not believe it is likely that the liver pathology that accompanies ADA deficiency results from the inability of hepatocytes to produce inosine or deoxyinosine. Adenosine is an extracellular signaling molecule that elicits a vast array of physiological responses by engaging cell surface receptors (10, 11)(Fig. 9). Adenosine signaling is most often associated with cellular events that collectively serve to protect cells under metabolic stress, but can under some conditions be cytotoxic (62).Accumulations of adenosine seen in ADAdeficient fetuses (60) may disrupt adenosine signaling; however, little is known with regard to adenosine receptor types, levels, and localization in the murine embryo and fetus. Deoxyadenosine is a cytotoxic metabolite that kills target cells by interand dAdo is not detected. This pattern is reversed in -/- fetuses, in which Ino is reduced and Ado and dAdo are the major peaks found. (B) Nucleotide profile from whole blood collected from a heterozygous (+/-) and homozygous mutant (-/-) fetus 17.5 dpc. Whereas ATP is the major peak in blood from the +/- fetus, it is slightly reduced in blood from the - / - fetus. dATP levels are increased in -/- blood samples. The peak at 9.8 minutes in the -/- sample has tentatively been identified as dADP.
212
-
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
apoptosis
intracellular signaling
Ado + Hcy
dNDPs Ado
3
I
AdoHcy
NDPs
X-CH 3
X
AdoMet
I no
deoxynucleotide synthesis
dlno
reactidns
FIG. 9. Schematic of pathways influenced by levels of adenosine and deoxyadenosine, which accumulate in the absence of ADA enzymatic activity. Adenosine (Ado)and deoxyadenosine (dAdo), generated by nucleic acid breakdown during apoptosis, are taken up by cells via a ubiquitously expressed nucleoside transporter. Extracellular (EC) Ado influences intracellular signaling by binding subsets of adenosine receptors (AR). Intracellular (IC) accumulations in dAdo can interfere with deoxynucleotide synthesis via its phosphorylation to dATP and suhsequent inhibition of ribonucleotide reductase. &do can also inhibit S-adenosylhomocysteine (AdoHcy) hydrolase, leading to disturbances in transmethylation reactions involving S-adenoxylmethionine (AdoMet). Accumulations of Ado can also influence this pathway by conversion to AdoHcy. Ino, inosine; dIno, deoxyinosine, NDPs, nucleotide diphosphates; dNDPs, deoxynucleotide diphosphates.
fering with deoxynucleotide metabolism and/or disrupting cellular transmethylation reactions (12) (Fig. 9). Interference with deoxynucleotide synthesis is mediated by the phosphorylation of deoxyadenosine to dATP via nucleoside and nucleotide kinases (63).Accumulation of dATP leads to the inhibition of ribonucleotide reductase and the disruption of deoxynucleotide synthesis needed for DNA replication and repair (13, 64, 65).Another route of deoxyadenosine cytotoxicity involves the inhibition of methylation reactions involving S-adenosylmethionine (AdoMet). The product of such methylation reactions is S-denosylhomocysteine (AdoHcy), which is hydrolyzed to adenosine and homocysteine by AdoHcy hydrolase. This enzyme is inhibited by deoxyadenosine, leading to the accumulation of AdoHcy, which can function as a competitive inhibitor of many transmethylation reactions critical to cellular function (66, 67). AdoHcy hydrolysis is reversible; therefore,
ADA REGULATION AND FUNCTION
213
accumulations of intracellular adenosine combining with free homocysteine can also cause increases in AdoHcy and subsequent inhibition of transmethylation reactions (67). These mechanisms of deoxyadenosine cytotoxicity are hypothesized to provide the metabolic basis for the immunodeficiency associated with ADA-deficient humans. In addition to marked accumulations in deoxyadenosine, concentrations of dATP are greatly elevated in the blood of ADA-deficient fetuses 17.5 dpc (60) (Fig. 7B). AdoHcy hydrolase is inhibited in livers and other tissues of ADA-deficient fetuses (M. Wakamiya, unpublished data) and pups (61), and this inhibition is accompanied by disturbances in the levels of AdoHcy and AdoMet, suggesting an interference in AdoMet-related cellular transmethylation reactions. These metabolic disturbances suggest that deoxyadenosine cytotoxicity (Fig. 9) is likely to provide the metabolic basis for liver damage and subsequent perinatal death of ADA-deficient fetuses. This hypothesis is further supported by the observation that the murine liver is a source of both deoxyadenosine-phosphorylatingenzymes and AdoHcy hydrolase (14, 68).
C. Genetic Reconstitution of ADA in the Placenta Over 95% of the ADA enzymatic activity in the fetal gestation site resides in trophoblasts of the placenta (19,23), suggesting an important role for ADA in this organ. The physiological importance of ADA in the placenta is not known; however, given that ADA-deficient fetuses, which also lack ADA in their adjoining placentas, die perinatally, it is reasonable to suggest that placental ADA plays an essential role during fetal stages of development. This hypothesis is supported by the observation that the metabolic disturbances seen in ADA-deficient fetuses are not evident until 12.5 dpc (60). This coincides with the removal of maternal ADA from the gestation site with regression of the secondary deciduum, and the failure of ADA to be expressed in ADA-deficient placentas (60) (Fig. 8). We assessed the importance of placental ADA by genetically restoring the enzyme to placentas of ADA-deficient fetuses (69). This was accomplished by designing an ADA minigene capable of targeting ADA expression to placentas of transgenic mice at levels comparable to that of endogenous ADA. This minigene was equipped with Ada placental gene regulatory elements (23)and modified to include a 36-bp deletion in the 5’ untranslated region, a feature that enabled us to distinguish the minigene transcript from native ADA transcripts. Once transgenic mice carrying this minigene were generated, they were intercrossed with mice heterozygous for the null Ada allele. Among the progeny of such matings were animals hemizygous for the ADA minigene locus and heterozygous for the null Ada allele. When intercrossed, these animals served as a source of fetuses that were homozygous for the null Ada allele, some of which carried the ADA minigene locus.
214
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
Consistent with previous results (60),no ADA-deficient mice were present at weaning; however, ADA-deficient mice carrying the ADA minigene locus were detected at a percentage suggesting a 100% rescue of these animals (69). Thus, the expression of the ADA minigene in the placentas of ADA-deficient fetuses was sufficient to rescue them from perinatal lethality, suggesting that placental ADA may play an important role during fetal stages of development.
D. Prevention of Metabolic Disturbances by Placental ADA Expression On closer examination of rescued fetuses, it was found that severe fetal liver damage was prevented. In addition, most of the purine metabolic disturbances found in ADA-deficient fetuses (60) were prevented by the expression of the ADA minigene in their adjoining placentas (Fig. 10). This included the prevention of deoxyadenosine accumulation in ADA-deficient fetuses and placentas (Fig. 10B and E), and the prevention of dATP accumulation in fetal blood (Fig. 1OC). These findings further suggest that the high levels of ADA found in the murine placenta are important for fetal development, and a major function of this enzyme in the placenta is likely to prevent the accumulation of deoxyadenosine, which is potentially toxic to the developing embryo and fetus. Interestingly, the accumulation of adenosine was not prevented in ADAdeficient fetuses expressing an ADA minigene in their placentas, but was prevented from accumulating in the placenta itself(Fig. 10A and D). Little is known with regard to the involvement of adenosine signaling during mammalian development. The failure to prevent adenosine accumulation in rescued ADA-deficient fetuses suggests that elevated adenosine levels are not overtly detrimental to these fetuses. However, the potential effects of high concentrations of adenosine on placental development and function are not clear. Adenosine receptors are known to be expressed in the murine placenta (70).The prevention of adenosine accumulation in the placentas of rescued fetuses raises the possibility that high levels of adenosine in the placenta may exert a detrimental effect on placental function in ADA deficient mice by interaction with receptors. However, ADA-deficient and rescued placentas appear morphologically normal. More knowledge regarding the expression of adenosine receptors in the placenta and fetus are needed before we can assign a role to adenosine signaling in the phenotypes observed. Naturally occurring null alleles of the Ada gene have been observed in the human population and are associated with ADA-deficient immunodeficiency (12).In contrast to what we observed in mice, human ADA-deficient fetuses survive prenatal development and are not known to suffer severe hepatocellular damage, although mild liver findings have been noted (12). It is possible that fundamental differences in how human and murine hepato-
A D A REGULATION AND FUNCTION 0
T
"1 T
T
E T
0.4
215
c
35
30
1
FIG.10. Levels of ADA substrates and dATP in fetuses and placentas 17.5 dpc. Fetuses and placentas from intercrosses between mice heterozygous for the null Ada allele and hemizygous for the ADA minigene locus (Tg) were collected 17.5 dpc, and nucleosides and nucleotides were extracted and analyzed using reversed-phase HPLC. (A) Fetal adenosine levels. (B) Fetal deoxyadenosine levels. (C) dATP levels measured in fetal blood. (D) Placental adenosine levels. (E) Placental deoxyadenosine levels. Measurements were made on samples heterozygous for the null Ada allele with the ADA minigene locus (Tg, m l / + , n = 4), and samples homozygous for the null Ada allele without (ml/ml, n = 2) or with (Tg, ml/ml, n = 2) the ADA minigene locus. Tg,m l / + values are essentially the same as wild-type values. Values are given as means 2 SE; N.D., not detected at a lower limit of detection of 50.001 nmolimg protein. Reproduced by permission from Blackburn et al. (69).
cytes respond to alterations in ADA substrates may be responsible for the d8erence in phenotypes. Alternatively, these differences may suggest that there is a greater need for ADA during mouse prenatal development. This
216
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
need may stem from a higher rate of ADA substrate generation resulting from tissue degeneration, which occurs at the maternal-fetal interface throughout murine development as a natural part of implantation chamber reorganization (71, 72). Genetic restoration of ADA in the placentas of ADAdeficient fetuses allows them to survive prenatal development, providing the opportunity to investigate whether postnatal mice develop phenotypes similar to those seen in humans (see Section IV).
E. Tissue-specific Rescues: A General Strategy The rescue of ADA-deficient fetuses by genetically restoring ADA to the placenta illustrates a general strategy for the tissue-specific correction of phenotypes associated with null mutations in mice. It is not uncommon for null mutations generated by gene targeting to result in embryonic or fetal lethality (73). Whereas this demonstrates the importance of a gene during prenatal development, it prevents investigation of its function postnatally. For example, ADA is expressed at high levels at three different places and times during the murine life cycle: first in the trophoblasts of the placenta during prenatal development (19,2O); next in the gastrointestinal epithelium of the adult (15);and then in the secondary deciduum of the pregnant uterus (19). The lack of ADA in the first of these places, the placenta, results in a phenotype that precludes our ability to investigate its importance in adult tissues. One approach to such problems is to create a tissue-specific knockout of a gene by strategies that employ the recombinase Cre (74).An alternative approach was demonstrated here, and involved tissue-specific restoration of gene expression on a knock-out background. Placental expression of an ADA minigene on an ADA-deficient background allowed for the survival of ADA-deficient fetuses through the prenatal phenotypic bottleneck, and provided adult mice that were used to assess the role of ADA in adult tissues, such as the gastrointestinal tract, the deciduum, and the immune system. Similar approaches may be useful in assessing the physiological role of genes in animals carrying targeted mutations that result in placental-related embryolethalities that prevent postnatal evaluations of these genes (75-77).
IV. Role of ADA in the Murine Immune System A. Lymphopenia and Immunodeficiency Associated with Restricted Expression of ADA Depending on the severity of the mutation, ADA deficiency in humans is associated with mild to severe immunodeficiency that, in severe cases, can lead to infant mortality, if not treated (12).The DNA fragment used to target
217
ADA REGULATION AND FUNCTION
expression of the ADA minigene in transgenic mice contains regulatory elements for high levels of expression in the placenta prenatally and the forestomach postnatally (23).Consistent with this, ADA enzymatic activity in rescued mice is only in the forestomach and to a lesser extent in other regions of the gastrointestinal tract (78). On investigating the status of the immune system in these rescued mice with restricted Ada expression, the spleen and thymus were found to be smaller and lymphoid cell counts were significantly reduced in these major lymphoid organs (Fig. 11). This suggests that these mice exhibited lymphopenia. Further evaluation by flow cytometry of leukocyte subpopulations in
C
IT
Q Tg, m l l i
Tg, m l l m l
50
spleen
thymus
FIG. 11. Reduction in lymphoid organ size and cell number in adult mice with restricted ADA expression. Adult mice, rescued by the expression of the ADA minigene locus (Tg) in the placenta prenatally, did not express ADA in any tissue outside the gastrointestinal tract. Spleens (A) and thymuses (B) from 12-week-oldTg mice heterozygous for the null Ada allele (Tg, ml/+) and Tg mice homozygous for the null Ada allele (Tg, m l / m l ) , showing the reduction in lyniphoid organ size in the absence of ADA. Similar reduction in organ size has been observed in all animals examined thus far. (C) Total lymphoid cells counted in spleens and thymuses from Tg, m l i f and Tg, inUmI literrnates ranging from 9 to 13weeks of age. Two females and two males were examined for each genotype. Data (78)are given as total lymphoid cells (in millions) SD; *, significant with P < 0.02.
*
218
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
spleens and thymuses from these mice showed that there were no significant differences in the distribution of leukocyte subpopulations in the thymus. However, leukocyte subpopulations were significantly altered in the spleens. The most striking change was a decrease in spleen T-cell populations, with a significant reduction in CDPpositive cells. A slight reduction in CD8-positive T cells and moderate increases in B cells (CD45R positive) were also observed. Consistent with this lymphopenia, lymphoid cells from rescued mice showed reduced responsiveness to mitogen stimulation, suggesting a partial immunodeficiency. In humans, ADA deficiency is associated with a mild to severe immunodeficiency, depending on the severity of the mutation in the Ada gene. T-cell lymphopenia often precedes overt immune deficiency in patients with milder mutations (79), and may account for some cases of “HIV-negative CD4 lymphocytopenia” (12). Mice with restricted Ada expression resemble these patients in that they have lymphopenia and partial immunodeficiency (78). Similar results have been observed in mice chronically treated with the ADA inhibitor 2’-deoxycoformycin (SO). The usefulness of rescued partially ADA-deficient animals as models for immune development and diseases awaits more thorough characterization of their immune system. However, it appears that these mice will be useful as animal models for studying the metabolic basis of ADA-related immunodeficiencies in humans (Section IV,B). In addition, these animals will be useful as tools for the advancement of enzyme and gene therapy approaches to treating ADA deficiency in humans. Current efforts are under way to rescue ADA-deficient fetuses using placental-specific DNA elements, which are expected to yield postnatal animals completely deficient in this enzyme. It is speculated that these animals will exhibit more severe lymphopenias and immunodeficiencies.
B. Metabolic Basis for Partial Immunodeficiency Purine metabolic disturbances are readily detected in patients with partial and complete ADA deficiency (65, 79, 81). These disturbances include elevations in plasma adenosine and deoxyadenosine, deoxyadenosine in the urine, elevated dATP and decreased ATP in erythrocytes, and decreased AdoHcy hydrolase activity in erythrocytes. Deoxyadenosine cytotoxicity is thought to provide the metabolic basis for the immunodeficiencyobserved in ADA-deficient humans (65); however, adenosine has also been shown to be toxic to T lymphocytes (62). Severe disturbances in purine metabolism have been observed in rescued mice with restricted Ada expression (78). Adenosine levels were elevated in the thymus, spleen, liver, kidney, and serum, whereas pronounced elevations of deoxyadenosine and dATP were observed only in the thymus of rescued animals. There was also an inhibition of
ADA REGULATION AND FUNCTION
219
AdoHcy hydrolase in the spleen, thymus, and liver of rescued mice examined. Similar metabolic disturbances have been reported in mice treated with the ADA inhibitor 2’-deoxycoformycin (82). These findings strongly suggest that mice with limited Ada expression suffer lymphopenia and partial immunodeficiency resulting from deoxyadenosine cytotoxicity.
V. Role of ADA in the Secondary Deciduum A. Pharmacologic Inhibition during Early PostimpIa ntat ion Stages During early postimplantation stages of development, ADA is abundant in the secondary deciduum surrounding the embryo (Fig. 3A and C) (19).As with the high levels of ADA in the placenta, high-level decidual ADA expression suggests that it may serve an important role during early postimplantation stages of development. Treatment of pregnant mice with the potent ADA inhibitor 2’-deoxycoformycin, 7.5 and 8.5 dpc, resulted in a high incidence of embryo lethality, which was evident by 12.5 dpc (83, 84). The teratogenic effects of 2’-deoxycoformycin coincide with these stages when decidual ADA is elevated. Inhibition of ADA in the gestation site was accompanied by disturbances in purine metabolism, including local increases in adenosine and deoxyadenosine (85), as well as dATP (86).Massive apoptosis was seen in embryos shortly after 2‘-deoxycoformycin exposure, and in vitro studies suggest that the accumulation of deoxyadenosine was responsible for the induction of apoptosis (86).The phenotypic and metabolic effects seen were dose dependent, stage specific (7.5and 8.5 dpc only), and stereoselective for the inhibitor, and suggest that ADA may be playing an important role during early postimplantation stages of development.
B. Reproductive Status of ADA-deficient Mice Pharmacological studies with 2’-deoxycoformycin have involved the administration of the inhibitor to the whole animal, making it difficult to determine the relative roles of placental, embryonic, or decidual ADA during early postimplantation stages of development. The creation of ADA-deficient animals, followed by the correction of the placental phenotype, produced mice lacking ADA in all tissues outside the gastrointestinal tract, including the secondary deciduum. These mice provided an opportunity to test genetically the importance of decidual ADA. Intercrosses between rescued male and female mice with restricted Ada expression were reproductively successful (M . R. Blackburn, unpublished data). The females used in these studies lacked ADA in their deciduum.
220
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
However, because of the presence of the ADA minigene locus, they expressed normal levels of ADA in trophoblasts of their placentas. Subsequent matings between females lacking decidual ADA and males heterozygous for the null Ada allele, but lacking the ADA minigene, produced litters consisting of gestation sites, all of which were devoid of decidual ADA, and some which did or did not contain placental ADA. These litters were smaller than those of homozygous mating pairs; histological and iininunohistochemical analyses revealed that gestation sites lacking both decidual and placental ADA showed signs of embryolethality by 9.5 dpc. These data suggest that ADA at the maternal-fetal interface during early postimplantation stages is essential for normal development. However, the relative importance of decidual or placental ADA is still unclear. Though the physiologic mechanisms involved are still not known, it appears that ADA provided by the ADA minigene locus in the developing placenta is sufficient to allow ADA-deficient embryos to survive early postimplantation stages of development without decidual ADA. Massive cell death occurs at the maternal-fetal interface in the mouse as a natural part of the implantation process and the outward growth and expansion of the embryo and placenta (71, 72). Similarly, apoptosis is part of the developmental process of most organ systems in the developing embryo and fetus. Deoxyadenosine is generated from the breakdown of DNA from dying cells (12).The sensitivity of the embryo and fetus to deoxyadenosine cytotoxicity (60, 86) suggests that there must be a means to protect the embryo and fetus from deoxyadenosine accumulation. It seems likely that the high levels of ADA at the maternal-fetal interface serve to prevent the accumulation of deoxyadenosine generated by local cell death. Assessment of the metabolic changes that occur in the embryo and fetus in the presence or absence of decidual and/or placental ADA will strengthen this hypothesis.
VI. Role of ADA in the Gastrointestinal Tract The highest levels of Ada expression in the adult mouse are found in the proximal gastrointestinal tract (Table I) (15,30). Immunohistochemical analysis revealed the enzyme to be predominantly localized to the keratinized squamous epithelium of the tongue, esophagus, and forestomach, and the simple columnar epithelium lining the villus of the small intestine (Fig. 4) (15).The physiological role of ADA in these tissues is not known; however, ADA is part of a collection of purine catabolic enzymes that are highly expressed in the proximal small intestine and that function to produce uric acid (30,27)(Fig. 2). One possible function of ADA in the small intestine is to participate in the production of this antioxidant, which may be important in
ADA REGULATION AND FUNCTION
221
trapping peroxyl radicals formed during digestion (30). As the simple columnar epithelial cells of the small intestine differentiate and migrate to the tips of the intestinal villi, they undergo apoptosis and are then shed into the lumen. This cell death is a potential source of deoxyadenosine, which may be harmful to local tissues such as the mucosally associated lymphoid tissue (87). High levels of ADA in these cells may be present to prevent deoxyadenosine from accumulating. This is consistent with a potential role of ADA in other tissues that undergo massive apoptosis, such as the thymus (12, 28) and deciduum (71, 72). However, ADA is not found in all tissues that exhibit apoptosis. Excessive apoptosis has been observed in the proximal small intestine of newborn ADA-deficient fetuses (61).This suggests that ADA may be playing an important role in this tissue. However, the perinatal lethality of these mice due to liver damage has thus far prevented the full manifestation and characterization of this phenotype. Gross examination of ADA-deficient mice rescued by placental ADA restoration suggests that there are no severe structural problems in the gastrointestinal tract of mice that express Ada only in their forestomachs. It is possible that the lack of normal amounts of ADA in the gastrointestinal tract may cause disturbances in uric acid production, which may effect gastrointestinal physiology. More information regarding the physiological importance of ADA in the proximal gastrointestinal tract awaits the outcome of ongoing efforts to generate rescued animals that are completely ADA deficient. Expression of ADA in various regions of the gastrointestinal tract appears to be regulated by separate tissue-specific gene regulatory elements (Section II,B,3) (23).As with placental Ada regulatory elements, these gastrointestinal tract elements will be useful for targeting expression of regulatory molecules to various regions of the proximal gastrointestinal tract to assess their physiological roles.
VII. Summary Much has been learned about the human and mouse Ada genes: they are similar in their primary structure, they appear to be regulated in similar fashions, and their gene products play similar and critical roles in controlling the levels of physiologically active purines. ADA is an essential enzyme of purine metabolism, which is ubiquitously expressed, but also displays enhanced expression in specific tissues during development and in the adult. In mice, Ada is highly expressed prenatally at the maternal-fetal interface, first in the maternal deciduum during early postimplantation stages, and then in embryo-derived trophoblasts of the placenta during fetal stages of development (18-21). Postnatally, ADA is found at high levels throughout
222
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
the proximal gastrointestinal tract (15, 26). There is still much to be learned with regard to how this pattern of expression is achieved; however, there is mounting evidence that enhanced expression is regulated by distinct gene regulatory elements located throughout the Ada gene. Transgenic mice have proved to be critical for deciphering these regulatory elements, due to the complexity of Ada expression patterns and the need to monitor expression during the normal developmental program (23). The most thoroughly characterized enhancer domain is that required for enhanced expression in the human thymus. Both humans and mice contain a conserved enhancer domain within intron 1, which is necessary for enhanced expression in the thymus (27, 47, 48). A unique feature of this domain in the human is that it requires sequences referred to as “facilitators,” which establish a chromatin transition state necessary for the T-cell enhancer domain to function (48). Similar regions of the murine Ada gene possess the ability to activate the Ada promoter in all tissues. Considerable progress has been made in the identification and characterization of a placental enhancer in the mouse. It is located within the 5’ flanking region of the Ada gene and possesses placental nuclear-protein binding sites as well as many potential transcription-factor binding sites that may be important in placental-specific expression of ADA (23, 41). Other gene regulatory elements that are in the process of being characterized include forestomach, decidual, and small intestine tissue-specific enhancers. Deciphering gene regulatory elements responsible for the developmental and tissue-specific expression of Ada will improve our knowledge of how this gene accomplishes its unique pattern of expression, as well as increase our understanding of gene expression in general. In addition, defining tissue-specific gene regulatory elements will enable the targeting of selected cDNAs to the various tissues in transgenic mice. This will provide a new and powerful means of addressing the functional roles of various regulatory molecules within these tissues. For example, the identification of regulatory molecules and their receptors in subsets of trophoblasts at various stages of development has given rise to new hypotheses regarding mechanisms of implantation and placentation. These molecules include cytokines and their receptors, and matrix-degrading enzymes and inhibitors of these enzymes. Placental regulatory elements can now be utilized to miss-express these proteins, or mutant forms of them, to assess their physiological roles in the placenta. Another example of how these elements can be utilized is seen in the rescue of ADA-deficient mice through the restoration of placental ADA (69). This illustrates a general strategy for the tissue-specific correction of phenotypes associated with null mutations in mice. Placental ADA accounts for
ADA REGULATION AND FUNCTION
223
over 95% of the ADA found in the fetal gestation site (23). ADA-deficient fetuses that also lack ADA in their adjoining placentas die perinatally in association with profound purine metabolic disturbances and hepatocellular impairment (60, 61). The importance of placental ADA was shown by genetically restoring the enzyme to the placentas of ADA-deficient fetuses (69). Doing so prevented most of the purine metabolic disturbances as well as severe liver damage. This suggests that disturbances in purine metabolism, particularly those related to deoxyadenosine cytotoxicity, are responsible for liver damage and subsequent perinatal lethality. The resulting postnatal animals have restricted ADA expression and show signs of CD4 lymphopenia and immunodeficiency, which were associated with thymus-specific accumulations in deoxyadenosine and dATP and profound disturbances in AdoHcy hydrolase metabolism (78). This genetic strategy has now provided animals that can be used to address long-standing questions with regard to the metabolic basis for the immunodeficiency associated with ADA-deficient humans. These animals can also be utilized in the development of enzyme and gene-therapy approaches to treating immunodeficiencies as well as developing new strategies for treating leukemias susceptible to disturbances in purine metabolism. It will be of interest to decipher the mechanism of liver toxicity in ADAdeficient animals, and to assess why this phenotype is not seen in humans lacking ADA. Finally, continued examination into the role ofADA at the maternal-fetal interface should provide compelling information into the control of physiological purines during development. In conclusion, the diverse pattern of Ada expression provides an excellent model for studying gene regulation, as well as tissue-specific responsibilities of this enzyme.
ACKNOWLEDGMENTS We thank John Winston, Vera Sidaraki and David Wilson for their critical review of this manuscript. This work was supported by NIH Grants GM42436, DK46207, and HD30302. M. R. B. was supported by a NIH postdoctoral fellowship (HD07843).
REFERENCES I. 2. 3. 4.
T.G. Brady and C. I. O’Donovan, Comp. Biochem. Physiol. 14, 101 (1965). S. Frederiksen, ABB 113, 383 (1966). Z. Chang, P. Nygaard, A. C. Chinault and R. E. Kellems, B] 30, 2273 (1991). D. A. Wigington, G . S. Adrian and J. J. Hutton, NARes 12, 2439 (1984).
224
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
5. C. Y. Yeung, D. E. Ingolia, D. B. Roth, C. Shoemaker, M. R. Al-Ubaidi, J. Y. Yen, C. Ching, C. Bobonis, R. J. Kaufman and R. E. Kellems, JBC 260, 10299 (1985). 6. 11. Valerio, M. G. C. Duyvesteyn, P. Meerd Kahn, A. G. van Kessel, A. de W ar d and A. van der Eb, Gene 25, 231 (1983). 7. 1). K. Wilson, F. B. Rudolph and F. B. Quiocho, Science 252, 1278 (1991). 8. A. J. Sharff; D. K. Wilson, Z. Chang and F. A. Quiocho, J M B 226, 917 (1992). 9. 1). Bhaumik, J. Medin, K. Gathy and M. S. Colman, JBC 268, 5464 (1993). 9a. J. Sideraki, K, A. Mohamedali, D. K. Wilson, Z. Chang, R. E. Kellems, F. A. Quiochoand F. B. Rudolph, Bchem, in press (1996). 10. J. R. S. Arch and E. A. Newsholnie, Essays in Biochem. 14, 82 (1978). I Z . C . L. Stiles, JBC 267, 6451 (1992). 12. M. S. Hershfield and B. S. Mitchell, in “The Metabolic and Molecular Basis of Inherited Disease” (C. R. Scriver, A. L. Beaudet, W. S. Sly and D. Valle, eds.), Vol. 1, p. 1725. McGraw-Hill, New York, 1995. 1.3. tl. Ullnian, L. J. Gudas, A. Cohen and D. W. Martin, Jr.. Cell 14, 365 (1978). 14. 1). A. Carson, J. Kaye and 1).B. Wasson, J. Zinmunol. 124, 8 (1980). IS. J. M . Chinsky, V. Ramamurthy, W. C. Fanslow, D. E. Ingolia, M. R. Blackburn, K. T. Shaffer, H. R. Highley, J, J. Trentin, F. B. Rudolph, T. B. Knudseii and R. E. Kellems, Dijferentiution 42, 172 (1990). 16. M. K. Siin and M. H. Maguire, B i d . Reprod. 2, 291 (1970). 17. 1’. C. Lee, Dea B i d . 31, 227 (1973). 18. T. B. Knudsen, J. D. Green, M. J. Airhart. H. R. Highley, J. M. Chinskey and R. E. Kellems, Biol. Reprod. 39, 937 (1988). 19. T. B. Knudsen, M. R. Blackburn, J. M. Chinsky, M. J. Airhart and R. E. Kellems, B i d . Reprod. 44, 171 (1991). 20. I). P. Witte, D. A. Wiginton, J. J. Hutton and B. J. Aronow, J . Cell Bid. 115, 179 (1991). 21. M. R . Blackhurn, X. Gao, M. J. Airhart, R. G. Skalko, L. F. Thompson andT. B. Knudsen, I h . Dyn. 194, 155 (1992). 22. L. Hong, J. Mulholland, J. M. Chinsky, T. B. Knudsen, R. E. Kellems and S. T.Glasser, H i d . fieprod. 44, 83 (1991). 2.3. J. H. Winston, C. R. Hanten, P. A. Overbeek and R. E. Kellems, JBC 267, 13472 (1992). 24. M. J. Airhart, M. A. Roberts, T. B. Knudsen and R. G. Skalko, Brain Res. Bull. 25, 299 (1990). 25. J. I). Geiger and J. I. Nagy, J. Neurochevn. 48, 147 (1987). 26. J. M. Chinskey, V. Hamamurthy, T.B. Knudsen, H. R. Higley, W. C. Fanslow, J. J. Trentin and H. E. Kellems, in “Gene Transfer and Gene Therapy,” p. 255. Alan R. Liss, New York, 1989. 27. €3. Aronow, 1). Lattier, R. Silbiger, M. Dusing J. Hutton, G. Jones, J. Stock, J. McNeish, S. Potter, 13. Witte and D. Wiginton, Genes Dea 3, 1384 (1989). 28. B. E. Chechik, W. P. Schrader and J. Minowada, J. 1,nmunol. 126, 1003 (1983). 29. T. Dooley, L. D. Fairbanks, H. A. Sinimonds, C. H. Rodeck, K. H. Nicolaides, P. W. Soothill, P. Stewart, G. Morgan and R. J. Levinsky, Prenatal Diagn. 7, 561 (1987). 30. K. A. Mohamedali, 0. M. Guicherit, R. E. Kellems and F. B. Rudolph, /BC 268, 23728 (1993). 31. 1). Vulerio, M. G. C. Duyvesteyn, B. M. M. Dekkler, G. Weeda, T.M. Berkvens, L. van der Voorn, H. van Ormondt and A. J. van der Eb, EMBOJ. 4, 437 (1985). 32. 1). A. Wiginton, 1). J. Kaplan, J. C. States, A. L. Akeson, C. M. Perme, I. J. Bilyk, A. J. Vaughn, D. L. Lattier and J. J. Hutton, BJ 25, 8234 (1986). 33. T. Mohdndas, R. S. Sparkes, E. J. Suh and M. S. Hershfield, Hum. Genet. 66, 292 (1984). 34. C. M. Alhott, E . P. Evans, M. Burtenshaw, S. T Ball, C. J. Skidmore, J. Jones and J. Peters, Biochem. Genet. 29, 537 (1991).
ADA REGULATION AND FUNCTION
225
35. M. R. Al-Ubaidi, V. Ramamurthy, M. C. Maa, D. E. Ingolia, J. M. Chinsky. B. D. Martin and R. E. Kellcms, Genomics 7, 476 (1990). 36. 1). E. Ingolia, M. R. Al-Ubaidi, C. Y. Yeung, H. A. Bigo, D. A. Wright and R. E. Kellems, MCBiol 6, 4458 (1986). 37. J. W. Innis, D. J. Moore, S. F. Kash, V. Ramamurthy, M. Sawadogo and R. E. Kellems, JBC 266, 21765 (1991). 38. M. R. Dusing and D. A. Wiginton, NARes 22, 669 (1994). 39. D. Valerio, H. van der Putten, F. M. Botteri and P. M. Hoogerbriigge, NARes 16, 10083 (1988). 40. S. Rauth, K. G. Yang, A. M. Seibold, D. E. Ingolia, S. R. Ross and C. Y.Yeung, Somatic Cell Mol. Genef. 16, 129 (1990). 41. Deleted in proof. 42. I). J. Steger, M. Buscher, J. H. Hecht and P. L. Mellon, Mol. Endocrinol. 7, 1579 (1993). 4.3. D. M. Dorfman, D. B. Wilson, G. A. P. Bruns and S. H. Orkin, JBC 267, 1279 (1992). 44. I. C. Ho, P. Vorhees, N. Marin, B. K. Oakley, S. F. Tsia, S. H. Orkin and J. M. Leiden, EMBO J . 10, 1187 (1991). 4 5 . J. P. HoefHer, T. E. Meyer, Y. Yum, J. L. Jameson and J. F. Habener, Science 242, 1430 (1988). 46. H. Weintraub, Cell 75, 1241 (1993). 47. B. J. Aronow, R. N. Silbiger, M. R. Dusing, J. L. Stock, K. L. Yager, S. S. Potter, J. J. Hutton and D. A. Wiginton, MCBiol 12, 4170 (1992). 47a.K. C. Ess,T. L. Whitaker, 6. J. Cost, D. P. Witte, J. J. Hutton and B. J. Aronow, MCBiol 15, 5707 (1995). 48. B. J. Aronow, C. A. Ebert, M. T. Valerius, S. S. Potter, D. A. Wiginton, D. P. Witte and J. J. Hutton, MCBiol 15, 1123 (1995). 49. D. L. Lattier, J. C. States, J. J. Hutton and D. A. Wiginton, NARes 17, 1061 (1989). SO. J. M .Chinskey, M.-C. M u , V. Rarnamurthy and R. E. Kellems, JBC 264, 14561 (1989). 51. Z. Chen, M. L. Harless, D. A. Wright and R. E. Kellems, MCBiol 10, 4555 (1990). .52. Z. Chen, D. Wright and R. E. Kellems, MCBiol 11, 6238 (1991). 53. M.-C. Maa, J. M. Chinskey, V. Ramamurthy, B. D. Martin and R. E. Kellems, JBC 265, 12513 (1990). .S4.V. Ramamurthy, J. M. Chinskey, M.-C. Maa, M. L. Harless, D. A. Wright and R. E. Kellems, MCBiol 10, 1484 (1990). 55. J. W. Innis and R. E. Kellems, MCBiol 11, 5398 (1991). 56. S. F. Karh, J. W. Innis, A. U. Jackson and R. E. Kellems, MCBiol 13, 2718 (1993). S7. S. F. Kash and R. E. Kellerns, MCBiol 14, 6198 (1994). 58. T. O'Brien and J. T. Lis, MCBiol 11, 5285 (1991). 59. J. Lis and C. Wu, Cell 74, l(1993). 60. M. Wakamiya, M. R. Blackburn, R. Jurecic, M. J. McArthur, R. S. Geske, J. Cartwright, K. Mitani, S. Vdishnav, J. W. Belmoiit, R. E. Kellems, M. J. Finegold, C. A. Montgomery, A. Bradley and C. T.Caskey, PNAS 92, 3673 (1995). 61. A. A. J. Migchielsen, M. L. Breuer, M. A. van Roon, H. te Riele, C. Zurcher. F. Ossendorp, S. Toutain, M. S. Hershfield, A. Berm and D. Valerio, Not. Genet. 10, 279 (1995). 62. H. Kizaki, K. Suzuki, T. Tadakuma and Y. Ishimura, JBC 265, 5280 (1990). 6.3. D. A. Carson, D. A. J. Kaye and J. E. Seegrniller, PNAS 74, ,5677 (1977). 64. B. Ullman, B. B. Levinson, M. S. Hershfield and D. W. Martin, Jr., J B C 255, 848 (1989). 65. A. Cohen, R. Hirschhorn, S. D. Horowitz, A. Rubinstein, S. H. Polrnar, R. Hong and 1). W. Martin, Jr.. PNAS 75, 472 (1978). 66. M. S. Hershfield, JBC 254, 22 (1979). 67. J. M. Johnston and N. M. Kredich, J. Inimtrnol. 123, 97 (1979). 68. J. D. Finkelstein and B. Harris, ABB 159, 160 (1973).
226
MICHAEL R. BLACKBURN AND RODNEY E. KELLEMS
69. M. R. Blackburn, M. Wakamiya, C. T. Caskey and R. E. Kellems, JBC 270, 23891 (1995). 70. N. F. Puffinbarger, K. R. Hansen, R. Resta, A. B. Laurent, T.B. Knudsen, J. C. Madam and L. F. Thompson, Mol. P h a m o l . 47, 1126 (1995). 71. A. 0. Welsh and A. C. Enders, Am. J. Anat. 172, 1 (1985). 72. S. Katz and P. A. Abrahamsohn, Anat. Embryol. 176, 251 (1987). 73. E. Y. H. P. Lee, C. Y. Chang, N. Hu, Y. Chun, J. Wang, C. C. Lai, K. Herrup, W. H. Lee and A. Bradley, Nature 359, 288 (1992). 74. H. Gu, J. D. Marth, P. C. Orban, H. Mossmann and K. Rajewsky, Science 265, 103(1994). 75. F. Guillemot, A. Nagy, A. Auerbach, J. Rossant and A. L. Joyner, Nature 371,333 (1994). 76. C. C. Gurtner, V. Davis, H. Li, M. J. McCoy, A. Sharpe and M. I. Cybulsky, Genes Deo. 9, 1 (1995). 77. Y. Uehara, 0. Minowa, C. Mod, K., Shiota, J. Kuno, T.Noda and N. Kitamura, Nature 373, 702 (1995). 78. M. R. Blackburn, S. K. Datta, M. Wakamiya, B. S. Vartabedian and R. E. Kellems,]BC, in press (1996). 79. I. Santisteban, F. X. Arredondo-Vega, S. Kelly, A. Mary, A. Fisher, D. S. Hummell, A. Lawton, R. U. Sorensen, E. R. Stiehm, L. Uribe, K. Weinberg and M. S. Hershfield, J. Clin. Invest. 92, 2291 (1993). 80. A. Tedde, M. E. Balis, S. Ikehara, R. Pahwa, R. A. Good and P. P. Trotta, PNAS 77, 4890 (1980). 81. M. S. Coleman, J. Donofrio, J. J. Hutton, L. Hahn, A. Daoud, B. Lampkin and J. Dyminski, JBC 253, 1619 (1978). 82. H. Ratech, G. J. Thorbecke and R. Hirschhorn, Clin. Zmmunol. Zrnmunopathol. 21, 119 (1981). 83. T. B. Knudsen, M. K. Gray, M. R. Blackburn, M. J. Airhart, R. E. Kellems and R. G. Skalko, Teratology 40, 615 (1989). 84. M. J. Airhart, C. M. Robbins, T.B. Knudsen, J. K. Church and R. G. Skalko, Teratology 47, 17 (1993). 85. T. 8. Knudsen, R. S. Winters, S. K. Otey, M. R. Blackburn, M. J. Airhart, J. K. Church and R. G. Skalko, Teratology 45, 91 (1992). 86. X. Gao, M. R. Blackburn and T. B. Knudsen, Teratology 49, 1 (1993). 87. I. Roitt, J. Brostoff and D. Male, in “Immunology,” p. 3.7. Glower Medical Publishing, London, 1985.
S1 -Nuclease-sensitive DNA Structures Contribute to Tra nscriptio na I ReguI at ion of the Human PDGF A-chain ZHAO-YI WANG THOMAS F. D E U E L ~
AND
Departments of Medicine and Biochemist y and Molecukr Biophysics Jewish Hospital at Washington Uniuersity School of Medicine St. Louis, Missouri 63110
.............................................
228 229 230 231
233 234
F. Promoter R
234 235 235 235
B. An S1-sensitive Site-directed Single-strand DNA Oligomer Suppresses Transcription .... C. Use of the S1 Sensitivity A-Chain Gene . . . . . . . . . . . . . . . . . D. Identification of a Novel Sin on S1 Sensitivity ........................... ........... E. Cell-type-specific Regulato PDGF A-Chain Gene . . . . . . . . . . . 111. Summary and Perspective ...................................... References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
236
238 240
240 241 242
Genes active in transcription characteristically are hypersensitive to digestion with DNase I(]-6). These hypersensitive sites appear as gaps in the nucleosomal arrangements of chromatin fibers; not only are they accessible 1 To
whom correspondence may be addressed.
Progress in Nucleic Acid Research
and Molecular Biology, Vol. 55
227
All
Copyright 8 1996 by Academic Press, Inc.
rights of reproduction in any form reserved
228
ZHAO-YI WANG AND THOMAS F. DEUEL
to enzymatic cleavage but also are sensitive to reagents that cleave doublestranded DNA chemically (7-13). A number of investigators have refined the analysis of DNase I-hypersensitive sites and the basis of their formation (14). Two themes have consistently emerged that are relevant to formation of DNase I-hypersensitive sites and their importance: DNA structure is flexible and constantly subject to change as the result of micro- and macroconformational stress; and DNase I-hypersensitive sites often are at or near transcriptional regulatory regions. Because DNase I sensitivity and S 1nuclease sensitivity have been shown to correlate in regions active in transcriptional regulation, we used S1 sensitivity to map sites sensitive to S1 nuclease to determine if sensitivity assay to S 1nuclease could be used to identify potential regulatory regions. In this review, we develop the background and rationale that led to this approach, the rationale for selecting the PDGF A-chain gene for analysis, and the use of S1 nuclease sensitivity to identify seven regions in the PDGF A-chain gene that recognize DNA binding proteins and appear to be important in its regulation. We also outline a novel strategy to establish the importance of Sl-nuclease-sensitive sites and suggest that mapping of S1-sensitive sites is a useful tool to identify potential regulatory regions within active genes.
1. Background Eukaryote gene expression is mainly regulated at the level of transcription initiation. Transcription requires the ordered positioning of the transcription initiation complex and associated proteins for the productive binding of RNA polymerase I1 at the RNA initiation site(s). Eukaryotic genes transcribed with RNA polymerase I1 contain short cis-acting regulatory elements are recognized by trans-acting protein factors that influence transcription initiation and regulate the efficiency of transcription. Often two or more factors recognize the same site. Homo- and heterodimerization of related factors are of major importance in determining the regulatory signal. Increasing recognition of DNA elements and classes of trans-activating factors have suggested general schemes to begin to account for both the extraordinary diversity and the specificity of gene expression that characterize important biological responses such as development. Three regions of DNA, the promoter, enhancer, and silencer regions, interact to regulate messenger RNA synthesis in higher eukaryotes (14-18). Promoters are located immediately upstream from the start site of transcription and typically contain about 100-300 bp DNA (14, 15). Promoters are required for accurate and efficient initiation of transcription. Enhancers in-
S~-SENSITIVECIS-ACTING REGULATORY ELEMENTS
229
crease and silencers decrease the rate of transcription and are defined as cisacting regulatory elements that function independently of distance, position, and orientation (19).The DNA components of promoters and enhancers are often interchangeable and appear to use many of the same basic mechanisms to modulate transcription (14).
A. Conformational Heterogeneity of DNA Early models of the interplay among trans-acting regulatory factors and the transcription initiation complexes relied on protein-protein interactions among transcription factors. The role of the DNA within the regulatory region was assigned to that of a semi-inert scaEold that correctly positioned regulatory proteins needed for efficient transcription. In part, this model arose from the modular property of the DNA elements associated with eukaryotic RNA polymerase I1 promoters. These elements consisted of relatively short consensus sequences that determined the regulatory functions of the promoter and that could be interchanged without seriously influencing their function (20, 21). More recently, DNA has been shown to be a dynamic structure that can adopt different conformations that are in equilibrium with each other (22, 23). Different conformations of DNA are dictated by their nucleotide sequence and by a number of other factors as well. Unusual DNA conformations that differ from the B-form DNA were found and were often detected because of their sensitivity to DNase I or S1 nucleases in vitro (22, 23), leading to the concept that microheterogeneity of DNA structure is an important component of gene regulation (24).Among the unusual DNA structures identified were “left-handed” Z-DNA, which occurs principally in stretches of alternating C and G residues, cruciform DNA, which occurs at inverted repeat sequences, and triplex DNA and the “slipped DNA configurations, both of which are found at oligopurine-oligopyrimidine stretches. Experimental evidence now supports the existence of non-B-form DNA in living cells (25, 26). Key roles of DNA conformational heterogeneity have been proposed in the regulation of essential processes such as recombination, transcription, and replication (27). Sites of unequal sister chromatid exchange have the potential to form Z-DNA tracts on each of the donor chromosomes (28). Sl-nuclease-hypersensitivesites appear to be important in chromosomal translocation (29, 30). These and many other reports of heterogeneous DNA structures and their apparent functional importance have suggested models that rely on “conformational information” within the secondary structure of DNA to explain regulation of promoter function. This “conformational information” within the DNA structure is considered to be a basic component of the regulatory processes that control the rate of transcription (31).An important implication of conformationally heterogeneous
230
ZHAO-YI WANG AND THOMAS F. DEUEL
DNA and local perturbations that arise in DNA structure within promoter regions is that the affinity of DNA sequence-specificregulatory proteins for these elements may range widely and that different conformations of the same DNA elements may differentially recognize transcription Eactors targeted to similar or identical DNA sequences.
B. Nuclease-sensitive Sites Nuclease-sensitive sites were first identified when investigators examined the conformation of DNA within transcriptionally“active”chromatin (1, 2). Subsequently, sites hypersensitive to cleavage by DNase I in SV-40 and Drosophilu chromatin were characterized in considerable detail (8-10). The more accessible and thus “hypersensitive”regions of DNA to DNase I digestion were often localized in regulatory sequences in less compacted chromatin domains of active or potentially active genes. Their importance in uiuo was suggested when they were found in isolated nuclei. The insights from these and many other experiments suggested that local unfolding of eukaryotic chromatin, with dissociation of its structural proteins to allow entry of the transcription machinery to promoters, is an important initial step in transcriptional control (32-36). Nuclease-sensitive sites are devoid of classical nucleosome cores (37). Transcriptionally active nuclear chromatin is nucleosome free and sensitive to different nucleases that recognize single-stranded DNA. The nucleosomefree regions allow access of truns-acting factors to cis-acting DNA sequences (11,31, 38-44). These considerations and evidence of unusual DNA conformations within the regulatory regions of important growth-related genes such as the human c-myc (45), epidermal-growth-factor (EGF) receptor (46), PDGF A-chain (47), and mouse c-Ki-rus (48) genes, suggest that cis-acting regulatory elements may be identified by experimental strategies that identify these unusual structures. The operational definition of open chromatin as “DNase I-hypersensitive” has been employed for more than a decade (49,50). It is now recognized that DNA sequences proximal to the promoter region of active genes are often sensitive to DNase I and chemical reagents that recognize singlestranded DNA (38). It is also recognized that rapid changes in chromatin structure often accompany gene activation events that occur in response to defined stimuli, indicating the intrinsic plasticity of local chromatin organization. “Inducible” DNase I-sensitive sites may be found in the absence of transcription and may persist following transcription. In contrast, “constitutive sites” are found in genes transcribed at low basal levels, such as housekeeping genes and the heat-shock genes. Constitutive DNase I-hypersensitive sites are independent of gene expression and are found in promoter regions
s S SENSITIVE
CIS-ACTING REGULATORY ELEMENTS
231
of genes in which transcriptional induction precedes transcriptional activation. Constitutive sites need to be maintained or reestablished after replication and mitosis with each cell cycle (38).In the nomenclature of Walrath et al. (SO), inducible promoters require “remodeling” whereas constitutively hypersensitive ones are “preset. This classification implies that the mechanisms leading to DNase I hypersensitivity during interphase, on rapid induction of gene activity or after replication and mitosis, are different (51). Nuclease-sensitive sites on the same gene also may be tissue specific and may be differentially expressed during development (38).For example, sites within the globin genes are selectively sensitive to S1 nuclease at different stages of development (34, implying a remarkable degree of regulation of DNase I-hypersensitive sites. This diversity of regulation also is illustrated with the chicken tissue-specific lysosome gene (52, 53) that is DNase I sensitive in response to hormones. ”
C. Mechanisms Leading to Hypersensitive Sites The mechanisms that lead to the formation of DNase I and other nuclease-hypersensitive sites are poorly understood. DNA methylation, looping, sequence, alternative conformations, torsional stresses, and the interactions of DNA sequences with trans-acting factors all may contribute to formation of hypersensitive sites. In each instance, these sites are discontinuations of the nucleosomes aligned on chromatin fibers (38, 54).Recently, the importance of superhelicity has been emphasized. DNA is constrained in vivo into topological domains within which molecular stresses are regulated by imposed superhelicity. Superhelical stresses destabilize duplex DNA (55), and modulation of DNA superhelicity affects a number of essential functions, such as initiation of transcription, replication, recombination, and repair. The importance of supercoiling of DNA has been established in experiments in which S 1 nuclease was used to detect altered DNA conformations in specific regions of active chick globin chromatin (31). The 5’-flanking regions of expressing but not nonexpressing globin genes contain S1-nuclease-hypersensitive sites. They were found in inserts within supercoiled plasmids but not in inserts in relaxed or linearized plasmids; their formation was ascribed to the free energy of superhelicity (56). Superhelicity is controlled in vivo by enzymatic and by other means. In prokaryotes, superhelicity is altered in response to environmental stresses, and changes in superhelicity appear to mediate specific cellular responses, including ion fluxes and initiation of transcription of specific genes. Unwinding of duplex DNA is required for initiation of transcription and of replication. The sites that undergo destabilization depend on the DNA sequences. Transition to strand separation influences the transition of DNA at other sites as well, in an apparent cooperative fashion. The ability of DNA to bend also is essential for appropriate
232
ZHAO-YI WANG AND THOMAS F. DEUEL
recognition of DNA elements by their cognate binding proteins. Nucleosomes are often positioned specifically along DNA sequences that are adjacent to or in between DNase I-hypersensitive sites. Whereas DNA bendability and the sequences localized close to the dyad axis of the nucleosome influence nucleosome positioning, the basic relationship of the nucleosome “frames” to hypersensitive site formation is unclear. An intriguing twin-supercoiled domain model of transcription has recently been proposed to account for the dependency of superhelicity (57-60)and suggests striking conformational changes locally in DNA. It was suggested that when resistance to rotational motion of the transcription complex is large, the advancing RNA polymerase I1 generates positive supercoils ahead and negative supercoils behind within the DNA template. The positively and negatively supercoiled regions are prevented from mutual annihilation by protein complexes that bind to both of these regions. When a regulatory protein binds to cognate DNA sequence-specific recognition sites, the DNA loop generated by tracking constitutes an independent domain that is positively supercoiled when the enhancer is downstream of the polymerase and negatively supercoiled if upstream of the advancing polymerase. Supercoils of either sense accumulate, dependent on the relative rates of formation and the processes that remove them. This twin-transcriptional loop model suggests that transcription of one gene may regulate transcription of a second gene located in cis through template supercoiling. Importantly, the degree (and sign) of supercoiling also may alter recognition of regulatory elements by their cognate protein factors. Two DNA sequence motifs have been identified that are associated with altered DNA conformation and that function in protein recognition and binding. Oligo(dA)*oligo(dT)tracts cause DNA bending. Such tracts have been identified upstream of promoters within the promoter domains themselves, within protein recognition sites, and in origins of replication (61-64). Methylation and ethylation interference experiments with the SV-40 T-antigen show that the sequences flanking the oligo(dA)*oligo(dT)bind the T-antigen, whereas the (dA)*(dT)motif itself does not. Oligo(dG).(dC) motifs have been detected upstream of the 5’ regions of a number of genes (65, 66). Oligo(dG) upstream sequences of the chicken adult p-globin gene alter DNA conformations with different states of supercoiling, as identified by hypersensitivity to S1 nuclease and increased reactivity with bromoacetaldehyde (65). Short oligo(dG).oligo(dC)stretches, including the GC boxes (GGGCGG) that are binding sites for Spl, are often present upstream of transcription initiation sites. (G + C)-rich “curved duplexes have been described and G C base pairs are found with comparable, if not greater, frequency than T residues at bends in oligomeric X-ray structures.
233
S1-SENSITIVE CIS-ACTING REGULATORY ELEMENTS
Furthermore, G C base pairs appear to be major contributors to the perturbations found in computer simulations of DNA bending. In mutagenesis analyses to identify the critical bases involved in bends of SV-40 DNA, interruption of a short run of Gs that lack neighboring As resulted in a more rapid migration of the DNA fragment on polyacrylamide gel, presumably due to “straightening” (67). Psoralen, which preferentially photoreacts with 5’ TA sequences, was also used to identify a psoralen-hypersensitive region in the SV-40 promoter region. The region identified extends from 150 bp on the late promoter site of the enhancers to the early promoter boundary, suggesting that this region is a sequence-directed structural alteration. Discoveries of activator proteins that distort DNA but lack obvious activation domains have also focused attention on the role of DNA structure in transcriptional regulation (68). For example, the transcription factor MerR mediates repression versus activation through stereospecific modulation of DNA structure. The repressor form of MerR binds between the -10 and -35 promoter element of the bacterial mercury-detoxification gene, P , allowing RNA polymerase to form an inactive complex with PT and MerR (69, 70), whereas when mercuric ion binds, Hg-MerR converts the polymerase complex into the transcriptionally active or “open” form (69-71). MerR bends the DNA toward itself at two loci defined by DNase I sensitivity, whereas the activator conformation, Hg-MerR, relaxes these bends. Thus, activator-induced unbending coupled to untwisting of the operator (72) conformationally alters the promoter favorably as a template.
D. Rationale for S1 Sensitivity Analysis These examples and many others indicate the remarkable ability of DNA to function as a “dynamic” structure and that nonstatic, nontypical DNA conformations are highly and tightly regulated. These properties of DNA suggested that probes to detect structurally different forms of DNA within a given promoter may recognize regions of DNA that are actively or potentially regulatory and may be used to identify regulatory regions in genes, thereby providing evidence for the hypothesis that non-B-form DNA conformations are important and perhaps an essential feature of the regulation of gene transcription. Because each of the altered DNA structures that have been studied shares the property of partial single-stranded character, we tested sensitivity to S 1 nucleases in vivo as a “conformational probe.” The PDGF A-chain gene was used for analysis because its expression is regulated mainly at the level of transcription and because expression of the PDGF A-chain is important in a number of normal and pathological situations. The choice of the PDGF A-chain gene was reinforced because its promoter is highly (G C) rich. Genes of other regulatory proteins contain sites sensitive to S1 in highly (G
+
234
ZHAO-YI WANG AND THOMAS F. DEUEL
+ C)-rich regions, including the chicken p-globin gene promoter (65) and in the c-Ki-ras (73) and the epidermal-growth-factor receptor (EGF-R) genes (46). E. The PDGF A-Chain Gene The platelet-derived growth factor (PDGF) is of central importance in cell growth. PDGF is an important mitogen for cells of mesenchymal origin, such as fibroblasts, arterial smooth muscle cells, and glial cells (74). PDGF is a cationic, -30-kDa protein that binds to specific receptors on the surface of responsive cells and elicits complex signaling pathways that lead to DNA synthesis and mitosis (74). PDGF is also a potent chemoattractant and survival factor. PDGF mediates apoptosis in growth-arrested normal rat kidney cells (75). Structurally, PDGF is a heterodimer of two (A and B) polypeptide chains linked by disulfide bonds. It is found almost exclusively in platelets and megakaryocytes. However, the subunit chains also associate as homodimers (AA, BB) in different tissues and these homodimeric isoforms are equally as active biologically as the PDGF heterodimers (76-78). The B chain of PDGF is the product of the protooncogene c-sis (79) that is located on chromosome 22 (80). The homologous PDGF A-chain gene is found on chromosome 7 (81). Different cell types respond differentially to the three isoforms of PDGF, depending on the dimeric composition of the PDGF receptor. PDGF plays a central role in many physiological and pathological processes in viuo, such as normal cell proliferation, differentiation, wound healing, tissue remodeling, fibrosis, atherosclerosis, and neoplastic cell growth. Expression of PDGF and its receptor also are subject to strict regulation during development and in maturity (82-84), in maternal transcripts (85)and unfertilized oocytes and blastocysts (86). PDGF A-chain but not PDGF B-chain mRNA has been detected in neurons of the peripheral and central nervous systems in later embryonic and adult mice (83). PDGF A-chain transcripts also have been observed in various transformed cell lines and PDGF AA is the major secreted product from a number of transformed cells as well (74, 87). Taken together, these results suggest that expression of both the PDGF A- and B-chains are exquisitely regulated during development and in response to a number of physiological and pathological stimuli (87-89). The results thus indicate the importance of identifying the regulatory sites and proteins acting in trans to regulate transcription.
F. Promoter Region of the PDGF A-Chain Gene We isolated, sequenced, and characterized the genomic DNA of the PDGF A-chain gene (90). The transcription initiation site is 845 nucleotides
S1-SENSITIVE CZS-ACTING REGULATORY ELEMENTS
235
upstream of the translation start site. The sequence surrounding the TATA box is sufficient to promote gene transcription both in vivo and in vitro (90). The sequence surrounding the TATA box has a (G + C) content of -91%. Seven hexanucleotide repeats (CCGCCG or GGGCGG) that correspond to consensus binding sites for the transcription factor S p l (91) were found at positions -906, -552, -416, -411, -72, -66, and -60. A consensus sequence (CGCCCCCG) corresponding to the binding site of NGFI-A (nerve-growth-factor-inducible gene A) (92) [also known as EGR-1, Zif 268, and Krox 24 (93-95)] was found at position -550 and two overlapping sequences were observed at -70 and -64. A consensus sequence for the binding of AP-2 was identified at position -570 (90).
G. S1 Hypersensitivity Mapping Assay To identify possible regulatory sites, we sought regions of single-stranded character with S1 nuclease. The S1 hypersensitivity assay is shown schematically in Fig. 1. Supercoiled DNA containing the putative regulatory regions was isolated from bacteria and treated with 0.5 units of S1 nuclease. After phenol extraction and ethanol precipitation, the DNA was digested with restriction enzymes recognizing the polylinker sites and end labeled either with T4 polynucleotide kinase and [y-32P]ATP or with avian myeloblast virus (AMV) reverse transcriptase and [a-32P]dNTP. After release of inserts with a second restriction enzyme, the end-labeled DNA fragments were separated and analyzed on a DNA sequencing gel with a chemically sequenced DNA ladder as marker.
II. S1-sensitive Sites in PDGF A-Chain Gene A. Identification of SHSI and SHSll The first region tested for S1 nuclease sensitivity was the (G+C)-rich region within the proximal PDGF A-chain gene promoter. The supercoiled plasmid PBS-153 (containing DNA sequence -153 to +387, relative to the start site of transcription of the PDGF A-chain) was digested with S l nuclease and then with BamHI, and the S1-nicked DNA was end labeled and separated on a sequencing gel. Two S1-nuclease-hypersensitive sites were mapped between base-pairs 90 to 96 and 60 to 69 upstream of the transcription initiation site (Fig. 2). The S1 nicking was predominantly found on the C-rich strand and at both ends of the 1 S b p dG-dC motif at location -70 to -82. This S1 nicking is supercoil- and pH-dependent. The upstream S1hypersensitive site (-90 to -96) has been designated SHSI and the downstream site (-60 to -69) as SHSII. SHSI contains sequences similar to the core consensus sequence recognized by the transcription factor HZTFl, a
236
ZHAO-YI WANG AND THOMAS I?. DEUEL
1 1 5’
S1 Nuclease
Cut at site A
label/ J
cut at site B
Y’labe’
1
w
Chemical sequencing ladders
Denaturing gel FIG.1. Scheme for the S1 nuclease sensitivity assay.
member of the NF-KB family (96). SHSII contains 5’-GGGCGG-3’ sequences that are recognized by the transcription factor Spl. To establish a role for (G + C)-rich regions in the transcriptional activity of the PDGF A-chain gene, 7 b p were deleted with S1. A second 34-bp deletion (-39 to -72) also was made. The 7-bp deletion in the (G + C)region resulted in a loss of -63% of transcriptional activity; deletion of the additional 34 bp decreased activity about 84%, suggesting that optimal expression of the PDGF A-chain gene depends on the integrity of the (G+C)rich region and S1-nuclease-sensitive sequence (SHSII) (66).
6. An S1 -sensitive Site-directed Single-strand DNA Oligomer Suppresses Transcription
To confirm the single-stranded character and the possible importance of this region in transcriptional regulation, we tested a 24-mer (5’-GGGGGC-
237
S1-SENSITIVE CIS-ACTING REGULATORY ELEMENTS
A
B -127
SHSl -117
4 U J -87
,.
3-end A labeled labeled mm+ + 1 2 3 1 2 3 G C 5 -End
SHSll -77
,11411.111 -57
-47
-37
-27
GAGG~GCGGGTCCCAGGCCCGGAATCCGGGGAGGCGGGGGGGGGGGGG~G~GCGGGGGCGGGGGAGGGGCGCGGCGGCGGCGG~
FIG. 2. Fine mapping of the S1-nuclease-sensitive sites in the (G + C)-rich region of the PDGF A-chain gene promoter. (A) Supercoiled PBS-153was treated with 0.1 and 0.5 unit of S1 nucleaselpg of DNA. The DNA was purified, digested with BatnHI, and end laheled either at the 5’ end of the top strand or 3’ end of the bottom strand. The end-labeled DNA was fractioned on a 6% sequencing gel along with the 3’ end-labeled DNA chemical sequencing ladder as marker. (B) Sequence of the S1-sensitive sites in the (G + C)-rich region of PDGF A-chain gene promoter. The position of the TATA box is marked. The arrows denote the sites of S1 nicking using the coding strand (dG-rich) to represent sites on the noncoding strand (dC-rich). The length of the arrows denotes relative cleavage frequencies as measured by densitometer tracing of the autoradiograms.
GGGGGCGGGGGCGGGGGA-3’) complementary to the Sl-nuclease-sensitive C-rich strand of the SHSII site described above to determine if it would anneal to this localized single-stranded region of DNA within the (G + C)-rich region in the PDGF A-chain promoter. The synthetic oligonucleotide was end labeled, incubated with the supercoiled pBS840 CAT plasmid at pH 4.5,and assayed for complex formation by electrophoresis in a 1%
238
ZHAO-YI WANG AND THOMAS F. DEUEL
agarose gel and autoradiography. Complex formation was established because the end-labeled synthetic dG 24-mer comigrated with the supercoiled plasmid and bound in nearly stoichiometric amounts to the supercoiled plasmid. Furthermore, the annealed dG 24-mer protected the S1-sensitive site against attack by S1 nuclease (97). Neither the end-labeled synthetic dCrich 24-mer (complementary to dG-rich 24-mer) nor the double-stranded dG*dCfragment annealed to the same plasmid. Furthermore, the complex was not observed when the end-labeled oligonucleotide dG 24-mer was incubated with the plasmid at pH 8.0. The binding specificity of dG 24 to the S1-sensitive site SHSII established that the C-rich strand assumes a single-stranded character in the supercoiled plasmid. The single-stranded character is not likely to be the linear intermolecular triplex structure that has been described with synthesized pyrimidine-rich oligonucleotides that bind in the major groove of homologous regions of double-stranded DNA (98). Such an intermolecular triplex structure is not likely to have formed at SHSII in the PDGF A-chain promoter because that structure predicts that both G-rich (dG 24) and C-rich (dC 24) oligomers will complex with nonsupercoiled plasmid equally. To determine whether this hypersensitive region is functionally important, a supercoiled plasmidlreporter gene containing the PDGF A-chain promoter was incubated with increasing concentrations of the dG 24-mer under conditions favoring DNA-DNA complex formation. The plasmid was transfected into HeLa cells, and transcriptional activity of the reporter gene was assayed. The activity of the PDGF A-chain gene promoter was reduced to one-fourth when the supercoiled plasmid had been incubated with an equal molar concentration (60 nM) of the dG 24-mer. Under identical conditions, incubation of the plasmid with the C-rich oligomer had no measurable effect on promoter activity at concentrations as high as 160 nM. It was necessary to establish the stability of annealing of the 24-mer to SHSII in vivo; over 95% of the 24-mer remained annealed to the plasmid after transfection and maintenance in cells for 72 hours (97). These data indicate that the dG 24-mer remained stably annealed specifically to the non-B-form GC box. Furthermore, the annealed dG 24 strongly suppressed promoter activity of the PDGF A-chain gene in uioo, indicating the importance of SHSII to the transcriptional activity of the PDGF A-chain gene.
C. Use of the S1 Sensitivity Assay in Other Regions of the PDGF A-Chain Gene Supercoiled plasmid DNA containing the PDGF A-chain promoter (-36 to +388) was isolated from the host bacteria and treated with S1 nuclease using the protocol described above. A series of S1-sensitive sites that mapped precisely to the sequence 5'-TACECTCCTCCTCCTCT-3' (+50 to
239
S~-SENSITIVECIS-ACTING REGULATORY ELEMENTS
+67),a homopurine/homopyrimidine motif, were observed downstream of the 5‘ cap site (see Table I). The sensitivity to S1 was not observed when the plasmid was linearized, suggesting that this homopurine/homopyrimidine motif is sensitive to S1 nuclease when the plasmid DNA is in the superhelical structure. This S1-nuclease-sensitive site corresponds to a similar S1nuclease-sensitive “TCC” repeat motif found in the human epidermal growth factor (EGF) receptor promoter, which binds Spl and TCF (46).It was later shown to be a novel and functional binding site for WT1, the product of the Wilm’s tumor suppressor gene (99). The “TCC” repeat motif is shared to varying degrees in promoter regions of certain growth-related genes that possess (G + C)-rich promoters, including the EGF-R (46), c-Ki-rus (73), TGF-P3 (100), and insulin-receptor (101)genes, and may be an important “conformationally” regulated region in these genes as well. Another prominent series of S1-sensitive sites was observed within the junction region (- 484 to -495) of two oligopurine-oligopyrimidine motifs. Additional but less intense sites of S1 cleavage we observed at the junction (-465 to -475) of a downstream oligopurine-oligoipyrimidine motif, indicating that the junction region of the oligopurine-oligopyrimidine motif may undergo a structural transition under torsional stress in vitro to expose preferentially the noncoding strand of the DNA to S 1 nuclease (see Table I). The upstream S1-sensitive region interacts with a nuclear factor and contains a positive regulatory element (202). The sequence of the downstream S1-sensitive region (-4% to -475) was compared to sequences of known cis-acting regulatory elements and found to be similar to the CC(AT),GG or CArG motifs that occur in single or multiple copies in promoters of the c-fos, the interleukin receptor, and the sarcomeric actin and nonmuscle actin genes (103-205).Subsequently, the CArG motif of the PDGF A-chain gene was found to interact with the serum response factor TABLE I
SUMMARY OF S~-SENSITIVE SITESIN THE PDGF A-CHAINGENE DNA sequence
Binding factor
Function
-477 CCTIlTATGG -468 -416 GCGGGGGCG -407 -496 CCAAAGACTGA -486 -96 GGAATCCGG -90 -71 GCGGGGGCGGG -60 +50 TACTCCTCCTCCTCCTCT +67
SRF, SSBF GCF Unknown H2TF1 Spl, NGFI-A, WT1 TCF, WT1, Spl NGFI-A Unknown
Enhancer Silencer Enhancer Enhancer Enhancer and silencer Enhancer and silencer
+ 1605 TCGGGGAGGGGGAGTG GGGCAGGCC + 1630
Cell-type-specific silencer
240
ZHAO-YI WANG AND THOMAS F. DEUEL
(SRF) and to function as a serum response element (SRE) (206).This SRE identified by S1 sensitivity appears to represent an essential site contributing to the increase in transcription of the PDGF A-chain gene in cells treated with the PDGF A-chain homodimer (autoregulation) or with serum.
D. Identification of a Novel Single-stranded SRE Binding Protein Based on S1 Sensitivity Because the SRE region possesses characteristics of single-stranded DNA under supercoiling stress in vitro, we also sought a factor(s)that may recognize the altered DNA structure as well. Nuclear extracts of HeLa cells were fractionated by heparin-Sepharose chromatography, and individual fractions were assayed for DNA binding activities by gel-mobility shift. A factor tentatively identified as the SRF was observed to bind to the double-stranded probe containing the SRE of the PDGF A-chain (106). However, when the noncoding strand of the PDGF A-chain SRE (ASRE) was used as probe, a second protein distinct from the SRF was observed to complex with the labeled probe, but failed to interact with the double-stranded probe or with the coding strand of this region (107).The S1 sensitivity assay thus identified an element that has single-stranded character. Heterogeneity of DNA structure suggests that the same DNA sequence binds separate nuclear proteins that bind preferentially under different conditions with different conformation binding specificity, each of which may significantly contribute to the regulation of the PDGF A-chain gene.
E. Cell-type-specific Regulatory Element in the First lntron of the PDGF A-Chain Gene Each of the Sl-sensitive sites identified in the 5’-flanking region of the PDGF A-chain gene interacts with nuclear protein(s). However, subsequent deletion analyses failed to uncover cis-acting regulatory elements to account for the cell-type-specific expression of the PDGF A-chain gene. The S1 nuclease sensitivity assay was therefore used to search for regulatory element(s) in the first intronic DNA sequences. An S1-nuclease-sensitive region was found in a polypurine-polypyrimidinesequence within a 147-bp AluI-AluI DNA fragment of the first intron within positions + 1591to + 1661 (Table I). The 147-bp DNA fragment has been identified as cell-type-specific silencer that is not active in A172 cells but is functional in HeLa cells and recognizes nuclear factors differentially in A172 and HeLa cells. The celltype-specific silencer activity has been further localized within a 24-bp DNA sequence that is homologous to a previously reported negative regulatory sequence in collagen-I1 genes (108) and may be an essential cis-acting element in controlling the cell-type-specific expression of the PDGF A-chain gene (109).
S~-SENSITIVE CIS-ACTING
REGULATORY ELEMENTS
241
111. Summary and Perspective The PDGF A-chain is very important for regulating both normal and abnormal cell growth (74).It is transcriptionally activated by different stimuli in uitro and in viuo. Based on a number of structural and functional analyses of DNA, summarized above, we tested the hypothesis that regulatory regions are identifiable by unusual structures that contain DNA with singlestranded character and thus S 1 sensitivity. The use of an S 1 sensitivity assay led to the surprising finding that each of seven S1-sensitive sites identified in the 5’- and 3’-flanking sequence of the PDGF A-chain gene interacts with nuclear proteins whose binding regulates PDGF A-chain gene expression. The conformational structure of these S 1-sensitive sites is currently unknown but they are located within polypurine*polypyrimidinesequences (such as GCF binding site and novel NGFI-A binding site) or next to these sequences (such as H2TF1, SRF, and Spl binding sequences) (see Table I), suggesting that these S1-sensitive structures may be induced by these polypurine-polypyrimidine sequences (23).The mechanisms by which the S1nuclease-sensitive sequences are involved in regulation of gene transcription are unclear, but a number of prevailing views suggest that their singlestranded nature allows RNA polymerase-I1 entry and the formation of an open-transcription complex (110). Subsequent positive supercoiling ahead of, and negative supercoiling behind, the polymerase may lead to destabilization of chromatin and access of these regions to different transcriptionally active DNA binding proteins. Non-B-form DNA structures under superhelical stress release histones or other DNA binding proteins from these segments of DNA to form “open” regions of chromatin that are accessible to other regulatory transcription factors (46). An important consequence of these sites to transcriptional regulation is suggested; local conformational changes in cis-acting elements very likely positively or negatively alter the &nity of trans-acting factors for these elements. Two or more proteins may also differentially recognize different conformations of DNA. Thus, the dynamic nature of DNA structure and the microheterogeneity of its conformations provide extraordinary flexibility to these regulatory elements. The use of the S1 sensitivity assay led to the novel use of oligonucleotides annealed to Sl-sensitive regions in supercoiled plasmids to establish functional roles of S1-sensitive regions in transcription. In our experiments, transcriptional activity of supercoiled plasmids incubated with nanomolar concentrations of G-rich oligonucleotides at low pH values (which favors formation of intramolecular triplexes) was greatly reduced. We suggest that the annealed oligonucleotides bind to conformationally altered DNA and interfere with the binding of DNA-sequence-specific transcription factors to influence transcription.
242
ZHAO-YI WANG AND THOMAS F. DEUEL
In conclusion, we have defined specific regions in the 5’-flanking sequence and in the first intron of the PDGF A-chain gene that are involved in the modulation of transcriptional activity by S 1 sensitivity mapping. We have also obtained evidence that the local conformations of DNA influence DNAprotein interactions and that local alterations of the DNA conformation may be detected by sensitivity to S1 nuclease in negatively supercoiled DNA in uitro. Thus, complex regulatory systems that include B-form and nonB-form cis-acting elements and double- and single-stranded DNA binding proteins all may contribute to the complex regulation of the transcription of the PDGF A-chain gene in d u o . Microheterogeneity of DNA conformation may be a very important factor in transcriptional regulation.
REFERENCES 1. H. Weintraub and M. Groudine, Science 193, 848 (1976). 2. A. Garel and R. Axel, PNAS 73, 3966 (1976). 3. C. Bonifer, A. Hecht, H. Saueressig, D. M. Winter and A. E. Sippel, J. CeU B i d . 47,99 (1991). 4. M. C . Alevy, M. J. Tsai and B. W. O’Malley, Bchem 23, 2309 (1984). 5. K. Jantzen, H. P. Fritton and T. Igo-Kernenes, NARes 14,6085 (1986). 6 . B. Levy-Wilson and C. Fortier, JBC 264, 21196 (1989). 7. C. Wu, R. Holmgren, K. Livak, Y.-C. Wong and S. C. R. Elgin, /. CeU Biol. 79, 113a (1978). 8. A. J. Varshavsky, 0. H. Sundin and M. J. Bohn, NARes 5, 3469 (1978). 9. W. A. Scott and D. J. Wigmore, Cell 15, 1511 (1978). 10. C. Wu, P. M. Bingham, K. J. Livak, R. Holmgren and S. C. Elgin, Cell 16, 797 (1979). 11. S. C. Elgin, CeU 27, 413 (1981). 12. J. D.McGhee, W. I. Wood, M. Dolan, J. D. Engel and G . Felsenfeld, Cell 27,45 (1981). 13. I. L. Cartwright and S. C. Elgin, MCBiol6, 779 (1986). 14. T. Maniatis, S. Goodbourn and J. A. Fischer, Science 236, 1237 (1987). 15. S. McKnight and R. Tjian, CeU 46, 795 (1986). 16. N. Webster, J. R. Jin, S. Green, M. Hollis and P. Chambon, Cell 52, 169 (1988). 17. E. Serfling, M. Jasin and W. Schaffner, Trends Genet. 1, 224 (1985). 18. B. Wasylyk, CRC Crit. Reu. Biochem. 23, 77 (1988). 19. B. Ondek, L. Gloss and W. Herr, Nature 333, 40 (1988). 20. E.N. Trifonov, TIBS 16, 467 (1991). 21. D. Kitsberg, S . Selig and H. Cedar, Curr. Opin. Genet. Deu. 1, 534 (1991). 22. R. D. Wells, ]BC 263, 1095 (1988). 23. R. D. Wells, D. A. Collier, J. C. Hanvey, M. Shimizu and F. Wohlrab, FASEBJ. 2,2939 (1988). 24. C . M. Gorrnan, L. F. Moffat and B. H. Howard, MCBioZ 2, 1044 (1982). 25. Y. Kohwi, S. R. Malkhosyan and T. Kohwi-Shigernatsu,J M B 223, 817 (1992). 26. A. R. Rahmouni and R. D. Wells, Science 246, 358 (1989). 27. A. Weinreb, D. R. Katzenberg, G. L. Gilmore and B. K. Birshtein, PNAS 85, 529 (1988). 28. M. Lu, N. Zhang, S. Raimondi and A. D. Ho, NARes 20, 263 (1992). 29. M. Adachi and Y. Tsujimoto, Oncogene 5, 1653 (1990).
S~-SENSITIVE CIS-ACTING
REGULATORY ELEMENTS
243
30. E. Martin-Blanco, J. R. Valverde, F. Flores, I. Vernos and R. Marco, BBRC 194, 647 (1993). 31. H. Weintraub, Cell 42, 705 (1985). 32. G. Felsenfeld, Nature 355, 219 (1992). 33. A. P. Wolffe and S. Dimitrov, Cht. Reo. Eukaryotic Gene Expression 3, 167 (1993). 34. J. L. Workman and A. Buchman, TIBS 18, 90 (1993). 35. P. B. Becker, BioEssoys 16, 541 (1994). 36. S. M. Paranjape, R. T. Kamakaka and J. T. Kadonaga, ARB 63, 265 (1994). 37. J. J. Hayes and A. P. Wolffe, BioEssays 14, 597 (1992). 38. S. C. Elgin, JBC 263, 19259 (1988). 39. R. Reeves, BBA 782, 343 (1984). 40. J. C. Eissenberg, I. L. Cartwright, G. H. Thomas and S. C. Elgin, ARGen 19,485 (1985). 41. G. H. Thomas, E. Siegfried and S. C. R. Elgin, in “Chromosomal Proteins and Gene Expression” (G. Reeck, G. Goodwin and P. Puigdomenech, eds.), p. 77. Plenum, New York, 1985. 42. D. S. Pederson, F. Thoma and R. T. Simpson, Annu. Reu. Cell Biol. 2, 117 (1986). 43.M. Yaniv and S. Cereghini, CRC Crit. Reu. Biochem. 21, 1 (1986). 44. D. S. Gross and W. T. Garrard, TIBS 12, 293 (1987). 45. E. H. Postel, S. E. Mango and S. J. Flint, MCBiol9, 5123 (1989). 46. A. C. Johnson, Y. Jinno and G. T. Merlino, MCBiol 8, 4174 (1988). 47. Z.-Y. Wang and T. F. Deuel, BBRC 188, 433 (1992). 48. D. G. Pestov, A. Dayn, E. Siyanova, D. L. George and S. M. Mirkin, NARes 19, 6527 (1991). 49. C. Wu, Nature 286, 854 (1980). 50. L. L. Wallrath, Q. Lu, H. Granok and S. C. R. Elgin, BioEssays 16, 165 (1994). 51. G. Wall, P. D. Varga-Weisz, R. Sandaltzopoulos and P. B. Becker, EMBO]. 14, 1727 (1995). 52. H. P. Fritton, T. Igo-Kemenes, J. Nowock, U. Strech-Jurk, M. Theisen and A. E. Sippel, Nature 311, 163 (1984). 53. H. P. Fritton, T. Igo-Kemenes, J. Nowock, U. Strech-Jurk, M. Theisen and A. E. Sippel, Biol. Chem. Hoppe-Seyler 368, 111 (1987). 54. D. S. Gross and W. T. Garrard, ARB 57, 159 (1988). 55. C. J. Benham, PNAS 90, 2999 (1993). 56. A. Larsen and H. Weintraub, Cell 29, 609 (1982). 57. L. F. Liu and J. C. Wang, €“AS 84, 7024 (1987). 58. H. Y. Wu, S. H. Shyy, J. C . Wang and L. F. Liu, Cell 53, 433 (1988). 59. G. N. Giaever and J. C. Wang, Cell 55, 849 (1988). 60. Y. P. Tsao, H. Y. Wu and L. F. Liu, Cell 56, 111 (1989). 61. L. Bossi and D. M. Smith, Cell 39, 643 (1984). 62. R. R. Plaskon and R. M. Wartell, NARes 15, 785 (1987). 63. M. Snyder, A. R. Buchman and R. W. Davis, Nature 324, 87 (1986). 64. K. Zahn and F. R. Blattner, Science 236, 416 (1987). 65. J. M. Nickol and G. Felsenfeld, Cell 35, 467 (1983). 66. Z.-Y. Wang, X.-H. Lin, Q.-Q. Qui and T. F. Deuel, JBC 267, 17022 (1992). 67. D. L. Milton and R. F. Gesteland, NARes 16, 393 (1988). 68. R. Tjian and T.Maniatis, Cell 77, 5 (1994). 69. B. Frantz and T. V. O’Halloran, Bchem 29, 4747 (1990). 70. A. Heltzel, I. W. Lee, P. A. Totis and A. 0. Summers, Bchem 29, 9572 (1990). 71. T. V. O’Halloran, B. Frantz, M. K. Shin, D. M. Ralston and J. G. Wright, Cell 56, 119 (1989).
244
ZHAO-YI WANG AND THOMAS F. DEUEL
72. A. 2. Ansari, M. L. Chael and T. V. O’Halloran, Nature 355, 87 (1992). 73. E. K. Hoffman, S. P. Trusko, M. Murphy and D. L. George, PNAS 87, 2705 (1990). 74. T. F. Deuel, Annu. Rev. Cell B i d . 3, 443 (1987). 75. H.-R. C. Kim, S. Upadhyay, G . Li, K. C. Palmer and T. F. Deuel, PNAS 92,9500 (1995). 76. T. F. Deuel, J. S. Huang, S. S. Huang, P. Stroobant and M. D. Waterfield, Science 221, 1348 (1983). 77. C.-H. Heldin, A. Johnsson, S. Wennergren, C. Wernstedt, C. Betsholtz and B. Westermark, Nature 319, 511 (1986). 78. A. Hammacher, U. Bellman, A. Johnsson, A. Ostman, K. Gunnarsson, B. Westermark, A. Wasteson and C.-H. Heldin, JBC 263, 16493 (1988). 79. M. D. Waterfield, G. T. Scrace, N. Whittle, P. Stroobant, A. Johnsson, A. Wasteson, B. Westermark, C.-H. Heldin, J. S. Huang and T. F. Deuel, Nature 304, 35 (1983). 80. R. Ddla Favera, R. C. Gallo, A. Giallongo and C. M. Croce, Science 218, 686 (1982). 81. C. Betsholtz, A. Johnsson, C. H. Heldin, B. Westermark, P. Lind, M. S. Urdea, R. Eddy, T. B. Shows, K. Philpott and A. L. Mellor, Nature 320, 695 (1986). 82. M. Mercola, D. Melton and C. D. Stiles, Deu. B i d . 138, 114 (1988). 83. H.-J. Yeh, K. G. Ruit, Y.-X. Wang, W. C. Parks, W. D. Snider and T. F. Deuel, Cell 64, 209 (1991). 84. H.-J. Yeh, I. Silos-Santiago, Y.-X. Wang, R. J. George, W. D. Snider and T. F. Deuel, PNAS 90, 1952 (1993). 85. M. Mercola, D. A. Melton and C. D. Stiles, Science 241, 1223 (1988). 86. 1). A. Rappolee, C. A. Brenner, R. Schultz, D. Mark and 2. Werb. Science 241, 1823 (19XH). 87. H. Ross, E. W. Raines and D. F. Bowen-Pope, Cell 46, 155 (1986). 88. M. Mercola and C. D. Stiles, Deoeloptnent 102, 451 (1988). 89. T. F. Deuel and R. M. Senior, N . Engl. J. Med. 317, 236 (1987). 90. Y. Tdkimoto, 2.-Y. Wang, K. Kobler and T. F. Deuel, PNAS 88, 1686 (1991). 91. M. R. Brigs, J. T. Kadonaga, S. P. Bell and R. Tjian, Science 234, 47 (1986). 92. J. Mill~randt,Science 238, 797 (1987). 93. B. A. Christy, L. F. Lau and D. Nathans, PNAS 85, 7857 (1988). 94. P. Lemaire, 0. Revelant, R. Bravo and P. Charnay, PNAS 85,4691 (1988). 95. V. P. Sukhatma, X. Cao, L. C. Chang, C.-H. Tsai-Morris, D. Stamenkovich, P. C. P. Ferreira, D. R. Cohen, S. A. Edwards, T. B. Shows, T. Curran, M. M. L. Beau and E. D. Adamson, Cell 53, 37 (1988). 96. M. J. Lenardo and D. Baltimore, Cell 58, 227 (1989). 97. Z.-Y. Wang, X.-H. Lin, M. Nobuyoshi, Q.-Q. Qui and T. F. Deuel, JBC 267, 13669 (1992). 98. H. E. Moser and P. B. Dervan, Science 238, 645 (1987). 99. Z. Y. Wang, Q. Q. Qiu, K. T. Enger and T. F. Deuel, PNAS 90, 8896 (1993). 100. R. Lafyatis, F. Denhez, T. Williams, M. Sporn and A. Roberts, NARes 19, 6419 (1991). 101. S. Seino, M. Seino, S. Nishi and G. I. Bell, AS 86, 114 (1989). 102. Z.-Y. Wang, Q.-Q. Qui and T. F. Deuel, BBRC 198, 103 (1994). 103. R. Treisman, Cell 46, 567 (1986). 104. W. A. J. Yan, J. B. R. Franza and M. Z. Gilman, EMBOJ. 8, 1785 (1989). 105. A. Minty and L. Kedes, MCBiol 6, 2125 (1986). 106. X. Lin, Z. Wang, L. Gu and T. F. Deuel, JBC 267, 25614 (1992). 107. Z.-Y. Wang, X.-H. Lin, M. Nobuyoshi and T. F. Deuel, JBC 268, 10681 (1993). 108. P. Savagner, T. Miyashita and Y. Yamada, JBC 265, 6669 (1990). 109. Z.-Y. Wang, N. Masaharu, Q.-Q.Qui, Y. Takimoto and T. F. Deuel, NARes 22,457 (1994). 110. H. Weintraub, Cell 32, 1191 (1983).
Minute Virus of Mice cisActing Sequences Required for Genome Replication and the Role of the transActing Viral Protein, NS-1 CAROLINE R.
ASTELL,*.'
QINGQUAN L I U , * COLINE. HARRIS,
* JOHN BRUNSTEIN, *
HITESHK. J I N D A L ~ AND PATTAM* *Department of Biochemistry and Molecular Biology Faculty of Medicine Uniuersity of British Columbia Vancouver, British Columbia, Canada V6T 123 +Department of Biomedical Research St. Elizabeth’s Medical Center of Boston Boston, Massachusetts 02135
I. &-Acting Sequences Required for MVM DNA Replication A. Assay for Replication of MVM Genome Constructs . . . . . . . . . . . . . . B. Loop End and Stem Deletions of the MVM Right-hand Hairpin Prevent Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Characterization of a Replication-competent MVM Minigenome,
253 254 255 258
cation of the Minigenome
258
Required for Replication
259
F. Replication of Minigenome
C. Biochemical Functions of Recombinant NS-1
...................
E. In Vitro Resolution of the 5'-5' Bridge Dimer . . . . . . . . . . . . . . . . . F. In Vitro Resolution of the 3'-3' Bridge Dimer . . . . . . . . . . . . . . . . . G. In Vitro Nicking Assays Show That NS-1 Nicks the B Half and Not the A Half of the 3'-3' Bridge Dimer . . . . . . . . . . . . . . . . . . . . . . . . .
1
264 267 267 268 270 27 1 274 275 276
To whom correspondence may be addressed
Progress in Nucleic Acid Research and Molecular Biology, Val. 55
245
Copyright 0 1996 by Academic Press. Inc.
All rights of reproduction 111 any form reserved.
246
CAROLINE R. ASTELL ET AL.
H . Binding of NS-1 to the Origin Region ......................... I. Identification of Amino-acid Residues in NS-1 Essential for Nicking MVM DNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Summary and Future Directions ................................. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
278 279 280 282
The family Parvovirinae includes small, nonenveloped single-stranded linear DNA genome viruses (1). The classification of these viruses recognizes two subfamilies: the Densovirinae, whose members infect insects, and the Parvovirinae, whose members infect vertebrates. The Parvovirinae are further divided into three genera: the Dependouiruses, which normally require helper viruses for replication (e.g., adeno-associated virus, AAV);2 the Puruoviruses, which replicate autonomously (e.g., minute virus of mice, MVM); and the Erythroviruses, a newly designated genus of autonomously replicating viruses that replicate in erythroid progenitor cells (2). There are at present two members of this genus, human parvovirus (B19) and simian parvovirus (SPV) (3),which replicates in cynomologus monkeys. Members of the Puruovirus genus normally have a restricted host range, infecting a particular species, for example, rats, mice, pigs, cattle, cats, mink, dogs, raccoons, or geese (4). Infection with these viruses can cause abnormalities during fetal and newborn stages of the animal’s life, often leading to death or deformity. Early studies that characterized the effects of Kilham rat virus (KRV)inoculated into neonatal hamsters (5,6)have been reviewed (7,8). In addition to host range restriction-based on species-at least two parvoviruses, MVM and porcine parvovirus (PPV), exist as different strains that infect distinctly different cell types (e.g., MVMp and MVMi) or exhibit different pathologies (e.g., PPV) (9, 10). Also, the very closely related feline panleukopenia virus (FPV)and canine parvovirus (CPV)infect feline or canine hosts, respectively (11).The determinant of tropism and/or pathology in the case of all these viruses (MVM, PPV, and CPVIFPV)has been mapped to the capsid protein (VP) of each virus (10-13); however, it is not known how these digerences in the sequence of the capsid protein exert their tropism. In the case of MVMp, which replicates in fibroblast cells, and MVMi, which replicates in cells of lymphoid origin, the different tropism is not exerted at the level of receptor binding (14), and this appears to be the case with PPV as well (10). 2 The parvoviruses referred to in this article are listed in alphabetical order: AAV, adenoassociated virus; B19, human parvovirus B19; BPV, bovine parvovirus; CPV, canine parvovirus; FPV, feline panleukopenia virus; H1, rat H1 parvovirus; KRV, Kilham rat virus; LuIII, LuIII parvovirus; MVM, minute virus of mice; PPV, porcine parvovirus; SPV, simian parvovirus.
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
247
Parvoviruses are spherical (icosahedral) particles approximately 25 nm in diameter. They have a nonenveloped single-layer protein capsid surrounding the DNA genome, and are very stable. Viral particles have been purified by centrifugation in CsC1, with the density of full particles being determined at 1.45-1.47 g/cm3 (heavy fulls) or 1.41 g/cm3 (light fulls) and at 1.32 g/cm3 (empty capsids) (15).The difference between heavy and light full particles has, to our knowledge, not been resolved. The 1.45 g/cm3 particles were suggested to be an infectious precursor form of the mature particle (1.41 g/cm3) (8). It is now known that the major nonstructural protein, NS-1, is covalently attached to the 5’ end of virion DNA, and on newly assembled particles this protein is external to the particles (16).Although many particles lose the NS-1 and remain infectious, particles with NS-1 attached would be expected to have a slightly lower density in CsCl. This observation may account, at least in part, for the two types of full particles observed, but is unlikely to be the complete explanation. Because of their small size, the structure of parvoviruses is amenable to X-ray crystallographic techniques. The structure of canine parvovirus has been reported at 3.25 A resolution (17).Also, B19 particles (self-assembled from recombinant-expressed capsid proteins) are being studied (18,19), and several other parvovirus structures, including MVM, are under investigation (20). A detailed discussion of parvovirus structure has appeared (21), but is beyond the scope of this review. The genome of parvoviruses is a single-stranded, nonpermuted DNA molecule -4.6 to 5.5 kb long. These viruses package a single molecule of the minus strand (complementary to transcripts) and, in some cases, the plus strand (same polarity as transcripts). Some viruses package approximately equivalent amounts of minus and plus strands [e.g., AAV (22), B19 (23), and LuIII (24)], whereas others package predominantly minus strands [e.g., MVM, which packages -99% of minus strands ( S ) ] ;others package a range of minus and plus strands [e.g., bovine parvovirus (BPV) packages up to 2030% plus strands, depending on the host cell (25)]. No parvovirus that packages predominantly plus strands has been characterized.3 The genomes of all parvoviruses examined contain two diagnostic features. First, they are packaged as single-stranded linear DNA molecules and, second, they all have terminal palindromic sequences that have the potential to fold into double-stranded hairpin structures:
-
By convention, the genome of autonomous parvoviruses, which package predominantly minus strands, has been drawn with the 3‘ ends of the minus strand on the left and 5’ ends on the right (26).The plus or Complementary strand is drawn 5’ + 3’, with the left end corresponding to the 5’ end and the right to the 3‘ end of the DNA. In this article, the 3‘ end of the viral genome is also referred to as the left-hand hairpin, and the 5’ end as the right-hand hairpin.
A
B w
Ip W
5149
i
5049
5100
\
I T T R G T R T T Rc T miT T T r n t t t T G t t R t G t i t t G A G A T R c R T t i t T TcGc i R i tR t c t R R c T
1 w R c T G t T miT G cGc r c w c crmcc R m cct G c ~ 1 ~ T I G G T R R T C R I R R l G R l ~ C R R R R R T C C C R C E C T C C C R C C C T C T R I G ~ R C R C R R G C G R T R C T C G C I l G R C C R I G R C C R R C C R R RCC~GRA G T T G G T T G G T C T G G C C G 1
5‘
I
4944
I
4993
/ G \
5024
5026
’
FLIP
I 5047
FIG. 1. Structure of the MVMp hairpin termini. (A) The nucleotide sequence of the viral 3’ (left end) terminus is illustrated in a Y-shaped hairpin configuration. Of the 115-nt sequence, 104 nt are able to base-pair with complementary sequences. The nucleotides within the stem portion (nt 25-26 and 89-91) are mismatched and result in a bubble structure, referred to as the “bubble” at nt 25 (32).(B and C ) The nucleotide sequence of the viral 5’ (right end) terminus is illustrated in a linear hairpin duplex form. Both the “flip”and “flop” sequences are illustrated. The stem structures are identical from nt 4949 to nt 5023 and nt 5070 to nt 5149. The loop end region of the right-end hairpin exists in two sequence orientations (flip and flop) as a result of hairpin transfer during D N A replication (32) (see Fig. 2 for further explanation). (D) The right-hand end of the MVM genome has inverted repeat sequences within the 206-nt terminal palindrome. If these are arranged in a base-paired configuration, the hairpin consists of a stem-plus-arms structure. The central shorter arm has been referred to as a knob, hence, a deletion of the knob generated a plasmid referred to as Bk.
G C
D
C
G T T G E T
l
C A A C C A
T A
5099
G C
5049
I
T A G C T R T G R G C G A R C r t G T R C GRCCGGCT C G A l R C T C G C T T G f l C C R T t CTGGCCGT
I
4994
n i
C C A A C C A
G G T T E G T
I
5046
'c
GA G C
FIG. 1. (Continued)
STEMS PLUS ARMS (FLIP)
250
CAROLINE R. ASTELL ET AL.
These terminal hairpin structures are essential for viral DNA replication and are discussed in detail in Sections II,B-I1,D. The terminal palindromes vary in size from -115 nt (left end, MVM) (27) to 365 nt [both ends of B19 (28)]. In addition, some parvoviruses have the same sequence at both ends of their genomes [e.g., AAV-2 (29); B19 (28, 30)]; hence, the ends are described as inverted terminal repeat (ITR) sequences, whereas other parvovirus genomes have terminal palindrome sequences that are unrelated, one end from the other [e.g., MVM (31, 32), other rodent parvoviruses such as H1, and bovine parvovirus (BPV) (33)]. The nucleotide sequences of the left-hand and right-hand terminal hairpin structures of the MVM genome are illustrated in Fig. 1. Of note is the right-hand terminal structure of MVM, which exists as two closely related sequences (flip and flop) that arise due to the process of hairpin transfer. More than 20 years ago, it was proposed that a novel solution to replicating the 5’ ends of the daughter strands of linear DNA molecules could be envisioned if the ends contained palindromic sequences (34). As illustrated in Fig. 2A, the unreplicated 5’ end of the daughter strand would leave a 3’ single-stranded tail on the parental DNA strand. This single-stranded region could fold back on itself and be ligated to the daughter strand. A nick opposite the ligation transfers the hairpin end to the daughter strand, and strand extension of the parental strand completes synthesis of the end. A specific consequence of this process is that if the terminal palindrome is an imperfect palindrome (Fig. 2B), hairpin transfer results in two sequences at the ends of the genome. These sequences are related in that one is the inverted complement of the other. When the first structures of the ends of parvovirus genomes were obtained, most were indeed shown to contain two sequences, related in that one is the inverted complement of the other. This was true for the ends of AAV-2 (35) and the right end of the MVM genome (31,32). However, the left end of the MVM genome contained a single sequence, yet it was an imperfect palindrome (27,31);hence, this observation suggested that the left end and the right end were replicated in different ways. Further studies of monomer R F molecules isolated from infected cells confirmed that a single sequence orientation is present at the left end of the MVM genome, and explanations such as only one sequence orientation being packaged did not explain the observations of the unique left-hand terminal sequence (32). In an effort to explain how the sequence of the left end was a unique sequence while the right hand palindrome existed as two sequences, a modified rolling hairpin model of DNA replication was proposed (Fig. 3) (32). At the time this model was proposed, it was known that at least one viral protein, the major nonstructural protein, NS-1, was required for replication (M. Merchlinsky and D. C. Ward, personal communication). Also, other
- -251
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
A
5
B
GRR T T C y
0
GRRGTTC
8
ni,ck
nick
1
ligate
0
0
GRA T T C
GRRCT T C
C T T ARG
CTTGRAG
FIG.2. Hairpin transfer mechanism to explain replication of the end of a DNA molecule containing a terminal repeat sequence. (A) The parental DNA strand (narrow line) undergoes semiconservative replication, resulting in a daughter strand (wide line) that is incompletely replicated at its 5 end due to lack of a primer. The single-stranded inverted repeat sequence (for convenience represented here as a 6-nt EcoRI recognition sequence) folds back on itself, and a ligation step followed by a nicking step result in hairpin transfer [i.e., transfer of the terminal inverted repeat sequence from the parental (upper) to the daughter (lower) strand]. The hairpin transferred to the lower strand acts as a template for strand extension to fill in the upper (parental) strand and complete synthesis. (B)The same scheme is illustrated for an imperfect inverted repeat sequence, showing that hairpin transfer generates two terminal sequences (related in that one is the inverted complement of the other). For the purposes of this illustration, a short 6- or 7-nt sequence is shown. In the case of the termini of parvovirus genomes, the terminal palindromes range from 115 nt to 206 nt (left and right ends, respectively, of the MVM genome) to 365 nt (both the left and right ends of the B19 genome).
studies had noted a protein covalently attached to the 5‘ end of viral DNAs (31).Consequently, it was suggested that the protein involved in nicking and ligating the MVM DNA during replication (and hence becoming covalently attached to the DNA) is NS-1. NS-1 was proposed to function in a manner similar to that of the 4x174 cis-A protein, which nicks replication intermediates and ligates single-stranded (circular) progeny genomes (36). Although one study suggested that the terminal protein was cellular in origin (37),it is now clearly established that the protein coupled to the 5’ ends of MVM DNA is NS-1 (16, 38). The remainder of this review focuses on two aspects. The first is a summary of studies directed toward characterizing the cis-acting sequences required for MVM genome replication. The second is a discussion of the
CAROLINE R. ASTELL ET AL.
252
I
2
r‘
3 C
V
c
C
a
I
>*
4
VPW
nick 1 3’-3’ bridge dimer nick/ligate
C
nick
FIG.3. Modified rolling hairpin model (MRHM) for autonomous parvovirus DNA replication. The incoming single-stranded viral genome strand is extended by host cell polymerases to form a covalently linked (left end) monomer RF (parental RF) (step 2). The right-hand ends fold back and the 3‘-OH primes synthesis (step 3) of a dimer RF (step 4) joined in the center by an extended left-end hairpin. This structure is referred to as the 3’ to 3’ bridge dimer. According to the MRHM, the bridge dimer is resolved asymmetrically, beginning with a nick, as indicated (step 4). Strand extension of the V parental strand displaces the left-hand hairpin sequence until the 5’ end of the nick site is opposite the second nick site (step 5). A second nick and ligation
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
253
biochemical properties of NS-1, the major nonstructural protein of MVM, and how it is involved in replication of the genome.
1. cis-Acting Sequences Required for MVM DNA Replication Two facts suggest that the terminal palindrome sequences of MVM are required for parvovirus genome replication. First, all parvovirus genomes characterized contain these unusual terminal structures (33), and these structures had been proposed as a novel solution to replicating the ends of linear DNA (34).Second, studies characterizing defective interfering particles of MVM showed that these replication-competent linear DNA molecules contain relatively large deletions within the central portion of the genome but retain the hairpin termini (39). In order to characterize the DNA sequences needed for replication, it is necessary to have an infectious clone. Such a clone of the MVM genome became available in 1983 (40). Construction of this clone, referred to as pMM984, required a series of steps to generate a full-length molecule with extended hairpin structures inserted (using BamHI linkers) into pBR322 (40).This plasmid was obtained by cloning the right-hand and left-hand ends of the genome separately and combining them into the same plasmid. Although it was established that pMM984 could be transfected into permissive mouse cells and would yield infectious virus (as determined by plaque formation), amplification of the plasmid in bacterial cells resulted in deletions within the right-hand terminal sequence that rendered the plasmid (viral insert) incapable of replicating in eukaryotic cells (40).These deletions were characterized and shown to occur in two steps, resulting in symmetrical
resolves the dimer into two monomer RF molecules (step 6). It was proposed that the m o l e h e on the left is converted to an extended form by nicking and strand extension and is recycled (to step Z), whereas the fully extended monomer R F molecule (on the right, step 6) undergoes repeated hairpin transfer and strand displacement synthesis (step 7) to generate progeny singlestranded viral DNA. This mechanism conserves the sequence at the left end and generates two sequences (flip and flop, step 8a and 8b, respectively) at the right end. In the original presentation of the model, it was proposed that the nickases that function at both the right end and the 3’-3’ bridge dimer were the same and they were likely NS-1. This has subsequently been confirmed (see Section II,D for details). In 1985, it was further proposed that the reason the bridge dimer is resolved asymmetrically is that the “bubble” at 25 (see Fig. 1A) generates a small sequence heterogeneity in the arms of the bridge dimer (see Fig. 10A). The filled circle indicates NS-1 covalently attached to the 5‘ end of the DNA and the arrowhead indicates the direction of DNA synthesis. The open triangle indicates the two sequences, “flip” and “flop,” found at the right end of the genome (see Fig. 1B and C).
254
CAROLINE R. ASTELL ET AL.
deletions at precise positions within the hairpin (33, 41). The results were consistent, with the deletions occurring as a result of slipped mispairing during replication due to the presence of short direct repeat sequences within the hairpin (33). In order to create mutations within the genome and ask if these afFect replication, it was necessary to find a bacterial cell capable of propagating the genome faithfully. To this end, a search of recombinant-deficient(rec-) Escherichia coli was undertaken (41), largely due to a report stating that some rec- strains of E. coli can propagate large inverted repeat sequences (42). This search was not extensive, but did identlfy one strain, JCSlll, that replicated the infectious MVM clone with little evidence of hairpin deletion (41). Surprisingly, the strain that worked best for the MVM hairpin was not the best strain identified for growth of lambda phage containing a 3200-bp perfect palindrome (42). The JCSlll strain has been used for other parvovirus infectious clones as well as other inverted repeats; it continues to be useful, but another commercially available strain, SURE cells (Stratagene, La Jolla, CA), works almost as well in our laboratory and has advantages with respect to lower amounts of background nucleases and restriction modification systems (P. Tam and C. R. Astell, unpublished observations).
A. Assay for Replication of MVM Genome Constructs In order to assay replication competence of molecular clones of the MVM genome, plasmids containing the MVM DNA and a source of NS-1, the major nonstructural protein, are transfected into permissive mouse fibroblast cells (e.g., LA9 cells) or nonpermissive simian COS-7 cells. In the first studies to establish infectivity of pMM984, NS-1 was expressed from the MVM genome encoded by the plasmid (40).The viral DNA replicated whether or not it was excised from plasmid sequences (i-e.,NS-1 was capable of releasing the viral sequences from plasmid sequences), although excision of the viral sequences increased transfection efficiency 5- to 10-fold. Purification of the excised viral insert yielded DNA as infectious as monomer RF DNA isolated from infected cells, and ligation of the excised insert DNA into a DS circular molecule doubled the efficiency of transfection of MVM RF (40). In these assays, the efficiency of replication was estimated by the number of plaques produced. Biochemical assay of replication was also determined by extraction of DNA from transfected cells, deproteinization, digestion with restriction endonuclease, and electrophoresis on agarose gels. When the DNA was transferred to a filter and probed with nick-translated viral DNA, evidence for replication (specificdouble-stranded viral DNA bands, including
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
255
terminal extended and turnaround fragments) was observed (40).A useful modification of this assay included digestion of DNA extracted by a modified Hirt lysis of transfected cells with the restriction endonuclease DpnI (43). DpnI recognizes the sequence G-mA-T-C. Grown in bacterial cells that contain the dam methylase, plasmid DNA can be digested with DpnI. However, if the DNA replicates in eukaryotic cells that lack a dam methylase, hemi- or nonmethylated G-A-T-C sequences are not digested (44). Early studies showed that a mutant pMM984 construct deleted in the viral structural protein coding region is replication competent (40),whereas a hameshift within the nonstructural protein coding region rendered the plasmid replication incompetent (M. Merchlinsky and D. C. Ward, personal communication). Although some constructs can be tested for replication competence with the NS-1 gene expressed from the DNA clone being tested for replication, it is simpler to supply NS-1 on a second plasmid (43, 45).
6. Loop End and Stem Deletions of the MVM Right-hand Hairpin Prevent Replication As mentioned above, the infectious clone of MVM undergoes deletions within the right hairpin when propagated in most normal cloning strains of E . coli, and hence amplification in J C S l l l or SURE cells is necessary to prevent this phenomenon. We used J C S l l l cells to isolate several spontaneous right-hand hairpin deletants of the MVM genome. The deleted clones were stable in J C S l l l , and they contained symmetrical “loop e n d deletions of 40 or 97 bp (Fig. 4A). In either case, the cloned deletants failed to replicate when transfected into LA9 or COS-7 cells in the presence of NS-1 (Fig. 4C) (43). We had noted previously that many, if not all, parvovirus hairpin structures can fold into a “stem-plus-arms’’ structure (Fig. 4A) (33). Because the Bi (EcoRI restriction fragment B, intermediate deletion, 40 bp) and AB (EcoRI restriction fragment B, major deletion, 97 bp) right hairpin deletants, which could not replicate, can fold into an extended hairpin and not a stem-plus-arms hairpin, we predicted that a smaller deletion that would allow formation of the arms should be replication competent. The Bk (EcoRI restriction fragment B, knob deletion, 9 bp) deletant was tested for replication competence and, surprisingly, this construct also failed to replicate. Hence, we concluded from these studies that the entire right-hand hairpin of the MVM genome is required for genome replication. A second group of deletant clones referred to as “stem deletions” were constructed by exonuclease I11 digestion from the BarnHI linker into the right-hand hairpin. As illustrated in Fig. 4A, as long as enough of the hairpin was retained such that one could envision repair by strand extension (i.e., dl 16), the stem deletant could replicate. More extensive stem deletions render
256
CAROLINE R. ASTELL ET AL. A
A
--.
\
1 -I
I I V
Extended hairpin
Stem-plus-arms hairpin I nrcclcwdu
-> wild ype
>-
dclctcd 0
Bi
40
AB
91
t "Stem" deletlom
wild trpe
dl 16 .dl 2
dl 19
nuclcocida deleted 0
46
dl2
82 101
613
126
dl 19
FIG. 4. Schematic illustration of the right and left hairpin ends of the MVM genome, including mutant hairpin termini constructs. (A) The right-hand end of the MVM genome is illustrated in the stem-plus-arms and linear-duplex-hairpin configurations. In the loop end deletions, symmetrical deletions of the loop end of the hairpin are indicated. The nucleotides between two identical symbols (AB, Bi, or Bk) are deleted. In the stem deletions, a progressively increasing number of nucleotides was deleted from the BamHI linker used to generate the infectious clone, pMM984. Deletions 16, 19, 2, and 3 delete 46, 82, 101, and 126 nt, respectively. (B) The left-hand palindrome of the MVM genome is illustrated as a Y-shaped hairpin. Stem deletions of progressively increasing size (dl 12, 26, 31, 57, and 78) are indicated. The unpaired region within the stem corresponds to the bubble at nt 25. (C) Transient DNA replication assays of loop end and stem deletants ofthe right-hair hairpin of MVM [data are from Salvino et al. (43)].Mouse A9 cells were transfected with an infectious clone of MVM, pMM984 loop end deletants (AB, Bi, or Bk), or stem deletants (dl 16, dl 19). After 48 hours, DNA was isolated by a Hirt extraction and analyzed by electrophoresis on an agarose gel, blotted to nitrocellulose, and probed with MVM DNA. When pMM984 was transfected into cells, replica-
257
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
6
n
nuclcotidcs deleted
“Stem” deletions
wild typc
dl 57
0
dl 12
11
dl 26
25
dl 31
30
dl 51
56
ia
71
di
C 1 2 3 4 5 6 7 8 9
tion intermediates, including single-strand (ss) DNA, monomer RF (m) DNA, and dimer RF (d) DNA, are clearly seen (lane 1).Dilutions of 0.1X or 0.01X were run on lanes 2 and 3, respectively. Lane 4 is from a mock infection. Lanes 7, 8 , and 9 correspond with the loop end deletions AB, Bi, and Bk, respectively, and lanes 5 and 6 correspond with stem deletants dl 16 and dl 19, respectively. The band labeled M is a 1000-bp fragment (32P-labeled)added to the Hirt extracts as a means of controlling for variable recovery of DNA in the extraction processes.
the plasmids replication negative (Fig. 4C). In another naturally occurring “mutation,” the MVMi genome has a subtle variation in sequence compared with MVMp and it is replication competent (46,47).These results are entire-
258
CAROLINE R. ASTELL E T AL.
ly consistent with studies with the AAV-2 infectious clone, in which topology was concluded to be the important determinant in replication (48). Subsequent studies have shown that certain nucleotides involved in the resolution and replication mechanisms play a key role, and these will be discussed in Sections 11,E-I1,H. The published studies with “loop e n d and “stem” deletion clones of the right-hand hairpin were done with essentially full size genome clones (43). Similar results were obtained with these right-hand hairpin deletions cloned into a minigenome, pPTLR (P. Tam and C. R. Astell, unpublished results).
C. Characterization of a Replication-competent MVM Minigenome, pPTLR To facilitate studies with other mutations within putative cis-acting replication sequences, a recombinant minigenome of MVM, pFTLR, was constructed (45).This clone contains the left 411 nt of pMM984 (viral sequences 1-411) joined to the 807-nt XbaI to BamHI fragment (viral sequences 43425149; Fig. 5A). Replication of this construct is as efficient as the full-length genome when NS-1 is supplied by cotransfecting mouse LA9 cells with pCA4.0 (a derivative of pMM984) or COS-7 cells with pCMV-NS-1 (NS-1 expressed under control of the cytomegalovirus immediate-early promoter) (Fig. 5B). The advantage of using a minigenome of this size is that all four monomer RF species (mLR 1 through 4) predicted by the modified rolling hairpin model are resolved on the agarose gels (Fig. 5B) and have been characterized by exonuclease 111 and S1 nuclease digestion, and boiling and fast cooling (45).The putative structures of mLR,-, are illustrated in Fig. 5C. The steady-state amounts of each intermediate in LA9 cells indicate that mRF species with a closed left-hand end and closed left- and right-hand ends predominate, whereas in COS-7 cells, the fully extended mRF and closed left-hand end predominate. These results were observed with transfected cells under steady-state conditions (DNA isolated at 48 hours) and suggest that the left-hand hairpin is replicated (resolved) less efficiently than the right. Studies with LL (two left-end) and RR (two right-end) minigenome constructs support this conclusion (49) (see Section I,F), as do results of others in which the accumulation of replication intermediates following infection of synchronized cells showed 65% of mRF molecules are in the fully extended form, whereas most of the rest are covalently joined at the lefthand end (50).
D. Most Stem Deletions of the MVM Left-hand Hairpin Prevent Replication of the Minigenome “Stem deletions” were generated at the left palindrome of the pFTLR minigenome using exonuclease I11 (51). Deletions of 11, 25, 30, 56, and 77
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
259
bp (Fig. 4B) were identified and used in the minigenome construct to test replication in mouse LA9 cells and COS-7 cells. All left hairpin terminal deletions appeared to replicate in mouse cells cotransfected with pMM984 (source of NS-1); however, the size of the monomer RF species was that of the pPTLR minigenome, indicating that the deletion plasmids recombined with the left end of pMM984 with high efficiency, generating a replicationcompetent minigenome. When the same deletion plasmids were cotransfected into COS-7 cells (with pCMV-NS-1, source of NS-1) only the dl 12 (11-bp deletant) construct could replicate. From Fig. 4B, it is clear that this is the only construct with the potential to repair itself (by strand extension). The dl 26 mutant would be expected to repair the bubble at 25 against the opposite strand, eliminating this unpaired region of the hairpin. Hence, once again it appears that replication of the MVM genome requires the entire hairpin sequence, although mutations of specific nucleotides were not tested in these assays.
E. MVM Sequences Internal to the Right-hand Hairpin Structure Are Required for Replication Having established that almost the entire left and right hairpins are required for replication of the MVM genome, we sought to determine the minimum size of genome containing these hairpins that could replicate. The minigenome pPTLR (-1.2 kb) described above was the smallest construct shown to replicate at that time. It is significantly smaller than previously characterized defective interfering particles (39) or an internal deletant (-1.9 kb) (43).A series of internal left-end (ILE) and internal right-end (IRE) deletions within pPTLR were constructed (Fig. 6A) and tested for replication in LA9 or COS-7 cells (45). ILE deletions to nt 259 replicated as efficiently as the minigenome, pPTLR, but a deletion to nt 140 replicated at a reduced rate in both cell types, indicating sequences from nt 140 to nt 411 are not essential for replication. In contrast, IRE deletions show that a region internal to the right-hand hairpin is required for replication (45). Deletions from nt 4243 to nt 4489 replicated well in LA9 cells, but an IRE deletion to nt 4636 does not replicate (Fig. 6B). In COS-7 cells, replication was detected up to and including nt 4636 and at a reduced level up to nt 4695. Beyond that, replication was abolished. Because these plasmid deletants (pPTLR411-4636and pPTLR411-4695) are, in total size, 924 and 865 nt, and significantly smaller than the minigenome (1218 nt), it was possible that the small size of these minigenomes restricted their replication. However, insertion of spacer sequences failed to restore replication, unless these sequences contained the MVM sequences deleted. Interestingly, a fragment spanning this entire region, inserted in either orientation, restored replication.
A
LH
11
mb
I
I 1
I
I
I
I
RH
+I
FIG.5 . Structure and replication of the minigenome of minute virus of mice, pFTLR. The left-hand end (411 nt) and right-hand end (807 nt) of the MVM genome were cloned, generating a plasmid that was deleted from nt 411 to nt 4342.This plasmid, pPTLR, retained both hairpin termini intact, the P4 promoter, the poly(A) signal, and the two 65-bp repeats (tandem arrows) near map unit 92. (B) When pPTLR is transfected in mouse A9 cells or COS-7 cells in the presence of NS-1 (in A9 cells pPTLR is cotransfected with pCA4.0, a derivative of the infectious clone pPMM984, and in COS-7 cells pF'TLR is cotransfected with pCMV NS-1, a plasmid containing the NS-1 gene under control of the CMV immediate-early promoter), the MVM sequences are excised and replicated. The monomer RF molecules are -1.2 kb in size and are resolved into four bands, rnLR1, mLR2, mLR3, and mLR4. Dimer RF niolecules dLR 1, 2, and 3 are also seen, as is a small amount of single-stranded DNA (ssLR). (C) When the mLR products are analyzed by digestion with exonuclease I11 and S1 nuclease, or heated and fast cooled, the properties of the four bands are consistent with molecules that are fully extended (mLRl), covalently linked at one end (mLR2 and mLR3), and covalently closed at both ends (mLR4).
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM d
B
+
26 1
4
k 5
a
a=fiMVM 4- ssMVM
-: d L R -3
f s s L R 1
2
3
4
5
A9
6
7
8
9 10 11 12
COS-7
C mLRl
mLR3
mLR4
+
4
>>-
4
FIG. 5 . (Continued)
1
A
[
I LH
NcoI 259
1
Deletions
Deletions ILE 1
c
- 1
L 1-1
-
-Sau3aSau3a
4741 4806
I >
I
LpolyN+)
H d1411-4436
d1411-4489
1 - 1 I
IRE
-b [RsaB]
RsaI RsaI 4.579 4662
Rsa I 4431
EcoRV. 381
-
-b RsaA
I
I d1259-4342 d1140-4342
d1411-4636 { d1411-4695 I d1411-477 I d1411-4806 I d1411-4853
'
d141l-925
FIG. 6. Construction of internal left-end (ILE) and internal right-end (IRE) deletions in the plasmid pPTLR and analysis of their replication in A9 cells. A series of nested deletions from nt 411 were generated in a leftward and rightward direction by using appropriate restriction endonuclease sites (leftward) or exonuclease 111 (rightward)digestion. The plasmids were designated by the nucleotide numbers bracketing the deletion (e.g., dl 140-4342 is equivalent to the full infectiousgenome clone deleted from nt 140 to nt 4342). Both leftward deletants dl 259-4342 and dl 140-4342 replicate when transfected into A9 cells with pCA4.0 [source of NS-1 (45)].The rightward deletants also replicate when the deletion extends to nt 4489, but not ifthe deletion extends to nt 4636 or beyond (lanes 6-11). Note that the sizes of the mLR, ssLR, and IP bands decrease because the size of the deletant minigenome is smaller with each successive deletant. In this figure, the DNA is analyzed without DpnI digestion. Similar replication results are observed after DpnI digestion; however, this digests the input pCA4.0 and MVM and generates smaller fragments that complicatethe visual interpretation of the overall result. DNA species identified are mMVM, monomer RF of MVM DNA derived from the infectious clone (pCA4.0); pCA4.0, a full-length infectious clone of the MVM genome; IP, the internal deletant clones of MVM used in the transfection; ssMVM, single-stranded MVM DNA (full-length); mLR, monomer RF molecules derived from the internal deletant clones; ssLR, single-stranded DNA derived from the internal deletant clones; pPTLR, the 1.2-kb minigenome of MVM; and pPTLR dl 411-4436, an internal deletant clone of pPTLR deleted between nt 411 and 4436,etc.
263
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
B d
4
d
2
E+
P
$
P
23.0
-
9.4
-
6.7 4.3
-
-nput pCA4.0 +mMVM
7
t:a 1.1 0.9
-
IP
4-
ssMVM
]
mLR
2
ssLR
1 2 3 4 5 6 7 8 91011 FIG. 6. (Continued)
In order to further analyze the internal replication sequence (IRS) required for replication, the region was subdivided by restriction fragment digestion into RsaA and RsaB, corresponding with fragments nt 4431-4579 and nt 4580-4662, respectively (Fig. 6A). When RsaA was inserted (in either orientation) into a replication-incompetent IRE deletant, replication was restored to -60% of the minigene level in the correct orientation, but only 7% in the incorrect orientation. When the RsaB fragment was inserted (in either orientation), replication was restored to -20-30% of the minigene level (both orientations) (49). These observations suggest that the IRS sequence plays an important role in replication of the MVM genome; however, its function is not known. In order to investigate the function of the IRS sequence, we studied binding of proteins in nuclear extracts of A9 cells (49). It was found that both the RsaA and RsaB fragments bind host cell nuclear proteins and that at least
264
CAROLINE R. ASTELL ET AL.
four specific complexes are formed with RsaA (MRF A3, A4, A5, and A6) and three specific complexes are formed with RsaB (MRF B3, B4, and B5) (Fig. ?A). Two other complexes, MRF A2 and MRF B2, are disrupted when either unlabeled RsaA or RsaB (but not an unrelated DNA fragment) is used in competition with the labeled fragment. This observation suggests that the protein(s) contributing to the MRF A2 and MRF B2 complexes are the same. It was also shown that the same complexes form when uninfected or MVMinfected LA9 nuclear extracts were used; hence, the proteins that bind to the A/B region are cellular. To further characterize the protein-DNA complexes, the LA9 nuclear extracts were fractionated by a standard biochemical procedure, and a fiaction containing predominantly MRF B5 binding activity was used for DNase I footprinting. The nucleotides protected are illustrated in Fig. 7B. Other data suggest that MRF B3 and MRF B4 are components of MRF B5 (49). Further studies are under way to establish more precisely the nucleotides within the RsaA/RsaB region important for replication and/or protein binding and the identity of the cellular proteins (J. Brunstein and C. R. Astell, unpublished results). We do not know yet what the function of the IRS sequence is; however, one possibility is that it facilitates a folding back of the extended right-hand palindrome during synthesis of dimer RF molecules (Fig. 3, step 3) and single-stranded viral progeny strands (Fig. 3, step 7) (51).
F. Replication of Minigenomes with Two Righthand and Two Left-hand Hairpin Termini As indicated above, all parvovirus genomes have terminal palindrome sequences that can form hairpin structures. Some genomes have inverted terminal sequences in which the palindromes are identical, whereas others have end sequences that are unrelated. The former have been termed type A genomes and the latter have been termed type B genomes (33). Teleologically, it makes more sense to have a type A genome. Studies with hairpin deletions of the AAV genome (type A) showed that the genome can replicate even if most of one hairpin end is deleted. The explanation is that because there is one intact end, the deleted end is able to repair itself using the intact end as a template (48). This is not possible with type B genomes (e.g., MVM) and we have observed that relatively minor stem deletions of either the left or right hairpin render the genome replication incompetent (see Sections 1,B-1,D). It has intrigued us for many years why MVM and many other parvoviruses evolved a type B genome. To gain insight into this phenomenon, we constructed two mutant minigenomes that have two left hairpins (pPTLL) or two right hairpins (pPTRR) (49) (Fig. 8). When these genomes are trans-
265
REPLICATION O F THE AUTONOMOUS PARVOVIRUS MVM
A
B Uninfected NE
Uninlecled NE
%I
* m e ’
; o a a b 4- NS
4- MRF 86 4- MRF 84
MRF A4 MRF A3
4- MRF 63
4- MRF A2
4- MRF 82
4- Free Pmbe R
1
C nt4519 I
I
Rsa I
2
3
4
FbaB
I
5
-
4- FmOPmbb
d
Site I
2
3
Site II
4
4
5
nt 4662 I
I
Rsa I
FIG. 7. Cellular proteins in nuclear extracts hind to RsaA and RsaB fragments located inboard of the right hairpin end and required for replication of minigenomes. Radiolabeled RsaA fragment (nt 4431-4578) (A) and RsaB fragment (nt 4579-4662) (B) were incubated in the presence (lanes 2-5) or absence (lane 1)of 5 pg of uninfected A9 nuclear extract and analyzed by nondenaturing gel electrophoresis. Competitor DNAs were included at a 200-fold molar excess of unlabeled RsaA (lane 3), RsaB (lane 4), or an unrelated Rsa70 fragment (lane 5). Specific protein-DNA complexes were designated MVMp replication factors (MRFs). Nonspecific (NS) complexes are also indicated. Because the A2 and B2 bands are competed by both RsaA and RsaB, it is presumed that the same protein or proteins form complexes on both RsaA and RsaB. The nuclear extract was subjected to chromatography. A fraction that contained predominantly MRF B5 binding activity was used in DNase I footprinting studies. (C) The nucleotides protected by the M R F B5 fraction are illustrated (heavy over- or underbar). The arrowheads indicate a region with increased sensitivity to DNase I in the presence of MRF B5.
fected into COS-7 cells with pCMV-NS-1, pPTRR replicates as efficiently as the pPTLR construct, whereas pPTLL replicates very inefficiently. Our initial pPTLL construct lacked the AIB (IRS) region described
CAROLINE R. ASTELL ET AL.
266
1.7 kb
p4
(XhoIIXbaI)
P
a P4
pPTLLX
"
A
1.0 kb
pFTLLXS1F P4 P 2 A b AIB+-, v
sx
I
sx
pPTLLXS2F
1
2.3 kb
a P4
1.4 kb
FIG. 8. Schematic representation of pFTLL and pPTRR clones. The diagrams illustrate that these clones contain two leR hairpins (Y-shaped) (pFTLL, pPTLLX, pPTLLXSIF, and pPTLLXSZF) or two right hairpins (pPTRR) (drawn in the stem-plus-arms configuration). In addition, the 65-bp repeats (two small arrows), the IRS sequence (RsaAIRsaB region), and P4 promoter are indicated.
above. LL constructs that contain the IRS sequence do replicate at a reasonable level (although definitely less efficiently than the pPTLR and pPTRR constructs), generating monomer RF (mLL) DNA and dimer and higher multimer sequences. The mLL DNA occurs as two bands that were characterized as having the following structures: =-and
=
Whatever the role of the left-hand terminus is, it apparently does not function as an efficient hairpin during replication. In marked contrast, the right hairpin is replicated efficiently at either end. Because the pPTRR construct replicates as efficiently as pPTLR, one may wonder why a new "isolate" of MVM that possesses two right-hand
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
267
termini has not been characterized. MVM particles are packaged with an NS-1 molecule covalently linked to the 5’ end of negative-strand DNA on the outside, and it has been suggested that this end of the progeny strand may initiate packaging of the DNA (16). However, the packaging signal for MVM DNA has not yet been characterized, and studies have shown that VP2 binds to the 3’ terminal hairpin of MVM RF DNA and packages singlestranded viral D N A (52, 53). Consequently, it seems reasonable to us at this time to assume that MVM has a type B genome because, although the righthand hairpin end replicates efficiently and the left-hand hairpin does not, the left-hand end likely contains the packaging signal for formation of progeny virions. [It should be mentioned that although VP2 can package singlestranded viral DNA (52, 53), viral particles lacking the minor capsid protein VP1 are not infectious (%).I
II. The Nonstructural Proteins of MVM Of the 5149-nt MVM genome, the left (115nt) and right (206 nt) hairpins represent -6% of the genome. Much of the remainder (of the plus strand) contains two large open reading frames accounting for -86% of the genome (55,56).All vertebrate parvovirus genomes are arranged in a similar manner. The left half of these genomes encodes nonstructural (NS) proteins and the right half encodes structural (VP) proteins. Differences in gene expression among vertebrate parvoviruses arise due to the use of one [e.g., B19 (56,57) and BPV (58)],two [e.g., MVM (59)], or three [e.g., AAV (60,61)] promoters to initiate transcripts that undergo alternate complex splicing events to generate mRNAs translated into NS and VP proteins (62).
A. The Family of NS-2 Proteins In the case of MVM, the two promoters are located at P4 and P38 (59) (Fig. 9). Primer extension assays map the initiation of transcripts to nt 201 ? 5 (64)and nt 204 or 205 (63)(P4 transcripts) and nt 2003 (46)or 2005 5 (64) (P38 transcripts). All transcripts are processed and polyadenylated at the far right side of the genome using predominantly the most distal of four AATAAA signals (65). Splicing of the primary transcripts is complex. A large splice within the left half of the genome results in a transcript that encodes the NS-2 proteins. These proteins (-25 kDa) contain the same 84 N-terminal amino acids as NS-1 and then change reading frame to acquire a distinct block of amino acids as the second exon. The C terminus varies due to alternate splicing of the small intron at map unit 45 (66, 67) (Fig. 9). As a result, three different C-terminal ends are found on NS-2. The NS-2 proteins are located in the cytoplasm and nucleus of infected
CAROLINE R. ASTELL ET AL.
268
R2
r r
1
NS-2
L VP-1
FIG.9. Transcription map for the MVM genome. Transcripts originate at one of two promoters, P4 or P38. They all terminate at the most distal poly(A)site. Splicing is complex; nine transcripts are generated. The 83-kDa NS-1 protein is encoded by three R1 transcripts. The R2 transcripts result in three 25-kDa NS-2 proteins that differ slightly at the COOH end due to alternate splicing near map unit 45. The R3 transcripts encode VP1 (the minor capsid protein) and VP2 (the major capsid protein). For MVM, the third capsid protein, VP3, is generated by proteolytic cleavage.
cells; the phosphorylated species of NS-2 are predominantly cytoplasmic whereas the nonphosphorylated ones are distributed in both the nucleus and the cytoplasm. The half-life of the NS-2 proteins is relatively short (-1 hour) compared with NS-1 (>6.5 hour). Because all three NS-2 proteins share similar degradation, phorphorylation, and localization patterns, the significance of the carboxyl end of these proteins is unknown (68). The role of the NS-2 proteins in the life cycle of MVM is still uncertain, although they are required for replication in some cell lines and are required for capsid protein synthesis and ssDNA synthesis (69-71). They are also involved in pathogenesis of MVM-infected mice and are required (along with NS-1) for maximal cytotoxicity in human transformed cells (72, 73). More recently, evidence has been presented that the NS-2 proteins are required for correct folding or assembly of the capsids rather than translation of VP1 and VP2 (74).
B. The Maior Nonstructural Protein NS-1 The major nonstructural polypeptide NS-1 is a 672-aminoacid protein. The stop codon for the ORF encoding NS-1 precedes the small splice at map
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
269
unit 45 (Fig. 9), hence, the primary sequence of this protein is unique, unlike the NS-2 proteins. Somewhat surprisingly, the ORF encoding NS-1 is open for some 49 codons upstream of the ATG at nt 261. This seems quite unusual and also occurs in MVMi, a closely related virus (46).There are four amino-acid-coding differences within this region between MVMp and MVMi, but no stop codons. The strongest evidence against the use of this “pre”-coding region appears to be the lack of transcripts mapping to this region (59).Also, indirect evidence from recombinant NS-1 expressed from nt 261 indicates this protein is active in several functional assays (described in Section I1,C) (75-78). Although NS-1 is a unique sequence protein, it is phosphorylated (79) and localized within the nucleus of infected and transfected cells (51, 80, 81). The phosphorylation of NS-1 is exclusively on serine residues, and phosphotryptic peptide analysis indicates that at least 18 phosphotryptic peptides can be resolved in infected mouse cells (H. K. Jindal and C . R. Astell, unpublished results). The role of the phosphorylation is not known. As mentioned in Section &A, the half-life of NS-1 has been estimated at >6.5 hours (68). A number of years ago, as a start to mapping the functional domains of NS-1 and understanding its role in the MVM life cycle, computer analysis of parvovirus proteins identified a putative nucleotide-binding fold motif within the NS-1 protein (82).This region is conserved among the NS proteins of all vertebrate parvoviruses. The major functional roles of NS-1 have been associated with replication of the genome and transcriptional activation of the viral promoters. When the infectious clone of MVM became available, it was shown that a deletion within the structural protein coding region permits replication of the DNA. However, a frameshift mutation at the MstI site (nt 1061)within the NS-1 coding region blocks replication (M. Merchlinsky and D. C. Ward, personal communication; 83). The precise role of NS-1 in replication of the MVM genome is still not fully understood; however, recent in vivo and in vitro studies have led to an improved understanding of the replication mechanism, and not surprisingly have provided data that cannot be explained by the modified rolling hairpin model (Fig. 3). In addition, the role of NS-1 in transcriptional activation has been clarified. MVM NS-1 enhances transcription from the P38 promoter by up to 100-fold and the P4 promoter to a lesser extent (84, 85), and cis-acting regulatory sequences have been identified for the P38 promoter (86)and P4 promoter (63, 87). Unlike many transcription factors, NS-1 was not shown convincingly to bind to either the P4 or P38 promoter regions, which have been implicated in promoter regulation. However, very recent evidence shows that it binds to the P38 promoter (88).Prior to this observation, the working hypothesis suggested that NS-1 interacts with other cellular proteins more directly involved in the initiation of transcription, and this is still
270
CAROLINE R. ASTELL ET AL.
likely. When NS-1 is fused to a DNA binding domain that binds upstream of a reporter gene, NS-1 up-regulates expression of that gene (89). Further constructs have narrowed this truns-activation domain to the C-terminal region of NS-1. Similar results have been obtained by others (90).In support of the idea that NS-1 interacts with a cellular protein or proteins, the twohybrid genetic-selection system has been used to clone a cellular cDNA that binds the N-terminal half of NS-1 (C. E. Harris and C. R. Astell, unpublished results). Because NS-1 localizes in the nucleus of cells, it presumably contains a nuclear targeting signal. This signal appears to consist of a bipartite lysinerich motif near amino-acid residue 200 (K1"K(X)1,KKK216) (91).These studies also observed that both the wild-type and a C-terminal deletion mutant of NS-1 (amino acids 1-605) are able to effect the nuclear localization of NL(nuclear localization negative) mutants of NS-1. These results suggest that NS-1 can oligomerize prior to transport into the nucleus. Other recent data suggest that oligomers of NS-1 must form in order for NS-1 to bind to the P38 promoter (88),and we have indirect genetic evidence that two molecules of NS-1 exhibit protein:protein interaction (C. E. Harris and C. R. Astell, unpublished results).
C. Biochemical Functions of Recombinant NS-1 In order to characterize biochemical functions of NS-1, we overexpressed this protein in insect cells using a recombinant baculovirus (75).Purification of NS-1 was achieved using an immunoaffinity column containing a monoclonal antibody to NS-1 (75, 80). The initial activities associated with this recombinant NS-1 were ATPase and helicase, which paralleled similar functions associated with the AAV REP 68 (and 78) proteins (92, 93). The recombinant NS-1 complements a mouse cell extract and effects site-specific nicking of the 3' half-dimer bridge and resolution of the dimer bridge (Q. Liu and C. R. Astell, unpublished results; see Sections II,F and 11,G).The insect cell recombinant protein expression system was also used to study a series of mutations within the NTP-binding motif of NS-1 (94). These mutations targeted putative key residues within the nucleotide-binding fold (94, 95), which corresponds with conserved amino acids in a superfamily of proteins involved in replication of DNA and RNA viruses believed to be associated with a helicase function (96). Of the mutations constructed, all had greatly reduced helicase activity, whereas the ATPase activity varied from 95 to 1% of that of the wild-type NS-1 protein (94). In addition to studies with purified NS-1, others have attempted to map functional regions within NS-1 by cotransfecting an NS-l-expressing (wildtype or mutant) plasmid into cells with either a viral genome construct containing the &-acting sequences required for replication (94, 97) or the
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
271
MVM P38 promoter driving a suitable reporter gene (81, 89, 94, 97). In summary, most mutations within the nucleotide-binding fold are replicationnegative (rep-) and trans-activation-positive (trans+). One notable exception is the lysine residue at position 405. Its conversion to serine (94) or arginine or methionine (81) blocks trans-activation of the P38 promoter. A similar observation was made for the NS-1 protein of H1 parvovirus (98). In addition, several mutant NS-1 proteins are defective in the resolution of the 5’ to 5’ and 3‘ to 3’ bridge dimer structures (81).(See Section II,D for an explanation of bridge dimer resolution.)
D. O n the Role of NS-1 in Replication of the MVM Genome A major focus of this review is to summarize recent data on the role of NS-1 in replication of the MVM genome. The original model for MVM replication (99)was modified about 10 years ago (32) to take into account the fact the right-hand hairpin of the genome exists in two sequence orientations, indicating that it may arise by a hairpin transfer mechanism. In contrast, the left-hand hairpin exists as a unique sequence and hence is replicated by a mechanism different from that on the right end. The key modification for the rolling hairpin model was the prediction that resolution of the central dimer bridge (3‘ to 3’, tail to tail) arrangement required asymmetric nicking of this region, which arose due to a subtle asymmetry generated by the “bubble” at nt 25 (Fig. 10). In addition, the ends of the monomer RF (extended form) at the right-hand end are longer than the viral genome DNA by about 18 nt, indicating nicking in this region is inboard of the genomic end by some 18 nt. Since the modified rolling hairpin model was proposed, it has become firmly established that NS-1 is covalently bound to the 5‘ end of monomer RF DNA as well as viral genomic DNA (38) and in fact can be detected on the exterior of newly assembled virus particles linked to the DNA genome (16). The next obvious step to understanding how the MVM genome is replicated was to study resolution of the 5‘ to 5’ and 3’ to 3’ dimer bridge structures. The first breakthrough came with studies of Cotmore and Tattersall (1OO), who cloned MVM DNA spanning both the right end to right end (viral 5’ to 5’) and left end to left end (viral 3’ to 3‘) fusions into plasmids. We refer to the former as a right-end-dimer bridge clone and the latter as a left-enddimer bridge clone. The left-end-dimer bridge structure is located within the central region of the dimer RF molecule (Fig. 3), and the right-enddimer bridge structure would be found in tetrameric and higher concatemers readily observed during replication of MVM (101) and other parvoviruses, such as AAV-2 (102).
272
CAROLINE R. ASTELL ET AL.
-
I
I
5-1
AG
’’I
I I
Tc
?TC
AAG
u
13’
1
1
1
Ir.
’
I
1
TC
I
I ’
AG
Psll[Ncol] StyI 3
Psll[Ncol] StyI
StyI [Ncol][Alul] Pvull
FIG. 10. Resolution of the 3’-3’ bridge dimer according to the modified rolling hairpin model for MVM DNA replication. (A) The left end of the MVM genome contains a Y-shaped hairpin structure. This sequence is an imperfect palindrome. The major asymmetries are a bubble at nt 25 within the stem of the hairpin and two “arms”of differing lengths and nucleotide sequence (see Fig. 1for the complete sequence). When the genome undergoes replication, the left-hand hairpin is located in the center of the dimer RF in an extended form (Fig. 3, line 4). The bubble and length of the arms allow one to distinguish between the two halves of the extended left-end hairpin, and these are designated as the “A” half and “B” half. In this diagram, the arms at the loop end of the hairpin are folded out from the duplex molecule to emphasize the asymmetry. According to the modified rolling hairpin model, a nick is introduced on the A half as indicated. Strand extension synthesis displaces the lower strand. A second nick and ligation result in the B half of the extended 3’-3’ bridge dimer being resolved into a covalently closed end, whereas the A half of the bridge dimer is in an extended configuration. In addition, NS-1 (0)is covalently linked to the 5’ end of the A half. Both Cotmore et al. (77) and Liu et al. (78) established that the products predicted by this model are observed; however, subsequent data establish that the mechanism of resolution is incorrect in that the initial nick likely occurs on the
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
6
C
1 2 3 4
273
1 2 3 4
1
5
2 6
I
3 4
B half at site #2 (see Section II,G for further discussion). (B) An in oitro resolution assay of the 3'-3' bridge dimer clone, pQLDB1, demonstrates that this plasmid is resolved in the presence of NS-1 into a B half turnaround form (fragment 3) and an A half extended form (fragment 2). The presence of the turnaround or extended configuration was established using 2-D neutraUalkaline agarose gel electrophoresis. In addition, low levels of turnaround end form for both halves of the dimer bridge (bands 3 and 6) occur, likely due to recombination across the dimer bridge. Also present are low levels of extended form of the B half (fragment 7), indicating that (according to the resolution mechanism proposed in the modified rolling hairpin model) the initial nick can occur (-10% of the time) on the B half. (C) A series of plasmids with mutations in the A half, designed to reduce resolution (altered bubble sequence, altered nick site sequence), were tested in the resolution assay. Surprisingly, all the mutants retained much of their activity; however, the amount of each product was altered relative to wild type, pQLDB1. The identity of the bands are as follows: 1, plasmid DNA; 5, unresolved dimer bridge fragment; 2, extended form of A half; 3, turnaround form of B half; 6, turnaround form of A half; 7, extended form of B half; 4, small plasmid DNA fragment.
When either of these circular plasmids is transfected into murine cells and superinfected with MVM (source of NS-l), the plasmids are resolved and replicated as linear molecules with two 5' (right ends) or two 3' (left ends) at the termini. In addition, the ends of the products of the right-enddimer bridge are predominantly in the extended form (molecules with NS-1 covalently attached to the 5' ends). In contrast, the ends of the resolution products of the left-end-dimer bridge were both extended (with NS-1 covalently attached to the molecular 5' end hydroxyl) and in a turnaround form (with, of course, no NS-1 attached) as illustrated:
274
CAROLINE R. ASTELL ET AL. right-end-dmer bridge
left-end-dirner-bndge
+c
I I 4 c--;c_s I
l
I
l
m
++ c
1
*
The head-to-head arrows indicate the dimer bridge. The black circles indicate NS-1 protein covalently attached to the 5’ ends of the DNA. This observation of the presence of extended and turnaround forms of the left-end hairpin was consistent with that predicted by the modified rolling hairpin model, and was observed a number of years ago with MVM infections (103), with the transfected pPTLR plasmid (45), and more recently in a careful study of MVM DNA intermediates in infected synchronized mouse cells (54). Although in this initial publication on the resolution of bridge dimers it was not possible to distinguish between asymmetric or symmetric resolution of the left-end bridge dimer (IOO), it was not long before in vitro resolution provided evidence for asymmetric resolution (104, 105) (see Section 11,F).
E. In Vitro Resolution of the 5 ‘ 4 ’ Bridge Dimer Knowing that both the left-hand end and right-hand end bridge dimer constructions could be resolved in uivo (loo),the next obvious approach was to observe this phenomenon in uitro. Cotmore et al. expressed recombinant NS-1 in HeLa cells using vaccinia virus vectors (106). Nuclear extracts containing NS-1 were prepared essentially as was done for in uitro replication of the SV-40 genome (107)and incubated with clones of the right-end bridge dimer (106). The products of the resolution of the circular plasmid pREB1412 [the same 5 4 bridge dimer clone that is resolved in uivo (106)lwere characterized. In this in uitro reaction, it was apparent that resolution occurred in the presence of little net DNA synthesis; hence, the products were labeled by a combination of strand extension and nick translation. It was observed that both the extended forms and the turnaround forms of each arm are detected (i.e., resolution of the right-end-dimer bridge structure is symmetric) and that NS-1 can be shown to be attached covalently to the 5‘ ends of the extended form. Lower levels of the turnaround forms from both arms were also generated in the absence of NS-1 and likely arise due to recombination across the palindrome:
+ ‘J
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
275
In summary, these studies show that there is a functional origin of DNA replication within the 5’-5’ bridge dimer and in the presence of wild-type NS-1; presumably this is resolved by symmetric nicking, strand extension, and religation to generate both extended and turnaround forms. The products of the resolution reaction were characterized by analysis on two-dimensional neutral/alkaline agarose gels. Although the extent of resolution in the presence of recombinant NS-1 was relatively low (5%), it seems very likely that this reaction is a reflection of in uivo events. Evidence that sequences within or near the 5’ hairpin are required for resolution was obtained when a derivative of pREB1412 lacking the central 296-bp sequence (the entire extended 5’ hairpin) was shown not to be resolved in the presence of NS-1.
F. In Vitro Resolution of the 3‘-3‘ Bridge Dirner Two laboratories have studied in vitro resolution of the 3’-3’ bridge dimer (BD).* Cotmore et al. (104) observed asymmetric resolution of the BD, in vitro, again in the presence of NS-1 expressed in HeLa cells and LA9 cells. The resolution mechanism as proposed by the modified rolling hairpin model is summarized in Fig. 10A. This model predicts that the initial nick site is asymmetric, occurring on the GAA/CTT arm of the BD. This arm has been referred to as the “A” arm (104),the other as the “B” arm. Because we have already referred to the inverted repeats at the loop end of the terminal palindromes as the “arms” of the hairpin (Fig. lA), we will refer to each half of the bridge dimer as the A half and B half, corresponding to the A arm and B arm, respectively (104).When the resolution products of the 3’-3‘ bridge dimer were characterized, in much the same way as for the 5’-5’ bridge dimer, it was clear that the A half was predominantly in the extended form and the B half was predominantly in the turnaround form. Essentially identical results were obtained in our laboratory (105) (Fig. 10B). We used a BD clone, pQLDB1, assembled using synthetic oligonucleotides.5 Our source of NS-1 was a crude extract of insect cells infected with a recombinant baculovirus. We also observed that the BD clone is converted into a linear form (with two turnaround form ends) in the absence of NS-1, presumably due to recombination across the palindrome. Using an LA9 cell extract, the level of recombination seems to be somewhat lower (105)than when HeLa extracts are used [104]. Also, the extent of resolution in the presence of NS-1 is estimated to be -10%. In later experiments, in which the amount of substrate pQLDBl plasmid 4 In subsequent discussions, the 3’-3‘ bridge dimer is referred to as simply the bridge dimer, or BD. 5 We used this approach to make the BD because it was readily adaptable to combining oligonucleotides that contained mutations.
276
CAROLINE R. ASTELL ET AL.
was reduced to &th, the level of resolution was much higher (30-40%) [see Fig. 8 in refereme (105)].Once again, the products of the in uitro resolution assays using the baculovirus recombinant NS-1 are consistent with the modified rolling hairpin model for MVM replication. What is also apparent from these studies is that although most of the products are consistent with the first nick occurring in the A half, extended and turnaround forms consistent with the first nick occurring in the B half are detected, although at a much lower level. In further studies, a series of mutant DB clones were constructed (using mutant oligonucleotides to construct the DB). The mutants were designed to change sequences within the A half and initial nick sequence in order to decrease resolution. In each case, the mutant dimer bridge underwent resolution and the frequency of extended and turnaround forms varied, but none of these mutations blocked resolution completely (105).
G. In Vitro Nicking Assays Show That NS-1 Nicks the B Half and Not the A Half of the 3’-3’ Bridge Dimer In the modified rolling hairpin model the first nick is introduced within the GAA/CTT arm (A half) of the BD. Cotmore et al. cloned half-dimer bridge molecules and tested these as a substrate for nicking by recombinant NS-1 (108). Surprisingly, the A half (plasmid pGAA) containing the GAA/ CTT sequence was completely inactive in this nicking assay, but the B half (plasmid pTC) was active. Evidence was obtained that the pTC plasmid could be immunoprecipitated with anti-NS-1 antibody (i.e., the NS-1 protein was covalently attached to this DNA), but no DNA was precipitated with the pGAA plasmid. Using PCR, the regions of the dimer bridge in the pTC and pGAA plasmids were reduced to just the stem portion (i.e,, the arms regions were eliminated). Again, the pTC derivative (pL1-2TC)was active in this assay but the opposite arm clone was inactive. These studies narrowed the origin activity to -50 bp of the stem of the B half of the bridge dimer, extending from 7 bp to the left of the nick site and 43 bp to the right, including an ATF consensus sequence (Fig. 11). Although the resolution products of the bridge dimer are consistent with the modified rolling hairpin model, the results of the in uitro nicking assay indicate clearly that the mechanism proposed in this model is incorrect and the actual mechanism is likely more complicated (108).To map the precise nicking site for NS-1, Cotmore and Tattersall used an end-labeled fragment from pL1-2TC and found that this fragment is nicked (in the presence of an extract containing NS-1). The nicked products were immunoprecipitated (anti-NS-1 antibody) and the size of the DNA fragment was determined. A
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
NF-Y binding I
277
cellular binding protein (CBP)? parvovims initiation factor (PIF)? I_____------. I
I
origin site
FIG. 11. Summary of the origin region of the B half of the 3'-3' bridge dimer of MVM. The B half of the 3'-3' bridge dimer, which has been shown to be nicked by NS-l(IO8; Q. Liu and C. R. Astell, unpublished results), is illustrated. The origin site spans -50 nt extending from 7 nt to the left of the nick site toward the ATF consensus sequence. The sequence heterogeneity unique to the B half of the dimer bridge and originating due to the bubble at nt 25 is indicated. The region of the sequence protected by binding of NS-1 is shaded. Below the diagram the nucleotide sequence of this region is indicated. NS-1 is believed to bind to a core consensus sequence (ACCA), (108). The probable binding site for PIF (109)and CBP (Q. Liu and C. R. Astell, unpublished results) is also indicated.
series of bands were detected that corresponded to the predicted region of nicking (Fig. 11):
11.1 CTI'ATCA This sequence conforms to the sequence CTWWTCA, which is also located at the site of nicking at the 5' end of the genome. NS-1 is known to be joined to the adenine residue at nt 5170 (16).
.1 CTATTCA
I
5172
I 5167
Further studies using a series of mutations within the arm demonstrated that deletion of sequence to the left of the nick site (see Fig. 11) reduced origin activity. However, mutations to the right of the nick site did not significantly alter origin activity, provided the spacing between the bubble site and nick site is maintained. [Note: In the A half of the bridge dimer,
278
CAROLINE R. ASTELL ET AL.
this spacing is displaced by 1nt due to the triplet GAA sequence, rather than the doublet TC sequence found in the B half, which arises due to the unpaired “bubble” at nt 25 (Fig. lA).] We have confirmed that the B half of the 3 ’ 4 ’ bridge dimer is nicked by NS-1, whereas the A half is not. In addition, purified NS-1 by itself is incapable of nicking the substrate, but requires the nuclear extract from LA9 cells to carry out this reaction (Q. Liu and C. R. Astell, unpublished results). This implies that cellular protein is required in the recognition and/or nicking process. We have also constructed a series of mutations within this region. Our data suggest that although an ATF6 site is included in the origin sequence, it is unlikely that the ATF protein plays a role in the nicking reaction, and we have evidence that another cellular binding protein (CBP) binds to this region of the B arm (Q. Liu and C. R. Astell, unpublished results). Similar results have been observed with a partially purified 120-kDa cellular protein (called PIF, for parvovirus initiation factor) that facilitates nicking of the B half of the bridge dimer by purified NS-1 (109).
H. Binding of NS-1 to the Origin Region For several years, one of the mysteries of MVM replication has been the fact that it has been difficult to demonstrate convincingly that the NS-1 protein binds to the terminal region of the genome, yet the equivalent protein from AAV-2, the REP 68 (and REP 78) protein binds to the AAV-2 origin and nicks it. Other cellular proteins do bind to this region, which includes the upstream region of the P4 promoter (87, 108,110).However, using a modified procedure to detect DNA protein complexes, it has been shown that NS-1 does bind to the 3‘ replication origin. In these studies, NS-1 was obtained by in vitro transcription translation as well as by using recombinant vaccinia and baculoviruses (expressing a His,-NS-1 fusion protein). When the recombinant NS-1 proteins were incubated with plasmid DNA containing the origin sequence, the DNA could be immunoprecipitated. These precipitations occurred with anti-NS-1 antibody directed against the N- or C-terminal region, whereas anti-NS-1 directed against the middle region of the protein was unable to precipitate the hairpin DNA sequences (108). The left-hand region of the MVM genome has also been shown to contain sequences that function in regulating the P4 promoter. In addition to a TATA box (nt 175) and GC-rich (nt 158) SPl consensus sequences (63),other consensus sequences have been identified within the hairpin itself, including The ATF site is the consensus sequence recognized by ATF transcription factors (activating transcription factors) (105).
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
279
two NF-Y7 sites and an E box8 (USF binding sites). These consensus sequences occur in two copies in the left hairpin extended form of monomer RF DNA, presumably the major template for transcription. NF-Y binds to a modified consensus sequence [CCAAC rather than CCAAT (110)lthat overlaps the NS-1 binding site (108).
I. Identification of Amino-acid Residues in NS-1 Essential for Nicking MVM DNA The role of NS-1 in site-specific nicking implies that specific residues within the NS-1 protein are involved. Early studies indicated that a protein attached to the 5’ end of MVM DNA is linked with the chemical stability of a tyrosine phosphodiester bond (rather than serine or threonine) (37). Subsequently, it was firmly established that the protein coupled to MVM is indeed NS-1 (38) and limited proteolysis showed the covalent link between NS-1 and MVM DNA localized to the N-terminal 280 residues (16). Experiments were carried out in which tyrosine residues were changed to phenylalanine and the resulting mutants were examined for their ability to support replication of a replication-competent subgenomic MVM genome (111).A number of the mutant NS-1 proteins supported replication in trans, but several did not. The replacement of tyrosine residues at aa 188, 197, or 210 with phenylalanine yielded protein inactive in replication. These residues were also of interest because others had shown that proteins involved in the rolling-circle DNA replication mechanisms contain a motif (the rollingcircle replication, or RCR, motif”), which is also found in the NS-1 proteins of parvoviruses, and in the case of the MVM NS-1 protein, this motif includes Tyr-188, -197, and -210 (112). There are two investigations of the role of the tyrosine residues at 188, 197, and 210 in MVM replication. In one, a mutation at Tyr-210 Phe blocked nicking of the B half of the bridge dimer, whereas similar mutations at 188 and 197 did not (113).However, Y188 and Y197 mutant proteins are able to nick (although less efficiently than wild-type NS-1) and become covalently attached to the DNA under low salt concentration conditions [5 mM KCl (113)l.Other studies from this group show that NS-1 binds specifically to the origin between 50 and 100 mM salt but that at low salt concentrations, the reaction is nonspecific (88).A H129R mutant [histidine within the HuHuuug The NF-Y site is the consensus sequence recognized by nuclear factor Y (110). The E box in the consensus sequence recognized by USF (upstream stimulatory factor) (110). 9 The RCR motif consists of three amino-acid sequences, of which two are present in parvovirus NS-1 proteins. One is a YxxK consensus sequence (x = any aa) and the second conforms to HuHuuu (u = hydrophobic aa), predicted to be involved in metal binding (112). 8
280
CAROLINE R. ASTELL ET AL.
consensus sequence ( I D ) ] was also inactive in nicking. Somewhat surprisingly, none of the tyrosine mutants resolved the 3 ‘ 4 ‘ bridge dimer (113). The conclusions of these studies are that Tyr-210 and His-129 are essential for nicking and covalent attachment of NS-1 to the viral origin within the 3’-3‘ bridge dimer. In similar studies, each ofthe Tyr-188, -197 and -210 mutants was expressed in insect cells (using recombinant baculoviruses) (Q. Liu, M. Skiadopoulos, E. A. Faust and C. R. Astell, unpublished results). The mutant NS-1 proteins were tested for their ability to resolve the 3‘-3‘ bridge dimer, and we found that the Y210F mutant is inactive. We also observed that the Y188F mutant is nonfunctional; however, Y197F is active. At the same time, we have not yet observed nicking of the half-dimer bridge (B half) with any of these mutant proteins, although our wild-type NS-1 protein is active (Q. Liu and C. R. Astell, unpublished results). Hence, the tyrosine at 210 is important in resolution of the 3’-3’ dimer bridge. Currently we also believe the tyrosine at 188 may have a role to play. The mechanism for nicking and ligation of 4x174 by the cisA protein involves two tyrosine residues (114). Whether this is the case for MVM and NS-1 will require further experiments. A map of the putative functional domains of NS-1 summarizing the locations of the replication functions described in this review, as well as other regions important in the MVM replication cycle, is shown in Fig. 12.
111. Summary and Future Directions The modified rolling hairpin model for MVM replication was proposed some years ago (32). This model introduced the idea that resolution of the central bridge dimer involves an asymmetric procedure that could explain how a unique sequence was retained at the left-hand end of the MVM genome. It was predicted that the sequence asymmetry resulting from the “bubble” at nt 25 in the left hairpin is the basis for this asymmetric resolution. In addition, it was predicted that NS-1 plays an important role in the resolution mechanism. Where are we now? Clearly, NS-1 does play an important role in resolution of both the 3‘-3’ and 5’-5’ bridge dimer structures. It acts as a sitespecific “nickase” on both the 5’-5’ BD and the 3’-3’ BD and resolves these structures. For the 3’-3’ BD, initially, NS-1 is covalently attached to sequences from the B half but, in a complete resolution reaction, ends up being transferred to the 5’ end of the extended A half. Although the final products are consistent with the modified rolling hairpin model, the mechanism for resolution remains elusive.
(a)
oligomerization and protein:protein interaction I
I
(1'
I
oligomerization
-- for co-nuclear translocation
1
(e) NTPbinding ATPase, helicase
100
200
300
400
500
(f) Transactivation
H 600
672
FIG. 12. Preliminary map of putative functional domains of NS-1. Map data (a-f) were derived from the following sources: (a) C. E. Harris and C. R. Astell, unpublished observations; (b) Niiesch and Tattersall (91);(c) Ilvana and Koonin (112),Niiesch et al. (113).and Q.Liu and C. R. Astell, unpublished observations; (d)Niiesch and Tattersall (91);(e) Wilson et a / . (75),Astell et a / . (82). and Jindal et al. (94);(f) Doerig et al. (85)and C. E. Harris and C. R. Astell, unpublished observations.
282
CAROLINE R. ASTELL ET AL.
Where do we go from here? It seems likely that NS-1 will interact with cellular proteins to effect replication (resolution) of the ends of the MVM genome. It is important to identify these proteins and understand their function(s). One group has begun to purify cellular proteins that interact with the terminal sequences and has identified a 120-kDa protein (parvovirus initiation factor) (109). Also, NF-Y has been shown to bind to the same region, and USF to an adjacent sequence (110).In addition, we have cloned two cellular cDNAs encoding proteins that interact with the N-terminal half of NS-1 ( C . E. Harris and C. R. Astell, unpublished results). Similar studies by another group have isolated a cDNA for a protein that interacts with NS-1 from the closely related H1 parvovirus (115).The functions of all three proteins are currently unknown. Within the next year, it should be possible to determine the roles of these cellular proteins in replication. Eventually, it should be possible to achieve in vitro replication of MVM using purified cellular polymerase(s) and accessory proteins plus NS-1. There are further questions to be asked: What residues on NS-1 are phosphorylated, and do they modulate the function of this protein? Finally, it would be interesting to know how NS-1 interacts with cellular components to exert its remarkable cytotoxic effect (116).Clearly, elucidation of replication of this small virus has proved to be a major challenge and many important questions remain to be answered. ACKNOWLEDGMENTS The work described in this review was supported by grants from the British Columbia Health Research Foundation and the Medical Research Council of Canada. PT was the recipient of an MRC studentship and JB is the recipient of an NSERC postgraduate scholarship. The authors are indebted to J. Romrnelaere and Sue Cotmore for sending preprints of their work prior to publication. The senior author (CRA)acknowledges past collaborations with David Ward, Peter Tattersall and Sue Cotmore and their many students and postdocs. The early days of MVM molecular biology will always be fondly remembered. We also acknowledge more recent collaborations with E. A. Faust and students in his laboratory. Last but not least, we thank Sharon Krowchuk for typing this manuscript and preparing many of the figures.
REFERENCES 1. K. I. Berns, in “Fields Virology” (B. N. Fields and D. M. Knipe, eds.), p. 1743 Raven Press, New York, 1990. 2. C. R. Pringle, Arch. Virol. 133, 491 (1993). 3. M. G. O’Sullivan, D. C. Anderson, J. D. Fikes, F. T. Bain, C. S. Carlson, S. W. Green, N. S. Young and K. E. Brown, J . Clin. Inwest. 93, 1571 (1994).
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
283
4 . G. Siegl, R. C. Bates, K. I. Bems, B. J. Carter, D. C. Kelly, E. Kurstak and P. Tattersall, Znteruirology 23, 61 (1985). 5. H. W. Toolan, Science 131, 1446 (1960). 6. L. Kilham and G. Margolis, Virology 13, 141 (1961). 7. L. Kilharn, Prog. Med. Virol. 20, 113 (1975). 8. S. F. Cotmore and P. Tattersall, Ado. Virus Res. 33, 91 (1987). 9 . J. Bergeron, J. Menezes and P. Tijssen, Virology 197, 86 (1993). 10. J. Bergeron, B. HBhert and P. Tijssen, J. Virol. 70, 2508 (1996). 1 1 . U . Truyen, A. F. Chang, B. Obermaier, P. Vajalainen and C. Parrish, J. Virol. 69, 4702 (1995). 12. E. M. Gardiner and P. Tattersall, J. Virol. 62, 1713 (1988). 13. L. J. Ball-Goodrich and P. Tattersall, J. Virol 66, 3415 (1992). 14. E. M. Gardiner and P. Tattersall, J. Virol. 62, 2605 (1988). 15. P. Tattersall, in “Replication of Mammalian Parvoviruses” (D. C. Ward and P. Tattersall, eds.), p. 53. CSHLah, CSH, NY, 1978. 16. S. F. Cotmore and P. Tattersall, J. Virol. 63, 3902 (1989). 17. J. Tsao, M. Chapman, M. Aghandje, W. Keller, K. Smith, H. Wu, M. Luo, T. Smith, M. Rossmann, R. Compans and C. Parrish, Science 251, 1456 (1991). 18. M. Agbanje, S. Kajigaya, R. McKenna, M. G. Rossman and N. S. Young, Virology 203, 106 (1994). 19. M. Aghanje, R. McKenna, M. Rossmann, S. Kajigaya and N. S. Young, Virology 192, 121 (1991). 20. M. Agbandje, A. L. Llamas-Saiz, W. R. Wikoff, M. Rossmann, J. Bratton and P. Tattersall, Abstr. Paruouirus Workshop, 6th, Montpellier, France, S2#2, p. 6 (1995). 21. M. S. Chapman and M. G. Rossmann, Virology 194, 491 (1993). 22. K. I. Berns and S. Adler, J. Virol. 9, 394 (1972). 23. S. F. Cotmore and P. Tattersall, Science 226, 1161 (1984). 24. R. C. Bates, C. E. Snyder, P. T. Banerjee and M. Sankar, J. Virol. 49, 319 (1984). 25. A. K. Saemundsen, cited in R. C. Bates, C. E. Snyder, P. T. Banerjee and S. Mitra, J. Virol. 49, 319 (1984). 26. R. Armentrout, R. Bates, K. Berns, B. Carter, M. Chow, D. Dressler, K. Fife, W. Hauswirth, G. Hayward, G. Lavelle, S. Ahode, S. Straus, P. Tattersall and D. Ward, in “Replication of Mammalian Parvoviruses” (D. Ward and P. Tattersall, eds.), p. 523. CSHLab, CSH, NY, 1979. 27. C. R. Astell, M. Smith, M. B. Chow and D. C. Ward, Cell 17, 91 (1979). 28. V. Deiss, J. D. Tratschin, M. Weitz and G. Seigl, Virology 175, 247 (1990). 29. E. Lusby, K. H. Fife and K. I. Berns, J. Virol. 34, 402 (1980). 30. R. 0. Shade, M. C. Blundell, S . F. Cotmore, P. Tattersall and C. R. Astell,J. Virol. 58, 921 (1986). 31. C. R. Astell, M. B. Chow and D. C. Ward, CSHSQB 47, 751 (1983). 32. C. R. Astell, M. B. Chow and D. C. Ward, J. Virol. 54, 171 (1985). 33. C. R. Astell, in “Handbook of Parvoviruses” (P. Tijssens, ed.), p. 59. CRC Press, Boca Raton, FL, 1990. 34. T. Cavalier-Smith, Nature 250, 647 (1974). 35. E. Lusby, R. Bohenzky and K. I. Berns, 1. Virol. 37, 1083 (1981). 36. S. Eisenberg, J. G&ths and A. Kornberg, PNAS 74, 3198 (1977). 37. M. Chow, J. W. Bodnar, M. Polvino-Bodnar and D. C. Ward, J. Virol. 57, 1094 (1986). 38. S. F. Cotmore and P. Tattersall, J. Virol. 62, 851 (1988). 39. E. A. Faust and D. C. Ward, J. Virol. 32, 276 (1979).
284
CAROLINE R. ASTELL ET AL.
40. M. J. Merchlinsky, P. J. Tattersall, J. J. Leary, S. F. Cotmore, E. M. Gardiner and D. C. Ward, J. Virol. 47, 227 (1983). 41. R. Boissy and C. R. Astell, Gene 35, 179 (1985). 42. D. R. F. Leach and F. Stahl, Nature 305, 448 (1983). 43. R. Salvino, M . Skiadopolous, E. M. Faust, P. Tam, R. 0. ShadeandC. R. Astell,]. Virol. 65, 1353 (1991). 44. S . Lacks and B. Greenberg, J M B 114, 153 (1977). 45. P. Tam and C. R. Astell, Virology 193, 812 (1993). 46. C. R. Astell, E. M. Gardiner and P. Tattersall, J. Virol. 57, 656 (1986). 47. R. Sahli, G. K. McMaster and B. Hirt, NARes 13, 3617 (1985). 48. R. B. Lefebvre, S. Riva and K. I. Berns, MCBiol4, 1416 (1984). 49. P. Tam and C. R. Astell, J. Virol. 68, 2840 (1994). 50. 6. Tullis, R. V. Schonberg and D. J. Pintel, J. Gen. Virol. 75, 1633 (1994). 51. P. Tam, Ph.D. Dissertation. University of British Columbia, Vancouver, B.C. (1994). *52. K. Willwand and B. Hirt, J. Virol. 67, 5660 (1993). 53. K. Willwand and B. Hirt, J . Virol. 65, 4629 (1991). 54. G. E. Tullis, L. R. Burger and D. J. Pintel, J. Virol. 67, 131 (1993). 55. C. R. Astell, M . Thomson, M. Merchlinsky and D. C. Ward, NARes 11, 999 (1983). 56. K. Ozawa, J. Ayub, Y. S. Hao, G. Kurtzman, T. Shimada and N. Young, J. Virol. 61,2395 (1987). 57. M. C. Blundell, C. Beard and C. R. Astell, Virology 157, 534 (1987). 58. L. E. Via and M. Lederman, Abstr. Paruuuirus Workshop, 6th, Montpellier, France, P5#13, p. 17 (1995). 59. D. Pintel, D. Dadachanji, C. R. Astell and D. C. Ward, NARes 11, 1019 (1983). 60. C. A. Laughlin, H. Westphal and B. J. Carter, PNAS 76, 5566 (1979). 61. M. R. Green and R. G. Roeder, J. Virol. 36, 79 (1980). 62. K. E. Brown, N. S. Young and J. M. Liu, CRC Crit. Reo. 0ncol.IHematol. 16, 1 (1994). 63. J. K. Ahn, B. J. Gavin, G. Kumaz and D. C. Ward, J. Virol. 63, 5425 (1989). 64. E. Ben Asher and Y. Aloni, J. Virol. 52, 266 (1984). 65. K. E. Clemens and D. Pintel, Virology 160, 511 (1987). 66. C. V. Jongeneel, R. Sahli, G. K. McMaster and B. Hirt, J. Virol. 59, 564 (1986). 67. W. R. Morgan and D. C. Ward, J. Virol. 60, 1170 (1986). 68. S. F. Cotmore and P. Tattersall, Virology 177, 477 (1990). 69. L. K. Naeger, J. Cater and D. J. Pintel, J. Virol. 64, 6166 (1990). 70. L. K. Naeger, N. Salome and D. J. Pintel, J. Virol. 67, 1034 (1993). 71. J. E. Cater and D. J. Pintel, J. Gen. Virol. 73, 1839 (1992). 72. D. G. Brownstein, A. L. Smith, E. A. Johnson, D. J. Pintel, L. K. Naeger and P. Tattersall, J. Virol. 66, 3118 (1992). 73. C. Legrand, J. Rommelaere and P. Caillet-Fauquet, Virology 195, 149 (1993). 74. S. Cotmore, R. Gottlieb, A. D’Ambramo, J. Bratten and P. Tattersall, Abstr. Paroooirus Workshop, 6th, Montpellier, France, S2#4, p. 8 (1995). 75. G . M. Wilson, H. K. Jindal, D. E. Yeung, W. Chen and C. R. Astell, Virology 185, 90 (1991). 76. S. F. Cotmore, J. P. F. Niiesch and P. Tattersall, Virology 190, 365 (1992). 77. S. F. Cotmore, J. P. F. Niiesch and P. Tattersall, J. Virol. 67, 1579 (1993). 78. Q. Liu, C. B. Yong and C. R. Astell, Virology 201, 251 (1993). 79. S. F. Cotmore and P. Tattersall, Virus Res. 4, 243 (1986). 80. D. E. Yeung, G. W. Brown, P. Tam, R. H. Russnak, G. Wilson, I. Clark-Lewis and C. R. Astell, Virology 181, 35 (1991). 81. J. P. F. Nuesch, S. F. Cotmore and P. Tattersall, Virology 191, 406 (1992).
REPLICATION OF THE AUTONOMOUS PARVOVIRUS MVM
285
82. C. R. Astell, C. D. Mol and W. F. Anderson, J. Gen. Virol. 68, 885 (1987). 83. M. Merchlinsky, Ph.D. Dissertation, Yale University, New Haven, CT (1994). 84. C. Doerig, B. Hirt, P. Beard and J.-P. Antonietti, J. Gen. Virol. 69, 2563 (1988). 85. C. Doerig, B. Hirt and J:P. Antonietti, J . Virol. 64, 387 (1990). 86. C. Lorson, L. Burger and D. Pintel, Abstr. Parnovirus Workshop, 6th, Montpellier, France, P5#15, p. 119 (1995). 87. S. Faisst, M. Perros, L. Delen, N . Spruyt and J. Rommelaere, Virology 202, 466 (1994). 88. J. Christensen, S. F. Cotmore and P. Tattersall, J. Virol. 69, 5422 (1995). 89. D. Legendre and J. Rommelaere, J. Virol. 68, 7974 (1994). 90. C. Harris and C. R. Astell, Abstr. Paroovirus Workshop, 5th, Crystal River, FL, P1#16 (1993). 91. J. P. F. Niiesch and P. Tattersall, Virology 196, 637 (1993). 92. D . 4 . Im and N. Muzyczka, Cell 61, 447 (1990). 93. D . 4 . Im and N. Muzyczka, 1. Virol. 66, 1119 (1992). 94. H. K. Jindal, C. B. Yong, G. M. Wilson, P. Tam and C. R. Astell, JBC 269, 3283 (1994). 95. M. K. Bradley, T. F. Smith, R. H. Lathrop, D. M. Livingston and T. A. Webster, PNAS 84, 4026 (1987). 96. A. E. Gorbalenya, E. V. Koonin and Y. I. Wolf, FEBS Lett. 262, 145 (1990). 97. M. Skiadopoulos, R. Salvino, W. L. Leary and E. A. Faust, Virology 188, 122 (1992). 98. X. Li and S . L. Rhode, III,J. Virol. 64, 4654 (1990). 99. P. Tattersall and D. C. Ward, Nature 263, 106 (1976). 100. S. F. Cotmore and P. Tattersall, J. Virol. 66, 420 (1992). 101. D. C. Ward and D. K. Dadachanji, in “Replication of Mammalian Parvaviruses” (D. C. Ward and P. Tattersall, eds.), p. 297. CSHLab, CSH, NY, 1978. 102. S. E. Straus, E. Sebring and J. Rose, PNAS 73, 742 (1976). 103. M. B. Chow, PhD. Dissertation, Yale University, New Haven, CT (1981). 104. S. F. Cotmore, J. P. F. Nuesch and P. Tattersall, I. Virol. 67, 1579 (1993). 105. Q. Liu, C. B. Yong and C. R. Astell, Virology 201, 251 (1994). 106. S. F. Cotmore, J. P. F. Niiesch and P. Tattersall, Virology 190, 365 (1992). 107. B. W. Stillman and Y. Gluzman, MCBiol 5, 2051 (1985). 108. S. F. Cotmore and P. Tattersall, EMBO J. 13, 4145 (1994). 109. J. Christensen, S. F. Cotmore and P. Tattersall, Abstr. Paroovirus Workshop, 6th, Montpellier, France S6#5, p. 41 (1995). 110. Z. Gu, S. Plaza, M. Perros, C. Cziepluch, J. Rommelaere and J. J. Cornelius, J Virol. 69, 239 (1995). 111. M. H. Skiadopoulos and E. A. Faust, Virology 194, 509 (1993). 112. T. V. Ilyana and E. V. Koonin, NARes 20, 3279 (1992). 113. J. P. F. Nuesch, S. F. Cotmore and P. Tattersall, Virology 209, 122 (1995). 114. R. Hanai and J. C. Wang, JBC 268, 23830 (1993). 115. C. Cziepluch, E. Kordes, A. Pujol, J.-C. Jauniaux and J. Rommelaere, Abstr. Paroovirus Workshop, 6th, Montpellier, France, P4#10 (1995). 116. A. Op de Beeck, F. Anouja, S. Mousset, J. Romrnelaere and P. Caillet-Fauquet, Cell Growth Dqfer. 6, 781 (1995).
This Page Intentionally Left Blank
Index
A Adenosine deaminase, murine biological roles, 196-197 gastrointestinal tract, 220-221 immune system lymphopenia prevention, 217-218 metabolic disturbances and immunodeficiency, 216, 218-219 secondary deciduum postimplantation development, 219 reproductive status, 219-220 deficient mice generation by homologous recombination, 208-209 knockout in specific tissues, 216 metabolic disturbances, 209, 211-213 expression placenta regulation, 204-206 postnatal expression, 200-201, 203 prenatal development, 198, 200 thymus regulation, 206-207, 222 tissue-specific activity, 197-198, 200, 221-222 gene model for expression, 207-208 promoter, 203-204 structure, 203-204 reaction catalyzed, 195 reconstitution in placenta effect on survival, 214, 222-223 minigene, 213 prevention of metabolic disturbances, 214-216 sequence analysis, 195-196 structure, 196 Antisense RNA, nuclear poly(ADP-ribose) polymerase effects apoptosis, 151-154 cell differentiation, 148 cell survival after DNA damage, 147148
chromatin organization, 144-145 DNA repair, 147-148 DNA replication, 148, 150-151 DNA strand-break rejoining, 146-147 endogenous mRNA transcripts, 142 genomic stability, 146 nuclear activity depletion, 138-139, 143-144 protein expression, 142-143 expression in transfected cells, 139, 141142 vectors, 139 Apoptosis, nuclear poly(ADP-ribose) polymerase role, 151-154
D DNase I, hypersensitivity of actively tran scribed genes, 227-228, 230-232
F Ferritin cytokine-responsive RNA in 5’-untranslated region, 126-127 iron-responsive element augmentation by iron-responsive mRNA open reading frame sequences, 127-129, 131 role in iron induction, 122-124, 131 iron-responsive proteins hydrogen peroxide effects, 125-126 iron response, 124 nitric oxide effects, 125 phosphorylation of IRP-1, 126 role in iron induction, 122-123 role in disease, 121
287
288
INDEX
G Global regulator, see Regulon Glutamine synthetase, see Leucine/Lrp regdon; Nitrogen regulon
H Heat-shock regulon identification of members genes, 50 proteins, 50-51 polypeptide induction by heat shock identification, 47, 49 kinetics, 47-49 sigma factor control 032, 44-47, 51, 53 a54, 53-54 temperature-sensitive mutant, 44
I Iron-responsive element augmentation by iron-responsive mRNA open reading frame sequences in ferritin, 127-129, 131 role in iron induction of ferritin, 122-124, 131 Iron-responsive proteins hydrogen peroxide effects, 125-126 iron response, 124 nitric oxide effects, 125 phosphorylation of IRP-1, 126 role in iron induction of ferritin, 122-123
L Leucine/Lrp regulon direct regulation of genes, 65-66 footprinting assays for Lrp dissociation constants, 71 indirect regulation of glutamine syntbetase, 66-67 Lrp affinity for leucine, physiological significance, 67-69 target gene expression enhancement, 73
identification, 63-65 transcriptional activation by Lrp, 69, 71-73
M Messenger RNA, half-life in bacteria, 5-6 Minute virus of mice DNA replication assay, 253-254 cis-acting sequences required for replication characterization of pPTLR minigenome, 258 internal replication sequence, 263264 internal right-end sequences, 259, 263 left-hand hairpin deletion and replication prevention, 258-259 replication of minigenomes with multiple hairpin termini, 264-266 right-hand hairpin deletion and replication prevention, 254-255, 257-258 hairpin transfer mechanism, 250-251, 280 genome structure, 247, 267 NS-1 activities, 270-271 binding to origin region, 278-279 bridge dimers, in vitro resolution 3'-3' bridge dimer, 275-276 5'-5' bridge dimer, 274-275 domains, 269 essential amino acid residues for nicking activity, 279-280 half-life, 268 nicking B half of 3'-3' bridge dimer, 276-278 nuclear targeting signal, 270 open reading frame, 268-269 phosphorylation, 282 role in replication, 250-251, 271, 273280, 282 transcriptional activation, 269-270 NS-2 carboxy termini, 267 functions, 268 half-life, 268
289
INDEX strains, 246 structure, 247
N Nitrogen regulon glutamine synthetase adenylation, 56-57 reaction catalyzed, 55 transcriptional regulation, 57 NtrA mapping of binding sites, 59 phosphorylation. 59-60 phosphorylated protein intermediates, detection, 57-58 response rate, 61-62 signal transduction pathway, 54-55, 6061, 74 two-component response regulator, 54 NS-1, see Minute virus of mice, NS-1 NS-2, see Minute virus of mice, NS-2
0
chromatin organization, 144-145 DNA repair, 147-148 DNA replication, 148, 150-151 DNA strand-break rejoining, 146-147 endogenous mRNA transcripts, 142 genoniic stability, 146 nuclear activity depletion, 138-139, 143-144 protein expression, 142-143 expression in transfected cells, 139, 141-142 vectors, 139 biological roles, 135-137, 151-154 domains, 136 inhibition of activity competitive inhibitors, 136-137 deletion mutation, 137 knockout mice, 137-138 overexpression of DNA-binding domain and trans-dominant inhibition, 138 P protein, see Ribosome stalk
R
Operon, see Regulon
P Platelet-derived growth factor biological activity, 234, 241 S 1 nuclease analysis of A-chain gene binding protein identification, 240 hypersensitivity mapping assay, 235 identification of hypersensitive sites, 235-236, 238-240 promoter, 234-235 rationale, 233-234 transcription suppression by complementary oligonucleotide, 236-238, 241 structure, 234 Poly(ADP-ribose) polymerase antisense RNA effects apoptosis, 151-154 cell differentiation, 148 cell survival after DNA damage, 147148
Regulon, see also Heat-shock regulon; Leucinellrp regulon; Nitrogen regulon continuum between regulatory proteins, 73 control of initiation in bacteria DNA masking, 8-10 RNA polymerase concentration, 7, 10-12 initiation complex formation and promoter clearance, 15-17 isomerization of closed-to-open RNA polymerase-promoter complex, 14-15 promoter binding, 12-14 defined, 2-3 global regulator advantages for cell coordinated response by large number of genes, 24 cross-regulation and regulatory integration, 25 economic model comparison, 27 improved genetic flexibility, 26
290 comparison with local regulator abundance, 17 DNA sequence specificity, 17-18 recruitment, 18 target operons, 3, 17 design features and control, 22-B regulator protein control coregulator binding, 20 covalent modification, 20 expression, 19 multimerization, 21 sequestration, 21 identification of members gene expression analysis by nucleic acid hybridization, 35-36 isolation of operon fusions to reporter genes, 28-32 polypeptide synthesis, differential rate detection by gel electrophoresis, 32-35 precautions, 36-37 integration of responses, 74-75 in vivo studies bacterial growth conditions, 41-42 DNA footprinting, 40 regulatory protein concentration determination, 37-40 stimulus-response pathway, 3-5 Ribonuclease P catalytic mechanism, 93-95 structures in various species, 88-91 substrate recognition, 91-93 tRNA precursor processing, 87-88, 91 yeast nuclear enzyme catalytic subdomain identification, 109, 111, 113 mutation affecting rRNA processing, 113-115 phylogenetic analysis, 98-99 RpRl gene expression, 97 mature domain replacements, 108 randomization mutagenesis, 108-109 structure, 95, 97 structure analysis by footprinting, 99, 101, 105 Ribosome stalk bacterial components, 157-158 cytoplasmic pool L11-like protein, 167
INDEX
PO, 166 P1,165-166 P2, 165-166 essentiality of acidic P proteins in yeast, 177, 179 eukaryotic components and structure L11-like protein, 165 PO, 164-165 P1, 159-160, 163 P2, 159-160, 163 expression of acidic P proteins, regulation, 187-189 functional exchange of heterologous acidic P proteins between species, 183184 PO function carboxyl-terminal domain, 181183 essentiality for cell viability, 181 Pl/PB-PO protein complex ribosomal binding, 177, 179 stability, 168-169 stoichiometry, 168 structure, 167-168, 182-183 phosphorylation of proteins effect on P protein function, 175176 kinases, 173-175, 190 PO, 172-173 P1, 171-173 P2, 171-173 P protein exchange in ribosome PO, 170-171 P1, 169-170 P2, 169-170 protein expression pattern, role of acidic P proteins, 18&181 ribosome activity regulation, 184, 186187, 189 Ribozyme, see Ribonuclease P RNA polymerase alternate sigma factors, 43-44 control of initiation in bacteria concentration, 7, 10-12 initiation complex formation and promoter clearance, 15-17 isomerization of closed-to-open RNA polymerase-promoter complex, 14-15 promoter binding, 12-14
291
INDEX
S
T
S1 nuclease analysis of platelet-derived growth factor A-chain gene binding protein identification, 240 hypersensitivity mapping assay, 235 identification of hypersensitive sites, 235-236, 238-240 promoter, 234-235 rationale, 233-234 transcription suppression by complementary oligonucleotide, 236-238, 241 conformational heterogeneity of substrate DNA, 229-233, 241 hypersensitivity of actively transcribed genes, 228, 231 Stimulon, defined, 3
Transcription conformational heterogeneity of DNA, 229-233, 241-242 control of initiation in bacteria DNA masking, 8-10 RNA polymerase concentration, 7, 10-12 initiation complex formation and promoter clearance, 15-17 isomerization of closed-to-open RNA polymerase-promoter complex, 14-15 promoter binding, 12-14 control of initiation in eukaryotes, 228229 nuclease hypersensitivity of actively transcribed genes, 227-228, 230-233