Progress in Nucleic Acid Research and Molecular Biology, Volume 46

PROGRESS IN Nucleic Acid Research and Molecular Biology Volume 46 This Page Intentionally Left Blank PROGRESS IN ...

Author: Waldo E. Cohn

91 downloads 768 Views 14MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

PROGRESS IN

Nucleic Acid Research and Molecular Biology Volume 46

This Page Intentionally Left Blank

PROGRESS IN

Nucleic Acid Research and Molecular Biology edited by

WALDO E. COHN

KlVlE MOLDAVE

Biology Dioision Oak Ridge National Laboratory Oak Ridge, Tennessee

Department of Molecular Biology and Biochemistry Unioersity of Cal$ornia, Zrvine Irvine, California

Volume 46

ACADEMIC PRESS, INC. A Division of Harcourt Brace 6.Company Son Diego New York Boston London Sydney Tokyo Toronto

This book is printed on acid-free paper. @

Copyright 0 1993 by ACADEMIC PRESS, INC. All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.

Academic Press, Inc. 1250 Sixth Avenue, San Diego, California 92101-431 I United Kingdom Edition published by

Academic Press Limited 24-28 Oval Road, London NWl 7DX International Standard Serial Number: 0079-6603 International Standard Book Number: 0-12-540046-2

PRINTED IN THE UNITED STATES OF AMERICA 93

9 4 9 5 96 97 98

BB

9 8 7 6 5 4 3 2

1

Contents

SYMBOLS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix

SOMEARTICLES PLANNED FOR FUTURE VOLUMES . . . . . . . . . . . . . . . . . . . . . . .

xi

ABBREVIATIONS

AND

Adenoviral DNA Integration and Changes in DNA Methylation Patterns: A Different View of Insertional Mutagenesis Walter Doerfler I. I1. 111. IV. V.

Scope ofReview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Survey of Basic Findings on Adenovirus DNA Integration . . . . . . . . . . Uptake of Foreign (Adenoviral) DNA by Mammalian Cells . . . . . . . . . On the Mechanism of Adenovirus DNA Integration . . . . . . . . . . . . . . . De Nouo Methylation of Integrated Foreign DNA . . . . . . . . . . . . . . . . . VI . Alterations in Cellular Gene Expression in Adenovirus-Infected and -Transformed Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII . A Different View of Insertional Mutagenesis . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 5 8 11 23 30 32

33

Posttranscriptional Control of the Lysogenic Pathway in Bacteriophage Lambda Amos B . Oppenheim. Daniel Kornitzer. Shoshy Altuvia and Donald L . Court I . A Genes Involved in the Lysis/Lysogeny Decision . . . . . . . . . . . . . . . . I1. RNase I11 in Posttranscriptional Regulation of A Genes . . . . . . . . . . . . 111. Stimulation of cII Translation by IHF . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Metabolic Instability of Phage Regulatory Proteins . . . . . . . . . . . . . . . . V. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

40 45 46

47 47

Global Regulation of Mitochondria1 Biogenesis in Saccharomyces cerevisiae J . H . de Winde and L . A . Grivell I . Transcriptional Regulation and Signal Transduction in Yeast . . . . . . . . I1. Transcriptional Regulation by Oxygen . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Differential Regulation of Gene Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . V

54

59 63

vi

CONTENTS

IV. V. VI . VII . VIII . IX . X.

Transcriptional Regulation by Carbon Source ..................... Transcriptional Regulation under Stress Conditions . . . . . . . . . . . . . . . . A Path from Mitochondrion to Nucleus? . . . . . . . . . . . . . . . . . . . . . . . . . Mitochondrial Biogenesis and the Yeast Cell Cycle . . . . . . . . . . . . . . . . Regulation of Mitochondrial Biogenesis in Relation to Cell Growth . . Mitochondrial Biogenesis in Evolutionary Perspective . . . . . . . . . . . . . Conclusions and Prospects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64 70 71 73 74 81 82 85

DNA Polymerase II. the Epsilon Polymerase of Saccharomyces cerevisiae Alan Morrison and Akio Sugino I. I1. I11 . IV. V. VI . VII .

Categorization and Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DNA Polymerase I1 Structure and Activities . . . . . . . . . . . . . . . . . . . . . Cell Cycle Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Domain Structure of Catalytic Subunit . . . . . . . . . . . . . . . . . . . . . . . . . . Genetics of 3'+5' Exonuclease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The DNA Repair Polymerase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Role of DNA Polymerase I1 in DNA Replication . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . .

94 96 103 104 110 112 114 117

Regulation of Bacillus subtilis Gene Expression during the Transition from Exponential Growth to Stationary Phase Mark A. Strauch I . The Bacillus subtilis Transition State . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 I1. Keeping Stationary Phase Gene Expression Off: Transition-State Regulators . . . . . . . . . . .......................... 124 I11 . Activators and Modulators of Transition-State Gene Expression . . . . . 138 IV. Initiating Sporulation . . . . . . . . ..................... 143 V. "Redundant" Control of Transiti s ............ . . 146 VI . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

Genomic Organization of T and W. a New Family of Double-Stranded RNAs from Saccharomyces cerevisiae Rosa Esteban. Nieves Rodriguez-Cousiiio and Luis M . Esteban I . General Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1. T and W Gcnomic Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

156 157

CONTENTS

111.

IV. V. VI. VII. VIII. IX. X.

Single-Stranded RNA Counterparts of T and W dsRNAs . . . . . . . . . . . Configuration of W and 2 0 3 RNA . . .... T and W Replication Cycles . . . . . . . .................... T and W Are Not Encapsidated into Viral Particles .... Are the RNA Polymerases Associated with the RNAs? . . . . . . . . . . . . . Relationship between T and W and Evolutionary Origin of T and W . . Conclusions and Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vii 165 169 173 174 175 179 180

Mechanism of Action and Regulation of Protein Synthesis Initiation Factor 4E: Effects on mRNA Discrimination, Cellular Growth Rate, and Oncogenesis Robert E. Rhoads, Swati Joshi-Barve and Carrie Rinker-Schaeffer I.

V. VI. VII.

Regulation of Initiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 . . . . . . . . . . . . . . . . . . . 186 188 ............................... 191 Regulation of eIF-4 . . . . . . . . . 197 Alteration of Intracellular Levels of eIF-4E . . . . . . 200 Summary, Conclusions, and Future Directions . . . . . . . . . . . . . . . . . . . 213 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

Enzymology of Homologous Recombination in Saccharomyces cerevisiae W.-D. Heyer a n d R. D. Kolodner I.

Recombination Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11. Physical Analysis of Recombination ............................. 111. Enzymology of Homologous Genetic Recombination in S. cereoisiae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

222 228 228 266

273

This Page Intentionally Left Blank

Abbreviations and Symbols All contributors to this Series are asked to use the terminology (abbreviations and symbols) recommended by the IUPAC-IUB Commission on Biochemical Nomenclature (CBN) and approved by IUPAC and IUB, and the Editors endeavor to assure conformity. These Recommendations have been published in many journals (1. 2) and compendia (3) and are available in reprint form from the Office of Biochemical Nomenclature (OBN); they are therefore considered to be generally known. Those used in nucleic acid work, originally set out in section 5 of the first Recommendations ( I ) and subsequently revised and expanded ( 2 . 3). are given in condensed form in the frontmatter of Volumes 9-33 of this series. A recent expansion of the one-letter system ( 5 ) follows.

Symbol

G A T(U) C

G A

R

G or A T(U) or C A or C G or T(U) G or C A or T(U)

Y M K S W”

H

Origin of symbol

Meaning

Guanosine Adenosine (ribo)Thymidine (Uridine) Cyridine

T(U)

C

puRine pyrimidine aMino

Keto Strong interaction (3 H-bonds) Weak interaction (2 H-bonds) G, H follows G in the alphabet A, B follows A T (not U), V follows U C. D follows C

D

A or C or Y(U) G or T(U) or C G or C or A G or A or T(U)

not not not not

N

G o r A or T(U) or C

aNy nucleoside

Q

Queuosine (nucleoside of queuine)

B V

Q

(I

e , unspecified)

‘Modified from Proc Narl Acad Sci U S A 83. 4 (1986) bW has been used for wyosine. the nucleoside of “base Y” (wye) ‘D has been used for dihydrouridine (hU or H , Urd) Enzymes

I n naming enzymes, the 1984 recommendations of the IUB Commission on Biochemical Nomenclature ( 4 ) are followed as far as possible At first mention. each enzyme i s described either by irs systematic name or by the equation for the reaction catalyzed or by the recommended trivial name, followed by its EC number in parentheses Thereafter. a trivial name may be used Enzyme names are not to be abbreviated except when the substrate has an approved abbreviation ( e g , ATPase, but not LDH. is acceptable)

ABBREVIATIONS A N D SYMBOLS

X

REFERENCES 1. JEC241, 527 (1966); &hem 5, 1445 (1966); BI 101, 1 (1966); AEE 115, 1 (1966). 129, I (1969); and e1sewhere.t General. 2. EIE 15, 203 (1970); JEC 245, 5171 (1970); JME 55, 299 (1971); and e1sewhere.t 3. “Handbook of Biochemistry” (G. Fasman, ed.), 3rd ed. Chemical Rubber Co., Cleveland, Ohio, 1970, 1975, Nucleic Acids, Vols. I and II, pp. 3-59. Nucleic acids. 4. “Enzyme Nomenclature” [Recommendations (1984) of the Nomenclature Committee of the IUB]. Academic Press, New York, 1984. 5. EIB 150, 1 (1985). Nucleic Acids (One-letter system).t Abbreviations of Journal Titles

Journals

Abbreviations used

Annu. Rev. Biochem. Annu. Rev. Genet. Arch. Biochem. Biophys. Biochem. Biophys. Res. Commun. Biochemistry Biochem. J. Biochim. Biophys. Acta Cold Spring Harbor Cold Spring Harbor Lab Cold Spring Harbor Symp. Quant. Biol. Eur. J. Biochem. Fed. Proc. Hoppe-Seyler’s 2. Physiol. Chem. J. Amer. Chem. SOC. J. Bacteriol. J. Biol. Chem. 1. Chem. Soc. J . Mol. Biol. J. Nat. Cancer lnst. Mol. Cell. Biol. Mol. Cell. Biochem. Mol. Gen. Genet. Nature, New Biology Nucleic Acid Research P r o c Natl. Acad. Sci. U.S.A. Proc. Soc. Exp. Biol. Med. Progr. Nucl. Acid. Res. Mol. Biol.

ARB ARGen ABB BBRC Bchem BJ BBA CSH CSHLab CSHSQB EJB FP ZpChem JACS J. Bact. J BC JCS JMB JNCl MCBiol MCBchem MGG Nature NB NARes PNAS PSEBM This Series

E. Cohn, Director). tReprints available from the Office of Biochemical Nomenclature (W.

Some Articles Planned for Future Volumes Collagen Genes: Mutations Affecting Collagen Structure and Expression

W. G. COLE mRNA Binding Proteins in Eukaryotic Cells

TOMDONAHUE AND K. GULYAS Processing of Eukaryotic Ribosomal RNA

DUANEC. EICHLERAND NESSLY CRAIG Signal-transducing G Proteins: Basic and Clinical Implications

C. W. EMALA,W. F. SCHWINDINGER, G. S. WARDAND M . A. LEVINE Replication Control of Iteron-Containing Chromosomes

MARCINFILIJTOWICZ Cellular Transcriptional Factors Involved in the Regulation of HIV Gene Expression

RICHARDGAYNOR AND C. MUCHARDT RNA Polymerase as a Molecular Machine: The Coupling between Catalytic Function and Propagation along DNA

ALEX GOLDFARB The Role of the 5’ Untranslated Region of Eukaryotic mRNA in Translation and Its Investigation Using Antisense Technologies

MATHIASW. HENTZE,KOSTAS PANTOPOULOS AND HANSJOHANSSON Mammalian Aminoacyl-tRNA Synthetases

LEV L. KISSELEV Nuclear Pre-mRNA Processing in Higher Plants

K. R. LUEHRSEN, S. TAHAAND V. WALBOT The Regulation of Ribosomal Transcription

TOMMoss Nonsense-mediated mRNA Decay in Yeast

S. W. F’ELTZ, F. HE, E. WELCHAND A.

JACOBSON

New Members of the Collagen Gene Family

TAINAPIHLAJANIEMI AND MARKREHN Molecular Biology and Regulatory Aspects of Glycogen Biosynthesis in Bacteria

J. PREISSAND T. ROMEO Human Mutational Spectrometry: Means and Ends

WILLIAMG. THILLYAND KONSTANTIN KHRAPKO xi

xii

SOME ARTICLES PLANNED FOR FUTURE VOLUMES

The Balbiani Ring Gene Family: A Multigene Family Responsible for a Tissue-Specific Function

LARSWIESLANDEH Prestalk Cell Differentiation and Movement during the Morphogenesis of Dictyosteliurn

discoideurn

J. WILLIAMSAND A. MORRISON Diverse Mechanisms for Regulating Ribosomal Protein Synthesis

J. M . ZENGELAND LASE LINDAHL

Adenoviral DNA Integration and Changes in DNA Methylation Patterns: A Different View of Insertional Mutagenesis WALTER D O E R F L E R Znstitut f u r Genetik Uniuersitiit zu f i l n Kiiln, Gerrnany

I. Scope of Review . . . . . . . . . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Survey of Basic Findings on Adenovirus DNA Integration . . . . . . . . . . . 111. Uptake of Foreign (Adenoviral) DNA by Mammalian Cells . . . . . . . . . .

IV.

V.

VI. VII.

A. Mammalian Cells in Culture Incorporate Adenovirus DNA . . . . . . B. DNA-Protarnine Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On the Mechanism of Adenovirus D N A Integration . . . . . . . . . . . . . . . . A. Studies in a Cell-Free System from Hamster Cell Nuclei . . . . . . . . B. Non-homologous Recombination between Baculovirus DNA and Foreign DNA in Insect Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Conclusions and Goals . . . . . . . . . , , . . , . . . , . . . . . . . . . . . . . . . . . . . . De Novo Methylation of Integrated Foreign DNA , . . . , . . . . . . . . . . . . . A. Basics of DNA Methylation: A Brief Synopsis . . . . . . . . . . . . . . . . . . B. De Novo D N A Methylation, a Host Defense Mechanism . . . . . . . . C. Role of DNA Methylation in Viral Replication Cycles . . . . . . . . . . . D. Origin and Spreading of de Nmo Methylation across Integrated Adenovirus DNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Insertion of Foreign (Ad12) DNA and Changes of DNA Methylation in the Adjacent Cellular DNA Sequences . . . . . , . . . . . . . . . . . . . . . . Alterations in Cellular Gene Expression in Adenovirus-Infected and -Transformed Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Different View of Insertional Mutagenesis , . . . . , . . . . . . . . . . . . . . . . References . , . . . , . . . . . . . . . . . . . . . . . . . . . . . . .

1

5 8 8 8 10 11 11

18 21 23 23 26 27 28 30 30 32 33

1. Scope of Review Mammalian cells can take up foreign D N A and insert it into their genomes. The frequency of these events, the mechanism of foreign D N A insertion, and the consequences of foreign D N A integration into an estabProgress in Nucleic Acid Rerearch and Molecular Biology, Vol. 46

1

Copyright 0 1993 by Acadonuc Prcr,, Inc. All nghti of reproduction tn any form reserved

2

WALTER DOERFLER

lished functional genome have received limited attention. One approach to this problem has been studies on the integration of viral DNA, frequently in virus-transformed cells or in virus-induced tumor cells. It is conceivable that the mechanism of foreign DNA insertion depends on the way in which external genomes or genome fragments are introduced into the cell. My laboratory has investigated the problem of foreign DNA insertion into the mammalian genome as well as mechanisms operative in mammalian cells that enable them to enlarge their genetic repertoire and, at the same time, control the expression of the newly acquired genetic information. Human adenoviruses have been used in this work. It is likely, but has not been proven, that the integration events after adenovirus infection and after the transfection of viral DNA follow similar pathways. To what extent can virion proteins or structures, or early viral gene products, influence the transfer of the viral genome into the permanently fixed state inside the mammalian genome? Since mammalian cells have the capacity not only to insert foreign DNA into their own genomes but also to rearrange segments of their own genome, cellular functions seem to be sufficient for foreign DNA integration. On the other hand, viral functions may be able to modify or to direct insertional recombination reactions. In recent years, we have developed a cell-free system using nuclear extracts from hamster cells and partly purified proteins from these extracts to imitate at least certain steps in the insertional recombination reaction in in vitro experiments. In this way, it may be possible to identify the cellular and, by using extracts from adenovirus-infected cells, the viral components of the enzyme complex that is presumably responsible for foreign DNA insertion. Although recombination events between adenovirus and cellular DNAs can be catalyzed by the partly purified cell-free system, it has so far not been possible to mimic the complete integration reaction in uitro. Adenovirus DNA integration in many respects is akin to non-homologous recombination. Nevertheless, “patchy” homologies at the sites of integration have frequently, though not invariably, been found. Perhaps, the terms “homologous” and “non-homologous recombination” are still too abstract to allow an adequate description of an obviously very versatile cellular mechanism. The question of site-specificity of adenovirus DNA integration or lack thereof still requires further investigation. Studies on established adenovirus-transformed cell lines have revealed no evidence for specific sites of viral DNA integration. Transcriptionally active regions of the cellular genome have a higher propensity to recombine with foreign DNA, and one can argue plausibly that such regions may present the right chromatin structure for the recombination with foreign DNA. Moreover, foreign genes could gain functional advantages by insertion into chromatin structures that are in the process of active transcription. Of course, it

ADENOVIRUS DNA INTEGRATION AND METHYLATION PATTERNS

3

will be prudent not to confound plausibility arguments with experimentally proven facts. Upon infection of human cells with adenovirus type 12, the viral chromosome has frequently been found associated with chromosome 1. In this context, it is interesting to recall that the genome of adeno-associated virus (AAV) can be linked to the long arm of chromosome 19 after the infection of human cells with AAV ( 1 ) . Can experimental conditions be generated that would allow foreign (viral) genomes to select certain preferred sites for their integration? Perhaps virion structure and virion-associated functions are responsible for this site-selective integration mechanism. Could it be advantageous for the viral genome to use site selection under one set of conditions and rely on host directions to many different sites under other circumstances? It is the pliability of the integration mechanisms that makes them highly interesting. Part of the fascination with these biological problems derives from their evolutionary implications and their importance for tumor biology. It is difficult to imagine how evolution could have progressed without taking advantage of blocks of pretested genetic information, even in the form of longer stretches of genes and regulatory genetic elements, in the gradual build-up of organisms of ever-increasing functional complexities. Evolution would then have been dependent on the availability of mechanisms that help insert foreign D N A into preexisting genomes. Could the digestive tract of organisms have been the most likely portal of entry for foreign genes? Pursuing problems of tumor research, it is worth considering that the insertion of viral D N A into the host genome serves its function for the permanent fixation of the viral genome and its functional units. Moreover, insertion can lead to local mutagenesis. Researchers on gene therapy, who must be concerned with the functionally competent insertion of foreign genes into a preexisting genome, will be preoccupied with the mechanism of insertion and with problems ofhow to keep inserted genes active or dependent on controllable regulatory mechanisms. In our own research over the last decade, interest in DNA methylation has evolved from the work on adenovirus D N A integration. I realize that studies on adenovirus DNA integration in the context of DNA methylation can serve as a model only, and that this model, however interesting, will not allow us to study all the necessary niches of the very complex general problem of D N A methylation. But a paradigm experimental viral system has often opened doors and permitted us to investigate mechanisms to an extent that cellular systems cannot provide due to their inherent complexity and multifactorial interdependence. The ease with which mammalian genomes can integrate foreign D N A may be compatible with cell survival and the undisturbed expression pattern

4

WALTER DOERFLER

of the cellular genome only through the action of an equally pliable host defense mechanism against the activity of foreign genes whose products could b e detrimental to cell survival. Foreign (viral) D N A can become de nuvo methylated upon integration into the mammalian genome. The gradual spreading of D N A methylation across previously unmethylated D N A and the generation of surprisingly specific patterns of de nuvo methylation of D N A are characteristics of this interesting mechanism. Specific segments of a foreign genome can be spared from D N A methylation, perhaps because the activity of these segments has proved advantageous to cell survival or adaptability. I have proposed the hypothesis that the de now methylation of foreign D N A by the D N A methyltransferase system of the host is the functionally essential corollary to the integrative potential of mammalian cells for foreign D N A . De novo methylation can thus be considered a host defense mechanism against the expression of potentially damaging foreign genes and their products (2). It is likely that the mammalian D N A methyltransferases are an essential part of this mechanism, but can hardly he considered versatile and specific enough by themselves to account for the functional complexity of the de nmo imposed patterns. There seem to be additional, hitherto unknown, components of the D N A methyltransferase system that play a crucial role in the site-specific de nova methylation of foreign recently inserted D N A and its partial silencing. There is ample experimental evidence that the sequence-specific methylation of viral and eukaryotic promoters leads to the inactivation of specific genes (for reviews, see 3-8). The insertion of foreign D N A into an established genome has yet another, unexpected, consequence that might have far-reaching functional implications. By an unknown mechanism, the patterns of D N A methylation in the flanking sequences of the host genome can undergo alterations. Increases and decreases have been observed in these patterns, possibly as a consequence of foreign D N A insertion. Are perturbations in the preestablished chromatin structure in a given genomic region responsible for changes in the activity and specificity of the host’s D N A methyltransferase system? It has not yet been investigated to what distance from the site of foreign D N A integration these alterations in patterns of DNA methylation extend. The available evidence indicates that several hundred to a thousand basepairs (kbp) of nucleotides can be affected in the abutting cellular genome (9). However, much longer stretches of cellular D N A could be involved. In any event, loss of D N A methylation could invoke gain of originally silenced genetic activity; an increase of D N A methylation could entail the shutdown of previously active genes. Thus, the activity patterns of large segments of the mammalian genome could be altered. In this context, the term “insertional mutagenesis” can be redefined with a novel functionally relevant

ADENOVIRUS

DNA INTEGRATION

AND METHYLATION PATTERNS

5

meaning. After all, the insertion of foreign (viral) DNA could still be a decisive event (10)in the causation of virus-induced tumors, as insertional mutagenesis in this way could lead to the faulty regulation of several groups of genes (11)that are involved, directly or indirectly, in growth control. Much more experimental work is required to assess the validity of these concepts. Viral model systems hold distinct advantages in the design and execution of projects in tumor biology and for evolutionary considerations.

II. Survey of Basic Findings on Adenovirus DNA Integration The insertion of foreign (viral) DNA into an established genome raises a number of intricate problems which we have tried to investigate in the adenovirus system. Does the viral genome integrate at a limited number of specific cellular sites? What are the functional consequences for the affected cell? By what enzymatic mechanism can foreign DNA become integrated and how is its transcription regulated? By a series of experiments, some of these questions have been at least partly answered. Although evidence for viral DNA insertion has been adduced at relatively early times after infection (10, 12), detailed studies on infected cells early after the infection with adenoviruses have not yet been performed. The bulk of the experimental evidence has been adduced from studies on adenovirus-transformed cell lines or with adenovirus type-12 (Adl2)-induced tumors or tumor cell lines (for reviews, see 13, 14). We restricted the analyses to cells obtained under these conditions because of the notion that viral functions may modify the mechanism of foreign DNA integration in a specific way. Nevertheless, it is conceivable that all functional elements for foreign DNA insertion can be provided by the mammalian host cells themselves. In the DNA of a large number of adenovirus-transformed cells or of Adl2-induced tumor cells, the integration patterns of viral DNA have been determined by restriction and Southern blot-hybridization analyses. For the evaluation of the mode of integration of adenovirus DNA, size and position of the off-size fragments have been decisive. These fragments are characterized by the presence of either of the terminal segments of adenoviral DNA and by a position on the electropherogram that does not coincide with any of the authentic viral DNA fragments generated by the same restriction endonuclease. The off-size position is due to the covalent linkage of viral to cellular DNA and the generation of restriction fragments that consist of the terminal viral and the adjacent cellular DNA sequences. With very few exceptions, the positions and sizes of these off-size fragments differ among the many cell lines we have investigated (13).In the cell lines and recombi-

6

WALTER DOERFLER

AT

No.

Cell line

TABLE I JUNCTIONSBETWEEN ADENOVIRUS AND CELLULARDNAs WHICH NUCLEOTIDESEQUENCES HAVE BEEN DETERMINED^

Species

Transforming virus

Specific comments ~

1

CLACl

Hamster

Ad12

2

HE5

Hamster

Ad2

3

CLAC3

Hamster

Ad12

4

SYREC

Human

Ad12

5

CBA12-1-T

Mouse

Ad12

6

T1111(2)

Hamster

Ad12

7

HA1217

Hamster

Ad12

Ref.

~

Tumor cell line; left terminal Ad12 DNA junction Transformed cell line; right and left terminal Ad2 junctions; internal junction in deleted Ad2 genome Tumor cell line; left terminal Ad12 DNA junction Symmetric recombinant between left terminus of Ad12 DNA and human cell DNA; packaged into virions Tumor cell line; left terminal Ad12 DNA junction Tumor cell line; left terminal Ad12 DNA junction Transformed cell line; left terminal Ad12 DNA junctions

17, 38 18, 19

16

15, 20

21 22 23 ~

0

Cellularjunction sequences are transcriptionally

active in Nos. 1, 2, 4,5, 6, and 7.

These data were compiled from

14.

nants listed in Table I, the sites of linkage between viral and cellular DNAs have been cloned, and the nucleotide sequences across the sites of junction have been determined (15-23). The results of these analyses can be summarized as follows. 1. Adenovirus D N A persists in transformed or tumor cells in a covalently integrated and not in an episomal free form. The copy number of integrated viral genomes per cell can vary between 1 and >30. In general, Ad12 DNA is integrated intact or nearly intact and in an orientation colinear with that of virion DNA. In contrast, adenovirus type-2 (Ad2) DNA is inserted in a fragmented form or with internal deletions. Hamster cells are non-permissive for Ad12, but permissive for Ad2. In the generation of transformed cells, the persistence of the intact fully functional Ad2 genome may therefore be selected against in hamster cells (13, 14). 2. The nucleotide sequences at the cellular sites of integration are all different. There is no evidence for a unique cellular nucleotide sequence for viral DNA integration.

ADENOVIRUS

DNA INTEGRATION AND METHYLATION PATTEHNS

7

3. Adenovirus DNA can integrate into unique or repetitive cellular DNA. 4. Often, but not invariably, patch homologies between viral and cellular DNA, or between viral/cellular sequence stretches that replace each other as a consequence of integration, are observed at the sites of linkage. The mechanism of insertional recombination may thus be aided by, but is not dependent on, sequence homologies between the reaction partners. 5. At the sites of linkage, up to 174 viral nucleotides have been found to be deleted upon integration (17). In cell-line HA12/7, however, not a single viral nucleotide was missing at the left viral DNA terminus and its linkage site to cellular DNA (23). 6. Viral DNA integration can entail the deletion of cellular nucleotides (21) or proceed without the deletion of a single cellular nucleotide at the site of recombination (19). 7. The cellular sequences used as insertion targets for the foreign (adenoviral) DNA are transcriptionally active in all instances that we have investigated for transcriptional activity (24, 25). These sequences are transcribed also in the host cells prior to adenovirus infection or transformation. It is therefore conceivable that transcriptional activity of cellular sequences with their specific chromatin structure predisposes them for insertional recombination with foreign (viral) DNA. 8. In several instances ofadenovirus-transformed cells, the loss of the previously integrated adenovirus DNA has been documented (26-28). 9. In at least one of these Adl2-induced hamster tumor cell lines, the loss of apparently all viral genomes has been consistent with the maintenance of the oncogenic phenotype of these cells (28). Obviously, adenovirus tumorigenesis is not dependent in all cases on the continued persistence of the previously integrated viral genomes. These findings do not support the notion that functions encoded in the E l region of the adenovirus genome are essential to maintain the oncogenic phenotype of adenovirus-induced tumor cells. Of course, our observations may represent a special situation. Moreover, tiny segments of viral DNA not detectable by Southern blotting may have persisted in the tumor cells that had lost the Ad12 genomes. In any event, the loss of integrated adenovirus DNA from cellular genomes has been observed repeatedly upon continuous passage of Ad12induced hamster tumor cells in culture. Although there is, at present, no evidence for specific sites of integration of adenovirus DNA into the mammalian genome, it remains to be investigated whether certain insertion sites may be selected under specific condi-

WALTER DOEHFLEH

8

tions of infection or transformation. In human cells infected with Ad12, preferential association of the viral DNA with human chromosome 1 has been described both late and early after infection (29, 30). The molecular analysis of selective integration events has been initiated (31).

111. Uptake of Foreign (Adenoviral) DNA by Mammalian Cells A. Mammalian Cells in Culture Incorporate Adenovirus DNA When adenovirus DNA is added to human cells growing in culture, about 5-10% of the foreign DNA is taken up by the cells, and at least some of this DNA is transported into the nucleus (32).No additional pretreatment of the DNA or of the cells is required to effect this uptake. The conventional method of Caz+-phosphate precipitation of DNA (33, 34) does not enhance the uptake of foreign DNA, but somehow promotes the expression of the incorporated DNA. The electron-micrographs in Fig. 1 document uptake and nuclear transport of 3H-labeled Ad2 DNA into human HeLa cells. The location of the viral DNA has been followed by autoradiography combined with electron-microscopy (32).The results of sedimentation analyses of the incorporated DNA provide evidence for the interpretation that a part of the adenoviral DNA ingested by the cells becomes associated in an alkali-stable form with high-molecular-weight fast-sedirnenting DNA (32).It is therefore likely that the foreign DNA becomes integrated into the genome of the host cell. It is not known by what mechanism the foreign (adenoviral) DNA is transported into the cell and into the nucleus. Preformed DNA-protein complexes may be a preferred substrate for uptake and transport into human cells and their nuclei.

B. DNA-Protamine Complexes Transfection of foreign DNA into eukaryotic cells has become an important tool in molecular biology. Based on the results of previous studies on the core structure of human adenoviruses, we have developed a novel transfection method (35).The procedure involves the in vitro reconstitution of foreign DNA of viral or other origin with the major core protein VII of Ad2 or with protamine from salmon sperm. Both proteins are rich in basic amino acids and appear to share structural features (36).The DNA-protein complexes are added directly to the medium of mammalian cells growing in culture. The in uitro formation of specific DNA-protein complexes can be assessed by band-shift analyses. Bovine serum albumin does not enter into

ADENOVIRUS DNA INTEGRATION AND METHYLATION PATTERNS

9

FIG.1. Uptake offoreign (Ad2) DNA by human cells in culture. Electron-microscopy and autoradiography of KB cells infected with ”H-labeled Ad2 DNA. KB cells growing in monolayers were infected with 2.3 pg of3R-labeled Ad2 DNA in 0.5 ml of Eagle’s medium. Adsorption at 37°C was allowed to proceed for 30 minutes (a). 6 hours (b), and 24 hours (c and d), respectively. In the experiment using a 24-hour adsorption period, 0.5 nil of Eagle’s medium was added 2 hours after adding DNA. At the end of the incubation period, the inoculum was removed and the cells were inimediately fixed with 2 ml of 1.5%glutaraldehyde. Samples were then prepared for electron-microscopy and autoradiography. N, Nucleus, C, cytoplasm; M, mitochondrion. (Reproduced with permission from 32.)

specific complexes with DNA. Transfection of DNA-protein VII or DNAprotamine complexes results in their rapid transport into the cell nuclei. About 2-4 hours after transfection, up to 40% of the DNA added in complexes to cell cultures can be found in the nucleus, as compared with (10% of the DNA when other transfection methods are applied, or when “naked DNA is added to cell cultures (35). DNAs transfected by the new method into mammalian or insect cells retain their characteristic restriction patterns until at least 48 hours after transfection. Supercoiled circular plasmid DNAs are converted to open circular or linear DNAs. Expression has been measured both for transiently expressed genes (chloramphenicol acetyltransferase gene, Ad2 DNA in human HeLa cells) and for genes that have been integrated into the host genome and are expressed permanently, such as the gene for neomycin

10

WALTER DOERFLER

phosphotransferase in hamster BHK2l cells. Ad2 DNA or plasmid DNA preparations introduced into cells by the DNA-protamine complex method are as efficiently expressed as when transfected by the Ca2+-phosphate precipitation technique. Comparable levels of expression have been found in both transient and permanent expression (35).

C. Conclusions From the results deduced from the aforementioned experiments, it is likely that the amount of foreign (adenovirus) DNA taken up by mammalian cells and the intensity of transcription of viral genes are not directly correlated. Even when foreign DNA is added to the culture medium without any pretreatment, considerable amounts of foreign DNA can reach the nucleus and become associated with cellular DNA. Nevertheless, foreign DNA transcription can remain at a level reached when much smaller amounts of DNA have been taken up. Apparently, the foreign DNA must enter the cytoplasm and subsequently the nucleus in a particular conformation, possibly complexed with specific basic proteins. In our experiments, protamine has been chosen as the complexing protein (35)because it might resemble, at least in some aspects, the arginine-rich adenovirus core protein VII. The transport of previously protamine-complexed adenovirus DNA has been markedly enhanced when compared to free DNA added to cell cultures. It is worth noting that, even in the absence of special experimental conditions, adenovirus DNA (and probably any foreign DNA) can be taken up into the nucleus and eventually into the genome of mammalian and human cells. This notion is of interest beyond the experimental realms of tumor biology, gene technology, or reverse genetics. Uptake of foreign DNA could have been an important mechanism in evolution. Cells might thus have been able to broaden selectively their genetic repertoire, provided the expression of the newly acquired foreign genes can become subject to regulation in a way that the host cell would derive functional advantages from the immigrant genes, or at least not encounter gross detrimental consequences. It is argued in Section V,B that the selective de novo methylation of recently integrated foreign genes and their promoters could function as a highly specific defense mechanism of the recipient host cells. It will be interesting to investigate how often foreign DNA ingested with food into the digestive tract of many organisms could in part be taken up by the cells of these organs or of other cell systems. When M13 phage DNA is pipet-fed to mice, fragments of this DNA can be identified by Southern blotting or PCR in their feces about 1-7 hours after feeding ( 3 6 ~ ) . Because of the very considerable general biological and evolutionary implications, I have considered research on the uptake, integration, and controlled expression of foreign genes in mammalian cells an important

ADENOVIRUS DNA INTEGRATION AND METHYLATION PATTERNS

11

topic. In recent years, our interest has been focused on the mechanism of adenovirus (foreign) DNA integration into the mammalian genome.

IV. O n the Mechanism of Adenovirus DNA Integration A. Studies in a Cell-Free System from Hamster Cell Nuclei 1.

RATIONALE FOR

EXPERIMENTAL APPROACH

The results of extensive analyses on the structure of integration sites of foreign DNA in the adenovirus system (14) suggest that foreign (viral) DNA is frequently inserted into the host genome by non-homologous recombination at many different cellular sites. The existence of selective sites in specialized instances is still a possibility. Frequently, “patch homologies at the sites of integrative recombination seem to support the integration mechanism, perhaps by stabilizing a recombination complex or a mechanism akin to strand exchange (37).The frequent occurrence of patch homologies at the sites of insertional recombination raises the question of an adequate nomenclature in homologous versus non-homologous recombination. Although I adhere to the classical designations in this context, it appears more realistic to me that in (eukaryotic) recombination a wide gamut of possibilities and mechanisms may exist with totally non-homologous recombination as opposed to exchanges with perfect sequence homologies over an extensive nucleotide sequence, the prototype of homologous recombination, being at the extremes of a broad scale of possible mechanisms. However, before the enzymatic details of these recombination events are elucidated, it will be futile to debate mechanisms and subtleties of nomenclature. At the core of many enzymatic reactions that involve DNA lies the recognition of specific DNA sequence motifs by specific proteins. It is likely that the formation of recombination complexes is also dependent on specific DNA-protein interactions. Of course, the sequence motifs directing nonhomologous recombination are not known, and they may be quite variable, since a particular DNA structure could be more important than a certain sequence in facilitating the decisive interactions with proteins and eventually with foreign DNA. In the design of the experimental conditions for a cell-free system to study non-homologous recombination, we reasoned that a cellular DNA sequence, that had once been identified as having served as the target for the insertional recombination with foreign DNA (17), might carry sequence or structure signals important in the essential DNA-protein interactions. For

12

WALTER DOERFLER

our experiments, we therefore chose hamster cell sequences, termed p7 or p16, which constituted hamster cell pre-insertion sequences into which the genome of Ad12 had integrated in the Adl2-induced hamster tumor CLACl (17, 38) or T l l l l ( 2 ) (22, 39), respectively. The pBR322 plasmid-cloned p7 hamster pre-insertion sequence or the p16 pre-insertion sequence was incubated with Ad12 DNA or Ad12 DNA fragments and with nuclear extracts from hamster BHK2l cells grown in suspension culture (40-42). In this system, we tried to imitate at least some of the conditions for the recombination between Ad12 DNA and hamster cell DNA. It was obvious from the time of the conceptual initiation of this project that one would have to take a step-by-step approach to mimic the real integration reaction. Initially, only elements of the integration reaction could be realized under cell-free reaction conditions. RECOMBINATION BETWEEN Ad12 DNA 2. CELL-FREE AND A PKE-INSERTION DNA SEQUENCE FROM HAMSTER CELLS (40)

A cell-free system of nuclear extracts from BHK2l cells has been developed to catalyze recombination in vitro between DNA of Ad12 and two different hamster pre-insertion sequences. The pBR322-cloned 1768-bp fragment p7 and the 3100-bp fragment p16 from BHK2l hamster DNA had previously been identified as the pre-insertion sites corresponding to the junctions between Ad12 DNA and hamster DNA in cell line CLACl and in the Adl2-induced tumor T1111(2), respectively. PstI cleaved Ad12 DNA and the circular or EcoRI linearized p7 or p16 pre-insertion sequences were incubated with nuclear extracts. Recombinants were isolated by transfecting the DNA into recA- strains of Escherichia coli and by screening for Ad12 DNA-positive colonies. Without a selectable eukaryotic marker, all Ad12 DNA-positive recombinants were registered. Of a total of over 90 p7-Ad12 DNA recombinants, 21 were studied by restriction hybridization and four, by partial nucleotide sequence analyses. Among the p16-Ad12 DNA recombinants, four were analyzed. The sites of linkage between Ad12 DNA and p7 or p16 hamster DNA were all different and distinct from the original CLACl or T1111(2) junction site between Ad12 and hamster DNA. The in vitro recombinants were not generated by simple end-to-end joining of the DNA fragments used in the reaction, but by genetic exchange. Thirteen of the 25 recombinants were derived from the nucleotide 20,880 to 24,049 fragment of Ad12 DNA. Recombination experiments between Ad12 DNA and eight randomly selected unique or repetitive hamster DNA sequences of 1500-6200 bp in length did not yield recombinants (40, 4 1 ) . Apparently, the p7 and p16 hamster pre-insertion sequences recombined with Ad12 DNA with a certain preference. Sites in the p7

ADENOVIRUS

- 4 1

DNA

pBR322--BHK21 375bp - 0

-

"

BHK21(~7)-&-pBR322

(p7)

"

'

"

"

I

I

1

pFR6 p7+5 I .

f.

f

I*

I

0 AI-AIuI Av-Aval

E-EcoRI R-Rsal S- Sau3a

I

'

"

"

*

GCCC CCTC CCACCAGC 'Chi'

AV

_____

1500

1000

/

~

1768 bp

. . I

I

16"

.*

#

-c

"

I

~I A I

S

B -8911

"

w

tg

-*t

500

'

.. - .". . ... .

I

13

INTEGRATION AND METHYLATION PATTERNS

l

l

I l l

R Av Al

I

I AI

I I

1 I

Av

1 I

f

I*

f *

.I.*

0

*

-c

I b

URBJ

CCTGCCTCGC

0 CCGCCCTC

I

+-.--.

I

CLACl

I . ,

I

S

Av Al

CCTCTCCG

4-

GCCCrepeai €3 CCTT repeal CTGrewai

-

--

-c

-

---

(TGG:)6TGG olem-loop

FIG. 2. Map of the p7 hamster pre-insertion DNA. (a)The map describes the structure of the p7 hamster pre-insertion sequence as a pBR322 plasmid clone. Some of the important restriction sites are designated. The pBR322 plasmid DNA was linearized at the EcoRI site in some of the recombination experiments. Individual nucleotide sequences or striking repeats are indicated. The double-headed arrow marks the original site of Ad12 DNA integration as found in the Adl2-induced hamster tumor cell line CLAC1. The sites of recombination in the p7 DNA sequence with Ad12 DNA as observed in the in-oitro-generated recombinants p7-R5 and p7R6 are designated by single-headed arrows. (Reproduced from 40 by permission of Oxford University Press.)

nucleotide sequence important in cell-free recombination are shown in Fig. 2, and some of the characteristics of the recombinants generated in the cellfree system are summarized in Fig. 3.

3. PATCHHOMOLOGIES BETWEEN Ad12 AND p7 HAMSTERPRE-INSERTION SEQUENCES ARE UTILIZEDIN THE GENERATIONOF MANY in Vitro RECOMBINANTS (41) The cell-free recombination system derived from hamster cell nuclear extracts, in which the in vitro recombination among a hamster pre-insertion sequence, the cloned 1768-bp p7 fragment, and Ad12 DNA can be demonstrated, has been developed further. The nuclear extracts have been subfractionated by gel filtration on a Sephacryl S-300 column. The activity promoting cell-free recombination elutes from the Sephacryl S-300 matrix with the shoulder and not the peak fractions of the absorbency profile. By using these protein subfractions, in vitro recombinants have been generated between the p7 pre-insertion sequence and the nucleotide 20,880 to 24,049 fragment of Ad12 DNA, which has shown high recombination frequency. In all of the analyzed recombinants thus produced in uitro, striking patch homologies have been observed between the p7 and Ad12 junction sequences, and between Ad12 DNA or p7 DNA and pBR322 DNA. The patch homologies

Recombinant clone

DNA substrates

Ad12 x PI

p7-R1

+ p7

extract

x EcoRI

Homology t o BHK21 DNA (4.0 kbP p7I

Ad12 DNA component (map u n i t s 1

Nuclear

fragment

L e n g t h of rnstrt (nucleotide pairs1

ultrasonic

50.5-53.5

lpIp1

+

C - - - l

-

3800

p7-R4

0 . 3 M NaCl

C Z E I Z x

+

p7-R5

J

n t . 150-3950 83-88

2900 2200

4

pl-R2

p7-R6

63-65

p7-R8

ultrasonic

p7-R9

0.3 I

lac1

I___1__1+

r =

61-70 78-82

p7-Rl6 p7-Rl7

61-70 61-7C‘

p7-Rl.9

9-25

p7-R22

12-17,

p7-R29

72-17

p7-R35 p7-R46

78-83 66-70

64-70

p7-R50

Ad12 X P s t I + p 7 / c i r c u l a r Ad12 x P S t I + p7lDNase

p7-R52

Ad12 x P s t I x p7 x EcoRI

66-70 66-70 62-65.5

p7-4 .O/ 2

Ad12

X

P S t I + p7-4.0

X

ClaI

60-67

p7-4.013

Ad12

X

P S t I + p7-4.0

X

ClaI

42-53

p16-Rl p16-R2

Ad12

X

P S t I + p16 X S a l I Ad12 x P S t I + p16 X AvaI

48-60 78-88

pl6-R3 p16-84

Ad12 x P s t I + p16 X AvaI Ad12 X P S t I + p16 X S a l I

88-100 53-61

I

20

66-70

I

30

6100

-

7

3000

r = -

1500 1500

ca. 3600

62-67

2500

2300

L I I r i

Ad12 x PI

I

m

c 1

p7-R48

10

2900

LL__I

p7-R49

r

2000

+

I

29-321 - 1

71-76, 8-11.

0

L

I

p7-Rl5

+ p7/circular

2000

2000

> 1000 700

L_1__1

n.d. c a . 5000 2200

ca. 2700 ca. 3500 7100 1900 ca. 1200

L I I

I

I

40

50

T

I 60

B

I

4200

I

I 70

80

I 90

FIG.3. In oitro recombination between Ad12 DNA and the hamster pre-insertion sequence p7. Ad12-p7 recombinants recovered as pBR322 clones are designated p7-R1 to p7R52; Ad12-pl6 recombinants are named p16-R1 to p16-R4. Recombinants of Ad12 DNA with the 4000-hp hamster DNA segment encompassing the p7 sequence are termed p7-4.0/2 or p74.013.The entry “0.3 M NaCl” indicates the salt extraction procedure of nuclei. The schematic drawings represent the Ad12 DNA molecule, and the filled-in areas designate the Ad12 DNA segments included in the recombinants. A PstI map of the Ad12 genome and a map-unit scale are given for orientation at the bottom of the figure. The bars underneath the PstI map of Ad12 DNA and the scale designate the derivation of Ad12 DNA segments in the individual recombinants (numbers inside bars). A BHK-hybridization of “f (plus) indicates that the p7 DNA in the recombinant hybridizes to a diagnostic 4000-hp EcoRI fragment in BHK21 DNA; (minus) refers to the ahsence of BHK21 DNA signals. n.d., Not determined. (Reproduced from 40 by permission of Oxford University Press.) ”

“-”

14

1

100

left

right

recl8

t

PBR

P7

rec27

- t

ACCMAAGGCACCCCCAC

P7

rec20

PBR AAAAATAGC

BHK.W!.

CLAC 1

CCCCGACC~~CC~EEEE~~EC~~~~E~A~~~~~~~E~E~

FIG. 4. Patch homologies'between recombination partners. Nucleotide sequences at the right and left sites of the junctions between p7 hamster DNA and Ad12 or pBR322 DNA in the in-uitro-produced recombinants recl0, recl8, rec22, rec27, rec9, and rec2O. Ad12 DNA sequences are shaded; p7 DNA sequences and pBR322 DNA sequences are designated p7 and pBR, respectively. The nucleotide numbers are those of the published DNA sequences for p7 (40)and pBR322 (43). The Ad12 DNA sequences have not been numbered, since the nucleotide sequence of the nucleotide 20,880 to 24,049 fragment of Ad12 DNA was only partly known (44) at that time. The Ad12 DNA sequences in the recombinants are derived in part from unpublished data (45). The left to right orientation of the Ad12 sequences is designated by an arrowhead. The junction sequence between the left terminus of Ad12 virion DNA hamster cell DNA in the Adl2-induced tumor cell line CLACl is also presented ( I T ) . The nucleotides designated by asterisks represent regions of sequence identities between reaction partners. (Reproduced with permission from 41 .)

I5

16

WALTER DOERFLER

are similar to those found earlier during the analyses of some of the junction sequences in integrated Ad12 genomes in Adl2-induced hamster tumor cell lines (Fig. 4). Proteins in the shoulder fractions of the gel filtration experiment can form specific complexes with double-stranded synthetic oligodeoxyribonucleotides corresponding to several p7 and Ad12 DNA sequences involved in in vitro recombination. These sequences participate in the recombination reactions catalyzed by the same column fractions in the shoulder of the absorbency profile. Such proteins have not been found in the peak fractions. Further work is required to ascertain that the cell-free recombination system can mimic certain elements of the mechanisms of integrative recombination and to purify the cellular components essential for recombination (see 42). 4. PARTIAL PURIFICATIONOF

THE PROTEINS CELL-FREERECOMBINATION BETWEEN Ad12 DNA AND THE p7 HAMSTER PRE-INSERTION SEQUENCE (42) INVOLVED IN THE

In the fractionated cell-free system from nuclear extracts of hamster cells, we have purified, at least partly, hamster cell nuclear proteins that could catalyze in uitro the recombination between Ad12 DNA and hamster DNA. As shown earlier (Fig. 3), the nucleotide 20,880 to 24,049 fragment of Ad12 DNA (45) and the hamster pre-insertion sequence p7 from the Ad12induced tumor CLACl have proved to recombine at higher frequencies than randomly selected adenoviral or cellular DNA sequences (40). It is thought that a pre-insertion sequence might carry nucleotide sequence elements that are essential in eliciting recombination. Frequently, patch homologies between the recombination partners seem to play a role in the selection of sites for recombination in vivo and in cell-free systems. Nuclear extracts from BHK2l cells were prepared by incubating the nuclei in 0.42 M (NH,)2S0,. These extracts were further fractionated via Sephacryl S-300 gel filtration, followed by chromatography oc Monos and MonoQ columns. The results in Fig. 5 summarize the presently used purification scheme. The purified products that still exhibited recombinatorial activity contained a limited number of different protein bands, as determined by pdyacrylamide gel electrophoresis and silver staining. We used three different methods to assess the generation of hamster DNA-Ad12 DNA recombinants upon cell-free incubation with the purified protein fractions: (i) transfection of the recombination products into recA- strains of E . coli; (ii) the PCR in combination with amplification primers unique for each of the two different recombination partners; and (iii) an assay based on the binding of the two differently labeled reaction partners to each other (46). It was striking that the nucleotide sequence (5’)-CCTCTCCG-(3’)or sequences

ADENOVIRUS

DNA

17

INTEGRATION AND METHYLATION PATTERNS

BHK21 Cells

1

Nuclear Extract (420 rl( liuoniu

Fraction I

sulfate)

Sephacryl S-300

Fraction I1

(150 In Nacl)

Shoulder Fraction "aCII

MonoQ

I

M

k

Fraction Ill '

1 M Elution

MonoSFraction Flow Through

4 . 1 0

+I

MonoQ 500 mM Elution Fraction

Fraction I l l

Fraction IV 0.5M

FIG.5. Cell-free hamster system to study in uitro recombination. Fractionation of nuclear extracts from BHK21 hamster cells. The activity promoting recombination between the hamster p7 DNA and the nucleotide 20,880 to 24,049 fragment of Ad12 DNA was partly purified by a series of chromatographic procedures. The p7 DNA represented the pre-insertion DNA sequence from hamster cells into which Ad12 DNA had integrated in the Adl2-induced hamster tumor cell line CLACl (17, 38). The assay for the identification of recombinants based on transfecting the recombination products into the recA- strain of E . coli HB101/LM1035 was described (40).Crude nuclear extracts were prepared from BHKPl cells grown in suspension cultures to yield crude extract fraction I. A volume of 3 ml of frytion I was applied to a Sephacryl S-300 column. The active fractions from the shoulder in the OD,,, absorbency profile (solid area) were pooled as fraction 11. Fraction I1 was then loaded onto a MonoS column. The fractions catalyzing cell-free recombination eluted as fraction 111. The proteins in fraction 111 were subsequently adsorbed and eluted from a MonoQ column to yield fraction I\'. The elution profiles of each column are presented schematically, and the fractions active in recombination are indicated by solid areas. The NaCl concentration used for elution is also shown. (Reproduced with permission from 42.)

18

WALTER DOERFLER

adjacent to it in the p7 hamster DNA repeatedly served as a preferred, though not the only, recombination target for Ad12 DNA. The sequence occurred in the tumor CLAC1, and in five independently performed cellfree recombination experiments with crude nuclear extracts, with Sephacryl S-300 or MonoQ column fractions (42).

8. Non-homologous Recombination between Baculovirus DNA and Foreign DNA in Insect Cells The baculovirus Autographa calijornica nuclear polyhedrosis virus (AcNPV) has been used to document the lack of persistence of AcNPV DNA in mammalian cells (47) and to study details of the transcription of the AcNPV genome in insect cells (48-53). We have now employed the AcNPVSpodoptera frugiperda insect-cell system to investigate the mechanism of non-homologous recombination between AcNPV DNA and foreign DNA (54, 55). Homologous recornbination between sequences of the AcNPV polyhedrin gene and polyhedrin vector-cloned foreign DNA is the basis of one of the successful and efficient eukaryotic expression vector systems (56; for reviews, see 57-59). Here, I do not dwell on this interesting aspect of baculovirus research. The focus in this section is on the non-homologous recombination between AcNPV DNA and foreign DNA. Several laboratories have reported that insect-cell DNA in the form of elements akin to retrotransposons can integrate into AcNPV DNA in insect-cell cultures (52, 6065). However, we have also studied an obviously different kind of nonhomologous recombination between AcNPV DNA and foreign DNA that lacks the element of terminal repetitions at the sites of retrotransposon insertion that characterizes the cellular DNA insertions into AcNPV DNA. We have again used adenovirus DNA as the foreign model DNA for studies on non-homologous recombination with AcNPV DNA in living S. frugiperda insect cells (54)and in a cell-free system of nuclear extracts from these cells (55).I also summarize briefly our own results on AcNPV variants that carry cellular DNA in elements akin to retrotransposons (52). 1. RETROTRANSPOSON-LIKE INSERTION OF INSECT CELL DNA INTO THE AcNPV GENOME(52) In this study, a transposon-like insertion of S. frugiperda insect-cell DNA has been analyzed in single-plaque-isolate E (66) of the insect baculovirus AcNPV. The 634-bp cellular DNA insertion is characterized by an 18bp terminal inverted repeat and carries an EcoKI site. This additional EcoRI site in the 81-map-unit segment of the DNA of plaque-isolate E of AcNPV explains the differences between the EcoRI restriction map of the DNA fi-om this isolate and those of the virus stocks used in other laboratories. Except for this insertion, the nucleotide sequence at the site of insertion in the DNA

ADENOVIRUS

DNA

INTEGRATION AND METHYLATION PATTERNS

19

of plaque-isolate E is identical to that of AcNPV-E2 DNA (67). The cellular DNA insertion in the AcNPV genome is represented many times in the S. frugiperdu cell genome but has no detectable homology with DNAs from species other than lepidopteran insects. In S. frugiperdu cells, the retrotransposon-like insertion sequences are transcribed into cytoplasmic RNA. The transcription of these sequences is initiated within the cellular insertion element. In S . frugiperda cells infected with plaque-isolate-E of AcNPV, at least nine different size classes of AcNPV-specific RNAs are synthesized (49, 50);in AcNPV-E2-infected cells, similar size classes have been detected. The cellular insertion of plaque-isolate-E provides the initiation site for the synthesis of an additional RNA size class that is transcribed from the viral DNA genome. 2. NON-HOMOLOGOUS RECOMBINATIONBETWEEN ADENOVIRUS DNA AND AcNPV DNA IN INSECT CELLS(54) We used the expression-vector system of AcNPV and S.frugiperda insect cells to study mechanisms of recombination in insect cells. We concentrated on the isolation and analysis of non-homologous recombinants. The E l region of human Ad2 was inserted into regions of the AcNPV genome lacking apparent homologies to the polyhedrin region. Of a total of 122 recombinant AcNPV plaques, which hybridized to Ad2 DNA in plaqueannealing experiments, 13 recombinants proved non-homologous, and five of these recombinants could be grown to titers that facilitated virus replication and further investigations of the recombinant DNAs. Restriction and Southern blot analyses on all of the recombinants, and nucleotide sequence determinations on one of them, permitted the mapping of the sites of foreign DNA integration into the AcNPV genome for the heterologous recombinants. These sites were located in the EcoRI-C (map units 42.5-52.4), EcoRI-L (map units 69.5-72.5), EcoRI-0 (map units 32.6-34.5), and EcoRIQ (map units 88.2-89.7) segments of the plaque-isolate-E AcNPV genome (Fig. 6). Two of the non-homologous recombinants carried the insert in the EcoRI-L fragment. The map of the AcNPV genome in Fig. 6 lists the sites of insertion of the Ad2 DNA fragment (A-E) or offragments ofcellular DNA (Nos. 1-7) (see 52, 60-65). The nucleotide-sequence determinations across the sites of junction between the AcNPV DNA and the foreign (Ad2) DNA in one of the nonhomologous recombinants, AcNPV-Ad2E 1-D, revealed no sequence similarities at or close to the sites ofjunctions. A short sequence of six nucleotides was deleted from the original EcoRI-0 sequence of AcNPV DNA at the site of Ad2E1 DNA insertion. The inserted Ad2E1 DNA fragment comprised nucleotides 183-2763. Thus, nucleotides at the termini of the Ad2 DNA fragment had been deleted in the process of recombination.

20

WALTER DOERFLER

3 Recombinant AcNi"bAd2El-

6

3

2/7

Di

192

1 415 ..... .....

iB

AE

I

i c

EcoR I Hindlll

Pst I 0

10

20

30

I

40

50 map units

60

70

80

1

90

FIG. 6. Non-homologous recombination in insect cells. Map projections of the polyhedrin gene-located recombinant AcNPV-Ad2E1-192 and of the heterologous recombinants AcNPVAdBE1-A to -E on the restriction maps of AcNPV DNA. The map also presents the locations of foreign DNA insertions described previously [dotted lines and numbers: 1, cellular DNA insert (52);2, cellular DNA insert (64);3, two different cellular inserts (63); 4 and 5, cellular DNA inserts (60, 62); 6, Spodoptera frugiperda nuclear polyhedrosis virus DNA insert (65);and 7, cellular DNA insert ( S l ) ] . (Reproduced with permission from 54.)

In the usual polyhedrin gene-located recombinants, the foreign Ad2 DNA segment was fused to the polyhedrin promoter and recombined, presumably by homologous recombination via polyhedrin sequence segments in the vector, into the polyhedrin gene of AcNPV. In one of these homologous recombinants, AcNPV-Ad2E1-192, which was analyzed as the control, the Ad2E1 DNA segment between nucleotides 1 and 3117 (of 3322 original nucleotides) was inserted in an inverted orientation between nucleotides - 115 and + 753 of the polyhedrin gene of AcNPV. This particular polyhedrin sequence was deleted in the process. It was uncertain how this recombinant had been generated. The infectivities of the polyhedrin-located recombinant AcNPVAd2E1-192 and of the five non-homologous recombinants were compared, by single-cycle growth curves, to the infectivity of non-recombinant AcNPV. Within a factor of about 1.5, these values were identical at 72 hours postinfection, although there were differences in the timing of virus production. Some of the non-homologous recombinants proved unstable. There was no evidence that the Ad2El region inserted in the non-homologous recombinants was transcribed in S. frugiperda cells. The data presented documented that regions other than the polyhedrin gene in the AcNPV genome are capable of accepting foreign DNA (54). A CELL-FHEE SYSTEMWITH NUCLEAR EXTRACTS FROM Spodoptera frugiperdu CELLS(55)

3. STUDIESI N

In insect cells, the left terminal fragment of Ad2 DNA can insert by nonhomologous recombination into the 32.6-34.5 map unit (EcoRI-0)fragment

100

ADENOVIRUS DNA INTEGRATION AND METHYLATION PATTERNS

21

and into other segments of AcNPV DNA (S), as described in the previous section. We have subsequently imitated this recombination event in vitro by incubating the E l fragment of Ad2 DNA and the EcoRI-0 fragment of AcNPV DNA, both in the plasmid-cloned circular forms, with partly purified nuclear extracts from S . frugiperdu insect cells. Proteins from these extracts were fractionated by gel filtration. After the reextraction of DNA from the incubation mixture, recombinants generated in this cell-free system were identified directly with the PCR by using Tuq polymerase and appropriate primers unique to either of the two reaction partners. The occurrence of recombinants in the cell-free system could also be demonstrated by a biological test in which potential recombinants were isolated by transfection into recA- strains of E . coli. The recombinants identified were all different. The results of control experiments argued against the possibility that unspecific reaction products might have been generated during PCR. Nucleotide-sequence determinations in some of the recombinants localized the sites of genetic exchange between the two partners and assessed the non-homologous nature of the reaction (Fig. 7). The recombinants were, however, characterized by the presence of short patch homologies at or close to the sites of linkage between the reaction partners, as described earlier in the hamster cell system (41).

C. Conclusions and Goals The results described in the preceding sections demonstrate that at least certain aspects of the insertional recombination reaction between Ad12 DNA and hamster DNA in hamster cells or between AcNPV DNA and foreign DNA, like an Ad2 DNA fragment in insect cells, can be imitated in cell-free extracts. It will be the goal of further research to purify the enzymes involved in these non-homologous recombination reactions. Both in living cells and in cell-free nuclear extracts, patch homologies between the recombination partners at the sites of exchange seem to help the non-homologous recombination reaction. By applying different independent methods to identify and characterize the recombination products generated in cell-free systems, we have been able to document unequivocally that at least certain aspects of nonhomologous recombination reactions can be reproduced in cell-free systems. Details of the reaction parameters and of the structure of the recombination products have been very similar between the mammalian-cell and insect-cell systems. It thus appears likely that we have begun to study a generally important mechanism that transcends species limitations. The results of analyses on recombinants adduced with the two different assay systems, (i) the transfection test of potential recombination products into recA- strains of E . coli and (ii) the test based on the PCR have yielded

22

WALTER DOERFLER

1.

Ad2El:

3347 TGAQQYAGGATGAGACCCGCAC

recl:

TGAGGTACGAT CCATGACTGTA

!

ECOR1-0: CBUTklGAAAAAGCCA TGAC TGTA

2

Ad2EI

rec2 EcoRI-0

3

!

TATATTACACAT

CAATTTGdCAC JATATJACACAT 3027 CACTTTTCOCACCGACACTAAT

rec3

CACTTTTCGCT GCGATTAGCTC

'r

TTCTGA TJGQfXXC"CGA TTAACTC

Ad2E1:

2802 GGTGYAAQGI'TCTATGGGTTTA

rec4:

GGTGTAAGCT CACTTGCGTAT

EcoRI-0:

5.

TTGAGCACA

Ad2E1

EcoRI-0

4.

3174 TTTG AQGWCAAC ATACTG ACCC

Ad2El: rec5:

~ B R 3 2 2Tetr:

l

TGTGGAWGGM CCAC TTGCGTAT 2803 GTGTAAQGUTCTATGGGTTTAA pBR322 Tetr GTGTAAGCTT TAATGCGGTAG

I

TCGATU0GTJTAATGCGGTAG

6. Ad2E1:

3239 TTAGGAATGCAATGCAATTTGA

rec6:

TTACCAATGC CGACCACTTGC

EcoR LO:

f:

G TAC TGTGGATTCGACEAC TTGC

FIG. 7. Cell-free recombination in nuclear extracts from insect cells. Analyses of six differe n t recombinants. Recombinants recl, rec2, rec3, and rec6 were generated with the circular pAc610-Ad2El and pBR322-EcoRI-0 plasmids in cell-free nuclear extracts from Spodopteru frugiperdu cells; rec4 and rec5 were generated with the linearized HindIII-G fragment of Ad2 DNA and the circular pBR322-EcoRI-0 plasmid. In all experiments, a Sephacryl S-300 fraction of nuclear extracts was used as the enzyme source. Nucleotide sequences across the sites of junctioo (double-headed vertical arrows) in the recombinants investigated in lanes 1-6, respectively. The origins of the sequences linked in a cell-free system by recombination have been designated: AdlSEl, E l fragment of' Ad2 IINA;

ADENOVIRUS

DNA

INTEGRATION AND METHYLATION PATTERNS

23

very similar results with respect to both the structure of recombinants and the activity of chromatographically purified fractions from nuclear extracts of mammalian or insect cells. The non-homologous recombination of foreign DNA with the genome of eukaryotic cells may be an ancient evolutionarily important process whose mechanisms have therefore been conserved in different species. It is also conceivable that these mechanisms can play a role in the early steps of development when genetic functions are redistributed. The goal in our research for the future will be directed toward the further purification and enzymatic characterization of the cellular and/or viral proteins participating in the reaction. Moreover, we should like to imitate the actual integration reaction of an intact or nearly intact Ad12 genome into cellular DNA in a cell-free system. Lastly, it will be challenging to assess the structures of the recombination targets as they participate in the recombination reaction. It will also be necessary to investigate whether the integration reaction proceeds more effectively in adenovirus-infected cells.

V. De Novo Methylation of Integrated Foreign DNA A. Basics of DNA Methylation: A Brief Synopsis In mammalian DNA, presumably the only modified nucleotide, in fact the fifth nucleotide in DNA, is 5-methyldeoxycytidine (5mC). This fifth nucleotide frequently occurs in the dinucleotide combination 5mC-G; the other dinucleotide combinations, 5mC-A, 5mC-C, and 5mC-T, do occur, but their frequencies in mammalian DNA have not been definitely determined. The results of an as-yet-limited number of investigations on the mode of EcoRI-0, EcoRI-0 fragment of AcNPV DNA; pBR322 Tetr, Tetr gene in the pBR322 plasmid. Recombinants 1-5 were identified directly from the recombination mixture by the polymerase chain reaction (PCR). Recombinant 6 was isolated after transfecting DNA from the reaction mixture into E . coli strain HB101/LM1035 and after identifying Ad2-positive colonies hy hybridization to the 32P-laheled Ad2E1 DNA fragment. The Ad2 nucleotide numbers refer to the Ad2 nucleotide at the immediate site ofjunction and to the published Ad2 DNA sequence (68). The nucleotide sequences in the EcoRI-O fragment were obtained by using nucleotide sequences, determined in recombinants 1-6, as primers for nucleotide sequence determinations in the cloned EcoRI-0 fragment. This procedure proved the locations of the “unknown” nucleotide sequences in the recornbinants in the EcoRI-0 fragment of AcNPV DNA. Sequences identical to Ad2 sequences in individual recombinants were designated by open letters. The Ad2E1 nucleotide sequence (boldface) and the EcoRI-0 sequence or the pBR322 Tetr sequence (italic) were aligned with the corresponding nucleotide sequences in each of recombinants 1-6. (Reproduced with permission from 55.)

24

WALTER DOERF LER

distribution of 5mC in mammalian DNA suggest that this nucleotide is not dispersed randomly in the mammalian genome, but rather in highly specific patterns. These patterns are organ-, tissue-, and probably also speciesspecific. Methylation patterns in certain parts of the human genome are identical among different individuals (69-72). The methyl group in 5mC is introduced postreplicationally by DNA methyltransferases. The gene for one of these enzymes has been cloned and sequenced (73). Two types of DNA methylation can be formally distinguished, and it is not clear whether one or several enzymes andlor modifying factors are involved in these reactions. (i) In maintenance methylation, hemimethylated DNA is the substrate for the DNA methyltransferase system that inserts methyl groups on the unmethylated strand in a specific pattern using the methylated complementary strand as template. In this way, a given pattern of DNA methylation is inherited. However, hemimethylated sequences can persist in mammalian DNA at least for several generations (74).Apparently, even the “simple” mechanism of maintenance methylation is subject to complex controls. (ii) In de nouo methylation, unmethylated DNA is the substrate, and yet the DNA methyltransferase system can impose a highly specific pattern of 5mC residues on DNA, often, on foreign DNA recently integrated into a preexisting mammalian genome (see below). It has been postulated that this de nmo methylation of foreign DNA constitutes a cellular defense mechanism by which the activity of foreign genes potentially detrimental to the host cell can be permanently silenced (2). The deletion of the mouse gene for DNA methyltransferase by homologous gene transfer in embryonic stem cells leads to embryonic lethality (75). It is not certain whether thus manipulated cells still exhibit DNA methyltransferase activity. Of course, it is problematic to draw definite conclusions, because the insertion of foreign DNA into cells is difficult to control and may have affected additional regions of the genome. Moreover, it has not been investigated to what extent the transcriptional activity of regions surrounding the site of insertion has been altered by stem-cell manipulation. For the detection of 5mC residues in specific DNA sequences, two techniques have been applied. The assessment of the exact sequence locations of 5mC residues is essential, but the determination of the percentage of 5mC in DNA is genetically non-consequential. Many methylation-sensitive restriction enzymes (76) have been used to localize 51nc residues in a given DNA sequence. In combination with the Southern blot-hybridization technique (77), this approach has been widely applied after the seminal observation that the CCGG sequence, when methylated in the 3’-located C, is not cut by HpaII, but is cleaved by MspI (78). However, restriction analyses for the presence of 5mC residues are impossible when C-G sequences that are

ADENOVIRUS

DNA INTEGRATION AND METHYLATION PATTERNS

25

not part of recognition sites for methylation-sensitive restriction endonucleases have to be investigated. The only technique that permits the localization of all 5mC residues in a DNA sequence is the genomic sequencing technique (79, 80). For the method to be applicable, all molecules to be analyzed in a given sequence must be equally methylated. This procedure is technically difficult and expensive, and, even with much effort, can be applied only to relatively short stretches of DNA. Caution must, therefore, be exercised in drawing general conclusions from a very limited set of reliable data. In the future, one must focus on experimental work using the genomic sequencing technique or one of its modifications (81). The function of specific patterns of DNA methylation is essentially unknown. In general terms, 5mC can be considered a modulator of DNAprotein interactions with positive or negative effects on the binding of specific proteins to DNA motifs. Thus, practically any enzymatic or non-enzymatic reaction involving DNA could be affected by the presence of 5mC. Much consideration has been invested to study the effect of sequence-specific promoter methylation on promoter activity (for recent reviews, see 82). In our own work, we have used several viral promoters-the ELA promoter of Ad12 DNA (83),the major late promoter (MLP) of Ad2 DNA (84), the VAI gene of Ad2 DNA (85), the p10 promoter of AcNPV DNA (86),and, above all, the late E2A promoter of Ad2 DNA (87-92a)-to demonstrate that the sequence-specific methylation of these promoters leads to their inactivation, as documented by the results obtained in several different test systems. These viral promoters are polymerase-11-dependent, except for the VAI gene of Ad2 DNA, which is controlled by host polymerase 111. Details of these results are not presented here, since I have extensively reviewed different stages of our previous work on sequence-specific promoter methylation and promoter inactivation (2-8, 11). The effects of the sequencespecific methylation of DNA on other genetic functions, such as replication or recombination, have not yet been carefully studied. There is good evidence that 5mC can be deaininated and thus changed to a T residue (93). The presumptive frequency of this mutagenic transition has been implicated to explain the underrepresentation of C-G dinucleotides in the DNA of most higher eukaryotes, notably in mammals (94). Although such deaminations and transitions undoubtedly occur, it is unlikely that they are the sole or even the most important reason for the statistical underrepresentation of C-G dinucleotides. Functional optimization must have been a very powerful force in the development of specific nucleotide sequences in different species. It is of interest in this context that the highly repetitive Alu sequences in human DNA are very rich in C-G sequences and yet highly methylated at the same time (71).Obviously, deamination has not yet altered these sequence combinations in A h elements.

26

WALTER DOERFLER

For the human p53 gene (for a review, see 95), there is evidence that a high frequency of endogenous mutations are introduced at C-G sequences in colorectal cancers (96). In naturally occurring human malignancies, changes in patterns of methylation can be observed in a number of specific human genes (97). As one might expect, more extensive analyses in human malignancies or in human tumor cell lines reveal that such changes in patterns of methylation in specific genes are sometimes, but not always, found (70, 98). In a functional sense, patterns of D N A methylation could be considered indicators of genetic activity or inactivity, with the latter terms not being restricted to the transcription of genes or D N A segments. Since genetic activity patterns are likely to be altered in tumor cells and in mammalian cells growing in culture under somewhat artificial conditions, it is hardly surprising to find changes of methylation patterns under these circumstances. In the early steps of mouse or human development, very specific demethylation and remethylation events take place in specific sequences of mammalian genomes (99). In early stages of development, specific CCGG sequences in several genes can be demethylated in both D N A complements and, later on, remethylated in exactly the same patterns that existed originally. It is unknown what signal(s) direct the de no00 methylation machinery during early mouse or human development.

B. De Novo DNA Methylation,

a Host

Defense Mechanism (2) This proposal is based on experimental work on the integration of foreign D N A in mammalian cells, on the establishment of specific de now0 patterns of D N A methylation, and on the inhibition of transcription by the sequencespecific methylation of promoter sequences. It is suggested that eukaryotic cells have developed several mechanisms of defense against the uptake, integration, and continued expression of foreign DNA. I n the course of evolution and continuing at present, cells have been exposed to foreign DNA, entire genomes, or fragments of them. A particularly problematic organ system in that respect must be the digestive tract of higher organisms. The defense mechanisms are thought to be the following: (i) degradation and/or excretion of foreign DNA; (ii) excision and loss of previously integrated D N A from the host genome; and (iii) targeted inactivation of foreign genes by sequence-specific methylation. Genes whose products could be advantageous to the transformed cells might be selectively excluded from this silencing mechanism. In part, the specificity of de no00 methylation must reside in the D N A methyltransferase systems of the host cell. However, nucleotide sequence, structure, and chromatin arrangement in the foreign D N A could also play an important role.

ADENOVIRUS DNA INTEGRATION AND METHYLATION PATTEHNS

27

Since defense processes must have been activated many times in evolution, present patterns of DNA methylation may reflect vestiges of evolution, i.e., the sum total of selective de nmo methylations, possibly demethylations, and mutations. Could existing patterns of DNA methylation be altered during embryogenesis? One also has to consider the possibility that the insertion and progressive methylation of foreign DNA can lead to alterations in the methylation of flanking host-cell DNA sequences abutting the site of integration. It will be interesting to investigate to what extent these changes can contribute to the oncogenic transformation of cells, particularly after the insertion of foreign (viral) genomes in cells transformed by oncogenic viruses.

C. Role of DNA Methylation in Viral Replication Cycles From studies in the adenovirus and AcNPV systems, it is unlikely that DNA methylation plays a role in the productive replication cycles of these viruses or in the abortive infection of hamster cells with Ad12. In contrast, the double-stranded DNA of the iridovirus, frog virus 3 (FV3), is highly methylated; over 20% of the C residues in the virion DNA are methylated (100). It is likely that all C-G sequences in FV3 virion DNA are methylated, and that 5mC does not occur in any other nucleotide sequence combination (101). There is no evidence that Ad2 or Ad12 virion DNA (102, 103)or AcNPV DNA (104) contains detectable amounts of 5mC. We have investigated whether adenovirus DNA might be methylated in specific sequences, e.g., early in infection in the major late promoter region. Extensive analyses by restriction enzyme cleavage using several different methylation-sensitive restrictases provide no indication for the presence of 5mC in Ad2 DNA early or late after the productive infection of human cells (103)or after the abortive infection of hamster cells with Ad12 (105). We have initiated work on the major late promoter of Ad2 DNA early and late in productively infected human cells using the genomic sequencing method. Preliminary data reveal no 5mC residues in this Ad2 promoter at any time after infection (106). These results must be extended before final conclusions can be drawn. There is, at present, no reason to suggest that DNA methylation plays any role in adenovirus or baculovirus genomes in the productive or abortive infection cycles. These findings contrast with the extensive methylation in specific patterns that the adenovirus genomes are subject to after they become part of the host’s genome, as a consequence of their integration into cellular DNA. Another DNA virus genome, FV3 DNA, is methylated probably to saturation at all C-G sequences in the virion genome (100). DNA methylation is imposed upon newly replicated viral genomes, possibly by a

28

WALTER DOERFLER

virus-encoded DNA methyltransferase. Details of this very interesting system have been reviewed (107).

D. Origin and Spreading of de Novo Methylation across Integrated Adenovirus DNA If the de nmo methylation of integrated foreign DNA is indeed a cellular defense mechanism, it is important to understand its details. In gene therapy, the fixation and regulated expression of extraneously introduced DNA are prime goals. Inactivation of newly introduced genes would be an undesirable consequence of this manipulatory technique. At present, it cannot be predicted what parameters determine the generation of de nuvo introduced patterns of DNA methylation. Several factors seem to play a role: the nature of the nucleotide sequence that has been inserted; the importance of its gene products, if any, for the function and survival of the transformed cell; the site of foreign DNA insertion; the genetics of the host cell; and perhaps the stage in the replication cycle of the cell or its developmental stage at the time the foreign D N A has been integrated. None of these factors has been studied sufficiently. Moreover, all the results adduced from studies with cells in culture may not be applicable to intact organisms. In our own work, we have used cells in culture in whose genomes adenovirus DNA, in particular Ad12 DNA, has been fixed as a model foreign DNA. The integrated Ad12 or Ad2 genomes become extensively methylated in very specific patterns. 1. SPREADINGOF de Novo DNA METHYLATION IN AN INTEGRATED Ad2 DNA SEGMENTCAHRYING A FEW 5mC RESIDUES (74, 108) The establishment of de no00 patterns of DNA methylation in mammalian genomes is characterized by the gradual spreading of methylation, which occurs across an entire integrated adenovirus genome (39, 1091, as well as at the nucleotide level in the integrated late E2A promoter of Ad2 DNA (74,108). By applying the techniques of genomic sequencing and dimethyl sulfate or DNase I in vivo genomic footprinting, we have demonstrated that the spreading of methylation in cell lines that carry the late E2A promoter with three in vitro premethylated CCGG sequences initially involves a DNA domain of this promoter that is devoid of bound proteins. Subsequently, methylation spreads to neighboring regions, and the patterns of complexed transcription factors are altered. Evidence has been adduced that DNA methylation at sequences homologous to the AP-1 and octamer binding-factor sites interferes with protein binding. In contrast, the methylation of sequences in the vicinity of, but not involving, sequences homologous to an AP-2 site still permits the binding of

ADENOVIRUS

DNA

INTEGRATION AND METHYLATION PATTERNS

29

proteins to these sites. It is significant that, during the spreading of methylation, a few C-G sequences can remain hemimethylated for several cell generations before they also become methylated in both complements. Moreover, in the Adz-transformed cell-line HE2, the integrated heavily methylated late E2A promoter has been shown, by the genomic sequencing technique, to contain 5mC residues, not only in all C-G dinucleotides but also in one C-A and one C-T dinucleotide sequence. Hence, 5mC can occur in a silenced mammalian DNA sequence in dinucleotides other than C-G. This finding raises the question ofwhether 5mC in non-C-G dinucleotides is maintained in the methylated state during continuous cell propagation (74, 108). 2. INITIATIONAND SPREADING OF de Nouo DNA METHYLATION IN INTEGRATED Ad12 DNA IN HAMSTER CELLS(109) The establishment of de nmo generated patterns of DNA methylation is characterized by the gradual spreading of DNA methylation in newly integrated foreign DNA. We have used integrated Ad12 genomes in hamster tumor cells as model system to study the mechanism of de nmo DNA methylation. Ad12 induces tumors in neonate hamsters, and the viral DNA is integrated into the hamster genome, usually nearly intact and in an orientation that is colinear with that of the virion genome (14).The integrated Ad12 DNA in the tumor cells is initially weakly methylated at the CCGG sequences. Upon explanation of the tumor cells into culture medium, DNA methylation at CCGG sequences gradually spreads across the integrated viral genomes with increasing passages of cells in culture. Contrary to expectations, the de nmo inethylation of integrated Ad12 DNA in Adl2-induced hamster tumor cells does not spread into the viral genome from the abutting cellular DNA sequences, but is reproducibly initiated in paracentral regions of the integrated viral genome and progresses from there in either direction across major internal parts of the Ad12 genome. Eventually, the genome is strongly methylated, except for 15 map units on the left terminus and 10 map units on the right terminus of the integrated Ad12 DNA which remain hypomethylated. Very similar patterns have been observed in tumor cell lines with different sites of Ad12 DNA integration. In contrast, the levels of DNA methylation do not seem to change after tumor-cell explantation in several segments of hamster cell DNA of the unique or repetitive type. Restriction (HpaII) and Southern-blot experiments have been performed using selected cloned hamster cellular-DNA probes. The data suggest that, in the integrated foreign DNA, nucleotide sequences or structures or chromatin arrangements exist that can be preferentially recognized by the system responsible for de nmo DNA methylation in mammalian cells (109).A combination of these parameters may be decisive in determining pattern formation.

30

WALTER DOERFLER

E. Insertion of Foreign (Ad12) DNA and Changes of DNA Methylation in the Adjacent Cellular DNA Sequences (9) The methylation patterns in the genome of mammalian cells are remarkably stable, although occasional changes are observed. In mammalian cells, the unmethylated DNA of human adenovirions (102) undergoes de nmo methylation after integration into the host hamster genome (110).The site of linkage between the left terminus of Ad12 DNA and unique hamster DNA in the Adl2-induced tumor T1111(2) has been analyzed in detail (22).In what way, if any, are the methylation patterns of the adjacent cellular DNA segments affected by the insertion of unmethylated foreign (Ad12) DNA? In normal hamster kidney and spleen DNA and in several Adl2-transformed hamster cell lines, this pre-insertion sequence is completely methylated at the CCGG (HpaII) and GCGC (HhaI) sequences. The same pre-insertion sequences in the DNA of cell line BHK2l and on the non-occupied chromosome in the tumor cell-line H1111(2) in passage 9 are almost completely methylated. In contrast, the same sequence on the chromosome that carries the integrated Ad12 DNA sequence in the tumor T1111(2) is unmethylated at the CCGG and GCGC sequences, as are the abutting Ad12 DNA sequences. Thus, the insertion of unmethylated foreign DNA can lead to the hypomethylation of the flanking cellular DNA in the target sequences. The data presented from this limited, though exemplary, analysis do not yet allow far-reaching conclusions. However, similar results on alterations of patterns of cellular DNA methylation at the sites of retroviral DNA insertion have been reported (111).Extensive stretches of cellular DNA abutting and extending from the sites of foreign DNA integration must be investigated for possible changes in patterns of methylation. At present, it is not known whether these changes involve only a few hundred nucleotides or affect regions of many thousands of base-pairs of cellular DNA. This problem is intriguing to investigate, because changes in cellular DNA methylation may cause changes in the expression patterns of cellular DNA as a consequence of foreign DNA integration. These changes in turn could decisively contribute to the generation of the oncogenic phenotype.

VI. Alterations in Cellular Gene Expression in Adenovirus-Infected and -Transformed Cells ( 7 72) The interaction of adenoviruses with mammalian cells is characterized by the regulated expression of the free viral genome in infected cells or of the integrated viral genomes in transformed cells that are probably subject to a

ADENOVIRUS

DNA

INTEGRATION AND METHYLATION PATTERNS

31

different type of control. In both adenovirus-infected and -transformed cells, alterations in the transcription levels of cellular genes appear to play a decisive role. In adenovirus-infected human cells, a paradigm productive system, many of the cellular genes are turned off, particularly at late times after infection (113). In adenovirus-transformed cells, detailed analyses on a large number of cellular genes and on alterations of their transcriptional activities in comparison to non-transformed cells have been performed (1 14). We have searched for possible changes in the transcriptional program of 40 specific cellular genes in adenovirus-infected and -transformed cells. This necessarily limited approach is only a beginning. The selection of identified cellular genes or gene segments as hybridization probes offers obvious advantages, although the logistic difficulties become considerable when the activities of hundreds or even thousands of cellular genes are to be determined. On the other hand, we suspect that a true understanding of the transformed state of a cell will depend on the availability of a catalog of alterations in its transcriptional activity patterns. The present study was initiated because of our earlier observation that the integration of Ad12 DNA into the hamster genome in Adl2-induced tumor cells can lead to changes in the patterns of cellular DNA methylation in the immediate vicinity of the site of foreign DNA insertion (9). While these investigations will be expanded to more extensive stretches of cellular DNA sequences in the environment of the integration site(s), the data available render it conceivable that alterations of methylation patterns in cellular DNA could have a profound effect on cellular transcription patterns and might thus contribute to the mechanism of viral transformation (11).Patterns of DNA methylation are coupled in a complex way to the activity levels of eukaryotic genes (2-8, 11). Forty different cellular genes or gene segments were used as hybridization probes to analyze the cytoplasmic RNA from Ad2-infected KB cells, from Ad5-transformed human cells (293), and from several Ad2- or Adltransformed hamster cell lines. Many of the genes probed were not expressed in human or hamster cells. Transcription of the adenosylphosphoribosyl-transferase (ADPRT) gene and the heat-shock-protein-70 gene was increased in Adz-infected KB cells and in 293 cells. In Ad2infected KB cells, c-myc gene transcription was decreased. In 293 cells and in three adenovirus-transformed hamster cell lines (T637, BHK21Ad2ElA-E lB, and BHK2l-Ad2HindIII-G), the transcription of the c-jun gene was increased, whereas c-myc transcription was decreased in the latter two cell lines. The data presented here demonstrate that, among 40 different mammalian gene probes, alterations in steady-state levels of RNA were detected for five of these genes. These findings imply that a rather large number of host cellular genes might be affected in their expression levels by adenovirus

32

WALTER DOERFLER

infection or transformation. Substantial experimental efforts are required to catalog and understand the functional meaning of these alterations in cellular transcription patterns. We feel motivated by the present data to contribute to the elucidation of these complex programs, particularly in adenovirustransformed cells.

VII. A Different View of Insertional Mutagenesis Viral (foreign) DNA can integrate into mammalian genomes at many different sites. Frequently, these cellular integration sites are characterized by a state of transcriptional activity. Perhaps the specific chromatin structure associated with active transcription or replication predisposes such cellular sequences for recombination with foreign (viral) or any artificially introduced DNA. In a limited number of instances, the patterns of methylation in the DNA sequences surrounding sites of foreign DNA integration can be fundamentally altered. So far, such changes have been documented only for rather short stretches of cellular DNA. Loss of methylation can be associated with the gain of transcriptional activity; an increase of the de noljo methylation of previously unmethylated cellular DNA sequences can lead to the silencing of cellular genes. In this way, it is conceivable that the insertion of foreign (adenoviral) DNA into a preexisting mammalian genome could entail a very considerable change in the pattern of transcription and gene expression in these cells. I suggest the possibility that such alterations could play a decisive role in the malignant transformation of cells by viruses. Changes of cellular gene expression in adenovirus-transformed cells have been documented. Among the few genes tested, a surprisingly high proportion is affected in their transcriptional activities in adenovirus-transformed or Adl2-induced tumor cells (112). Although we do not yet understand the reasons for these activity changes, they are consistent with the idea that insertion of foreign DNA could have activated or inactivated cellular genes. Since insertion can occur at many different cellular sequences, it would be expected that the alterations in the spectrum of transcriptional activities involve different sets of cellular genes in different adenovirus-transformed cells or Adl2-induced tumor cells. These and other considerations make it a prime goal to investigate possible changes in patterns of methylation and transcription in long segments encompassing many kilobases of cellular DNA abutting sites of adenovirus DNA integration. As a motivation and goal of this research program, I have formulated the following working hypothesis. By the insertion of foreign DNA into an estab-

ADENOVIRUS

DNA

INTEGRATION AND METHYLATION PATTEHNS

33

lished mammalian genome with a well-regulated transcriptional program, the D N A at the site of insertion is disrupted and a gene, if located at this site, may be inactivated; this is the traditional view of the term “insertional mutagenesis.” In an extension of this well-founded consideration, I propose that a multitude of genes and functional genetic elements in the closer or even wider vicinity of this insertion site might be affected in their transcriptional activity by changes in D N A methylation as a consequence of foreign D N A insertion. We do not yet understand the mechanism by which foreign (adenoviral) D N A integration can influence, increase or decrease, de nmo generate, or abolish patterns of D N A methylation. However, extensive alterations of cellular transcriptional programs linked via alterations in patterns of D N A methylation as a consequence of adenovirus D N A or other D N A integration suggest a genetically stable mechanism for virus-induced transformation and oncogenesis that could operate independently of or in conjunction with viral gene products, like proteins encoded in the E l region of adenovirus D N A (for reviews, see 115-117).

ACKNOWLEDGMENTS The contributions that many pre- and postdoctoral colleagues in my laboratory have made to this research are referenced in the list oforiginal publications. I thank Petra Bohm for expert editorial work. I am indebted to the Deutsche Forschungsgemeinschaft through SFB274-TP1 and to the Bundesministerium fur Forschung und Technologie through Genzentrum Koln (TP2.03) for grant support. The Fritz Thyssen Stiftung and the Alexander van HumboldtStiftung provided valuable fellowships for Stefan Kochanek, Joachim Schorr, Miklos Toth, and Guangming Xiong.

REFERENCES 1. R. M. Kotin, J. C. Menninger, D. C. Ward and K. I. Berns, Genomics 10, 831 (1991). 2. W. Doerfler, Biol. Chem. lioppe-Seyler 372, 557 (1991). 3. W. Doerfler, J. Gen. Virol. 57, l(1981). 4. W. Doerfler, ARB 52, 93 (1983). 5. W. Doerfler, Nucleic Acids Mol. B i d . 3, 92 (1989). 6. W. Doerfler, Philos. Trans. R. Sac. London, Ser. B B326, 253 (1990). 7. W. Doerfler, Ado. Virus Res. 39, 89 (1991). 8. W. Doerfler, in “DNA Methylation: Molecular Biology and Biological Significance”(J. P.

Jost and H. P. Saluz, eds.), p. 262. Birkhauser, Basel, Boston, Berlin, 1993. 3. U. Lichtenberg, C. Zock and W. Doerfler, Virus Res. 11, 335 (1988). 10. W. Doerfler, J. Virol. 6, 652 (1970). 11. W. DoerAer, in “Malignant Transformation by DNA Viruses: Molecular Mechanisms” (W. Doerfler and P. Bohm, eds.), p. 141. Verlag Chemie, Weinheim, New York, Basel, Cambridge, 1992.

34

WALTER DOERFLER

12. W. Doerfler, PNAS 60, 636 (1968). 13. W. Doerfler, Curr. Top. Microbiot. Zmmunot. 101, 127 (1982). 14. W. Doerfler, R. Gahlrnann, S. Stabel, R. Deuring, U . Lichtenberg, M . Schulz, 11. Eick and R. Leisten, Curr. Top. Microbiol. Zrnnunol. 109, 193 (1983). 15. R. Deuring, 6. Klotz and W. Doerfler, PNAS 78, 3142 (1981). 16. R. Deuring, U. Winterhoff, F. Tamanoi, S. Stabel and W. Doerfler, Nature 293,81(1981). 17. S. Stabel and W. Doerfler, NARes 10, 8007 (1982). 18. R. Gahlmann, R. Leisten, L. Vardimon and W. Doerfler, EMBO J. 1, 1101 (1982). 19. R. Gahlmann and W. Doerfler, NARes 11, 7347 (1983). 20. R. Deuring and W. Doerfler, Gene 26, 283 (1983). 21. M. Schulz and W. Doerfler, NARes 12, 4959 (1984). 22. U . Lichtenberg, C. Zock and W. Doerfler, J. Virol. 61, 2719 (1987). 23. R. Jessberger, B. Weisshaar, S. Stabel and W. Doerfler, Virus Res. 13, 113 (1989). 24. R. Gahlrnann, M. Schulz and W. Doerfler, EMBOJ. 3, 3263 (1984). 25. M. Schulz, U. Freisem-Rabien, R. Jessberger and W. Doerfler, J. Virol. 61, 344 (1987). 26. J. Groneberg, D. Sutter, H. Soboll and W.Doerfler, J. Gem Virol. 40, 635 (1978). 27. D. Eick, S . Stabel and W. Doerfler, J. Virol. 36, 41 (1980). 28. I. Kuhlmann, S. Achten, R. Rudolph and W. Doerfler, EMBO J. 1, 79 (1982). 29. J. MeDougall, A. R. Dunn and K. W. Jones, Nature 236, 346 (1972). 30. T. Rosahl and W. Doerfler, Virology 162, 494 (1988). 31. 6. Oreiid, A. Linkwitz and W. Doerfler, submitted (1993). 32. J. Groneberg, D. T. Brown and W.Doerfler, Virology 64, 115 (1975). 33. M. Mandel and A. Higa, J M B 53, 159 (1970). 34. F. L. Graham and A. J. van der Eb, Virobgy 52, 456 (1973). 35. U . Wienhues, K. Hosokawa, A. Hoveler, B. Siegmann and W.Doerfler, D N A 6,81 (1987). 36. K. Sato and K. Hosokawa, J. Biochenz. 95, 1031 (1984). 36a. R. Schubbert, C . Lettmann and W. Doerfler, submitted (1993). 37. C. M. Radding, in “Genetic Recombination” (R. Kucherlapati and G . R. Smith, eds.), p. 193. American Society for Microbiology, Washington, D.C., 1988. 38. S. Stabel, W. Doerfler aid R. R. Friis, J. Virol. 36, 22 (1980). 39. I. Kuhlmann and W. Doerfler, Virology 118, 169 (1982). 40. R. Jessberger, D. Heuss and W. Doerfler, EMBO J. 8, 869 (1989). 41. J. Tatzelt, B. Scholz, K. Fechteler, R. Jessherger and W. Ilnerfler, J M B 226, 117 (1992). 42. J. Tatzelt, K. Fechteler, P. Langenbach and W. Doerfler, P.VAS 90 (1993). 43. N. Watson, Gene 70, 399 (1988). 44. W. Kruijer, F. M. A. van Schaik, J. G . Speijer and J. S. Sussenbach, Virology 128, 140 (1983). 45. J. Sprengel, B. Schmitz, D. Heuss, C. Zock and W. Doerfler, subinitted (1993). 46. R. Jessberger and P. Berg, MCBiol 11, 445 (1991). 47. S.T. Tjia, G. Meyer zu Altenschildesche and W. Doerfler, Virology 125, 107 (1983). 48. H. Lubbert and W. Doerfler, J. Virol. 50, 497 (1984). 49. H. Lubbert and W. Doerfler, /. Virol. 52, 255 (1984). 50. C . Oellig, B. Happ, T. Miiller and W. Doerfler, J. V i m / .61, 3048 (1987);corrigendum: J. Virol. 63, 1494 (1989). 51. B. Happ, J. Li and W. Doerfler, /. Viral. 65, 89 (1991). 52. C. Schetter, C. Oellig and W. Doerfler, J. Viral. 64, 1844 (1990). 53. R. Krappa, A. Behn-Krappa, F. Jahnel, W. Doerfler and I). Knebel-Morsdorf, /. Virol. 66, 3494 (1992). 54. G. Xiong, J. Schorr, S . T. Tjia and W. Doerfler, Virus Res. 21, 65 (1991). 55. J. Schorr and W. Doerfler, Virus Res. 28, 1.53 (1993).

ADENOVIRUS DNA INTEGRATION AND METHYLATION PATTERNS

35

G . E. Smith, M. D. Summers and M. J. Fraser, MCBiol 3, 2156 (1983). W. Doerfler, Curr. Top. Microbiol. Immunol. 131, 51 (1986). V. A. Luckow and M. D. Summers, BioTechnology 6, 47 (1988). L. K. Miller, Annu. Reu. Microbiol. 42, 177 (1988). D. W. Miller and L. K. Miller, Nature 299, 562 (1982). M. J. Fraser, G. E. Smith and M. D. Summers, J. Virol. 47, 287 (1983). P. D. Friesen, W. C. Rice, D. W. Miller and L. K. Miller, MCBiol 6, 1599 (1986). E. B. Carstens, Virology 161, 8 (1987). f54,B. Beames and M. D. Summers, Virology 168, 344 (1989). 65. M. A. Gonzales, G . E. Smith and M. D. Summers, Virology 170, 160 (1989). 66. S . T. Tjia, E. B. Carstens and W. Doerfler, Virology 99, 399 (1979). 67. G . E. Smith and M. D. Summers, Virology 89, 517 (1978). 68. R. J. Roberts, G . AkusjPrvi, P. Alestrom, R. E. Gelinas, T. R. Gingeras, D. Sciaky and U. Pettersson, “Adenovirus DNA,” Deu. Mol. Virol. 8, 1 (1986). 69. S. Kochanek, M. Toth, A. Dehmel, D. Renz and W. Doerfler, PNAS 87, 8830 (1990). 70. S. Kochanek, A. Radbruch, H. Tesch, D. Renz and W. Doerfler, PNAS 88, 5759 (1991). 71. S. Kochanek, D. Renz and W. Doerfler, EMBOJ. 12, 1141 (1993). 72. A. Behn-Krappa, I. Holker, U. Sandaradura de Silva and W. Doerfler, Genornics 11, 1 (1991). 73. T. Bestor, A. Laudano, R. Mattaliano and V. Ingram, J M B 203, 971 (1988). 74. M. Toth, U. Muller and W. Doerfler, JMB 214, 673 (1990). 75. E. Li, T. H. Bestor and R. Jaenisch, Cell 69, 915 (1992). 76. M. McClelland and M. Nelson, Gene 74, 291 (1988). 77. E. M. Southern, JMB 98, 503 (1975). 78. C. Waalwijk and R. A. Flavell, NARes 5, 3231 (1978). 79. 6. M. Church and W. Gilbert, PNAS 81, 1991 (1984). 80. H. P. Saluz and J. P. Jost, “A Laboratory Guide to Genomic Sequencing.” Birkhauser, Basel, Boston, 1987. 81. G. P. Pfeifer, S. D. Steigerwald, P. R. Mueller, B. Woldand A. D. Riggs, Scir?nce246,810 (1989). 82. J. P. Jost and H. P. Saluz, Eds., “DNA Methylation: Molecular Biology and Biological Significance.” Birkhauser, Basel, Boston, Berlin, 1993. 83. I. Kruczek and W. Doerfler, PNAS 80, 7586 (1983). 84. P. Dobrzanski, A. Hoeveler and W. Doerfler, J. Virol. 62, 3941 (1988). 85. R. Jiittermann, K. Hosokawa, S . Kochanek and W. Doerfler, J. Viro!. 65, 1735 (1991). 86. D. Knebel, H. Lubbert and W. Doerfler, EMBOJ. 4, 1301 (1985). 87. L. Vardimon, A. Kressmann, H. Cedar, M. Maechler and W. Doerfler, PNAS 79, 1073

56. 57. 58. 59. 60. 61. 62. 63.

(1982).

88. K.-D. Langner, L. Vardimon, D. Renz and W. Doerfler, PNAS 81, 2950 (1984). 89. K.-D. Langner, U. Weyer a i d W. Doerfler, PNAS 83, 1598 (1986). 90. B. Weisshaar, K.-D. Langner, R. Juttermann, U. Muller, C. Zock, T. Klimkait and W. Doerfler, JMB 202, 255 (1988). 91. B. Knust, U. Bruggemann and W. Doerfler, J. Virol. 63, 3519 (1989). 92. R. Hermann, A. Hoeveler and W. Doerfler, JMB 210, 411 (1989). 92u. C. Lettmann, B. Schmitz and W. Doerfler, WARes 19, 7131 (1991). 93. C. Coulondre, J. H. Miller, P. J. Farabaugh and W. Gilbert, Nature 274, 775 (1978). 94. H. Subak-Sharpe, R. R. Burk, L. V. Crawford, J. M. Morrison, J. Hay and H. M. Keir, CSHSQB 31, 737 (1966). 95. A. J. Levine, N. Engl. J. Med. 326, 1350 (1992). 96. E. R. Fearon and B. Vogelstein, Cell 61, 759 (1990).

36

WALTER DOERFLER

97. A. P. Feinherg and B. Vogelstein, Nature 301, 89 (1983). 98. S. Achten, A. Behn-Krappa, M. Jiicker, J. Sprengel, I . Holker, B. Schmitz, H. Tesch, V. Diehl and W. Doerfler, Cancer Res. 51, 3702 (1991). 99. R. Shemer, T. Kafri, A. O’Conell, S. Eisenberg, J. L. Breslow and A. Razin, PNAS 88, 11300 (1991). 100. D. B. Willis and A. Granoff, Virology 107, 250 (1980). 101. C. Schetter, B. Grunemann, I. Holker and W. Doerfler, suhmitted (1993). 102. U. Gunthert, M. Schweiger, M. Stupp and W. Doerfler, PNAS 73, 3923 (1976). 103. U. Wienhues and W. Doerfler, J. Virol. 56, 320 (1985). 104. D. Eick, H.-J. Fritz and W. Doerfler, A n d . Biochem. 135, 165 (1983). 105. L. Vardimon, R. Neumann, I. Kuhlmann, D. Sutter and W. Doerfler, NAHes 8, 2461 (1980). 106. D. Kammer, S. Kochanek and W. Doerfler, unpublished results. 107. D. B. Willis, R. Coorha and V. G. Chinchar, Curr. Top. Microbiol. I7nmunol. 116, 77 (1985). 108. M. Toth, U. Lichtenberg and W. Doerfler, PNAS 86, 3728 (1989). 109. G . Orend, 1. Kuhlmann and W. Doerfler, J. Virol. 65, 4301 (1991). 110. D. Sutter, M. Westphal and W. Doerfler, Cell 14, 569 (1978). 111. D. Jiihner and R. Jaenisch, Nature 315, 594 (1985). 112. T. Rosahl and W. Doerfler, Virus Res. 6, 71 (1992). 113. A. Bahich, L. T. Feldman, J. R. Nevins, J. E. Darnell, Jr. and C. Weinherger, MCBiol3, 1212 (1983). 114. A. J. van der Eb and A. Zantema, in “Malignant Transformation by DNA Viruses: Molecular Mechanisms” (W. Doerfler and P. Bohrn, eds.), p. 115. Verlag Chemie, Weinheim, New York, Basel, Cambridge, 1992. 115. A. J. Berk, ARCen 20, 45 (1986). 116. J. R. Nevins, Microbiol. Ren 51, 419 (1987). 117. J. Flint and T. Shenk, ARGen 23, 141 (1989).

Posttranscriptional Control of the Lysogenic Pathway in Bacteriophage Lambda’ AMOS B. OPPENHEIM, DANIELKORNITZER~ AND SHOSHY ALTUVIA3 Department of Molecular Genetics The Hebrew University-tiadassah Medical School lerusalem, Israel 91910

DONALDL. COURT Molecular Control and Genetics Section Luboratory of Chromosome Biology A BL-Basic Research Program National Cancer Institute-Frederick Cancer Research and Development Center Frederick, Maryland 21 702

I. 11. 111. IV. V.

h Genes Involved in the Lysis/Lysogeny Decision . . . . . . . . . . . . . . . . . RNase 111 in Posttranscriptional Regulation of A Genes . . . . . . . . . . . . . Stimulation of cll Translation by I H F . . . . . . . . . . . . . . . . . . . . . . . . . . . . Metabolic Instability of Phage Regulatory Proteins . . . . . . . . . . . . . . . . . Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38 40 45 46 47 47

Bacteriophage A is a temperate phage that infects Escherichia coli. A can propagate itself in two different ways: by lytic growth and by lysogenic growth (1-4). Infection of a bacterial cell begins by adsorption of the phage to

The U.S. Government’s right to retain a nonexclusive royalty-free license in and to the copyright covering this paper, for governmental purposes, is acknowledged. 2 Current address: Whitehead institute, Nine Cambridge Center, Cambridge, Massachusetts 02142. 3 Current address: Cell Biology and Metabolism Branch, National Institute of Child Health and Development, National Institutes of Health, Bethesda, Maryland 20892.

Progress ~n Nucleic A c d Research and Molecular BioInQ, Vol 46

37

Copyright 0 1993 by Academic Press, Inc All rights of reproduction in any form reserved.

38

AMOS B. OPPENHEIM ET AL.

a specific receptor on the host cell surface, the LamB protein, and injection of the phage DNA into the cytoplasm. In lytic growth, the injected DNA promotes the synthesis of the proteins required for its replication, for phage morphogenesis and for cell lysis, releasing after about 45 minutes some 100 new phage particles. During lysogenic development, the phage functions required for lytic growth are turned off, and genes required for the lysogenic pathway are turned on. These functions include the CI repressor, which inhibits lytic functions, and Int, which allows integration of the phage genome into the bacterial chromosome by site-specific recombination. From that time, and as long as the lytic functions are repressed, the phage genome passively multiplies along with the bacterial chromosome. The repressor maintains the prophage in a dormant state. In the event that the repressor protein in the cell is inactivated, the prophage resumes lytic growth. The ability of A to choose between lytic and lysogenic development has evolved as a response to changes in the physiological state of the host cell. The frequency by which the phage enters the lytic or lysogenic pathway is determined in large measure by the nutritional state of the host cell. For example, starved cells lysogenize more efficiently than cells grown in a rich medium (2, 5 , 6). In addition, the number of infecting phage particles per cell is also known to determine the rate of lysogenization: the higher the multiplicity of infection, the higher the rate of lysogenization (5).It is probable that the physiological state of the cell is signaled to the phage by host regulatory factors. The study of the mechanisms by which bacteriophage A utilizes host functions has led to the discovery of E . coli genes participating in transcription, recombination, replication, and the heat-shock response. The control of transcription initiation from the early promoters of bacteriophage A is not sufficient to provide for subtle controls involved with the switch between lysogenic and lytic pathways. In this essay, we review recent developments in the study of A phage gene expression showing that posttranscriptional control of phage genes by host functions provides for highly sensitive regulatory circuits.

1.

Genes Involved in the Lysis/Lysogeny Decision

Upon entry of the phage DNA into the cytoplasm, two divergent promoters are transcribed: pRand pL. The first genes to be transcribed are cro from pn and N from pL(Fig. 1)(7, 8). The N protein is a transcription antiterminator: it allows transcription from pR and pL to continue beyond the terminators located after the cro and N genes, respectively, including the pH-

POSTTRANSCRIPTIONAL CONTROL O F

h

LYSOGENY

39

FIG. 1. Early control genes and transcription. Transcription from the two major early promoters, pI, and pR, is indicated by the dashed arrows. In the absence of N, transcription terminates at tL1 and tR1. In the presence of N , RNA polymerase transcribes through all the terminators shown. In the presence of CII protein, the p,, pE, and paUtranscripts are initiated. When Q is made, it causes transcription termination from the pR’promoter to allow expression of R and other lytic late genes. The LexA-controlled Pooptranscript is indicated.

expressed cZl gene and the p,-expressed cZZZ gene (8-10). The CII protein is a transcription modulator that promotes the lysogenic pathway: it activates the pE promoter of the repressor gene (cZ) and the pI promoter of the int gene, which is required for the site-directed recombination of the phage genome with the bacterial chromosome (2, 11-14). Furthermore, CII depresses the expression of genes required for the lytic pathway. This is accomplished by activating the paq promoter, thereby yielding an antisense RNa to the Q (Fig. 1)mRNA (15).Thus, the three functions carried out by CII are all accomplished by the activation of “silent” promoters on A DNA. CII is a tetrameric DNA-binding protein, and, in uitro, i s both necessary and sufficient for the activation of the three CII-dependent promoters (11, 16). CII i s the “master switch” in the decision between the lytic and lysogenic pathways (2, 7, 17). The CII protein is unstable, with a half-life in the cell of approximately 1 minute (18).The function of the CIII protein is to stabilize CII (19, 20) and thereby raise its intracellular concentration. Since the level of CII is critical in the lysis/lysogeny switch, the CIII function plays a major role in this decision. Phages defective in clll lysogenize less efficiently than wild-type phages (12, 21), whereas phages overexpressing the clll gene lysogenize more efficiently than the wild type (19, 22). In addition, the phenomenon of higher frequency of lysogenization upon infection at a higher moi appears to depend on the number of copies of the cZZ1 gene present in the cell (12, 23). When a sufficient level of CII protein is present to activate the switch toward lysogeny, repressor will be synthesized and will shut off the early promoters, pR and pL; the integrase protein, synthesized from pr, will promote the integration of the phage genome into the bacterial chromosome. Once repression is established, cZZ transcription is turned off and the continued synthesis of repressor is directed by the maintenance p n n promoter. Alternatively, in the event that the level of CII is not sufficient to activate the switch, repressor will not be made and lytic development will be continued. The 0 and P proteins direct phage DNA replication; Q will allow late gene expression required for DNA packaging and cell lysis (2, 7, 8).

40

AMOS B. OPPENHEIiM E T AL.

II. RNase 111 in Posttranscriptional Regulation of li Genes RNase 111 is a cellular double-strand-specific endoribonuclease, which is involved in the maturation process of rRNA, and also in the processing of various cellular and phage mRNAs (24, 25). RNase 111, encoded by the m c gene, is a 25-kDa protein that acts as a dimer. All known RNase-111 processing sites have the potential to form stem-and-loop structures that are cuts on both strands, with a %base stagger, yielding a 3' overhang (26, and references therein). In cells defective in RNase I11 ( m e - ) , some host transcripts, such as "pnp mRNA (27), and m e mRNA itself (28),are stabilized and gene expression is enhanced. Other genes, like the 0.3, 1.1, and 1.2 genes of phage T7, require RNase I11 for efficient expression (29). Here we describe the effect of RNase 111 on the expression of the four A genes: N , e l l , c l l l , and int.

A. Stimulation of N Translation by RNase Ill The N gene is the first gene transcribed from the major leftward promoter (pL).Expression of ell and ell1 requires N-dependent transcription antitermination. This modulation by N allows the phage to respond to external stimuli and adjust its development accordingly. The expression of the N gene is controlled transcriptionally, translationally, and posttranslationally. Transcription of N from the pL promoter is negatively controlled by the A CI and Cro repressors; translation of N is positively controlled by RNase 111; stability of the N protein is negatively controlled by the Lon protease of E . coli. In addition, recent experiments show that N protein negatively autoregulates its own translation (30). The AUG translation-initiation signal for N is located 223 nucleotides from the start of transcription (Fig. 2). This N leader RNA forms a large secondary structure, consisting of a 21-bp stem-bulge-stem structure with a 90-nucleotide loop. This structure is processed at two sites by RNase 111, after nucleotides 88 and 197 of the N inRNA (31, 32). Processing does not affect N mRNA stability (33, 34). However, N expression, as measured using an N-lucZ protein fusion construct, is enhanced three- to fivefold in wildtype cells as compared to RNase-111-deficient ( m e - ) cells. It has been suggested that the stem structure processed by RNase I11 inhibits ribosome access to the N Shine-Dalgarno region (33).This model is consistent with the observation that a genetic deletion of the stem structure region results in a high constitutive expression of N , which is now RNase-111-independent.

6. Stimulation of clll Translation by RNase Ill The ell1 mRNA region that includes the ribosome binding site and the beginning of the coding sequence is predicted to be able to form two alterna-

POSTTHANSCHIPTIONAL CONTROL OF

h

41

LYSOGENY

U-A G-C U-A G-C U-A

RNase 111

G-c -dG cA

-

88G-C G-Cl97 U-A U-A U-A

+

nufL

AAG G A U-A

RA18-g;: G-U A A

c-G C-G C-G

G-C A-U C-G

A A

223 ACAGGAGPAUCCAGEGAUGCACAA 3' SD N

43

...

FIG.2. Structure of the RNase-Ill site in the N leader. The bases of the N leader are numbered relative to the first base of the pL transcript. The secondary structures drawn have been determined by Steege et al. Biol. Chem. 262, 17651-17658 (1987)] and include the b o d stem and loop of nutL to the left base 65 and the larger RNase-III-processed stem and loop. Sites processed by RNase I11 are indicated by arrows. The RA18 deletion endpoints are represented by lines drawn between bases 64 and 65 and bases 82 and 83. The Shine-Dalgarno sequence of gene N is underlined, as is the AUG initiation codon.

u.

tive secondary structures (Fig. 3) (19, 35). One of these structures has its ribosome binding site occluded (clll-OFF structure), and is therefore expected to be inactive in translation; in the alternate structure (clll-ON), the ribosome binding site is available for binding. RNA structure-probing experiments showed that the wild-type c l l l mRNA is found in these two structures at equilibrium. Mutations that either increase or decrease the rate of translation of cII1 have been isolated: mutations that increase translation favor the formation of the clll-ON structure (i.e., tor862), whereas mutations that decrease translation cause the c l l l RNA to assume the clll-OFF (i.e., a C-to-A change at position 20) (Fig. 3) (36, 37). The rate of translation is reflected in the ability of these various mRNAs to bind to purified 30-S ribosomal subunits in vitro, demonstrating that the effect of these mutations is at the level of translation initiation.

42

AMOS B . OPPENHEIM ET AL.

CIII - OFF

tor862

C A G G U:G U:G A-U LF-C-G C-G G-C U C-A A-U U: G A-U A-U C-G G-C

*20

U A-U C-G C-G A C C A-U C-G G-C G-C A?AUAAG-CUUCCG.

u

G

A

c

U tor862

C-

U-A A-U A-U C-G 5. AAAUAAGGAGCACACCAUG-C A-UUCCG..3 G-C RNase I11 G-C G-C U:G G:U G-C C-G *20 A-C-G

u

u

G U

C G U

A

54

-

cIII - ON

. 3'

FIG.3. Alternative cZZZ mRNA structures. The two structures for the clZZ mrNa exist in equilibrium. The structure on the left contains the Shine-Dalgarno sequence and the AUG initiation codon (indicated by the bases in boldface) within the base-paired region, and crzz translation is OFF. The structure on the right has the Shine-Dalgarno sequence and the AUG codon open for ribosome binding, and clZ2 translation is ON. RNase 111 may bind and stabilize the O N configuration. Mutation tor862 shifts the equilihrium to the right and favors cZZI expression, even in the absence of RNase 111. Mutation 20 shifts the equilihriurn to the left and blocks cIIZ expression.

A set of clll-lacz gene-fusion constructs shows that the translation of clll is strongly dependent on RNase 111. Since RNase I11 is capable of binding to, and distinguishing between, the O N and OFF structures (35, 38), binding of RNase 111 could shift the equilibrium between the alternative structures toward the translationally ON state. It is hypothesized that RNase-111 binding, and not processing, is responsible for the modulation of clI1 gene expression. In support of the model, the response to PNase 111 was lost in mutants frozen in the ON or OFF state. In uitro structure-probing experiments show that under low Mg2+ or high-temperature conditions, the ON-OFF equilibrium is shifted toward the OFF structure. Thus, the ON structure appears to be more sensitive to destabilizing conditions. This may be due to the existence of tertiary interac-

POSTTRANSCRIPTIONAL CONTROL OF

LYSOGENY

43

tions required to stabilize the ON structure. If RNase-I11 binding can affect cl1I gene expression without RNA cleavage, this form of regulation may be extended to other regulatory systems. It is therefore striking that among the proteins with high homology to RNase I11 is the human TAR-RNA-binding protein (26).

C. OOP-RNA-Mediated Degradation of cll mRNA by RNase Ill The antisense OOP (Ori-, 0-,and P-dependent) RNA transcript is complementary to 55 nucleotides at the 3‘ end of the cll gene and 22 nucleotides of the intercistronic region between the cl1 and the 0 genes. OOP RNA, when produced from a multicopy plasmid, inhibits cll gene expression 99% in an RNase-111-dependent fashion. A cleavage site in c1Z mRNA has been identified within the region of complementarity with OOP RNA (Fig. 1). This cleavage, which is absent in RNase-111-deficient cells, is followed by 3’-to-5‘ exonucleolytic processing of the cll mRNA. RNase 111 is therefore responsible for the OOP-RNA-dependent degradation of cl1 mRNA (39,40). The Poop promoter contains a recognition sequence for LexA, the repressor of the UV-inducible SOS response, and transcription of OOP is increased severalfold in a LexA-deficient strain. In line with this observation, the effect of the destabilization of cll mRNA by OOP RNA is detectable only after UV induction: under these conditions, a wild-type phage gives a twofold larger burst size than a phage defective one in OOP RNA synthesis (41). Presumably, the higher level of cZI mRNA in an OOP-defective phage leads to partial repression of the lytic functions, and hence to a smaller burst size. Interestingly, during infection by A, no condition tested showed any effect of OOP on the lytidlysogenic decision. Thus, the effect of oop is probably restricted to action during prophage induction.

D. Effect of RNase 111 on int mRNA During the two alternative pathways of A development, the int gene is transcribed from two different promoters: pL and pI. Transcription from pr results in efficient synthesis of the Int protein, whereas transcription originating at the pL promoter results in poor int expression. Transcription originating at the pL promoter is altered by the N antitermination complex, and therefore extends through various terminators, including the t, terminator that is located 3’ to the int coding sequence (Fig. 4).The extended RNA structure sib that includes the t, terminator structure is efficiently processed by RNase I11 (Fig. 4)(42).Once processed, the 3’ end of the int mRNA is subjected to degradation by a 3’-to-5’ exonuclease. This effect, termed “negative retroregulation,” has been reviewed in detail (13, 43). A good candidate for the 3’-to-5’ exonuclease is the host enzyme poly-

44

AMOS B . OPPENHEIM ET AL. sib

A PL

B PL

C PL

int

int

...-...-

unstable

D

FIG. 4. int retroregulation. Transcription of int from the pL promoter (A, B, and C) and the p1 promoter (D) yields mRNAs of different stabilities (Fig. 1).The p, transcript terminates t,. The t, structure may be bound by RNase 111, but it is resistant to RNase-111 processing and to exonucleolytic attack by polynucleotide phosphorylase (Pnpase). The pL transcript extends throngh the terminator, whereupon a different structure is formed (sib) that is sensitive to RNase 111 and can then be degraded by Pnpase from the cut 3’ end. RNase 111 is indicated by the shaded ellipse and the scissors. Pnpase is indicated by the solid half-circles, and mRNA decay is indicated by the dashed line.

POSTTRANSCRIPTIONAL CONTROL OF

k LYSOGENY

45

nucleotide phosphorylase, encoded by the pnp gene: a pnp- host is defective in retroregulation (43). Transcription originating at the pI promoter, on the other hand, terminates at t,; the resulting structure at the 3' end is not processed by RNase 111 and, presumably, the stem structure is able to prevent the attack by exonuclease (44). Consequently, int is efficiently expressed from the pI promoter. Interestingly, int expression from pr is reduced in an RNase-111-deficient strain (43).This effect may be due to direct stabilization of the 3' end of the mRNA in an m c + strain by binding of RNase 111 to the terminator structure; alternatively, the higher levels of polynucleotide phosphorylase found in an rnc- mutant (27) could result in partial degradation of the int mRNA, in spite of the presence of the stabilizing t, structure (26).

111. Stimulation of cll Translation by IHF Integration host factor (IHF) is a small heterodimeric DNA-binding protein that binds to specific sequences and induces a DNA bend. The I H F subunits are encoded by the himA and himD genes. I H F plays a role in a variety of cellular processes, such as transposition, site-specific recombination, initiation of DNA replication, and phage packaging; it also participates in the control of gene expression (see the review by Friedman, 45). I H F can either repress or stimulate transcription (46-52). In particular, I H F is involved at various levels in the life cycle of phage k: it is required for integration of the prophage into the chromosome and for packaging of the phage genome. Furthermore, I H F affects the transcription of the A promoters pL, pRT,and pE (47, 51; H. Giladi and A. B. Oppenheim, unpublished data). Here we describe the effect of I H F on the translation of c l l . I H F affects the expression of c l l posttranscriptionally (20, 53, 54). The expression of a cll-2acZ gene fusion is greatly reduced in an IHF-deficient host, while the expression of a cll-lac2 operon fusion is unaffected. Furthermore, purified I H F stimulates cll expression in a coupled transcriptionltranslation system in vitro. Genetic experiments suggest that the regulation of cll by I H F lies upstream from the cll coding sequence. An I H F binding site located upstream from the cll ribosome binding site may participate in this regulation (55). It is not known how IHF, a DNA-binding protein, affects translation. It is possible that by binding to DNA, IHF alters the rate of transcription and thereby affects mRNA structure at the translation-initiation region.

46

AMOS B. OPPENHEIM ET AL.

IV. Metabolic Instability of Phage ReguIatory Proteins One way to modulate protein activity is through the control of its stability. This could be achieved by the modification of the protein or the control of the expression of specific proteases, or through the synthesis of specific proteins that specifically inhibit these proteases. It has been recognized in recent years that a number of phage and host proteins are highly unstable. These unstable proteins, termed “timing proteins” (56) include the phage regulators N, CII, and CHI. Thus, these regulators may attain sufficient concentration in the cell for only a short time. The degradation of these proteins is carried out by specific proteases whose level and activity can be regulated by the environmental conditions. The N protein is highly unstable; its half-life is 1-2 minutes. It is degraded by the host Lon protease (18).Thus, once N synthesis is repressed, its activity is rapidly removed. CII protein is also highly unstable. Host functions encoded by the hfEA and hflB loci participate in the proteolysis of CII. The degradation of CII by HflA was reproduced in uitro (57), an achievement that should allow the detailed dissection of the mechanism by which Hfl recognizes CII. Twodimensional gel electrophoresis of cells defective in Hfl shows that a number of host proteins are overrepresented (58).Thus, HflA could affect lysogenization indirectly via one or more of the proteins regulated by Hfl. During lysogenization, CIII protein stabilizes CII (19, 20). CIII also activates the heat-shock response, probably through the stabilization of the heat-shock-specific subunit of RNA polymerase, a 3 2 (59).CIII may independently stabilize each protein or it may act through an intermediate step. Analysis of a set of missense mutations in the c l l l gene of phage A and of phage HK022 that yield inactive CIII proteins showed that all the mutations are located in the relatively conserved central region of the protein (60). A comparative analysis of the CIII protein sequence in A, HK022, and the lambdoid bacteriophage P22 indicates that this central region assumes an amphipathic a-helical structure. This part of the A clll gene was cloned within the a-complementing fragment of the ZacZ gene, and the resulting fusion protein displayed functional CIII activity. Mutations that yield a nonfunctional fusion protein cluster within the CIII moiety, indicating that the central portion of the CIII protein is both necessary and sufficient for CHI activity (60).The understanding of the mechanism by which CIII acts awaits the identification of the cellular target through which CIII exerts its effect. The relative concentrations of Int and Xis, which play important roles in the life cycle, are discussed above. Genetic experiments show that, while Int activity is highly stable, the activity of Xis is short-lived (61).It was suggested

POSTTRANSCRIPTIONAL CONTROL OF

A

LYSOGENY

47

that, when lysogeny is favored, Int possesses the ability to carry out the integration reaction for an extended period. In contrast, the reverse excision reaction, which requires the presence of Xis, can take place only as long as Xis is present.

V. Concluding Remarks Infecting A phages can sense the physiological state of the host cell and vary their rate of lysogenization accordingly. Presumably, this signaling is mediated by host factors that affect the expression or level of the phage proteins that control the lysisllysogeny decision. However, our knowledge about the ways by which the environment influences host genes is rather limited. Over 20 genes are transcribed from the two major promoters during A development. Posttranscriptional regulation is therefore required to regulate these genes independently. Furthermore, these additional levels of regulation may allow greater sensitivity of the system to variations in the environment the phage finds upon infection. Modulation of gene expression in the lysogenic pathway at the level of the initiation of translation is found to operate in the key regulatory factors N , cll, and clll. The mechanisms by which these functions are regulated remain to be elucidated.

ACKNOWLEDGMENTS This work was supported in part by grants from the National Council for Research and Development, Israel, and the Gesselschaft fur Biotechnologische Forschung, mbH, Braunschweig, Germany, and Grant GM38694 from the U.S. National Institutes of Health. Part of this work was performed in the Irene and Davide Sala Laboratory for Molecular Genetics at the Hebrew University, Hadassah Medical School. The research was also sponsored in part by the National Cancer Institute, Department of Health and Human Services, under Contract N 0 1 CO-74101 with ABL. The contents of this publication do not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.

REFERENCES 1. D. Friedman, E. R. Olson, C. Georgopoulos, K. Tilly, I. Hershkowitz and F. Banuett, Microbid. Reu. 48, 299 (1984). 2. I. Herskowitz and D. Hagen, ARGen 14, 399 (1980). 3. R. Sanger, R. A. Coulson, G . F. Hong, D. F. Hill and G . B. Peterson,JMB 162,729 (1982).

48

AMOS B. OPPENHEIM ET AL.

4 . M . B. Yarmolinsky, in “The Bacteriophage Lambda” (A. D. Hershey et a / . , eds.), p. 97. CSHLab, Cold Spring Harbor, New York, 1981. 5 . P. Kourilsky, MGG 122, 183 (1973). 6. P. Kourilsky and A. Knapp, Biochiinie 56, 1517 (1974). 7 . M. Ptashne, “A Genetic Switch: Gene Control and Phage X.” Cell Press and Blackwell Scientific, Cambridge, Massachusetts, 1986. 8. D. I. Friedman and M. Gottesman, in “Lambda 11” (R. W. Hendrix, J. W. Roberts, F. W. Stahl and R. A. Weisberg, eds.), p. 21. CSHLab, Cold Spring Harbor, New York, 1983. 9. J. W. Roberts, Cell 52, 5 (1988). 10. J. W. Roberts, Nature 224, 1168 (1969). 1 1 . H. Shimatake and M. Rosenberg, Nature 292, 128 (1981). 12. L. F. Reichardt, J M B 93, 267 (1975). 13. H. Echols and 6 . Guarneros, in “Lambda 11” (R. W. Hendrix, J. \V. Roberts, F. W. Stahl and R. A. Weisberg, eds.), p. 75. CSHLab, Cold Spring Harbor, New York, 1983. 14. D. Wulff and M. Rosenberg, in “Lambda 1 1 (R. W. Hendrix, J. CV. Roberts, F. W. Stahl and R. A. Weisberg, eds.), p. 53. CSHLab, Cold Spring Harbor, New York, 1983. 15. B. C. Hoopes and W. R. McClnre, PNAS 82, 3134 (1985). 16. T. S. Ho and M. Rosenberg, in “The Bacteriophages” (K. Calendar, ed.). Plenum, New York, 1988. 17. A. Rattray, S. Altuvia, J. Mahajna, A. B. Oppenheim and M. Gottesman,]. Bmt. 159, 238 (1984). 18. S. Gottesman, M. E. Gottesman, J. E. Shaw and M. E. Pearson, Cell 24, 225 (1981). 19. S. A h v i a and A. B. Oppenheim, J. Buct. 167, 415 (1986). 20. M . A. Hoyt, D. M. Knight, A. Das, H. I. Miller and H. Echols, Cell 31, 565 (1982). 21. A. D. Kaiser, Virology 3, 42 (1957). 22. B. J. Knoll, Virology 92, 518 (1979). 23. P. Kourilsky, Biochimie 56, 11 (1974). 24. R. J. Bram, R. A. Young and J. A. Steitz, Cell 19, 393 (1980). 25. R. Sirdeshmukh and D. Schlessinger, NARes 13, 5041 (1985). 26. I>. Court, in “Control of mRNA Stability” (G. Brawernian and J. Belasco, eds.). Acadeniic Press. In press. 27. C. Portier, L. Dondon, M. Crunberg-Manago and P. Regnier, EMHO J . 6, 2165 (1987). 28. J. C. A. Bardwell, P. Regnier, S. M . Chen, Y. Nakamura, M. Grunlierg-Manago and D. L Court, E M B O J. 8, 3401 (1989). 29. T. C. King, R. Sirdeshmukh and D. Schlessinger, Microbiol. Rel;. 50, 428 (1986). 30. L. Kameyama, L. Fernandez, G. Guarneros, and 1). Court, in “NATO AS1 Posttranscriptional Control of Gene Expression” (J. McCarthy and M . Tuite, eds.), Vol. H49, p. 125. Springer-Verlag, Berlin, 1990. 31. H . A . Lozeron, J. E. Dahlberg and W. Szyhalski, Virology 71, 262 (1976). 32. H. A. Lozeron, P. J. Anevski and D. Apirion, ] M A 109, 359 (1977). 33. L. Kameyama, L. Femandez, D. L. Court and G. Guameros, Mol. Microbio[. 5, 2953 (1991). 34. P. J. Anevski and H. A. Lozeron, Virology 113, 39 (1981). 35. S. Altuvia, D. Kornitzer, D. Teff and A. B. Oppenheim, J M B 210, 265 (1989). 36. D. Kornitzer, S. Altuvia and A. B. Oppenheim, J. Bact. 173, 810 (1991). 37. D. Kornitzer, “Control and Function of the clIl Gene of Bacteriophage A,” Ph. D. thesis. Hebrew University, Jerusalem, 1990. 38. S. Altuvia, D. Kornitzer, S. Kobi and A. B. Oppenheim, J M B 218, 723 (1991). 39. L. Krinke and D. L. Wulff, Genes Deu 1, 1005 (1987). 40. L. Krinke and D. L. Wulff, Genes Den 4, 2223 (1990).

POSTTRANSCRIPTIONAL CONTROL OF

h

LYSOGENY

49

41. L. Krinke, M. Mahoney and D. L. Wulff, Mol. Microbiol. 5, 1265 (1991). 42. U . Schmeissner, K. McKenney, M. Rosenberg and D. Court, J M B 176, 39 (1984). 43. G . Guarneros, in “Current Topics in Microbiology and Immunology” (A. Clarke, R. W. Compas, M. Cooper, H. Eisen, W. Goebel, H. Koprowski, F. Melchers, M. Oldstone, P. K. Vogt, H. Wagner and I. Wilson, eds.), p. 1. Springer-Verlag, Berlin, 1988. 44. U. Schmeissner, K. McKenney, M. Rosenberg and D. Court, Gene 28, 343 (1984). 45. D. I. Friedman, Cell 55, 545 (1988). 46. V. de Lorenzo, M. Herrero, M. Metzke and K. N. Timmis, EMBO J. 10, 1159 (1991). 47. H. Giladi, M. Gottesman and A. 8 . Oppenheim, J M B 213, 109 (1990). 48. G. Griffo, A. B. Oppenheim and M. E. Gottesman, J M B 209, 55 (1989). 49. L. Huang, P. Tsui and M. Freundlich, J , Bact. 172, 5293 (1990). 50. H. M. Krause and N. P. Higgins, J B C 261, 3744 (1986). 51. J. Kur, N. Hasan and W. Szybalski, Gene 81, 1 (1989). 52. P. A. van Rijn, N. Goosen and P. van de Putte, NARes 16, 4595 (1988). 53. J. Mahajna, A. B. Oppenheim, A. Rattray and M. Gottesman, J. B a t . 165, 167 (1986). 54. S. Peacock, H. Weissbach and H. A. Nash, PNAS 81, 6009 (1984). 55. J. F. Thompson, D. Waechter-Brulla, R. I. Gumport, J. F. Gardner, L. Mottoso de Vargas and A. Landy, J. Bact. 168, 1343 (1986). 56. M. Gottesman and Maurizi, Microbiol. Rec. 56, 592 (1992). 57. H. H. Cheng, P. J. Muhlard, A. Hoyt and H. Echols, PNAS 85, 7882 (1988). 58. H. H. Cheng and H. Echols, J M B 196, 737 (1987). 59, H. Bahl, H. Echols, D. B. Strauss, D. Court, R. Crow1 and C. Georgopoulos, Genes Deu. 1, 57 (1987). 60. D. Kornitzer, S . Altuvia and A. B. Oppenheim, PNAS 88, 5217 (1991). 61. R. A. Weisberg and M. E. Gottesman, “The Bacteriophage Lambda” (A. D. Hershey, ed.), CSHLab, Cold Spring Harbor, p. 489. New York, 1971.

This Page Intentionally Left Blank

Global Regulation of Mitochondrial Biogenesis in Saccharomyces cerevisiae J. H . DE W I N D EAND ~ L. A. GRIVELL Section f o r Molecirlur Biology Department of Moleculur Cell Biology University of Anuterdain 1098 SM Atnsterdum, The Netherlantls I. Transcriptional Regulation and Signal Transduction i n Yeast . . . . . . . . . 11. Transcriptional Regulation by Oxygen . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Differential Regulation of Gene Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .......... IV. Transcriptional Regulation by Carbon Source . . . V. VI. VII. VIII.

Transcriptional Regulation under Stress Conditio A Path from Mitochondrion to Nucleus? . . . . . . . . . . . . . . . . . . . . . . . . . . Mitochondrial Biogenesis and the Yeast Cell Cycle . . . . . . . . . . . . . . . . . Regulation of Mitochondrial Biogenesis in Relation to Cell Growth . . . IX. Mitochondrial Biogenesis in an Evolutionar pective . . . . . . . . . . . . X. Conclusions and Prospects . . . . . . . . . . . . . . ................... References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54 59 63

71 73 74 81 82 85

Mitochondria are essential organelles, present in all eukaryotic cells that use oxygen. Often referred to as the powerhouse of the cell, they carry out the reactions necessary for capture of the energy liberated during cellular respiration. Besides fulfilling this familiar role, mitochondria contain many enzymes required for key steps in diverse degradative ( 1 ) and biosynthetic pathways (2-4; Fig. 1).They are thus essential for cell survival, even in such facultative anaerobic organisms as the yeast Saccharomyces cerevisiue, which can selectively repress respiratory activity during fermentative and anaerobic growth. Biogenesis of a functional mitochondrion requires synthesis of several hundred proteins (5).Even though mitochondria of all eukaryotic organisms contain DNA and the complete machinery capable of expressing the information stored within it, only a few proteins are encoded by the mitochondrial genome. These are located mainly within the organelle’s inner membrane, where they are involved in electron transport and oxidative To whom correspondence may be addressed.

51

53

REGULATION OF MITOCHONDRIAL BIOGENESIS

Succinate

NADH

-

Complex

m

Complex ll

Complex

rv

Complex V

FIG. 1. Mitochondria1 function in Saccharomyces cereoisiue. (Left) Schematic representation of the mitochondrion as the central element in cellular metabolism. The mitochondrion is required for growth on non-fermentable carbon sources when the electron-transport chain (depicted in gray) is supplying membrane potential and energy (ATP). In addition, mitochondrial function is required under all growth conditions, as many key steps in various metabolic routes occur inside these organelles. Mitochondria supply intermediates for biosynthesis of amino acids, sterols, and fatty acids, and they synthesize heme. For a detailed description of each of the metabolic routes involving mitochondrial function, see the references indicated: nitrogen metabolism, proline degradation, and glutamate interconversion (1);carbohydrate utilization and citric acid metabolism (2); amino-acid biosynthesis and mono-carbon metabolism in nucleotide biosynthesis (3);biosynthesis of lipids and sterols (4);heme biosynthesis (23).For a detailed overview of the various metabolic routes the yeast cell uses when growing on different carbon sources, see 173. (Above) Schematic representation of the components of the respiratory chain in the inner mitochondrial membrane of Succharomyces cereoisiae. NADH is oxidized through NADH dehydrogenase (*). Unlike higher eukaryotes, yeast lacks complex I and contains instead a single-subunit enzyme. Whether this enzyme is capable of proton translocation remains to b e established. Reducing equivalents are further supplied by succinate through succinate dehydrogenase (complex 11), or by lactate through L-lactate cytochrome-c reductase (cytochrome b2).The redox cascade next involves the hydrophobic carrier ubiquinone (Q) and the hydrophilic carrier cytochrome c ( C ) , the cytochromes b and c1 containing ubiquinolcytochrome-c oxidoreductase (QCR; complex III), and the cytochrome ala, containing cytochrome-c oxidase (COX; complex IV), which ultimately reduces oxygen to water. Proton gradients formed during redox-conversions in complexes I, 111, and IV are used to synthesize ATP through ATP synthase (complex V). For a detailed description of structure and function of the yeast respiratory chain, see 5, 173, and 216.

J. H. DE WINDE AND L. A. GRIVELL

54

phosphorylation. The remainder, including major components of the mitochondrion's own genetic system, are encoded by nuclear genes. Maintenance and expression of the mitochondrial genome are thus completely dependent on nucleus-encoded proteins. In the yeast S. cereoisiae, loss of (parts of) the mitochondrial genome most often results in respiratory incoinpetence, but the cell maintains a primitive mitochondria] structure (promitochondrion) to carry out the essential biosynthetic functions. In bakers' yeast S. cereoisim, synthesis of a respiration-competent mitochondrion is controlled mainly by environmental stimuli, such as the availability of oxygen and the type of carbon source. In the presence of oxygen and the absence of a fermentable carbon source, transcription of nuclear genes encoding components of the respiratory chain and proteins of the mitochondrial transcription/translation machinery is induced four- to tenfold (6). A fermentable carbon source blocks this induction, irrespective of whether oxygen is present. However, relatively little is known about the mechanisms through which the cell balances the expression of the nucleusencoded mitochondrial proteins relative to each other and to the mitochondrially encoded subunits. Nor is it known how the rate of mitochondrial biogenesis is adjusted in relation to cell growth. In this review, we summarize what is currently known about the regulation of expression of nuclear genes that encode proteins involved in mitochondria] biogenesis, with emphasis on the regulation exerted by carbon source and oxygen. We briefly review features of mitochondrial proliferation in relation to cell division, and we discuss possible mechanisms through which the yeast cell adjusts the biosynthesis of mitochondrial components relative to metabolic requirements and cellular growth rate.

I. Transcriptional Regulation and Signal Transduction in Yeast2

A. cis-Acting Elements and trans-Acting Fact0rs A typical yeast promoter consists of several cis-acting elements that function as target sites for regulatory proteins (Fig. 2). Initiation of transcription 2 Abbreviations used: ARC, ANPl-PD23-GYC7 genomic region; ARS, autonomously replicating yequence; cAPK, fiMP-dependent protein kinase; CDE, centromeric core element; CEN, centromere; COR, CYCI-GSMI-IJAD7 genomic region; CS, sitrate synthase; eIF, eukaryotic translation-initiation factor; lisp, heat-shock protein; mtTF, @+hondrial transcription factor; NF-Y, _nuclear factor-?; NTS, con-transcribed spacer region of the rDNA repeat; RC2, yetardation-complex protein TFIIBID, RNA polymeriase-IItranscription factor

z;

REGULATION OF MITOCHONDRIAL BIOGENESIS

55

is catalyzed by the RNA polymerase-I1 complex at the initiation site I (7). The TATA-box (T in the figure) is essential in many promoters for transcription initiation to occur. It is the target site for the basal RNA polymerase-I1 transcription factor, TFIID, which nucleates the assembly of the other basal transcription factors and RNA polymerase-I1 into a stable pre-initiation complex (8). In addition to these basal control elements, at least one upstream activation site (UAS) is required for efficient transcription initiation to occur (9). UAS elements functions as DNA binding sites for regulatory proteins that are thought to interact with the basal transcriptional machinery to enhance transcription initiation in response to specific stimuli (10, 11).Several trans-acting factors that activate transcription by binding to specific UAS sequences are well-documented. The transcriptional activator GAL4 is specifically required for expression of the genes involved in galactose and melibiose metabolism in Saccharomyces (reviewed in 12). Transcription of nuclear genes encoding mitochondria1 proteins is often mediated through specific binding of the HAP1 or HAP21314 activator proteins (6; Sections 11, B and C and IV, B). In many instances, yeast promoters consist of several TATA and UAS elements, which together determine the rate of transcription of the adjoining gene. In addition, yeast promoters may contain operators or upstream repressor sites (URS; 13, 14) and upstream induction sites (UIS; 14). By binding specific proteins, these elements also contribute to overall transcriptional regulation. Recent biochemical analyses reveal the presence in yeast of yet another class of regulatory proteins. In contrast to the low-abundant specific regulators like GAL4 and the HAP proteins, which exert their effects on relatively small families of genes, a small group of highly abundant sequence-specific DNA-binding proteins is involved in diverse regulatory events, such as activation and repression of transcription, initiation of DNA replication, and chromosome maintenance. Well-characterized members of this family of multifunctional regulators are ABF1, CPF1, and RAP1, but several more have been described (reviewed in 15, 15a). The roles played by these proteins are discussed in more detail in Sections I, B and VIII, A.

B.

Mechanistic Models for Transcriptional Activation

As stated above, specific proteins that activate transcription are thought to exert their regulatory effects by interacting with the basal transcription ~~

~

B/D; TPR, fetratricopeptide yepeat; Ty, yeast transposon; UAS, upstream activating sequence; UIS, upstream [nducing sequence; URS, upstream repressing sequence. Amino acids are named according to their standard three-letter code. All gene names and abbreviations are listed in a glossary presented at the end of the text of this review article.

56

J . H. DE WINDE AND L. A. GHIVELL

A

B

C

D FIG. 2. Transcriptional regulation in yeast. (A) Schematic representation of an inducible yeast promoter, containing a number of regulatory elements. These include an upstream inducer sequence (UIS), an upstream activator sequence (UAS), an upstream repressor bequence (URS), a TATA-box (T), and a transcription-initiation site (I). A broken black bar indicates the adjoining coding region. (B) A specific transcriptional activator protein, while bound to its target UAS, may interact through its acidic C-terminal domain with the basic acetylated N-termini of histone H4 molecules. This presumably induces unfolding of repressing nucleosome structures, thereby ensuring accessibility of additional cis-acting promoter sequences (1 7, 18). (C) The acidic activator can interact with the TATA-box-binding fiator TFIID (D) and allow successive binding of the initiation factor TFIIB (B), thereby generating a core of assembly for the RNA polymerase-I1 pre-initiation complex (8, 1 5 1 , 16).Activity of the acidic activator can b e induced through a UIS-binding inducer protein (14). Alternatively, a URS-binding specific repressor protein may hamper or inhibit transcriptional activation through direct or indirect interaction with the activator protein and/or the newly formed pre-initiation complex (13, 14, 37u). (D)

HEGULATION OF MITOCHONDRIAL BIOGENESIS

57

machinery, the TATA-binding protein TFIID and the initiation factor TFIIB being good candidates for targets of interaction (16). Short acidic peptide segments, present in most transcriptional activators described thus far, appear to constitute surfaces of protein-protein interactions, mediating stimulation of transcription (10, 16a). However, it has become clear more recently that chromatin structure also plays an important role in transcriptional regulation. Nucleosomes can repress transcription initiation in viuo, and transcriptional activation of genes is often dependent on and accompanied by specific displacement of nucleosomes from the promoter regions involved (17). The essential basic N-terminal segment of histone H4 is required for transcriptional activation by acidic activator proteins (18), suggesting that these activators directly interact with this part of the nucleosomes. Thus, acidic activator proteins may function by unfolding repressing nucleosome structures, in addition to direct interaction with the basal RNA polymeraseI1 complex (Fig. 2). Being involved in diverse regulatory events, abundant DNA-binding proteins like A B F l and RAP1 probably act in a more general way. The effects exerted by these general regulators are clearly dependent on the context of their binding sites (15, 15a). Evidence is emerging that these proteins function by specifically positioning nucleosomes and maintaining defined nucleosome-free regions in the chromatin (19). Thus, general regulators may increase accessibility of cis-acting elements for more specific regulatory proteins. In addition, these proteins may function directly in transcriptional control as auxiliary regulators by interacting with specific activators or repressors and/or with the basal transcription machinery.

C. Transduction of Regulatory Signals The activity of truns-acting regulatory proteins binding to cis-acting elements in diverse promoters is regulated in response to external and internal stimuli that affect growth and metabolism of the yeast cell. In many instances, DNA-binding regulatory proteins lie at the end of complex signaltransduction pathways (Fig. 3). A primary response is often triggered through interaction of small molecules with receptors in the plasma membrane. This primary signal subsequently initiates a cascade of positive and negative secondary effects, involving several protein kinases and phosphatases, and various protein-protein interactions. This signaling cascade iiltimately induces alterations in various metabolic pathways. One of the bestAfter stimulated binding of TFIIB (B), the pre-initiation complex is assembled, consisting of additional basal transcription factors TFIIA [A) and TFIIEIF (EIF) and the RNA polymerase-I1 (RP2) complex (8). Subsequently, transcription initiation will take place, possibly again stimulated l y a specific activator through direct or indirect interaction with the initiation complex.

58

J . H . DE WINDE AND L. A. GRIVELL

A.242&3 FIG.3. Signal transduction in yeast. An external signal, often a small nutrient or metabolite molecule, interacts with a receptor/G-protein complex in the plasma membrane, causing activation of a 6-protein moiety. Through this activated subunit, a specific protein kinase (PK) is activated, which, by successive phosphorylation of several kinases and phosphatases (1, 2, and 3), initiates a cascade of secondary phosphorylations and dephosphorylations, involving direct and feedback regulatory events. Finally, a specific transcription activator protein (A*) is mobilized and activated through phosphorylation or dephosphorylation. This activator will stimulate transcription of a specific set of genes, thereby initiating a phenotypic response to the original primary signal.

studied examples of signal transduction in S. cerevisiae is presented by the pathway involved in the mating pheromone response (20-22). Regulation of mitochondria1 biosynthesis is dependent on signal-transduction pathways involved in nutrient sensing and growth control; these are discussed below (Section IV).

REGULATION OF MITOCHONDRIAL BIOGENESIS

59

II. Transcriptional Regulation by Oxygen A. Heme, an lntracellular Signaling Molecule Biogenesis of respiration-competent mitochondria depends on the presence of molecular oxygen, the final acceptor in the respiratory electrontransport chain. Most nucleus-encoded components of the respiratory chain are regulated by oxygen at the transcriptional level. This control is exerted by the cytochrome cofactor heme, a molecule particularly appropriate for the job. Its own synthesis is regulated by oxygen, and the redox state of the molecule is directly dependent on the redox state of the yeast cell (22a). Biosynthesis of heme occurs inside mitochondria and requires molecular oxygen in two successive oxidations that lead to synthesis of the immediate heme-precursor, protoporphyrin-IX from coproporphyrinogen-111 (23, 24). The first of these two reactions, catalyzed by the HEM13 gene product, is rate-limiting, indicating that cellular heme levels directly reflect intracellular oxygen tension. Like anaerobically grown wild-type yeast cells, hemedeficient mutants require unsaturated fatty acids and ergosterol (4, 25). Heme thus regulates the expression not only of genes encoding components of the respiratory chain, but also that of genes encoding enzymes involved in sterol and fatty-acid synthesis (see Section 11,B). At present it is not known how mitochondrially synthesized heme finds its way to the nucleus to exert its regulatory effects on gene transcription. In a few cases, oxygen regulation of gene expression is not controlled by heme. The best-known example is presented by the PET494 gene, whose product activates translation of the mitochondrial mRNA for cox11 (26).Synthesis of PET494 is regulated at the translational level by oxygen in a hemeindependent manner (27).A similar pattern of regulation may hold for the PET1 11 gene, required for translation of mitochondrial cox111 mRNA (28). The molecular basis for this regulation is not known.

B. HAP1, a Heme-Dependent Transcriptional Activator HAPl (CYP1) is a sequence-specific DNA-binding transcriptional activator, whose activity is largely dependent on heme (29, 30). HAPl binding sites occur in the promoter regions of CYCl (UASlA, UASlB; 31,32), CYC7 (33-35), CY'TI (36, 36a), CYB2 (37), AND CTTl (38; see Table I). The protein consists of 1483 amino acids and is organized in several clearly definable domains. The N-terminal domain (residues 1- 148) is responsible for sequence-specific DNA binding. This region can be folded into several alternative zinc-finger structures (29, 39) or into a zinc-cluster (40).A second

60

J . H . DE WINDE AND L . A. GKIVELL

TABLE I OXYGENAND HEME-DEPENDENT TRANSCRIPTIONAL CONTROL HAP21314, AND ROXl Enzyme

CYCl CYC7 CYB2 CYTl QCR2 QCR8 COX4 COXSA COXSB COX6 COX6B

HEMl HEMl3 Lanosteml-14ademethylase CYP51IERGII 3-Hydroxy-3-methylglutaryl CoA reductase HMGl 3-Hydroxy-3-methylglutarylCoA reducrase HMGZ A9 fauy acid desaturase OLE1 5-Aminolevulinate synthase Coproporphyrinogen 111 oxidase

a-ketoglutarate dehydrogenase dihydrolipoyl transsuccinylase dihydrolmyl dehydroggwe. Catalase T eIF5A eIF5A

1,

Regulatory factors

Gene

Iso-1-cytochromec Iso-2-cytochrome c Cytochrome b2 Cytochrome cl QH2-cytochrome c oxidoreductase subunit I1 QH~-ccytochromec oxidoreducrasesubunit VIII Cytochrome c oxidase subunit IV Cytochrome c oxidase subunit Va Cytochrome c oxidase subunit Vb Cytochrome c oxidase subunit VI Cytochmme c oxidase subunit VIb

BY REGULATORY FACTORS

HAPl HAPl HAP1 HAP1 HAP1

ROXl

ROXl HAPl

References

31,49 33,34,35,55 37 a 36,36u 117 118,125 I15 54,67 5459~ 116,117.200 b

HAPIC HAPl HAP1

ROXl ROXl ROXl ROXl

KGDl KGD2 LPDl CTTl TIF51A ANBIITIF5lB

HAPL,

115 46.47 46,48 62 62 22u 203 204 202

HAPl HAP1 ROXl

38 64 59,63

€3. Cuiard, personal communication. B. L. Trumpower, personal cornmunicati~ni.

Transcriptional activation rindrr hrme-deficient conditions, ace text.

domain is responsible for heme dependence of HAPl activity (residues 245445). This domain is remarkable in containing seven repeats of a peptide sequence that resembles a metal or heme binding site. It has been speculated that this region may function as a physiological sensor of the redox state of the yeast cell by unmasking specific DNA-binding activity of HAPl upon binding of one or more ligands in an oxidized or reduced form (29, 30). The C-terminus of HAPl contains a highly acidic region, important for transcriptional activation (41).

REGULATION OF MITOCHONDRIAL BIOGENESIS

61

Two intriguing properties of HAPl are that the protein binds DNA specifically at various target sites that display little obvious sequence resemblance, and that the extent of transcriptional activation is not directly correlatable with DNA-binding afinity. In the case of CYCl and CYC7, which, under normal growth conditions, encode the major iso-1 and minor iso-2 forms of cytochrome c , respectively (42), HAPl activates transcription from the CYCl UAS much more strongly than from the CYC7 UAS (43),despite the fact that the affinity of the protein for these highly dissimilar binding sites is about equal (33).A single amino-acid substitution in the dominant HAPI-18 mutant protein (CYPI-18; 29, 39), changing Ser-63 at the base of the zinc-finger region to Arg, is sufficient to change this pattern. The mutation causes loss of function at C Y C l , but results in an increased transcriptional activation of CYC7, while the &nity of the protein for the CYC7 UAS remains unchanged. Internal deletions in HAPl increase activity at CYCl but reduce activity at CYC7 in oiuo (44). These observations suggest that the activity of HAPl at a given UAS is dependent on allosteric interactions among its DNA binding site, its binding domain, and its activation domain (6). Further support for this idea comes from a study of a double mutation in the HAPl binding site at CYC7, which increases sequence similarity with both binding sites at CYCl (45). This mutation causes a concomitant increase in CYC7 transcription. DNAbinding afXnity of HAPl at C Y T l , encoding cytochrome c l , is significantly higher than at UASlB of C Y C l , although both binding sites are identical in 12 out of 15 base-pairs (36). In addition, chemical interference protection profiles indicate that HAPl binds differently to each of these largely related binding sites. Recently, additional features of heme-dependent regulation through HAPl have been described (46, 47). HAPl is required for activation of transcription of the HEM13 gene, encoding the heme-biosynthesis enzyme coproporphyrinogen oxidase, in the absence of heme (Table I). As postulated above, binding of heme to the sensor domain is thought to unmask the DNAbinding domain of HAP1. How HAPl binds to a target site and exerts its effect under heme-deficient anaerobic conditions is not clear. In addition, HAPl is required for repression of HEM13 and ERGllICYPSl, encoding lanosterol-14a-demethylase, under aerobic and anaerobic growth conditions, respectively (46-48). However, this effect may be mediated by additional factors. The interplay between HAPl and the heme-dependent repressor ROXl is responsible for additional layers of transcriptional control (see below). Furthermore, the putative repressor protein RC2 has been shown to compete with HAPl for binding to UASl of CYCl (31), and may contribute to the overall HAPI-dependent transcriptional regulation at vari-

62

J. H . DE WINDE AND L. A . GRIVELL

ous promoters. Clearly, this intriguing regulator has not yet disclosed all of its secrets.

C. Activator Complex HAP2/HAP3/HAP4; a Minor Heme Response The promoter region of the CYCl gene contains two UAS sequences that respond synergistically to heme and carbon source (49). As discussed above (Section 11, B), transcriptional activation through UASl is largely hemedependent and mediated by the HAPl protein (31). Although UAS2 functions primarily in carbon-source control (see Section IV, B), heme is required for maintenance of basal UAS2 activity during growth on glucose (49). Transcriptional activation through UAS2 is achieved by binding of a complex consisting of HAPB, HAP3, and HAP4 (50-52) that is involved in heme- and carbon-source-dependent regulation of a large number of mitochondria1 protein genes (Table I). None of these proteins binds heme directly. It has been suggested (52) that heme may modulate HAP21314 activity via translational regulation of the synthesis of the activator subunit HAP4. Further properties of the HAP2/3/4 complex are discussed in Section IV, B, which reviews carbon-source regulation.

D. ROX1, a Heme-Dependent Transcriptional Repressor During aerobic growth, a large family of genes encoding respiratorychain constituents and oxidative damage-repair enzymes is preferentially expressed as a result of heme-dependent transcriptional activation by the HAP proteins (see Section I1,B and C). However, other genes are specifically repressed by heme. This group includes genes involved in the biosynthesis of heme itself and of sterols, and genes encoding hypoxic isoforms of several heme-activated proteins (see also Section 111). Heme-dependent repression is mediated by the ROXl protein (53, 54).The ROXl gene is transcriptionally activated by heme (55),which is probably mediated by the HAP proteins (22a). The N-terminal domain of the 368-amino-acid ROXl protein is similar to the DNA-binding domain of the HMG class of nonhistone chromatin proteins (22a, 56, 57). Interestingly, the yeast SIN11SPT2 protein, involved in negative regulation of the HO gene, is also similar to HMG-class proteins (58).Consensus ROXl operator sites are present in the upstream regions of the hypoxic genes ANBl lTlF51 B , CYC7, COX5B, HEM13 (59, 59a), and ERG1 IICYP51 (48;Table I). Transcriptional activation of HEM13 under anaerobic conditions and repression of this gene under aerobic conditions may result at least in part from HAPl regulation of ROXl expression (47).

REGULATION OF MITOCHONDRIAL BlOGENESlS

63

111. Differential Regulation of Gene Pairs Heme-dependent regulation of gene expression presents an interesting phenomenon. Several genes that are preferentially transcribed under aerobic conditions have hypoxic counterparts that are oppositely regulated. These include genes encoding respiratory-chain constituents COX5 (COXSA and COX5B; 60) and cytochrome c (CYCl and CYC7; 61; see also Section IV, C and Fig. 4). In addition, 3-hydroxy-3-methylglutaryl-coA reductase, catalyzing the rate-limiting step in sterol biosynthesis ( H M G I and HMG2; 62), and cytosolic translation-initiation factor eIF5A (TlFSlA and ANBlITIF51 B; 63, 64) are encoded by differentially regulated gene pairs. The products of COX5A and CYCl represent predominant isoforms under aerobic growth conditions (61, 65). Accordingly, both genes are transcribed at significantly higher levels than their hypoxic counterparts (44,66). As described in Section 11, 8,heme-dependent regulation of CYCl and CYC7 is mediated by HAP1, which activates transcription from UASl more strongly than from UASCyc7 (43). In addition, transcription of CYC7 is repressed under heme-sufficient conditions by ROXl (55). The antagonistic action of HAPI and ROXl results in a low level of CYC7 transcription under aerobic growth conditions. Differential regulation of the COX5 genes is achieved through heme-dependent activation of COX5A by the HAP21314 complex and heme-dependent repression of COXSB by ROXUREOl (54, 67). It has been speculated (55, 67) that the hypoxic isoforms of oxygenrequiring enzymes serve important functions under low oxygen tension or during transition from respiring to non-respiring conditions. In line with this, iso-2 cytochrome c is stable under heme-deficient conditions, whereas the iso-1 form is not (61).The COX5B subunit confers an increased catalytic turnover rate and an increases heme-A oxidation rate on the cytochrome-c oxidase holoenzyme (68),which allows it to function better under conditions of low oxygen tension. A comparable situation may hold for the HMG2 gene product. The importance of hypoxic isoforms of proteins involved in respiration is highlighted by the recent discovery in yeast of an anaerobically expressed isoform of the ADPIATP translocator (69).This protein is essential for respiration, as translocation of ATP across the mitochondrial membrane is a key element of oxidative phosphorylation. In addition, ADP/ATP translocation is an important link between mitochondrial and cytoplasmic metabolism. The gene encoding the hypoxic isoform of the translocator, AAC3, is specifically transcribed under anaerobic conditions, in contrast to the major and minor aerobic isoforms encoded by AAC2 and A A C l , respectively (69). It is tempting to speculate that transcription of AAC3 is repressed under aerobic growth conditions by ROX1.

64

J. H . DE WINDE AND L. A. GRIVELL

The reason for differential heme regulation of the gene pair encoding cytosolic translation-initiation factor eIF5A is not clear. Under aerobic conditions, TlF51A is transcribed and TZF51B is repressed by ROXl (63, 64). The latter is induced both under anaerobic conditions and when TIFSIA is disrupted (64), suggesting negative feedback regulation of the aerobically expressed protein on its hypoxic isoform. Interestingly, the reciprocally regulated gene pairs TIF51A-CYC7 and TZF51B-CYCI are part of large duplicated genomic regions designated ARC and COR, respectively, carrying numerous related genes (69u).

IV. Transcriptional Regulation by Carbon Source In common with many other micro-organisms, S. cermisiue shows marked preferences for certain sources of carbon, nitrogen, and energy. One preference concerns the use of glucose above all other fermentable and nonfermentable carbon compounds. This behavior causes diauxic growth of this yeast when cultured on mixtures of carbon sources. Yeast cells growing on glucose display high growth rates, presumably related to the ease with which intermediates can be derived from glucose catabolism. Growth on glucose has radical consequences for the enzyme complement and metabolic patterns in the yeast cell. Synthesis of components of the mitochondria] respiratory chain is repressed, mostly at the level of transcription, causing a low respiratory capacity (5, 6). In addition, permeases and key enzymes involved in the utilization of other sugars are totally absent, and the activities of enzymes involved in gluconeogenesis and the glyoxylate cycle are drastically reduced (2). In the past decade, genetic approaches have partly unraveled the signaltransduction pathways mediating these responses. Selection of mutants displaying aberrant glucose utilization and defective catabolite repression of sucrose and galactose metabolism has led to the identification of cis- and truns-acting elements responsible for transcriptional activation and repression of the genes involved (70).As detailed Ilelow, many of these mutations cause highly pleiotropic phenotypes, suggesting that the factors involved participate in more than one regulatory pathway. These signaling routes are beginning to be integrated into the regulatory pathways involved in mitochondria] biosynthesis, coupling the latter to a variety of cellular functions, including carbohydrate and nitrogen metabolism, mating-type response, . and cell growth and morphogenesis.

REGULATION OF MITOCHONDHIAL BIOGENESIS

65

A. Glucose Repression in Yeast; a Complex Cascade Most of our current understanding of glucose repression in yeast comes from extensive genetic and biochemical investigations on the control of the invertase gene SUC2. Derepression of SUC2 in response to glucose starvation requires the products of S N F l - S N F I O (70, 71). S N F l encodes a protein-serinelthreonine kinase (72)that is physically associated with the SNF4 gene product, a protein required for maximal S N F l kinase activity (73, 74). The exact role of S N F l kinase in the glucose response is still unclear. Kinase activity of S N F l appears not to be affected by the availability of glucose (73),suggesting that S N F l is not the primary mediator of the glucose signal. Mutations in S N F l and SNF4 resemble mutations that stimulate the RAS-adenylate cyclase pathway (75), generally thought to be involved in nutrient sensing and growth control (76).Addition of glucose to glucose-starved cells causes a transient elevation of CAMP levels (77). However, mutations in the adenylate cyclase gene C Y R l , causing low constitutive levels of CAMP, reduce invertase expression (78). As mutants defective in the regulatory subunit gene BCYl of the CAMP-dependent protein kinase cAPK maintain wild-type levels of invertase (79), CAMP may not be directly involved in glucose repression. Over-expression of two negative regulators of the RAS-adenylate cyclase pathway, M S I l (80)and PDE2 (81), suppresses defects in nutrient utilization and sporulation of snfl and snf4 null mutants, but does not restore glucose regulation of SUC2 expression (82).Thus, the SNF1- and RAS-adenylate cyclase pathways function as antagonistic parallel routes regulating several common cellular functions. Their roles in the glucose response may, however, be completely distinct. The products of SNF2, SNF5, and SNFG are involved in the regulation not only of SUC2, but also of the acid phosphatase gene P H 0 5 (83),cell typespecific genes (84), and Ty elements (85). SNFBISWIZ, SNF5, and SNFG presumably act together in a large multi-subunit complex (86),which also contains SWIlIADR6 and SWIS, which are required for transcription of the HO gene as well as A D H l , ADH2, G A L l , and GAL10 (87).The complex may function as a general transcriptional activator by assisting dedicated DNAbinding activators. Interestingly, SNFBISWIZ shows extensive sequence similarity to ATP-dependent DNA helicases (88, 89). The S N F 3 gene encodes a high-affinity hexose transporter, and does not appear to be directly involved in the transcriptional regulation of SUC2 (90). SNF7, SNF8, SNF9, and S N F l O have only been defined genetically (71). Several genes required for glucose repression have been described. The H X K 2 gene encoding hexokinase PI1 plays a crucial role, probably by acting as an initial sensor of glucose levels (91).The enzyme exhibits protein kinase

66

J. H. DE WINDE AND L. A. GHIVELL

TABLE I1 PUTATIVEM I G l BINDINGSITES I N THE PxoMorEx REGIONS INVOLVED IN MrTOCHONDRIAL BIOGENESIS Gene

Positions

SUC2 A SUC2 B GAL4 GAL1 Consensus MIGl binding site MIGl HAP4 CYTl COXSB

QCR8

-2241-208 -2561-272 -1211-137 -1721-156

- 1661-182 - 1341-150

01.' GENES

Sequence

ATAAAAATGCGGGGAAT GAAATTATCCGGGGGCG CTGAAAATCTGGGGAAG CCTTATTTCTGGGGTAA WWWWTSYGGGG CGAgAAAaGTGGGGAAG ATATAAAaCTGGGGTTT TCAcAAATGTGGGGAAA AAATTAcTGaGGGGTTC GTAATTTTaaGGGGTCC TAAAATTTGgGGGGTGG

Kefereiices

97 96 52 36,36a 59a 59a 118

Positions indicated arc rclative to the translational start codon of each gene. The putative M I G l binding sites in UASl of COXSB are part of an iiwcrtrd repeat elernmt; see 59a. 0

Jr

activity, which may be modulated by intracellular CAMP levels (92). G R R l , R G R l , and HEX2IREGl are also of central importance to the glucose repression mechanism, although their exact biological function has not yet been revealed (93-954 The multiple growth defects of grrl and rgrl mutants suggest that these genes are important for general sensing of growth conditions, probably by interacting with the RAS-adenylate cyclase pathway. At the end of the negative regulatory cascade involved in glucose repression is the C,H, zinc-finger DNA-binding protein, encoded by the M I G l gene. The MIGl repressor binds specifically to the promoter regions of SUCB, CALI-10, and GALA (96,97). However, deletion of MIGl does not completely abolish glucose repression at SUC2, indicating that additional factors must be involved (96). The presence of a consensus MIG1-target site in the promoter region of its own gene suggests that MIGl may be involved in feedback regulation of its own synthesis (Table 11). A first link between general glucose repression and mitochondria1 biosynthesis was found in a search for suppressors of the snfl block on invertase expression. These investigations revealed that mutations in S S N G cause high-level constitutive expression of SUC2 and other glucose-repressed genes (98). ssn6 is allelic to cyc8, which causes over-expression of CYC7 (99, 100). Mutations in TUPLICYCS were also shown to completely abolish glucose repression at SUC2 (100). Besides constitutive derepression of several glucose-repressible genes, ssn6 and tupl mutants share complex phenotypes affecting mating-type-specific gene expression, cell wall morphology, per-

REGULATION OF MITOCHONDRIAL BIOGENESIS

67

meability, and DNA repair (reviewed in 101), suggesting a combined involvement of SSNG and TUPl in multiple regulatory pathways. The SSN6ICYC8 gene encodes a protein of 107 kDa, containing long stretches of poly(G1n) and poly(G1n-Ala) (102, 103). These tracts appear to be dispensable for SSNG function (104), but a region in the C-terminal half which is rich in Gln and Pro residues and similar to sequences found in other transcription factors may be important (105). The N-terminal half of the protein contains 10 copies of the 34-amino-acid tetratricopeptide (TPR) repeat (106).TPR units may form a secondary structure termed “snap helix,” in which individual &-helices are associated through knobs and holes (107). This structure may mediate protein-protein interactions and enable association with the nuclear scaffold. The TPR region is required for the function of the SSNG protein, although it is not clear whether all repeats are essential (104, 1 0 4 ~ ) . The TUPl gene encodes a protein of 78 kDa and also contains stretches of poly(G1n) (108, 109). The C-terminal half of the TUPl protein is essential for its function and contains six repeats of a 43-amino-acid sequence that are similar to the repeat region of G-protein P-subunits (110, 111). A functional role for these repeats has not been established, but they may well be involved in protein-protein interactions. Goebl and Yanagida (112) have pointed out that proteins containing TPR and p-transducin repeats may function in pairs in various regulatory pathways. Support for this hypothesis comes from the observation that SSNG and TUPl are associated in a protein complex (113). Recent evidence indicates that the SSN6-TUP1 complex is functioning as a general repressor of transcription (114). The complex may gain specificity for certain classes of genes by interacting with dedicated promoter-specific DNA-binding proteins. In the case of glucose repression, a strong candidate for specific DNA binding is the MIGl protein, which may recruit SSN6-TUP1 to the relevant promoter regions. Although genetic analysis has placed S N F l upstream from S S N G , S N F l kinase is not responsible for differential phosphorylation of the SSNG protein (104). The above results suggest that the promoter-specific DNAbinding protein is the target for the signal triggered by the availability of glucose, and not the SSNG-TUP1 complex (114).

B. HAP2/ HAP3/ HAP4; Carbon-SourceDependent Transcriptional Activation Carbon-source-dependent transcriptional regulation of many genes encoding imported mitochondria1 proteins is exerted through a complex consisting of the HAPB, HAPS, and HAP4 proteins (50, 52; see also Section 11, C on oxygen regulation). The complex has been studied most extensively by its action at UAS, of CYCI, which is the major control element for carbon-

68

J. H . DE WINDE AND L. A. GRIVELL

source regulation of this gene (49).Mutational analysis of UAS, has identified a consensus binding sequence, 5'-ACCAATNA-3' (51), which is present in the {JAS regions of many mitochondrial protein genes, including COX4 ( 1 1 9 , COX6 (116),QCR2 (117),QCR8 (118),and CYTl (36, 36a; see Table I). HAP2, HAP3, and HAP4 form a complex that is stable in solution (119, 120), and hence bind to their target site in an interdependent manner (50, 52). The HAP2 gene encodes a protein of 265 amino acids (121), containing, within its highly basic C-terminal half, an essential core of 65 amino acids that is entirely sufficient for assembly and DNA binding of the HAP2/3/4 complex. The remainder of HAPB, including a poly(G1n) tract, appears to be dispensable (120, 122). The HAP3 gene encodes a 144-amino-acid protein with no obvious structural characteristics (123). HAP2 and HAP3 are both required for sequence-specific DNA binding of the complex (120),while the HAP4 gene provides the primary transcriptional activation domain (52, 120). The 554-amino-acid HAP4 protein contains a highly acidic C-terminal domain that is similar to the activation domains of other transcriptional activators and essential for their function (41, 52, 120, 124).

C. Carbon-Source Regulation of Mitochondria1 Biogenesis As stated in the introduction to this section, synthesis of components of the mitochondrial respiratory chain is repressed in yeast cells growing on glucose. Transcription levels of the genes involved are rapidly adapted to changes in the available carbon source (125). How is carbon-sourcedependent regulation of mitochondrial biosynthesis integrated into the complex signaling pathways described above? A model scheme is depicted in Fig. 4.Carbon-source control of expression of mitochondrial protein genes is achieved in part at least through glucose repression of transcription of HAP2 (126) and HAP4 (52). HAP3 appears to be constitutively transcribed. Although the mechanism of repression of HAP2 and HAP4 has not been established, the transcriptional repressor MIGl may be involved. A perfect match to the consensus M I G l target sequence (97)is found in the promoter region of the HAP4 gene, coinciding with the transcriptional start site (Table 11). Negative transcriptional regulation of HAP4 through MIGl would allow glucose repression to act indirectly on many genes for mitochondrial proteins. Such regulation is analogous to that found for the GAL regulon, in which MIGl regulates expression of the GALl, GAL7, and GAL10 genes by regulating transcription of the GAL4 gene, encoding their common transcriptional activator (97). However, glucose repression of the GAL1 promoter is not only dependent on glucose control of GAL4 expression, but is in addition controlled through a cis-acting URS sequence (127, 128). By analogy, genes for mitochondrial proteins may be under direct negative control.

REGULATION OF MITOCHONDRIAL BIOGENESIS

69

Glucose

1

CYCl

CYC7

FIG.4. Regulation of mitochondria1 biogenesis by carbon source and oxygen. A model for transcriptional regulation of nuclear genes encoding respiratory-chain constituents is presented, using as illustration the CYCl and CYC7 genes that encode the two isoforms of cytochrome c (for details and references, see text). Carbon-source regulation of cytochrome-c expression is achieved through activation of CYCl transcription and probably ROXI-dependent repression of CYC7 transcription, mediated by the HAP2/3/4 activator complex. Activity of HAP2/3/4 is hampered on glucose, due to MIGl-SSNG/TUPl-dependent repression of transcription of HAP2 and HAP4. On a non-fermentable carbon source, the activity of HAP2/3/4 is induced following release from repression, which is caused by inactivation of the MIGl-SSNG/TUPl complex by the S N F l protein kinase. In addition, the S N F U S N M complex may interact directly with the HAP2/3/4 complex. Activity of S N F l is controlled through several early mediators of the glucose response signal, which are described in the text. Oxygen regulation of cytochrome-c expression is achieved through activation of CYCl transcription and ROXIdependent repression of CYC7 transcription, mediated by t h e HAPl activator protein. Activity of HAPl is largely regulated by heme, but also by the carbon source (49). The latter may h e achieved through carbon-source-regulated expression of several heme-biosynthesis genes (X), which is likely to involve the HAP2/3/4 activator complex (115).Activity of the heme-dependent repressor ROXl is also regulated through interaction with the SSNG/TUPl complex.

70

J. H . DE WINDE AND L. A. GRIVELL

Glucose repression of COXSB is conferred through one of its two UAS sequences (59u). This UAS1,5b contains a consensus MIGl binding site, as does the promoter region of CYTl (Table 11), suggesting that MIGl may directly repress transcription of several genes encoding mitochondria1 proteins. Negative regulation of CYB2 transcription is mediated through a cisacting URS sequence that is functionally related to the repressor element of the arginase gene CAR1 (37, 37a). As mentioned above, mutations in CYC8 and CYC9 (now called SSNG and TUP1, respectively) cause increased transcription of CYC7 (99) and TlF51BIANBl (109). These genes are not regulated by HAP21314 (35, 129), but transcription is repressed by ROXl (55; Table I). This indicates that SSNG-TUP1 is required for repression by ROXl and stresses the general function of the complex in repression of transcription. In tupl mutants, transcription of ROXl itself is also increased (109). This may result from increased activation of ROXl transcription by HAP21314 (22a),or from defective autorepression in the absence of a functional SSNG-TUP1 complex. Further evidence for the involvement of the general glucose repression pathway in the control of mitochondrial protein genes is presented by the finding that derepression of CYCl and COX6 is dependent on SNFl, and repression, on S S N 6 (130). In addition, glucose repression of CYCl is dependent on H X K 2 (131) and GRR1 (93).

V. Transcriptiona I ReguIa ti on under Stress Conditions Environmental stress induces a specific stress response in diverse organisms. The best-studied inducer of stress is heat-shock, causing a rapid increase of synthesis of specific heat-shock proteins (hsps), that are assumed to protect the organism from the deleterious effects of stress (132).The heatshock response is primarily regulated at the transcriptional level through the truns-acting heat-shock transcription factor (HSF; 132). One function of hsps in yeast is to facilitate import and assembly of mitochondrial proteins (133). The requirement for hsps in mitochondrial biogenesis is indicated by the mas3 mutation, displaying temperature-sensitive defects in mitochondrial protein import and cellular growth (134).This recessive lesion maps in the heat-shock transcription factor gene H S F (see also Section VII). In yeast, the nature of the heat-shock response is influenced by the metabolic state of the cell. Differences are most obvious between fermenting and respiring cells. Moreover, respiration-deficient yeast cells do not recover from heat-shock at high temperatures (132). These observations suggest a link between the stress response and mitochondrial biosynthesis and func-

REGULATION OF MITOCHONDRIAL B I U G E N E S l S

71

tion. The kinetics of mitochondrial biosynthesis in response to stress conditions have not been investigated in great detail. In yeast, a stress response is elicited by low levels of CAMP (135). Several reports indicate that synthesis of ATP and mitochondrially encoded respiratory components respond to changes in cytosolic CAMP levels (136, 137, and references therein). The ccsl-1 mutation is associated with an increase in cytochrome content, respiration, and ATP synthesis in addition to deregulation of the CAMP pathway. CCS1 is allelic with ZRA2, encoding a negative regulator of the adenylate cyclase activators RASl and RAS2 (138). Thus, the RAS-adenylate cyclase pathway involved in nutrient sensing and growth control is undoubtedly involved in the regulation of mitochondrial biogenesis. Starvation, nutritional downshift, and mild heat-shock induce a general effect in yeast called stringent response, in which transcription of genes encoding cytosolic ribosomal components is selectively inhibited (139, 140). Surprisingly, this response also appears to affect the levels of all mitochondrial transcripts (141). In logarithmically growing yeast cells, most of the cellular cytochrome c is expressed from CYC1 . In late-logarithmic and stationary-phase cultures CYC7 mRNA is specifically induced (42). General stress conditions, like heat-shock and stationary growth-phase, and concomitant growth-arrest are known to be accompanied by low levels of cellular CAMP (135, 142). In line with this, CYC7 transcription is strongly induced by heat-shock, stationary growth-phase, and low CAMP levels (143).This response appears to be independent of regulation by glucose and oxygen, and may be similar to stress regulation of the cytosolic catalase gene CTTl (144; T. Pillar, personal communication). General implications for stress control of mitochondrial biogenesis and function must await further investigations.

VI. A Path from Mitochondrion to Nucleus? Given the spatial separation of genetic information involved in mitochondrial biogenesis between the nucleus and the mitochondrion, nucleomitochondrial “cross-talk’ must exist to ensure coordinated expression of mitochondrial components, encoded in the two genomes. As described in Section II,A, heme functions as an important signaling molecule, originating from the mitochondrion. However, apart from transcriptional regulation by heme-dependent transcription factors, little is known about how and to what extent the mitochondrion can influence nuclear gene expression. Fermentative growth of S. cerevisiae is known to be dependent on the functional state of the mitochondria (145). Mutations in l M P l (146),which is

72

J. H. DE WINDE AND L. A. GRIVELL

allelic to the galactose permease gene GAL2 (147), in I M P 2 encoding a protein of unknown function (148), and in the regulatory gene GAL3 (146, 149) result in defective galactose fermentation in cytoplasmic petite strains. These strains are respiration-deficient as a result of deletions in p- or a total absence of po in the mitochondrial DNA. However, the molecular basis for a connection between mitochondrial function and galactose utilization remains rather obscure. Drug-induced respiration deficiency of mitochondria in S. cerevisiae increases the expression of cytochrome c (150). Kespiration-deficient cytoplasmic petite mutants display increased transcript levels of several nuclear genes (151,152).Interestingly, one group of transcripts that is abundant in po petites is derived from the non-transcribed spacer region of the rDNA repeat. These NTS transcripts are produced by KNA polymerase-I1 and are glucose repressible, and their accumulation depends on a co-dominant nuclear locus (153). Based on these and other findings, it has been speculated that cytoplasmic petites might enhance expression of several nuclear genes in an attempt to compensate for their respiration deficiency (154).However, this can not be the whole story, as transcription of the CYCl gene encoding iso-1 cytochrome c and genes encoding ATPase subunits are not affected by the mitochondrial genotype (151).In addition, steady-state levels of mRNAs derived from QCR genes encoding subunits of the respiration-chain complex ubiquinol-cytochrome-c oxidoreductase (155; see below), and COX5A and COX6 encoding subunits of the cytochrome-c oxidase complex (156) are reduced in cytoplasmic petites. The mechanism responsible for petite-specific induction of certain classes of nuclear-encoded genes may well involve regulation through metabolic changes that result from impaired mitochondrial functions. Evidence for this hypothesis is provided by the differential expression of the two isoforms of citrate synthase (CS) in po mutants of S. cerecisiae (157).In such petites, transcription of C l T l encoding mitochondrial CS1 is slightly depressed, whereas transcription of CZT2 encoding the peroxisomal enzyme CS2 is drastically increased. In normally respiring yeast cells, CIT2 expression is increased by chemical inhibition of respiration and by disruption of C I T I . As CS2 can partially compensate for absence of CS1, increased expression of CZT2 may reflect an attempt to find a way to synthesize important metabolic intermediates when mitochondrial functions are disturbed. The molecular basis for this regulatory mechanism is at present unclear. Recently, an abundant DNA-binding protein has been identified in yeast mitochondria, playing a key role in the structure, maintenance, and expression of mitochondrial DNA (57).This protein, ABFB, is highly similar to the human mitochondrial transcription factor mtTF1, and both factors are closely related to the vertebrate non-histone high-mobility-group proteins H M G l

REGULATION OF MITOCHONDRIAL BIOGENESIS

73

and HMG2 (57, 158, 159). ABF2 interacts with DNA both non-specifically and specifically at regulatory sequences involved in mitochondrial transcription and replication (57). Like all HMG-like proteins, this protein can condense and unwind DNA by introducing negative superhelical turns (159, 160), and thus appears to be an important structural element of the yeast mitochondrial nucleoid. Interestingly, a significant amount of ABF2 is reported to be localized in the nucleus. While this observation awaits further investigation, the authors raise the intriguing, albeit hypothetical, possibility that ABF2 might be involved in informing the nucleus about the state and quantity of the mitochondrial genome (J. F. X. Diffley, personal communication).

VII. Mitochondrial Biogenesis and the Yeast Cell Cycle Mitochondria proliferate by growth and division of preexisting organelles (161, 162). During the yeast cell cycle, the accumulation of mitochondrial mass in mother and daughter cells parallels the accumulation of other cellular constituents (163, 164),and the daughter cell receives a fraction of the total mitochondrial mass corresponding to its relative size (C. W. Woldringh, unpublished results). Accordingly, transcription of nucleus-encoded mitochondrial protein genes does not show periodic fluctuations throughout the cell cycle (165, 166). A more general genetic relationship between mitochondrial proliferation and cell division is indicated by the recessive mas3 mutation, which maps in the yeast heat-shock transcription factor gene H S F (134; see Section IV). This mutation causes temperature-sensitive defects in posttranslational mitochondrial import and in progression through the G2-phase of the cell cycle. Apparently, a certain class of hsps whose synthesis is mediated by HSF is required for both completion of G2 and efficient mitochondrial import and assembly under stress conditions. Mitochondrial inheritance is an essential feature of the yeast cell cycle. One of the early events of cell division is the movement of mitochondria into the growing bud (164), providing the daughter cell with all basic metabolic functions early in the cell cycle. Organellar migration in yeast is not likely to be dependent on microtubules, as migration still takes place in the presence of inhibitors of microtubule function (167). Moreover, the pattern of mitochondrial movement is independent of the type of carbon source or availability of oxygen ( 1 6 4 , 168). The recent isolation of mutants with defects in the mitochondrial distribution during mitosis may identify the mechanisms operating in yeast mitochondrial inheritance (168). The cell-cycle stage-specific

74

J. H. DE WINDE AND L. A. GRIVELL

mdml and mdm2 mutations cause temperature-sensitive growth, aberrant mitochondrial morphology, and a block in mitochondria1 migration into the daughter cell. In addition, the mdml mutation inhibits nuclear division and transfer of nuclei into the growing bud. Respiration competence of the mitochondria is not affected by the mutations. The MDM2 gene is allelic to the OLEl gene, encoding A9-fatty-acid desaturase (169).This indicates an essential role for unsaturated fatty acids in mitochondrial inheritance, which is likely to occur via their involvement in membrane biosynthesis. The OLEl gene is transcriptionally regulated by A9-unsaturated fatty acids (170) and by the heme-dependent repressor ROXI (2%; Table I), coupling mitochondrial inheritance to oxygen availability. The M D M l gene encodes an essential protein, displaying sequence similarity to mammalian vimentin and cytokeratin, which are components of the intermediate filament network (171). MDM1-specific antibodies recognize novel cytoplasmic structures in yeast and specifically recognize the intermediate filament network in animal cells. These results suggest that the M D M l protein forms part of a novel cytoplasmic structure in yeast, involved in organelle inheritance. Further investigation of the mechanisms governing mitochondrial inheritance may reveal novel aspects of the overall regulation of mitochondrial biogenesis.

VIII. Regulation of Mitochondria1 Biogenesis in Relation to Cell Growth The complex networks of oxygen- and carbon-source-dependent transcriptional regulation do not comprise the whole story of mitochondrial biogenesis. Certainly environmental stimuli, such as the availability of oxygen and the type of carbon source, account for an important part of transcriptional control on the genes involved. However, they also elicit variations in cellular growth rate (166, 172). Little is known about the mechanisms operating in the yeast cell to adjust synthesis of mitochondrial components relative to each other and in relation to cellular growth. During budding, the daughter cell receives only a relatively small amount of mitochondrial DNA from the mother cell, probably reflecting the distribution of mitochondrial mass (5; C. W. Woldringh, unpublished results). This implies that the yeast cell can monitor and respond efficiently to mitochondrial apportionment during growth and differentiation. On non-fermentable carbon sources, Saccharomyces divides at slow growth rates, while respiration and oxidative phosphorylation require an

REGULATION OF MITOCHONDRIAL BIOGENESIS

75

increased mitochondrial inner-membrane surface and, hence, an increased mitochondrial mass. Under these conditions, the bud is much smaller than the mother cell and must increase more in size after separation. This results in a longer G1 period before the onset of a new cell division cycle (163).On glucose, the growth rate is high, mitochondrial mass is reduced, and separation occurs when mother and daughter have attained comparable sizes. Hence, a mechanisms must exist that allows the selective accumulation of mitochondrial mass in the daughter cell while it is attached to the mother. Such a mechanism must involve controlled expression of both mitochondrial and nuclear genes in the mother cell, as mitochondrial migration occurs before the onset of nuclear division (163, 164). In our laboratory, questions relating to growth control of mitochondrial biogenesis are being addressed by studying the factors that control the biosynthesis of the yeast mitochondrial ubiquinol-cytochrome-c oxidoreductase (QCR). This respiratory enzyme complex I11 is located in the inner mitochondrial membrane and consists of nine different subunits (Figs. 1 and 5). Only one of these, cytochrome b, is encoded on the mitochondrial DNA. The remainder are encoded by nuclear genes, and their expression is controlled mainly at the level of transcription (5,173).For a number of the genes involved, cis-acting sequences important for transcriptional regulation have been identified (see Table 111). All genes appear to be controlled by the HAP21314 activator complex, whereas only a few genes are responsive to HAP1. Of particular interest was the finding that two abundant protein factors bind specifically to the 5' flanks of most Q C R genes and genes encoding other mitochondrial proteins (118, 174). Subsequent analysis revealed

FIG. 5. Ubiquinol-cytochrome-c oxidoreductase of Saccharomyces cereuisim. Schematic representation of the structure, topology, and stoichiometry of ubiquinol-cytochrome-c oxidoreductase, or respiratory-chain complex 111 (see also Fig. 1). The complex is present as a dimer in the inner mitochondrial membrane of S . cereoisiae. Details on structure, function, and assembly are given in 5,173, and216. Cyt.b, cytochrome b; Cyt.c,, cytochrome c I ; FeS, Rieske FeS-protein. Numbering of structural subunits is the same as in Table 111.

76

J . H . DE WINDE AND L . A. GRIVELL

TABLE Ill GLOBALREGULATION OF MITOCHONURIALBIOGENESIS; A

MULTIPLICITY OF TRANSCRIPTIONAL REGULATORY PROTEINS Regulatory factors

Gene

Q a z -

Encoding

c

HAPl'

HAP2/3/4

MIGl

ABFl

+

++ +

+

+ + ++ +

CPFl

RAPlh

References

-0

+ ++ ++ + +

QCRl QCR2 CYTl RIP1 QCR6 QCR7 QCR8 QCR9

44 kDa subunit I 40 kDa subunit I1 Cytochrome cl Rieske FeS-protein 17 kDa subunit VI 14 kDa subunit VII 11 kDa subunit VIII 7.2 kDa subunit IX

ND

ND

+

COX4 COX6 COX6E

Subunit IV Subunit VI Subunit VIb

ND

++ ++ +

CYCl LPDI

Iso-lcytochromec Dihydrolipoyl-Dff

++ ++

ND ND

++ -

+

ND

++ ND

++

++

++

+

+

+ ++ +

+

+

+ ++ ++ ++ t+

+

C

117200 3636a 125,174 174 c. d 118,125

+

c, e

-

115 200 c, e

+

+

-

31,49,174

+

+

201 c

4 In many instances, HAP1 binding sites cannot h e determined froin the availablc sequence data, because a clear consenws target sequence has not been determined. ND, Not determined; -, no consensus binding site present in the promoter region, +, consensus binding site present in the promoter region; + +, functional role established by Northern analysis of m R N A levels. RAP1 is a multifunctional regilatory factor, distantly related ki A B F I . See text and 15 for more details. J. H. de Winde, unpublished observations. c' Consensus A B F l binding site present within the first 30 bp of the N-terminal coding region. c' B L. Trnmpower, personal communication. (2

these factors to be identical to the previously identified ARS-binding factor A B F l and the centromere- and promoter-binding factor CPFl (I 7.5; Table 111). Both proteins are involved in the transcriptional regulation of a large number of genes important for cell growth and division.

A. Multifunctional Regulators ABFl and CPFl The multifunctional sequence-specific DNA-binding protein ABFl was originally identified as binding to the ARSlB-element and to the silencer elements of the non-transcribed mating-type loci H M R and H M L (176-1 78; reviewed in 15, 15a). The gene encoding this essential protein has been cloned and sequenced independently by four different groups (179-182).

REGULATION OF MITOCHONDRIAL BIOGENESIS

77

ABFl is involved in the initiation of DNA replication (178, 183, 184), and in transcriptional activation (185-188) and repression (189, 190), and is therefore likely to interact with various (more specific) regulatory proteins at different chromosomal loci. The protein is unusually rich in Asp, Asn, and Ser residues. The DNA-binding region is composed of an N-terminal metalbinding domain (179, 180) and a centrally located specificity domain (15). ABFl can itself function as a weak transcriptional activator, or cause strong transcriptional activation in combination with other weak activators (191). Structure/function studies with temperature-sensitive abfl mutants should provide more insight into the roles played by this protein in initiation of DNA replication and transcriptional activation (192). C P F l f C P l binds to the centromeric core element C D E l and to CDE1like recognition sequences in several promoters (193; reviewed in 15, 15a). The protein is not essential and its gene has been characterized independently by three different groups (194-196). The C-terminal part of C P F l contains a helix-loop-helix DNA-binding domain (196), followed by a leucine-zipper-like amphipathic a-helix (J. H. de Winde, unpublished results). cpfl null mutants exhibit decreased minichromosome stability and methionine auxotrophy. Despite the presence of C D E 1 sequence elements in the promoter regions of various methionine/S-adenosylmethioiiine biosynthesis genes, including the MET25 UAS (196, 197), C P F l does not appear to be required for transcriptional activation (198). Recent evidence suggests that binding of C P F l to the promoter regions of several of the MET genes enables the specific activator MET4 to stimulate transcription of these genes efficiently (199).As C P F l cannot function as a transcriptional activator (199;J. H. de Winde and M. van Berkel, unpublished result<),the protein is likely to act as an auxiliary regulator at various genomic loci (see Section VIII, D).

B. Roles of ABFl and CPFl in Regulation of the QCR8 Gene, Encoding a Respiratory-chain Component The promoter region of the QCR8 gene, encoding the 11-kDa subunit VIII of the QCR complex, consists of several cis-acting regulatory sequence elements (Table 111; Fig. 6). The UAS contains a consensus target sequence for the HAP2/3/4 activator complex (118). Accordingly, carbon-sourceregulated transcription of QCR8 depends on a functional HAP2/314 heteromer; transcription is not dependent on HAP1 (125). About 20 base-pairs upstream from the HAP2/3/4 site, overlapping target sequences for ABFl and C P F l are present, to which both factors bind in a mutually exclusive manner (118, 174).

78

J. H. D E WINDE AND L. A. GRIVELL

FIG. 6. Transcription regulation of the QCR8 gene of Saccharomyces cereuisiae. Schematic representation of the promoter region of the QCR8 gene, encoding the 11-kDa subunit VIII of the yeast mitochondria1 ubiquinol-cytochrome-c oxidoreductase (see also 125). Distances relative to the translational start codon are indicated. The HAP2/3/4 complex, binding to a consensus target sequence in the QCR8 UAS, specifically stimulates transcription on a nonfermentable carbon source (C). A B F l is required for efficient transcriptional activation to occur under both repressing and derepressing growth conditions, by interacting directly or indirectly with the HAP2/3/4 complex (A). CPFl acts, either directly or indirectly, as a repressive modulator of the induction response during escape from glucose repression (B). ABFl and CPFl may themselves be targets for regulatory interactions from signaling routes (?). The poly(dA-dT) stretch which is also involved in transcriptional regulation is indicated by a broken line, together with the alternating substitutions used in transcriptional analysis (see Fig. 7)

We have investigated what roles ABFl and CPFl play in transcriptional regulation of the QCR8 gene by analyzing the effects of binding-site mutations in the chromosomal context of the QCR8 promoter under repressing and derepressing growth conditions (125). ABFl is required for optimal QCR8 transcription during anaerobic growth (J. H. de Winde, unpublished results) and during repressed and derepressed growth under aerobic conditions. In addition, ABFl is ensuring a rapid and efficient induction of QCR8 transcription during escape from glucose repression, even when cell cycle progression is blocked in S-phase. CPFl is not important during steady-state growth, but modulates the overall induction response by functioning as a negative regulator. When ABFl cannot bind to the QCR8 promoter, the repressive effect of CPFl can be overcome only by progression through the S-phase of the cell cycle, stressing the importance of ABFl for efficient transcriptional induction (Fig. 6).

C. ABFl Is Involved in General Control of Mitochondria1 Biosynthesis As basal and induced transcription of QCR8 are dependent on both ABFl and HAP2/3/4 (125), it is tempting to speculate that these activators function synergistically, as previously observed for HAP1 and the multifunc-

79

REGULATION OF MITOCHONDRIAL BIOGENESIS

1 2 3 4 5

Actin

6 7 8 9 1 0 w;%p*

QCm

YPD

YPE

FIG. 7 Mutations in the poly(dA-dT) stretch of the QCR8 promoter region cause a decrease in derepressed mRNA levels Wild-type DL1 (MATa his3-11,15 leu2-3,112 uru3-251,328,372)(lanes 1and 6) and isogenic QCR8 promoter mutants, containing a deletion of eight consecutive dT-dA base-pairs (lanes 2 and 7 , 3 and 8) or specific substitutions of alternating dG for dT residues (lanes 4 and 9, 5 and 10) (see Fig 6) were grown on a complete medium containing 4%glucose (YPD) or 2%(w/v) ethanol (YPE). Total RNA was isolated and transferred to Hybond-N membranes (Amersham), and hybridized with radioactively labeled probe fragments specific for actin and QCR8 mRNA as described previously (125). Basal QCR8 mRNA levels in mutants grown on glucose (lanes 2-4) are slightly lowered compared to the isogenic wild type (lane 1). Derepressed QCR8 mRNA levels in mutants lacking the eight consecutive dT-dA base-pairs (lanes 7 and 8) are severely reduced compared to wild type (lane 6) A disruption of the dT-dA stretch by two alternating dG residues causes a less severe reduction of derepressed QCR8 mRNA levels (lanes 8 and 9)

tional regulator RAP1 (32). Synergistic activation may involve other protein factors through additional DNA-protein and protein-protein interactions (Fig. 6). Support for this hypothesis comes from mutations in a poly(dA-dT) stretch, including eight consecutive dT-dA base-pairs, that separates the ABFl/CPFl binding region from the HAP2/3/4 site (Fig. 6). Mutations in this region have only a minor effect on basal QCR8 transcription on glucose (Fig. 7). Disruption of the dT-dA stretch by substitution of dG for dT slightly lowers derepressed QCR8 mRNA levels, whereas deletion of the eight consecutive dT-dA base-pairs drastically decreases derepressed QCR8 transcription, Thus, the distance between the ABFUCPFl-binding region and the HAP2/3/4 site appears to be important for proper transcriptional regulation of QCR8, whereas the integrity of the poly(dA-dT) stretch is apparently less important. Synergistic combination of ABFl and other weak transcriptional activators can cause relatively strong activation (19I), but the mechanism and participants of the interactions are unknown. In contrast to the situation at QCR8, the UAS of the QCRZ gene encoding the 40-kDa subunit I1 of the QCR complex consists of an ABFl binding site and a HAP2/3/4 target site immediately adjacent to each other (117).The promoter region of QCR2 is remarkably similar to the COX6 promoter region, and ABFl is required for transcriptional regulation of COX6 under repressed and derepressed growth

80

J . H . DE WINDE AND L. A. GRIVELL

conditions, probably acting cooperatively with the HAP2/3/4 complex (200). Together with the presence of ABF1-binding sites in promoter regions of genes encoding iso-1 cytochrome c and all but one of the QCK subunits (Table III), this sets the stage for ABFl to function as a global regulator of respiratory competence in S . cerezjisiue. The promoter region of the LPDl gene, encoding dihydrolipoyl dehydrogenase (201), contains, besides a functional HAP2/3/4 target site (202), binding sites for ABFl and the related multifunctional regulator RAP1 ( 1 5 , 179), for CPF1, GCN4, involved in general control of amino-acid biosynthesis (41), and for the heat-shock transcription factor HSF (132; see also Table HI). Dihydrolipoyl dehydrogenase is the common component of the cytoplasmic pyruvate dehydrogenase and the mitochondrial or-ketogiutarate dehydrogenase complexes (201, 203, 204; see Table I). By formation of acetyl-CoA and succinyl-CoA, both complexes supply and maintain the metabolic turnover of the citric acid cycle (see Fig. l),and hence are essential for the respiratory competence of mitochondria. The LPDl gene therefore presents an intriguing example of global regulation of mitochondrial biogenesis, as various transcription regulatory pathways converge on its promoter region, including control by several general regulators.

D. General Transcriptional Regulation by Multifunctional Regulators Transcriptional regulation of nucleus-encoded mitochondrial protein genes by multifunctional regulators like ABFl may enable the yeast cell to adjust the rate of mitochondria1 biogenesis in relation to overall cellular metabolism and growth rate. How is transcriptional regulation through ABFl achieved? As stated above, ABFl may function synergistically with more specific regulatory proteins like the HAP21314 heteromer (Fig. 6). Carbon-source regulation of mitochondrial protein genes is assumed to be dependent on HAP2/3/4, but in addition ABFl may itself be a target for signal-transduction pathways involved in nutrient sensing and growth control (166).This is strengthened by the fact that the binding activity of ABFl varies with the growth rate (166) and that the phosphorylation state of the protein varies accordingly in response to the available carbon source (192, 205). ABFl remains bound to its target sites at various promoters in ciuo under different growth conditions (W. Mulder and J. H. de Winde, unpublished results). Through additional protein-protein interactions, other proteins may contribute to the overall transcriptional regulation. Binding of ABFl to the QCR8 5' flank creates a nuclease-hypersensitive nucleosome-free window, spanning the promoter region (206).Thus, ABFl is involved in maintaining a well-defined chromatin structure at QCRH,

REGULATION OF MITOCHONDHIAL BIOGENESIS

81

ensuring accessibility of the promoter and, hence, efficient transcriptional activation under various growth conditions. Likewise, the multifunctional regulator RAP1 is involved in the formation of a nuclease-sensitive chromatin structure at the HIS4 promoter, probably ensuring accessibility for the specific activators GCPJ4 and BAS l/BAS2 (207). These general regulators may therefore play comparable roles in different promoter contexts. A role for C P F l in chromatin structure determination at QCR8 is not clearly established (206), although in other promoters CPF1-binding sites appear to be associated with nuclease-hypersensitive sites (196). At the M E T 1 6 and M E T 2 5 promoters, binding of C P F l ensures efficient transcriptional activation by MET4, probably by increasing accessibility of the activator for its target sites (199). For other genes, including QCR8, transcriptional regulation through C P F l is still largely presumptive.

IX. Mitochondria1 Biogenesis in an Evolutionary Perspective Cytoplasmic petite strains of the facultative anaerobic yeast S. cerezjisiae, having lost part (p-) or all (PO) of their mitochondrial DNA, are perfectly viable when grown on a fermentable carbon source. This demonstrates that, in this petite-positive yeast, synthesis of mitochondrially encoded proteins is required only for respiratory functions. In petite-negative yeasts like Kluyveromyces lactis and Schizosaccharomyces pombe, p- or po mutations are lethal. Since several nuclear and mitochondrial mutations causing respiratory deficiency have been isolated from these yeasts, loss of viability is not likely to be due to loss of respiration. Apparently, dependence of cellular metabolism on biosynthesis of a functional mitochondrion is altered in these yeasts. A clue is given by the recent observation that one single nuclear mutation can alleviate lethality of the p" mutation in Schizosuccharomyces pombe ( 2 0 7 ~ ) . Investigation of transcriptional regulation of nuclear genes encoding mitochondrial proteins in Khycerotnyces luctis and Schizosaccharomyces pombe will help clarify the regulatory mechanisms governing mitochondrial biogenesis in S. cerezjisiae. cis- and trans-acting elements controlling transcription of genes encoding respiratory-chain components in Kluyveroinyces lactis, which is distantly related to S. cerevisiae, appear to be functionally and structurally related to their Succharomyces counterparts (W. Mulder and J. H . de Winde, unpublished results). The recent characterization of counterparts of ABFl(208) and C P F l (W. Mulder, J. H. de Winde and L. A. Grivell, unpublished results) in Kluyoeromyces lnctis should provide insight

82

J. H. DE WINDE AND L . A. GKIVELL

into the coupling among mitochondrial biogenesis, metabolic requirements, and cell growth. In Schizosaccharomyces pombe, which is extensively diverged from S. cereuisiae, the gene encoding a functional counterpart of Saccharomyces HAP2 has been identified (122). Strikingly, only the essential core domain of the protein, involved in subunit association and specific DNA binding of the HAP21314 complex, is highly conserved throughout evolution. The remainder of the protein is completely diverged, although also in Schizosaccharomyces the HAP2 protein is specifically required for respiratory competence. The function of the yeast HAP2/3/4 activator complex contrasts with the homologous CCAAT-binding complex NF-Y of higher eukaryotes, which acts as a global transcriptional activator. Again, the functional counterparts of HAP2 and HAPS, termed NF-YA and NF-YB, show high amino-acid conservation in the essential core domains, and can substitute for the respective yeast proteins (209, 210). In higher eukaryotes, features of transcriptional regulation in relation to mitochondrial biogenesis are only beginning to be unraveled. Nevertheless, interesting observations have already been made. In mammals, expression of nucleus-encoded respiratory-chain components is growth-regulated (211, 21la). Common protein-binding sites, probably involved in coordinate transcriptional regulation, have been identified in the promoter regions of several mammalian nuclear genes encoding respiratory-chain proteins (212). A candidate for a common regulatory protein in mammalian cells is the recently identified nuclear respiratory factor NRFl (213, 213a). Biologically active target sites for N R F l are present in the promoter regions of various genes encoding proteins involved in respiration and mitochondrial DNA replication. In addition, N R F l binds to the promoter regions of genes involved in protein synthesis and amino-acid catabolism ( 2 1 3 ~ )Like . yeast ABF1, mammalian N R F 1 may be involved in coordinating mitochondrial biosynthesis in relation to cellular growth and metabolism.

X. Conclusions and Prospects The yeast S. cerevisiae is a valuable organism for the study of mitochondrial biogenesis. In most eukaryotes, mutations affecting mitochondrial function are ultimately lethal, because of a strict requirement for respiration and energy production. Although several key metabolic functions of mitochondria are indispensable, S. cereoisiue can temporarily dispense with its respiratory chain and fall back on fermentative growth for its energy produc-

REGULATION OF MITOCHONDRIAL BIOGENESIS

83

tion. Oxygen and the type of carbon source are not only major regulatory elements for the synthesis of mitochondrial constituents, but in addition cause responses mediating important changes in cellular metabolism and growth rate. Expression of most nuclear genes encoding mitochondrial proteins is regulated at the transcriptional level. Research so far has concentrated mainly on unraveling molecular mechanisms involved in transcriptional control. During the last decade, a hierarchy of trans-acting regulatory factors has been uncovered. These include dedicated transcriptional activators displaying specificity for the family of nucleus-encoded mitochondrial protein genes, as well as more general factors that also affect other gene families. Regulatory factors are themselves controlled by intricate signal-transduction routes involved in nutrient sensing and growth control. In many instances, transcriptional control is exerted by positive and negative events, acting through parallel pathways. Overall, control of biosynthesis and function of mitochondria clearly are integrated into the complex regulatory networks concerning cellular growth and metabolic requirements. Despite the enormous amount of information concerning mitochondrial biogenesis in yeast, many details still have to be worked out. Knowledge of the signal-transduction pathways involved in nutrient sensing and growth control is still fragmentary. Transcription factors involved act on various promoters in different combinations, indicating that the context of cis-acting control sequences plays an important role in the regulation process. Additional proteins may be involved in transcriptional regulation through protein-protein interactions with the identified DNA-binding regulators (214, 215). Furthermore, the way in which the metabolic state of the mitochondrion can influence nuclear gene expression may well reveal interesting aspects of transcriptional regulation. Biosynthesis of the respiratory-chain complexes presents important aspects of mitochondrial biogenesis. Coordinated assembly of the complexes requires that the individual components be synthesized in stoichiometric amounts. As described above, regulation is exerted primarily at the transcriptional level. In addition, fine-tuning and control are likely to exist at the level of assembly and turnover of the different subunits. Several of the noncatalytic subunits of ubiquinol-cytochrome-c oxidoreductase (QCR) have been shown to play important roles in assembly and stability of the complex (173, 216). Each of the nuclear genes involved in mitochondrial biosynthesis possesses different configurations and combinations of binding sites for shared regulatory proteins (e.g., Table 111). Hence, further study is required to understand how coordinate synthesis results from overall transcriptional regulation.

84

J . H . DE WINDE AND L. A. GRIVELL

GLOSSARY OF GENE NAMESAND ABBREVIATIONS Gene name

Encodingimeaning of abbreviation

References

AAC ABF

ADP/ATP carrier or translocator protein

69 1.5, 1.50, 57 21 7 , 21 8 21 7, 21N 22a, 63 207, 219 79 37a

ADH ADR

ANB BAS BCYl CAR1

ClT COX CPF CTT1 CYB2 CYC CYCl CYC7 CYP CYRl

CYTl ERG GAL

GCN

GRRI llAP HEM HEX HIS HMG HMG HML HMR HO HSF IfXK IMP

IRA KGD LPDl MDM

MET MlGl MSI

ARS-binding factor Alcohol dehydrogenase Regulator of alcohol debydrogenase expression Anaerobically expressed gene Regulator of basal level transcription Regulatory subunit of CAMP-dependent protein kiiiase Arginase Citrate synthase Cytochrome oxidase (subunit) Centromere- and promoter-binding factor Cytosolic catalase T Cytochrome I?, (L-lactate cytochrome-c reductase) Gene involved in cytochrome I~iosynthesis Iso-1 cytochrome c Iso-2 cytochrome c Regulator of heme and cytochrome biosynthesis Adenylate cyclase (CDC35) Cytochrome c, G e n e involved in sterol biosynthesis Gene involved in galactose metabolism G e n e involved in positive regulation of general aminoacid biosynthesis General regulator of glucose repression Heme activator protein Gene involved in heme biosynthesis Regulator of glucose/hexose repression Gene involved in histidine biosynthesis 3-Hydroxy-3-metliylglutaryl-CoA reductase High-mobility-group protein Silent locus for mating-type a Silent locus for mating-type a Homothallisni locus; endonuclease responsible for initiation of mating-type switch Heat-shock transcription factor Hexokinase Gene involved in galactose metabolism, independent of mitochondria1 phenotype Inhibitor of RAS activity cr-Ketoglutarate dehydrogenase Lipoainide dehydrogenase Gene involved in mitochondria1 distribution and morphology Gene involved in inethionine biosynthesis Multi-copy inhibitor of GAL-gene expression; mediator of glucose repression Multi-copy suppressor of i r d deficiency

157 22a, 115, 116 15, ISa, I 9 6

38 37

99 49, 99 33, 35, 99 30, 39, I29 76, 78 36, 36a 4, 48

12

10, 41 93 49 24, 115 95 41, 207 62 56, 57 20, 177 20. 177 20 132 91 146, 148 76 20.3, 204 201

168 197, 1.99 96, 97 80

85

REGULATION OF MITOCHONDRIAL BIOGENESIS GLOSSARY (continued) Gene name

Encodingjmeaning of abbreviation

References

NRF1 OLE PDE PET

Mammalian nuclear respiratory transcription factor Gene involved in fatty-acid and sterol biosynthesis Phosphodiesterase Important for mitochondria1 respiratory competence; deficiency causes petite phenotype Gene involved in phosphate metabolism Ubiquinol-cytochrome-c oxidoreductase (subunit) Repressor/activator protein Regulator of activating nutrient signal; formerly “inducer of rat sarcoma” Regulator of general glucose repression Repressor dependent on oxygen Regulator of general glucose repression Repressor dependent on oxygen Mating-type switch inhibitor; repressor of HO expression Gene involved in sucrose metabolism; deficiency causes sucrose non-fermenting phenotype Suppressor of Ty-insertion mutation Suppressor of snf phenotype Gene involved in sucrose metabolism Invertase Mating-type switch inducer; activator of HO expression Translation-initiation factor G e n e involved in regulation of thymidine uptake

213 4 , 170 81

PHO QCR RAP1 RAS

REG RE01 RGRI ROX SIN SNF SPT SSN

suc su c 2 SWI

TIF TUP

26, 28 220 5 , 173 15, 15a, 221 76, 222 95 54 94 53, 55 20, 58 70 58, 223 70, 98 70 70, 98 20, 87 64 100

ACKNOWLEDGMENTS We gratefully acknowledge John Diftley, Bernard Guiard, Tim Pillar, Richard Trumbly, and Bernard Trumpower for communicating results and ideas prior to publication. We thank Herman Pel and Wietse Mulder for critical reading of the manuscript and other members of our laboratory for interactive and stimulating discussions. The original research described in this review was supported in part by The Netherlands Foundation for Chemical Research (SON), with financial support from The Netherlands Organization for the Advancement of Pure Research (NWO).

REFERENCES I . T. G. Cooper, in “The Molecular Biology of the Yeast Saccharornyces; Metabolism and Gene Expression” (J. N. Strathern, E. W. Jones and J. R. Broach, eds.), p. 39. CSHLab, Cold Spring Harbor, New York, 1982. 2 . D . G . Fraenkel, in “The Molecular Biology of the Yeast Saccharornyces; Metabolism and

86

J . H. DE WINDE AND L. A . GRIVELL

Gene Expression” (J. N. Strathern, E. W. Jones and J. R . Broach, eds.), p. 1. CSHLab, Cold Spring Harbor, New York, 1982. 3. E. W. Jones and G. R. Fink, in “The Molecular Biology of the Yeast Saccharolnyces; Metaholism and Gene Expression” (J. N. Strathern, E. W. Jones and J. R. Broach, eds.), p. 181. CSHLab, Cold Spring Harbor, New York, 1982. 4 . S. A. Henry, in “The Molecular Biology of the Yeast Succharomqces; Metabolism and Gene Expression” (J. N. Strathern, E. W. Jones and J. R. Broach, eds.), p. 101. CSHLab, Cold Spring Harbor, New York, 1982. 5. L. A. Grivell, EJB 182, 477 (1989). 6. S. L. Forsburg and L. Guarente, ARCell B i d 5, 153 (1989). 7. W. Chen and K. Struhl, E M B O J, 4, 3273 (1985). 8. S. Buratowski, S. Hahn, L. Guarente and P. A. Sharp, Cell 56, 549 (1989). 9. L. Guarente, Cell 36, 799 (1984). 10. K. Struhl, Cell 49, 295 (1987). 11. L. Guarente, ARGen 21, 425 (1987). 12. C. A. Stanway, BioEssays 13, 241 (1991). 13. A. D. Johnson and I. Herskowitz, Cell 42, 237 (1985). 14. H . S. Yo0 and T. 6. Cooper, MCBiol 9, 3231 (1989). 15. J. F. X. Dimey, Anthonie can Leeusenhoek, Znt. J. Gen. Mol. Microbiol. 61, 25 (1992). Z5a. T. Doorenbosch, W. H. Mager and R. J. Planta, Gene Expression 2, 193 (1992). 16. P. A. Sharp, Nature 351, 16 (1991). 16a. K. F. Stringer, C. J. Ingels and J. Greenblatt, Nature 345, 783 (1990). 17. M. Grunstein, ARCell B i d 6, 643 (1990). 18. L. K . Durrin, R. K . Mann, P. S. Kayne and M. Grunstein, Cell 65, 1023 (1991). 19. D. I. Chasman, N . F. Lue, A. R. Buchman, J. W. Lal‘ointe, Y. Lorch and R . D. Kornberg, Genes Dec. 4, 503 (1990). 20. K. Nasmyth and D. Shore, Science 237, 1162 (1987). 21. 6. F. Sprague, Jr., Trends Genet. 7, 393 (1991). 22. L. Marsh, A . M. Neiman and I. Herskowitz, ARCell Biol 7, 699 (1991). 22a. R. S. Zitomer and C. V. Lowry, Microhiol. Reu. 56, 1 (1992). 23. R. Labbe-Bois and P. Labbe, in “Biosynthesis of Heme and Chlorophylls” (H. A. Daily, ed.), p. 235. McGraw-Hill, New York, 1990. 24. D. Urban-Grirnal and R. Labbe-Bois, MGG 183, 85 (1981). 25. L. Guarente and T. Mason, Cell 32, 1279 (1983). 26. M . C . Costanzo and T. D. Fox, MCBioZ 6, 3694 (1986). 27. D. L. Marykwas and T. D. Fox, MCBiol 9, 484 (1989). 28. C. A. Strick and T. D. Fox, MCBiol 7, 2728 (1987). 29. K. Pfeifer, K. S. Kim, S. Kogan and L. Guarente, Cell 56, 291 (1989). 30. F. Creusot, J. Verdibre, M. Gaisne and P. P. Slonimski, J M B 204, 263 (1988). 31. K. Pfeifer, B. Arcangioli and L. Guarente, Cell 49, 9 (1987). 32. R. Sousa and B. Arcangioli, E M B O J. 8, 1801 (1989). 33. K. Pfeifer, T. Prezant and L. Guarente, Cell 49, 19 (1987). 34. T. Prezant, K . Pfeifer and L. Guarente, MCBiol 7, 3252 (1987). 35. K. S. Zitomer, J. W. Sellers, D. W. McCarter, 6. A. Hastings, P. Wick and C. V. Lowry, MCBiol 7, 2212 (1987). 36. J. C. Schneider and L. Guarente, MCBiol 11, 4934 (1991). 36a. U. Oechsner, H. Hermann, A. Zollner, A. Haid and W. Bandlow, MCG 232, 447 (1992). 37. T. Lodi and B. Guiard, MCBiol 11, 3762 (1991). 37a. R. M. Luche, R. Sumrada and T. G. Cooper, MCBiol 10, 3884 (1990). 38. H. Winkler, 6. Adam, E . Mattes, M. Schanz, A. Hartig and H. Ruis. E M B O J . 7, 1799 (1988).

HEGULATION OF MITOCHONDHIAL BIOGENESIS

87

39. J. Verdi*re, M. Gaisne, B. Guiard, N. Defranoux and P. P. Sloniinski, JMB 204, 277 (1988). 40. R. Marmorstein, M. Carey, M. Ptashiie and S . C. Harrison, Nature 356, 408 (1992). 41. I. A. Hope, S. Mahadevan and K. Struhl, Nature 333, 635 (1988). 42. T. M. Laz, D. F. Pietras and F. Sherman, PNAS 81, 4475 (1984). 43. K. S. Kim and L. Guarente, Nature 342, 200 (1989). 44. K. S. Kim, K. Pfeifer, L. Powel and L. Guarente, PNAS 87, 4524 (1991). 45. M. E. CerdBn and R. S. Zitomer, MCBiol 8, 2275 (1988). 46. J. Verdi&re, M. Gaisne and R. Labbe-Bois, MGG 228, 300 (1991). 47. T. Keng, MCBiol 12, 2616 (1992). 48. T. 6. Turi and J. C. Loper, Yeast 6, S234 (1990). 49. L. Guarente, B. Lalonde, P. Gifford and E. Alani, Cell 36, 503 (1984). SO. J. Olesen, S. Hahn and L. Guarente, Cell 51, 953 (1987). 51. S. L. Forsburg and L. Guarente, MCBiol 8, 647 (1988). 52. S. L. Forsburg and L. Guarente, Genes Dee. 3, 1166 (1989). 53. C. V. Lowry and R. S. Zitomer, PNAS 81, 6129 (1984). 54. C. E. Trueblood, R. M. Wright and R. 0. Poyton, MCBiol8, 4537 (1988). 55. C. V. Lowry and R. S. Zitomer, MCBiol 8, 4651 (1988). 56. H.-M. Jantzen, A. Admon, S . P. Bell and R. Tjian, Nature 344, 830 (1990). 57. J. F. X. DifRey and 8. Stillman, PNAS 88, 7864 (1991). 58. W. Kruger and I. Herskowitz, MCBiol 11, 4135 (1991). 59. C. V. Lowry, M . E. Cerdlin and R. S. Zitomer, MCBiol 10, 5921 (1990). 59a. M. R. Hodge, K. Singh and M. G. Cumsky, MCBiol 10, 5510 (1990). 60. M. 6 . Cumsky, C. E . Trueblood, C. KO and R. 0. Poyton, MCBiol7, 3511 (1987). 61. R. R. Matner and F. Sherman, J B C 257, 9811 (1982). 62. M. Thorsness, W.Schafer, L. DAri and J. Rine, MCBiol 9, 5702 (1989). 63. K. D. Mehta, D. Leung, L. Lefebvre and M. Smith, JHC 265, 8802 (1990). 64. J. Schnier, H . 6 . Schwelberger, Z. Smit-McBride, H. A. Kang and J. W.B. Hershey, MCBiol 11, 3105 (1991). 65. M. 6. Cumsky, C. KO, C. E. Trueblood and R. 0. Poyton, PNAS 82, 2235 (1985). 66. C. E. Trueblood and R. 0. Poyton, M C B i d 7, 3520 (1987). 67. M. R. Hodge, G. Kim, K. Singh and M. 6. Cumsky, MCBil9, 1958 (1989). 68, R. A. Waterland, A. Basu, B. Chance and R. 0. Poyton, JBC 266, 4180 (1991). 69. J. Kolarov, N. Kolarova and N. Nelson, JBC 265, 12711 (1990). 69a. H. A. Kang, H. G. Schwelberger and J. W.B. Hershey, MGG 233, 487 (1992). 70. M. Carlson, J . Bact. 169, 4873 (1987). 71. L. G . Vallier and M. Carlson, Genetics 129, 675 (1991). 72. J. L. Celenza and M. Carlson, Science 233, 1175 (1986). 73. J. L. Celenza and M. Carlson, MCBioZ 9, 5034 (1989). 74. J. L. Celenza, F. J. Eng and M. Carlson, MCBiol 9, 5045 (1989). 75. S . Thompson-Jaeger, J. FranGois, J. P. Gaughran and K. Tatchell, Genetics 129,697 (1991). 76. 1. M. Thevelein, Mol. Microbiol. 5, 1301 (1991). 77. K. Mbonyi, L. van Aelst, J. C. Arguelles, A. W.H . Jans and J. M. Thevelein, MCBiol 10, 4518 (1990). 78. K. Matsumoto, I. Uno and T. Ishikawa, Genetics 108, 53 (1984). 79. K. Matsumoto, I. Uno, T. Ishikawa and Y. Oshima, /. Bact. 156, 898 (1983). 80. R. Ruggieri, K. Tanaka, M. Nakafuku, Y. Kaziro, A. Toh-e and K. Matsumoto, PNAS 86, 8778 (1989). 81. P. Sass, J. Field, J. Nikawa, T. Toda and M . Wigler, PNAS 83, 9303 (1986). 82. E. J. A. Hubbard, X. Yang and M. Carlson, Genetics 130, 71 (1992). 83. E . Abrams, L. Neigeborn and M. Carlson, MCBiol 6, 3643 (1986).

88

J . H. DE WINDE AND L. A . GHIVELL

B. C . Laurent, M. A. Treitel and M. Carlson, MCBiol 10, 5616 (1990). A . M. Happel, M. S. Swanson and F. Winston, Genetics 128, 69 (1991). B. C. Laurent, M . A. Treitel and M. Carlson, PNAS 88, 2687 (1991). C. L. Peterson and I. Herskowitz, Cell 68, 573 (1992). 88. J. L. Davis, R. Kunisawa and J. Thorner, MCHiol 12, 1879 (1992). 89. B. C . Laurent, X. Yang and M. Carlson, MCBiol 12, 1893 (1992). 90. L. Marshall-Carlson, J. L. Celenza, B. C. Laurent and M . Carlson, MCBiol 10, 1105 (1990). 91. H . Ma, L. M. Bloom, C. T. Walsh and D. Botstein, MCBiol 9, 5643 (1989). 92. P. Herrero, R. Ferniindez and F. Moreno, J. Gerr. Micr-obiol. 135, 1209 (1989). 93. J. S. Flick and M. Johnston, MCBiol 11, 5101 (1991). 94. A. Sakai, Y. Shimizu, S. Kondou, T. Chihazakura and F. Hishinuma, MCBiol 10, 4130 (1990). 95. D. Niederacher and K.-D. Entian, MGG 206, 505 (1987). 95a. D . Niederacher and K.-D. Entian, EJR 200, 311 (19911. 96. J. 0. Nehlin and H. Ronne, E M B O J . 9, 2891 (1990). 97. J. 0. Nehlin, M. Carlberg and H. Ronne, E M B O J. 10, 3373 (1991). 98. M . Carlson, B. C. Osmond, L. Neigeborn and D. Botstein, Genetics 107, 19 (1984). 99. R. J. Rothstein and F. Sherman, Genetics 94, 871 (1980). 100. R. J. Trumhly, J. Bact. 166, 1123 (1986). 101. R. J. Trumbly, Mol. Microbiol. 6, 15 (1992). 102. J. Schultz and M. Carlson, MCBiol 7, 3637 (1987). 103. R. J. Trurnbly, Gene 73, 97 (1988). 104. J. Schultz, L. Marshall-Carlson and M . Carlson, MCBiol 10, 4744 (1990). 104a. J. Schultz, L. Marshall-Carlson and M. Carlson, MCRiol 12, 2909 (1992). 105. N . Mermod, E. A. O’Neill, T. J. Kelly and R. Tjian, Cell 58, 741 (1989). 106. R. S. Sikorski, M. S. Boguski, M. Goebl and P. Hieter, Cell 60, 307 (1990). 107. T Hirano, N. Kinoshita, K. Morikawa and M. Yanagida, Cell 60, 319 (1990). 108. F. E. Williams and R. J. Trumbly, MCBiol 10, 6500 (1990). 109. M. Zhang, L. S. Rosenbluin-Vos, C. V. Lowry, K. A. Boakye and R. S. Zitomer, Gene 97, 153 (1991). 110. H. K. W. Fong, T. T. Amatruda 111, B. W. Birren and M. I. Simon, PNAS 84, 3792 (1987). 111. M .Whiteway, L. Hougan, D. Dignard, D. Y. Thomas, L. Bell, 6. C. Saari, F. J. Grant, P. O’Hara and i7. L. MacKay, Cell 56, 467 (1989). 112. M. Goebl and M . Yanagida, TZBS 16, 173 (1991). 113. F. E. Williams, U . Varanasi and R. J. Tnunbly, MCBiol 11, 3307 (1991). 114. C . A . Keleher, M. J. Redd, J. Schultz, M. Carlson and A. D. Johnson, Cel/ 68, 709 (1992). 115. T. Keng and L. Cuarente, PNAS 84, 9113 (1987). 116. J. D. Trawick, C . Rogness and R. 0. Poyton, MCHiol 9, 5350 11989). 117. J. C. Dorsman and L. A. Grivell, Curr. Genet. 17, 459 (1990). 118. A. C. Maarse, M. d e Haan, A. Bout and L. A . Grivell, NARes 16, 5797 (1988). 119. S. Hahn and L. Guarente, Science 240, 317 (1988). 120. J. T. Olesen and L. Guarente, Genes Dew. 4, I714 (1990). 121. J. L. Pinkhain, J. T. Olesen and L. Guarente, MCHiol 7 , 578 (1987). 122. J. T Olesen, J. 11. Fikes and L. Guarente, MCBiol 11, 611 (1991). 123. S. Hahn, J. Pinkham, R. Wei, R. Miller and L. Guarente, MCBiol 8, 65.5 (1988). 124. E. Giniger and M . Ptashne, Nature 330, 670 (1987). 125. J. H. d e W i d e and L. A. Grivell, MCRiol 12, 2872 (1992). 126. J. L. Pinkhain and L. Guarente, MCBiol5, 3410 (1985). 127. J. S. Flick and M . Johnston, MCBioZ 10, 4757 (1990). 84. 85. 86. 87.

REGULATION OF MITOCHONDRIAL BIOGENESIS

89

128. J. S. Flick and M. Johnston, Genetics 130, 295 (1992). 129. J. Verdiere, F. Creusot and M. Guerineau, MGG 199, 524 (1985). 130. R. W. Wright and R. 0. Poyton, MCBiol 10, 1297 (1990). 131. H. Ma and D. Botstein, MCBiol 6, 4046 (1986). 132. S. Lindquist, ARB 55, 1151 (1986). 133. N . Pfanner and W. Neupert, ARB 59, 331 (1990). 134. B. J. Smith and M. P. Yde, MCBiol 11, 2647 (1991). 135. D.-Y. Shin, K. Matsumoto, H. Iida, I. Uno and T. Ishikawa, MCBiol 7, 244 (1987). 136. C. H . Dupont, M. Rigoulet, M. Aigle and 8 . Guerin, Curr. Genet. 17, 465 (1990). 137. C. H . Dupont, M. Rigoulet, B. Beauvoit and B. GuBrin, Curr. Genet. 17, 507 (1990). 138. F. Busserean, C.-H. Dupont, E. Boy-Marcotte, L. Mallet and M. Jacquet, Curr. Genet. 21, 325 (1992). 139. J. R. Warner and C. Gorenstein, Nature 275, 338 (1978). 140. C . M. Moehle and A. G. Hinnebusch, MCBiol 11, 2723 (1991). 141. R. Cantwell, C. M. McEntee and A. P. Hudson, Curr. Genet. 21, 241 (1992). 142. K. Matsumoto, I. Uno and T. Ishikawa, Yeust 1, (1985). 143. T. M. Pillar and R. E. Bradshaw, Curr. Genet. 20, 185 (1991). 144. T. Belazzi, A. Wagner, M. Schanz, G. Adam, R. Wieser and H. Ruis, Yeast 6, S266 (1990). 145. H. R. Mahler and D. Wilkie, Plasmid 1, 125 (1978). 146. A. A. Algeri, L. Bianchi, A. M. Viola, P. P. Puglisi and N. Marmiroli, Genetics 97, 27 (1981). 147. T. L. Ulery, D. A. Mangus and J. A. Jaehning, MGG 230, 129 (1991). 148. C. Donnini, T. Lodi, I. Ferrero and P. P. Puglisi, Yeast 8 , 83 (1992). 149. W. Bajwa, T. E. Torchia and J. E. Hopper, MCBiol 8, 3439 (1988). 150. T. V. Siemens, D. L. Nichols and R. S. Zitomer, J . Baet. 142, 499 (1980). 151. V. S. Parikh, M. M. Morgan, R. Scott, L. S. Clements and R. A. Butow, Science 235, 576 (1987). 152. J. Parteledis and T. Mason, MCBiol 8, 3647 (1988). 153. V. S. Parikh, H. Conrad-Webb, R. Docherty and R. A. Butow, MCBiol 9, 1897 (1989). 154. R. A. Butow and H. Conrad-Webb, in “Structure, Function and Biogenesis of Energy Transfer Systems” (E. Quagliariello, S. Papa, F. Palmieri and C. Saccone, eds.), p. 175. Elsevier, Amsterdam, 1990. 1.55. A. P. 6 . M. van Loon, R. J. de Groot, E. van Eyk, G . T. J. van der Horst and L. A. Grivell, Gene 20, 323 (1982). 156. L. E. Farrell, J. D. Trawick and R. 0. Poyton, in “Structure, Function and Biogenesis of Energy Transfer Systems” (E. Quagliariello, S. Papa, F. Palmieri and C. Saccone, eds.), p. 131. Elsevier, Amsterdam, 1990. 157. X. Liao, W. C. Small, P. A. Srere and R. A. Rutow, M C B i o l 11, 38 (1991). 158. M. A. Parisi and D. A. Clayton, Science 266, 965 (1991). 159. R. P. Fisher, T. Lisowsky, M. A. Parisi and D. A. Clayton, J B C 267, 3358 (1992). 160. J. F. X. DifRey and B. Stillman, JBC 267, 3368 (1992). 161. G. Palade, in “Methods in Enzymology” (S. Fleischer and 8. Fleischer, eds.), Vol. 96, p. 29. Academic Press, New York, 1983. 162. G . Attardi and G. Schatz, ARCell Biol 4, 289 (1988). 163. J. R. Pringle and L. H. Hartwell, in “The Molecular Biology ofthe Yeast Saccharomyces; Life Cycle and Inheritance” (J. N . Strathern, E. W. Jones and J. R. Broach, eds.), p. 97. CSHLab, Cold Spring Harbor, New York, 1981. 1 6 4 . B. Stevens, in “The Molecular Biology ofthe Yeast Saccharomyces; Life Cycle and Inheritance” (J. N. Strathern, E. W. Jones and J. R. Broach, eds.), p. 471. CSHLab, Cold Spring Harbor, New York, 1981.

90

J. H. DE WINDE AND L . A. GRIVELL

165. E. M . McIntosh, R. W. Ord and R. K. Storms. MCBioZ 8, 4616 (1988). 166. J. H . de Winde, P1i.D. Thesis, University of Amsterdam, 1992. 167. C. W. Jacobs, A. E. M . Adams, P. J. Szaniszlo and J. R. Pringle, J B C 107, 1409 (1988). 168. S. J. McConnell, L. C. Stewart, A. Talin and M . P. Yaffe, J . Cell B i d . 111, 967 (1990). 169. L. C. Stewart and M . P. Yaffe, J. Cell B i d . 115, 1249 (1991). 170. V. M. McDonough, J. E . Stukey and C. E. Martin, JBC 267, 5931 (1992). 171. S. J. McConnell and M . P. Y&e, J. Cell B i d . 118, 385 (1992). 172. M . H. Herruer, W. H. Mager, T. M. Doorenbosch, P. L. M . Wessels, T Wassenaar and R. J. Planta, NARes 17, 7427 (1989). 173. S. d e Vries and C. A. M. Marres, BBA 895, 205 (1987). 174. J. C. Dorsman, W. C. van Heeswijk and L. A. Grivell, NARes 16, 7287 (1988). 175. J. C. Dorsman, A. Gozdzicka-Jozefiak, W. C. van Heeswijk and L. A. Grivell, Yeast 7,401 (1991). 176. M . Snyder, A. R. Buchman and R. W. Davis, Nature 324, 87 (1986). 177. A. R. Buchman, W. J. Kimmerly, J. Rine and R. D. Kornberg, MCBiol 8, 210 (1988). 178. J. F. X. Diffley and B. Stillman, PNAS 85, 2120 (1988). 179. J. F. X. Diffley and B. Stillman, Science 246, 1034 (1989). 180. H. Halfter, B. Kavety, J. Vandekerckhove, F. Kiefer a i d D. Gallwitz, E M B O J . 8, 4265 (1989). 181. P. R. Rhode, K. S. Sweder, K. F. Oegemaand J. L. Campbell, GenesDec. 3, 1926(1989). 182. S. C. Francesconi and S. Eisenberg, PNAS 88, 4089 (1991). 183. S. S. Walker, S. C. Francesconi, B.-K. Tye and S. Eisenberg, MCBioZ 9, 2914 (1989). 184. Y. Marahrens and B. Stillman, Science 255, 817 (1992). 185. A. Goel and R. E. Pearlman, MCBiol 8, 2572 (1988). 186. J. C. Dorsman, M . M . Doorenbosch, C. 1: C. Maurer, J. H. d e Winde, W. H. Mager, R. J. Planta and L. A. Grivell, NARes 17, 4917 (1989). 187. H. Halfter, U. Miiller, E. L. Winnacker and I>. Gallwitz, E M B O J . 8, 3029 (1989). 188. F. Della Seta, S:A. CiafrC., C. Marck, B. Santoro, C. Presutti, A. Sentenac and I. Bozzoni, MCBiol 10, 2437 (1990). 189. A. H. Brand, 6 . Micklem and K. Nasmith, Cell 51, 709 (1987). 190. W. Kimrnerley, A. Buchman, R. Kornberg and J. Rine, E M B O J . 7, 2241 (1988). 191. A. R. Buchman and R. D. Kornberg, MCBiol 10, 887 (1990). 192. P. R. Rhode, S. Elsasser and J. L. Campbell, MCBiol 12, 1064 (1992). 193. R. J. Bran1 and R. D. Kornherg, MCBiol 7, 403 (1987). 194. R. E. Baker and D. C . Masison, MCBio! 10, 2458 (1990). 195. M. Cai and R. W. Davis, Cell 61, 437 (1990). 196. J. Mellor, W. Jiang, M . Funk, J. Rathjen, C. A. Barnes, T. Hini, J. H. Hegeniann and P. Philippsen, E M B O J . 12, 4017 (1990). 197. D. Thoinas, H. Cherest and Y. Surdin-Kerjan, MCBiol9, 3292 (1989). 198. J. Mellor, J. Rathjen, W. Jiang and S. J. Dowell, NARes 19, 2961 (1991). 199. D. Thomas, 1. Jaqueniin and Y. Surdin-Kerjan, MCBioZ 12, 1719 (1992). 200. J. D. Trawick, N. Kraut, F. R. Simon and R. 0. Poyton, MCBiol 12, 2302 (1992). 201. J. Ross, 6. A. Reid and I. W. Dawes, I . Gen. Microbid. 134, 11.31 (1988). 202. S. B. Bowman, 2. Zaman, L. P. Collinson, A. J. P. BrownandI. W. Dawes, MGG231,296 (1992). 203. B. Repetto and A. Tzagoloff, MCBiol9, 2695 (1989). 204. B. Repetto and A. Tzagoloff, MCBiol 10, 4221 (1990). 205. S. Silve, J. D. Trawick and R. 0. Poyton, Yeast 6, S246 (1990). 206. J. H. d e Winde, H. van Leeuwen and L. A. Grivell, Yeast in press (1993).

REGULATION OF MITOCHONDRIAL BIOGENESIS

91

207. C . Devlin, K. Tice-Baldwin, D. Shore and K. T. Arndt, MCBiol 11, 3642 (1991). 207a. P. H a t e r and T. I>. Fox, Genetics 131, 255 (1992). 208. P. M. Gonplves, K. Maurer, W. H. Mager and R. J. Planta, NARes 20, 2211 (1992). 209. L. A. Chodosh, J. Olesen, S. Hahn, A . S. Baldwin, L. Guarente and P. A. Sharp, Cell 53, 11 (1988). 210. X.-Y. Li, R. Mantovani, R. Hooft van Huysduynen, I. Andre, C. Benoist and D. Mathis, h’ARes 20, 1087 (1992). 221. R. Battini, S. Ferrari, L. Kaczmarek, B. Calabretta, S. Chen and R. Baserga, JBC 262, 4355 (1987). 21la. K. Luciakova, R. Li and B. D. Nelson, EJB 207, 253 (1992). 222. H. Suzuki, Y. Hosokawa, H. Toda, M. Nishikirni and T. Ozawa, JBC 265, 8159 (1990). 223. M. J. Evans and R. C. Scarpulla, Genes Deu. 4, 1023 (1990). 213a. C. A. Chau, M. J. Evans and R. C. Scarpulla, JBC 267, 6999 (1992). 214. H. J. Hirnmelfarb, J. Pearlberg, D. H. Last and M. Ptashne, Cell 63, 1299 (1990). 215. M. Nishizawa, Y. Suzuki, Y. Nogi, K. Matsumoto and T. Fukasawa, PNAS 87,5373 (1990). 216. M. D. Crivellone, M. Wu and A. Tzagoloff, JBC 263, 14323 (1988). 217. C. L. Denis, M. Ciriacy and E. T. Young, JMB 148, 355 (1981). 218. C. L. Denis and E. T. Young, MCBiol 3, 360 (1983). 219. K. T. Arndt, C. Styles and 6. R. Fink, Science 237, 874 (1987). 220. Y. Oshirna, in “The Molecular Biology of the Yeast Saccharornyces; Metabolism and Gene Expression” (J. N. Strathern, E. W. Jones and J. R. Broach, eds.), p. 159. CSHLab, Cold Spring Harbor, New York, 1982. 221. D. Shore and K. Nasmyth, Cell 51, 721 (1987). 222. M. Barbacid, ARB 56, 779 (1987). 223. F. Winston, D. Chaleff, B. Valent and G. R. Fink, Genetics 107, 179 (1984).

This Page Intentionally Left Blank

DNA Polymerase II, the Epsilon Polymerase of Saccharomyces cerevisiae' ALAN MORRISONAND AKIO sUCIN02 Laboratory of Molecular Genetics National Institute of Enoironmental Health Sciences Research Triangle Park, North Carolina 27709 Categorization and Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DNA Polymerase I1 Structure and Activities . . . . . . . . . . . . . . . . . . . . . . Cell Cycle Regulation ....... ........... Domain Structure of Catalytic Subunit . . . . . . . . . . . . . . . . . . . . . . . . . . . Genetics of 3'-+5' Exonuclease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The DNA Repair Polymerase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII. Role of DNA Polymerase I1 in DNA Replication . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1. 11. 111. IV. V. VI.

94 96 103 104 110

112 114 117

To date, five nuclear D N A polymerases have been observed or posited to exist in Saccharomyces cerevisiae, prompting questions of why there are so many, and what their roles are. Progress toward answering some of these questions has come from the SV40 (simian virus 40) in oitro D N A replication system. Seven cellular factors required for this system (I), including D N A polymerases a and 6 (2-4), have also been detected in S. cerecisiae, enabling genetic tests ofwhether the in vitro system truly reflects in vico chromosomal replication. The purely genetic approach identified a series of celldivision-cycle mutants, among which were alleles of the genes now known to encode the cx and 6 D N A polymerases (5-8). DNA polymerase I1 was detected enzymatically in yeast-cell extracts, but its gene, POL2,3 was not 1 Abbreviations used: SV40, simian virus 40; PCNA, proliferating cell nuclear antigen; UV, ultraviolet; SSB, single-stranded DNA-binding protein; RF-C, replication factor C; PCR, polymerase chain reaction; SDS, sodium dodecyl sulfate. To whom correspondence may he addressed. Present address: Department of Molecular Immunology, Research Institute for Microbial Diseases, Osaka University, Suita, Osaka 565, Japan. A list of yeast gene symhols and an explanation of the conventions used in this essay are given in the Glossary at the end of this review article.

94

ALAN MORRISON AND AKIO SUCINO

found in any genetic screen, nor was its mammalian counterpart identified in the SV40 in vitro system. POL2 was eventually cloned by reverse genetics. In this article, we review data on DNA polymerase I1 in the context of current knowledge of eukaryotic DNA polymerases, DNA replication and its fidelity, and DNA repair. Information on related subjects is covered in reviews of eukaryotic DNA polymerases (9-13), eukaryotic DNA replication (14), yeast DNA replication (15), the SV40 in citro system (16), and exonucleolytic proofreading (17, 18).

1. Categorization and Nomenclature

A. Eukaryotic DNA Polymerases Six genes encoding observed or predicted DNA polymerase activities have been mapped and sequenced in S. cerevisim (Table I). Activities encoded by POLl, POL2, POL.3, and M l P l have been characterized biochemically, while the putative REV3 protein remains a “computer polymerase” (19). On the basis of conserved primary structure, biochemical activities, and subunit structure, yeast DNA polymerases I and I11 are the homologs of metazoan DNA polymerases (Y and 6, respectively (6, 7,20-25). POLX, a putative homolog of the mammalian DNA-polymerase-P gene, has been discovered in the nucleotide sequence of yeast chromosome 111 (26), suggesting that this polymerase is present in yeast. POLX could encode a protein of about 68 kDa. K. Shimizu and A. Sugino (unpublished observations) have partially purified from yeast extracts a DNA polymerase, tentatively called DNA polymerase IV, with properties characteristic of DNA polymerase @: it is a basic protein inhibited by 2’,3’-dideoxythymidine TABLE I

EUKARYOTIC DNA Yeast DNA polymerase gene

POLl (CDC17) POL2 POL3 (CDC2) POLX REV3 MIPI

Yeast DNA polymerase activity

Genetic map position<‘

14L 14L 4L 3R 16L

-

POLYMEHASES

I

I1

Corresponding higher eukaryotic activity

a E

111 IV (?)”

6

Unknown M itochondrial

Unknown

P

(?)‘I

Y

From 26 and 116; mapping of POL2 is described i n Table 11.

’, Identification of yeast DNA polviiierase IV as the product of POLX eiikaryotic DNA polymerase f3 is pro\~isiwial(see text)

and thc homolog of higher

YEAST

DNA POLYMERASE

95

5‘-triphosphate, but not by N2-(4-butylphenyl)-2’-deoxyguanosine5’-triphosphate or aphidicolin, and requires a high Mg2+ concentration. The activity is intrinsic to a polypeptide of approximately 68 kDa. I t does not react with antibodies against DNA polymerase I, 11, or 111; whether it is encoded by the POLX gene is under investigation. A mammalian PCNA-independent (PCNA: proliferating -cell -nuclear antigen) DNA polymerase formerly conflated wi& 6 has been recognizedas being structurally distinct, and has been named E ( 1 1 , 27, 28). DNA polymerase E has been proposed (25, 27) to be the counterpart of yeast DNA polymerase 11, and resembles it by these criteria: the catalytic subunits of both yeast DNA polymerase I1 and human E polymerase are of very high molecular weight (>ZOO kDa) and, in contrast to the 6 polymerases, are active on a poly(dA).oligo(dT) template-primer in the absence of PCNA. Definitive evidence that the catalytic subunits of DNA polymerases I1 and E are in fact structural homologs has now been obtained from the primary structures of cDNA clones representing the human DNA polymerase E catalytic polypeptide (T. Kesti, H . Frantti and J. Syvaoja, personal communication). A unified nomenclature in which the yeast DNA polymerases were renamed according to their proposed metazoan counterparts was proposed in 1990 (25)even though it had not then been proven that DNA polymerases I1 and E are structural homologs. A standard nomenclature might be desirable, but the one promulgated disrupts the logic of the yeast system where the polymerases and their genes are given the equivalent Roman and Arabic numerals, respectively.

B. Class-B DNA Polymerases While all DNA polymerases may ultimately be related, they have been grouped into several classes on the basis of amino-acid sequence similarity (29). Class B contains the structurally related aphidicolin-sensitive DNA polymerases, and includes yeast DNA polymerase I1 as well as the metazoan ct polymerases. These related polymerases have been referred to as “a-like” (30),an imprecise term to convey primary structural homology, and confusing because class B also includes the metazoan 6 and E polymerases [it excludes the p and y enzymes; DNA polymerase p is related to deoxynucleotidyltransferase (31) and yeast DNA polymerase y is related to Escherichia coli DNA polymerase I, the archetypal class-A DNA polymerase (32, 33)].Furthermore, the a polymerases are atypical of class-B DNA polymerases in that they lack an intrinsic 3’+5‘ exonuclease (34, 35). Instead of “a-like,” we advocate use of the term “class B” (29),or the descriptive term “aphidicolin-sensitive. ” The primary structures of class-B polymerases are characterized by a

96

ALAN MORRISON AND AKIO SUGINO

series of conserved regions that occur in the same linear order in all members (20, 36). N o common system of labeling these regions has been universally accepted. We advocate using the numbering system of Wang et al. (36), also adopted by others (37, 38), in which the regions are given Roman numerals I-VI, in decreasing order of degree of conservation. This system must be modified by the addition of three conserved motifs of the 3’+5’ exonuclease active site, called Exo I, Exo 11, and Exo 111 (see Section IV, C). The reference to the left part of Region IV as Exo I’ by some authors (39-41) reflects the now-discredited theory that Exo I is located there (see Section IV, C), and is no longer useful.

II. DNA Polymerase II Structure and Activities

A. Purification of Yeast DNA Polymerase II 1. EARLYDETECTION Yeast DNA polymerase I1 was first reported in the early 1970’s as a polymerase activity in mitochondria-free cell extracts chromatographically distinct from DNA polymerase I (42) and associated with an exonuclease activity (43).In more detailed studies, partially purified DNA polymerase I1 was shown to be antigenically and enzymatically distinct from DNA polymerase I: it required a deoxynucleotide template and primer, was most active with poly(dA).oligo(dT), and copurified with a 3’-+5’ exonuclease of similar heat stability, capable of excising a 3’-terminal mismatched nucleotide (44-46). In these and in a more recent corroborating report (47),DNA polymerase I1 appeared to be a polypeptide of 150-170 kDa. Following the discovery of the chromatographically and antigenically distinct DNA polymerase 111 (48,49), genetic evidence strongly implied that there are at least three DNA polymerases (in addition to the mitochondria1 enzyme) in yeast, since DNA polymerase I1 was still detected in mutants either of CDC2 (POL3) or C D C I 7 (POLI)(8, 47).

2. DNA

POLYMERASE

11 HOLOENZYA4E

Until the work of Hamatake, DNA polymerase I1 had been purified as a proteolyzed single polypeptide of about 150 kDa. Hamatake et al. purified to near-homogeneity a catalytically active single polypeptide of about 145 kDa but, in addition, observed a much larger catalytically active DNA polyinerase I1 polypeptide of 2200 kDa that copurified through five chromatographic steps and during glycerol gradient centrifugation with proteins of 80, 34, 30, and 29 kDa (50). This complex termed DNA polymerase I1*,

YEAST

DNA POLYMERASE

97

apparently constitutes the holoenzyme. Essentially no difference in enzyme activity was observed between the two forms of purified DNA polymerase 11: both were catalytically active in polymerase and 3'45' exonuclease assays and displayed highly processive polymerization utilizing poly(dA).oligo(dT) as template-primer. The 145-kDa and 22OO-kDa species were coantigenic and generated common products on partial digestion by Staphylococcus uureus V8 or Arg-C proteases. The 145-kDa species was evidently derived from the 2200-kDa species by proteolysis, concomitant with dissociation of the other polypeptides. Dissociation, or lack of formation, of the holoenzyme form was also observed in an extract from the po12-1 mutant, which yielded a genetically truncated polypeptide of about 131 kDa (51). The subunit structure of the holoenzyme is summarized in Table 11. We next describe cloning of the genes encoding the ?200-, 80-, 34- and 30-kDa species. The as-yet-unnamed gene encoding the 29-kDa putative subunit D has not been cloned and there is little information regarding it. Data on the po12-1 mutant and other mutants of DNA polymerase I1 and its subunits are summarized in Table 111.

B. Subunits of DNA Polymerase II 1. CATALYTIC SUBUNITA ENCODEDBY POL2 Definitive proof that DNA polymerase 11 is a unique activity came with the cloning of the POL2 gene encoding the catalytic subunit A of DNA polymerase 11: POL2 has a unique primary structure and genetic map position (51). The predicted POL2 protein is about 256 kDa, and displays sequence-similarity to class-B DNA polymerases in its N-terminal half. Evidence that POL2 encodes DNA polymerase I1 rests on three results: antibody to purified DNA polymerase I1 also reacts with fragments of POL2 protein expressed in E . coli; a hexapeptide terminal sequence of a Lys-C proteolytic fragment of DNA polymerase I1 catalytic polypeptide is exactly matched by a sequence encoded by the POL2 open reading frame; and a truncated DNA polymerase I1 polypeptide of the expected 131 kDa was partially purified from the poZ2-1 mutant disrupted in the middle of the POL2 coding region (but downstream from the polymerase domain). 2. SUBUNITB The 80-kDa polypeptide was designated subunit B. It copurified with the catalytic subunit A, and the two polypeptides appeared to maintain a 1:l stoichiometry in purified preparations (SO). The gene encoding the 80-kDa polypeptide was termed DPB2 (DNA polymerase subunit 2). Its sequence predicted a polypeptide of about 79.5kDa and showed no significant similarity to any protein sequence in the database (52).The 80-kDa polypeptide

S

TABLE I1 SUBUNITSTHUCTUHE OF DNA POLYMEMSE11" Proposed subunit

a 0s

Polypeptide

M,

X

lo-''

A

>200

B

80 34, 30

c, C'

Gene

Map position

1

POI2

1 =4

DPB2 DPB3

14L, within 3.9 cM rad50" 16c 213, 30 c M dis-

Stoichiometry

Open reading frame (codons)

Predicted protein ( M , )

Transcript

(W

Activities

2222

255,649

7.5

698 201

79,461 23,005

2.5

DNA polymerase, 3 '+ 5' exon uclease ?

0.9

ATP binding

-

-

-

tal his7

D

29

=4

-

-

-

Data A r c from 50-54. To map POL2, a diploid isolated by crossing strain C<;954 (MATQ ura3 lys2 ho::LYSS ru&OA) with the pol2-1 derivativc of AMY50-IB (MATol lysl-l h i s - 7 a&-l hom-10 len2-1 rrra.3-52) WAS sporiilated. rad7OA was mnnitored b y srncitivity to mrthy ~ n c t h a r i c s ~ ~ l l ~ n and a t ep, d 2 - 1 b y thr Ura phcwutypc. Twelve tctradb giving Ciur vtal)lr spore$ all were of the parental ditype combination (unpublished observations). c H. Araki and A. Sugino (unpublished observations). '1 '1

+

TABLE 111 MUTANTSOF GENESENCODINGDNA POLYMERASE 11

Gene POL2

Allele

DPB2

DPB3

11 c

Amino-acid sequence change

A,

n, AND ~a Phenotype

Biochemical effect

DO12-1

URA3 insertion at base 3991 (BglII site)

Disruption at a.a. 1134

Truncated subunit A polypeptide; dissociation of subunits B and C

POL-2

URA3 insertion at base 1111 (RgZII site) LEU2 transplacement of bases 1315-7051 (EcoRV sites)

Disruption at a.a. 175 Deletion of a.a.’s 2422154 Asp-290+Ah, Glu-292-tAla

n.d.1)

Slow cell growth, increased spontaneous mutation rate Lethal

n.d.

Lethal

3’+5’ exonuclease reduced at least 100-fold

~012-3

(D W

Nuclentide sequence change

SUBUNITS

~012-4

A1460+ C. A1466- C

po1.2-3991 ~012-9‘

LEU2 insertion at base 3991 (RglII site) 625234A

~01.2-18c

C2719+T

dpb2A

LEU2 insertion at base 526

dpb2-1

n.d.

dpb3A

URA3 transplacement of 0.54-kb EcoRI-Hind111 region starting at base 346

Disruption at a.a. 1134 Met-644+Ile (within Region 11) Pro-7lO-tSer

n.d

Spontaneoirs mutation rate increased about 10- to 20-fold As for ~ 0 1 2 - 1

ts DNA polymerase I1 activity

t s cell growth

t s DNA polymerase I1 activity

t s cell growth

Disruption at a.a. 86 n.d.

n.d.

Lethal

Disappearance of subunit B from DNA polymerase I1 complex

ts

Deletion of a.a.’s 62-201

Altered chromatographic elution position and disappearance of subunits C and C ’ from DNA polymerase complex . .

Cell growth normal, increased spontaneous mutation rate

Data are from 35 and 51-54. n.d., Not determined. Note that nucleotide numbers fnr the pul2-9 and ~012-18motatioiis use the system of 51 and differ from those given in 53.

cell growth

100

ALAN MORKISON AND AKIO S U G I N O

was absent from D N A polymerase I1 holoenzyine partially purified from cells expressing the dpb2-I mutation, while subunits A and C were still detected (subunit D did not react with the mouse antiserum against D N A polymerase I1 used in the Western blots). The POL2 and DPB2 genes appeared to share a common cell cycle regulation (see below), and their temperature-sensitive and/or disruption mutants showed the same phenotypic consequences: cessation of genomic D N A synthesis; arrest of the cell cycle at the dumbbell morphological stage; and cell death (51-53). Our (unpublished) experiments provide genetic evidence for an interaction between subunits A and B in vivo. The po12-1 mutation confers an approximate halving of cell growth rate, resulting in small colony size (51). Thus, for cells grown in rich medium at 30"C, strain BJ3501 (54) and its pol2-1 derivative displayed doubling times of 103 1 and 208 8 minutes, respectively (mean and range of two determinations each). Since the pol2-1 mutation destabilizes the interaction between the catalytic subunit A and the essential subunit B in uitro (51), we tested whether overexpression of subunit B by increasing the DPB2 gene copy could suppress the po12-1 slowgrowth phenotype. Instead of po12-1, we used the ~012-3991allele, which is identical to po12-1 except that LEU2 instead of URA3 disrupts the POL2 coding region at nucleotide 3991 (Table 111). The inclusion of DPB2 on a multicopy plasmid substantially, but not entirely, suppressed the growth defect of po12-3991 (Table IV). The DPB2 plasmid had no effect on the growth of the parent strain, as judged by colony size.

*

*

3. SUBUNITC

The 34- and 30-kDa polypeptides are encoded by a single gene, DPB3. As with the POL2 and DPB2 genes, mouse antiserum against D N A polyTABLE IV DPBP GENECOPY NUMBER ~012-3991 S L O W G R O W T H P H E N O T Y P E "

E F F E C T OF INCREASED ON T H E

Relevant genotype

Doubling time (inin)"

Relative doubling time

POL2+ [YCP500]' td2-3991 [ Y C P ~ O ] ~012-3991 IYEpDPB21"

126 i 1 (2) 253 t 13 (2) 170 5 8 (4)

1 2.0

1.3

Derivatives of parent strain XSR03-2C (MATa hisl-7 [e112-3,-112 / ~ i i D - l ( i icm3-52 c a d ) were grown at 30°C i n synthetic medium lacking uracil to maiiitain the presence of plasmids. (7 Doubling times are expresaed as mean % i-ange or standard dcviatioii (nuinlter of determinations). a different isolate was uscd for each determination. Plasrnid YCp50 was used as the control lIHA3 plasmid. Plasrnid YEpDPB2 contains a KpizI-A'sil DNA fragmeirt containing the DPB2 gene (52) inserted into the K p n I - P r t l sites of the U M . 3 vector YEplacl95 ( l J 7 ) .

YEAST

DNA POLYMERASE

101

merase 11 holoenzyme was used to clone DPB3 from a h g t l l expression library (54). Antibody affinity-purified by reaction with the cloned DPB3 protein reacted with both the 34- and 30-kDa polypeptides by Western blotting, but not with the >200- or 80-kDa proteins. Furthermore, the dpb3A deletion mutant, which grew normally, yielded a DNA polymerase IT, partially purified through Mono-S column chromatography, consisting only of subunits A and B, lacking both the 34- and 30-kDa polypeptides. The 29-kDa polypeptide was present in the Mono-S column fractions but did not clearly co-chromatograph with subunits A and B, presumably because it had dissociated. (For this Western blot experiment, a rabbit antiserum that recognized all four subunits, A, B, C, and D, was used.) Since the DPB3 protein predicted from the sequence is only 23 kDa, the 34- and 30-kDa polypeptides are presumably either posttranslationally modified forms, or migrate anomalously in SDS-polyacrylaniide gels. The DPB3 protein has been expressed in insect cells using the baculovirus system (P. Ropp and A. Sugino, unpublished observations) where a 34-kDa, sometimes with a minor amount of 30-kDa, protein was produced. The 34- and 30-kDa polypeptides are denoted as subunits C and C'. The predicted DPB3 protein sequence contains two noticeable features: a region of 35 residues (residues 120-154) containing 63% acidic residues (primarily glutamate), including a run of 14 residues that contains 11 glutamates and one aspartate; and a possible nucleoside triphosphate-binding consensus sequence (55) in the N-terminal half of the predicted sequence (54). DPB3 protein expressed using the baculovirus system and purified to about 50% homogeneity bound ATP but not, to any significant extent, GTP, CTP, or 'ITP, as determined by SDS-polyacrylamide gel electrophoresis following challenge with 32P-labeled triphosphate and fixation using UV light (P. Ropp and A. Sugino, unpublished observations). Neither the partially purified DPB3 protein nor DNA polymerase I1 holoenzyme had detectable ATPase activity. Deletion of the DPB3 gene had no appreciable effect on cell growth (54). However, a modest (approximately 2- to 20-fold) increase in spontaneous mutation rate was measured by reversion assays in dpb3A strains, indicating that chromosomal DNA sequences are maintained less accurately when DPB3 protein is absent.

C. Stirnulatory Factors 1. S F I A factor called S F I that stimulates the activity of purified DNA polymerase I1 was partially purified (47). The stimulation occurred with poly(dA).oligo(dT), but not with activated DNA as template-primer and not with DNA polymerase I. SF I was subsequently purified as an apparent

102

ALAN MORRISON AND AKIO SUGINO

complex of polypeptides of 66, 37, and 13.5 kDa, with single-strand DNA binding activity intrinsic to the 66-kDa polypeptide (56).Although strikingly similar in subunit composition to yeast and human replication-factor A (57), the 66-kDa SF-I polypeptide was antigenically distinct from the 69-kDa DNA-binding polypeptide of yeast replication-factor A (56-58). Further investigation showed that the 66- and 37-kDa polypeptides are identical to the mitochondria1 heat-shock protein encoded by HSP6O and the yeast translation-initiation factor 4A, respectively (59). It is unclear whether SF I has any relevance to in uivo DNA replication. A second stimulatory factor, S F 11, was also found (47), but further reports on its identity or mechanism have not appeared. 2. PCNA

AND

RF-C

Although independence from PCNA is one of the criteria distinguishing the E from the 6 polymerases, there are several reports of an effect of PCNA on DNA polymerase 11. Hamatake et al. (50)found that PCNA increased the processivity of both the DNA polymerase I1 holoenzyme and the 145-kDa form with poly(dA).oligo(dT) as template-primer, while DNA polymerase I was unaffected. Lee et al. (60) found that, using singly primed coliphage M13 single-stranded DNA as primer-template in the presence of 0.13-M NaCI, synthesis by yeast DNA polymerase I1 (E) required hoth human PCNA and activator I (i.e., RF-C, replication-factor C), and was stimulated by E . coli SSB (single-stranded DNA-binding protein); in the absence of NaCI, synthesis required only SSB. With (dA),,o,).(dT)l,,,, in the presence of salt, but not in its absence, yeast DNA polymerase I1 required PCNA, human SSB, and human RF-C (60). Similar data were reported using yeast PCNA and RF-C (61, 62). These results do not necessarily imply any direct interaction with DNA polymerase 11. Perhaps SSB stimulates DNA polymerase I1 non-specifically by smoothing out regions of secondary structure and reducing nonproductive binding of the polymerase to single-stranded DNA, while RF-C and PCNA may bind the primer-template and present the primer terminus to any of a variety of DNA polymerases. On the other hand, it is possible that some of these stimulations do reflect protein-protein interactions involving DNA polymerase 11. We speculate that PCNA and RF-C, while specific for DNA polymerase 6, might have cellular hoinologs specific for DNA polymerase E, and that some functional cross-reactivity can occur.

D. Mammalian DNA Polymerase E Purified HeLa cell DNA polymerase E appears to have a subunit structure different from that of yeast DNA polymerase 11. It has been purified as a complex between a >200-kDa catalytic subunit and a 55-kDa subunit (27,

YEAST

DNA POLYMERASE

103

60, 63, 64). The >200-kDa subunit was cleaved by trypsin into 136- and 122kDa polypeptides, the smaller polypeptide containing the polymerase and, probably, 3'+5' exonuclease activities (64).This is reminiscent of the ~ 1 5 0 kDa polypeptide of yeast DNA polymerase 11, which is presumably derived by endogenous proteolysis from the 2200-kDa polypeptide. Purified calfthymus DNA polymerase E appears to be composed of either a single 140kDa catalytically active polypeptide (64),or polypeptides of 140, 125, and 40 kDa (28), the 140- and 125-kDa polypeptides being catalytically active (65). The apparently different subunit compositions of the yeast and mammalian E polymerases might reflect different physiological states of the enzymes rather than species differences (S. Linn, personal communication). If the mammalian and yeast E DNA polymerase subunit structures have diverged, it might suggest a divergence in roles of the polymerases. Alternatively, it is possible that the 55- and 80-kDa subunits of the yeast and human E DNA polymerases are homologs, though differing in size; other polypeptides corresponding to subunits B and C of yeast DNA polymerase I1 might simply have dissociated from the mammalian polymerase during purification. Siegal et al. (66) observed a functional interaction between calf-thymus DNA polymerase E and a 5'-t3' exonuclease similar to enzymes found in HeLa cells (67) and mouse cells (68). The exonuclease degraded a singlestranded DNA or HNA primer annealed to a template. When two primers, separated by a three-nucleotide gap, were annealed to an M13mp18 DNA template, degradation of the downstream primer in the presence of DNA polymerase E was dependent on provision of dNTP's and synthesis from the upstream primer; conversely, DNA polymerase a or 6 did not prevent degradation of the downstream primer. Siegal et al. (66) suggested a model in which the E polymerase is involved in replicating the lagging strand.

111. Cell Cycle Regulation In S . cerevisiue, at least 15 genes encoding proteins involved in DNA replication are coordinately expressed at the G,/S-phase boundary of the cell cycle (69). This expression requires an upstream cis-acting nucleotide sequence motif exemplified by the MluI restriction enzyme recognition sequence. The FOL2, DPB2, and DPB3 genes appear to belong to this set of genes: all contain at least one occurrence of the MZuI motif, or a variant, in their 5'-untranslated regions, and their transcripts were expressed at the G,/S-phase boundary (51-54). Periodic expression of the POL2, DPB2, and DPB3 genes was demonstrated with yeast cells synchronized using the a mating pheromone, a feed-starve procedure, or elutriation. This coregula-

104

ALAN MORRISON AND AKIO SUGINO

tion argues against the possibility that the DPB2 and DPB3 proteins copurify fortuitously with D N A polymerase 11, and provides circumstantial evidence that D N A polymerase I1 acts during S-phase.

IV. Domain Structure of Catalytic Subunit A. Polymerase Domain Class B polymerases are characterized by a series of conserved regions that occur in the same linear order in all members (20, 36). The extent of these conserved regions presumably defines the enzyme activity domain(s). Mutations conferring altered sensitivity to nucleotide analogs have been mapped to positions in and around Regions 11-V (38, 70, 71),and mutation of specific residues in Regions I and XI eliminated D N A polymerase activity (72-77). Inspection of the predicted amino-acid sequence of the aphidicolinsensitive D N A polymerase I1 showed that it belongs to class B, and contains the conserved regions I-VI in the same linear order identified by Wang and others (51, 36). Figure 1 shows a comparison of the domain structures of D N A polymerase I1 and the other aphidicolin-sensitive D N A polymerases of yeast (including the predicted REV3 polymerase). Two additional conserved regions are apparent, numbered VII and VIII (Fig. 1). Region VII contains several basic residues and a tyrosine, and is conserved in many class-B D N A

II

IV

I

POL1 REV3

vi

n1

I

vn v

vni

QS

I -

3

POL3

POL.? Ex01

Exolll EXO

n

FIG 1 Conserved regions of yeast DNA polymerase5 Yeast DNA polymerase5 I, 11, and 111, and REV3 protein, identlfied by their gene svmbols, are schemati7ed d S horiLorrta1 line5 with their N-termini to the left and C-termini to the right Boxes represent conserved region\ Solid boxes labeled I-VI represent conserved reglons defined by Wang et a1 (36) Sold boxes VII and VIII are two additional con5erved reg10115 Hatched boxes labeled C \ s reprewnt cysteine-rich region\ Open boxes (below lines) labeled Exo I, Exo 11, a i d Exo I11 represent coiiserved motifs containing 3’+5’ exonudea5e active-site residues (35), Exo I1 is coincldent with the right part of Region IV Region VII was dexribed for herpes-simplex-\.irrr5 DNA polvmerase (39, and is the same as Region 9 of 81, region IV of 118, and Region H of 119 It is characterized by the Ly\-Lys-Arg-Tyr motif ( r e s h e \ 96G-969 of yeast DNA polvmeIase 11) Region VIII 15 characterized by the Asp-X-X-Tyr-Tvr motif (residues 1142-1146 Of yeast DNA polymerase 11, X IS a non-conwrved amino acid)

YEAST

DNA

POLYMERASE

105

polymerases, including herpes-simplex-virus DNA polymerase (38). Region VIII is characterized by the Asp-X-X-Tyr-Tyr motif (X is a non-conserved amino acid), and is not clearly conserved in class-B polymerases other than those shown in Fig. 1 and their honiologs. The N-terminal half of DNA polymerase 11, excluding Region VIII, contains the polymerase and 3’+5‘ exonuclease activities, since a truncated DNA polymerase I1 polypeptide with these activities was partially purified from po12-1 mutant cells (51; A. Sugino, unpublished observations).

B. C-Terminal Half The amino-acid sequence of DNA polymerase I1 contains a C-terminal cysteine-rich region, thought to comprise a zinc-finger DNA-binding domain. Similar regions occur at the C-termini of metazoan ci and 6 polymerases and their yeast counterparts, and in the predicted REV3 polymerase, but are absent from other aphidicolin-sensitive polymerases. DNA polymerase 11 differs strikingly from other class-B enzymes in possessing a large unique C-proximal region. Perhaps even more strikingly, neither this unique region nor the Cys-rich region nor Region VIII is required for DNA polymerase II function either in uitro or in uiuo, as determined from analysis of the po12-1 mutant (51).Since the partially purified poZ2-1 mutant protein lacked associated subunits, the C-terminal half of the catalytic subunit is apparently required for maintenance of the holoenzyme form. The partial suppression of the po12-1 slow-growth phenotype by increased DPB2 gene copy-number (Table IV) argues that a significant function of the C-terminal half is to hold subunit B at the site of action of DNA polymerase 11, where it has an essential role.

C. 3’+5’ Exonuclease Active Site 1. CONSEWED

ACTIVE SITE

Five 3’+5’ exonuclease active-site residues have been assigned in DNA polymerase II on the basis of amino-acid sequence alignments with other class-B D N A polymerases and with E . coli D N A polymerase I (35).The 3’+5’ exonuclease active site of the Klenow fragment of E . coli DNA polymerase I has been exquisitely detailed by crystallographic and mutational studies (78-80). Several groups proposed that amino-acid sequence motifs corresponding to sequences in the Klenow 3’+5’ exonuclease active site could be recognized in aphidicolin-sensitive DNA polymerases (71, 81, 82). The most comprehensive statement of this hypothesis was made in a seminal paper by Salas and co-workers, who proposed that the 3’+5’ exonuclease active site residues are embedded in three amino-proximal sequence motifs, Exo I, Exo 11, and Exo 111 (30). While fundamentally correct, this report contained some sequence alignments that were subsequently challenged (9,

EX0 111

* S.cerevisiae

.

S pombe

Pol 11 Pol I1 Pol 111 PO1 6 Pol 6 PO1 6

S.cerevisiae S .pombe Human Bovine P.falciparum pol 6 Herpes simplex virus Varicella-zoster Epstein-Barr virus Cytomegalovirus Baculovirus Vaccinia virus Fowl poxivirus Adenovirus-5 Bacteriophage T4 E. coli P o l I1 Chlorella virus C.biennis virus Maize mitochondria S1 N.intermedia plasmid K. lactis pGLK-2 Bacteriophage PRDl Bacteriophage M2 Bacteriophaqe 629 S.cerevisiae Pol 1 S.pombe Pol a Human Pol a D.melanogaster P o l a S.cerevisiae Rev3

286-V M A 265-v M A 317-1 M S 296-1 M S 312-V L S 311-V L S 303-1 L S 364-L M C 345-L L C 292-A L A 297-C L S 192-L S C 137-Y L F 160-Y L L 137-F V T 108-V A N 152-W V S 186-1 A S 163-8 V S 221-F F V 262-1 M T 363-E V F 13-1 A A 5-M F S E-M Y s

F F F F F F F F F F F Y L F Y C I

a I D I D I D I D I D I D I D I D I D I D I D I D I D I D V D I D I D I D I D L D L D I D F

*

O E E E E E E E E E E E E E E E E W E I E A E L E F E F E C D F E C ~

* T T K-295 T T K-274 C A G-326 C A G-305 C A G-321 C A G-320 C I K-312 C K A-373 C K S-354 C L G-301 C M s-306 T H S-201 C H F-146 C Q F-169 T Y "-146 V T G-117 T T R-161 T Y S-195 C Q H-172 T L L-230 T R s-271 S F s-372 T D P-22 T T T-14 F T ~T-17 T

374-1 S 353-1 V 398-1 I 377-L I 393-1 T 392-1 T 386-L T 462-V T 443-A T 375-V T 404-V T 276-1 L 234-V V 247-V I 271-1 V 210-F T 220-1 I 268-5 I 263-1 L 306-V Y 336-V Y 416-L Y 67-1 Y 55-L Y 58-L Y 635-1 I 621-Y F 634-1 V 660-1 V 764-L S

T T G G G G G G G G G D

T T G G G G T F T A A F F G G G T

F Y Y Y Y Y Y Y Y Y Y F F E H

N N N N N N N N N N N

G G T I I I I I I V I N G N G

N N W N W N Y Y H H W H

N

N N N

G I I V L G L F G G L L L F

Y N H N H N H R H D H N I F D S G F E I

D D T C Q Q

F F N N

*

F D W P F I H N R-393 F D W P F V D A R-369 F H I P Y L L N R-414

F D I P Y L L D R-393 N F D L P Y L I S R-409 N F D L P Y L I S R-408

I N I N V N A N N S D V H N N N N G E G V Q W Q H S S Q S Y S G G K . K .K Q N E . Y G M D H N

F D L F D W F D W F D W F D L F D L F D L F D I F D E F D V F D L Y D L F D F F D G F D G Y D Y F D F F D G F m G V Y L M C Y F E L C Q L F S W

P Y P F A F P Y K Y P Y R Y

R Y I V P Y R M R Y T Y I M I F Q H L F A F A F D V S E N G

V V V Y

I L I I I I I I L I L I I I I V L I I L L L I I

L L M L L L T

S A M Q

N R-402 A K-478 E K-459 D R-391 T R-420 G R-292 N R-250 G R-263 A Q-287 N R-226 K H-236 G R-284 G R-279 S F-322

H Q L I D 1-352 L P Y-432 M K Y-83 V N W-70 I N W-73 A H R-651 L S R-636 L Q R-650 T D Q-676 I E R-780

470-L 449-L 513-L 492-L 509-L 507-L 501-1 574-1

555-1

490-L 535-V 386-1

430-M 441-M

435-T 317-Y 328-L 390-1 464-1

406-S

470-L 546-A 142-1

159-E 162-E

*

S V S D A V A-480 A Q Y S V S D A V A-459 S E Y

A A A A A G G G G A A E L I A A A L I I L Y Y

V I V V T E E M R K

R R D

S T A Y T K E E E A

Y C L K D Y C L K D Y C L K D Y C L K D Y C I K D Y C I Q D Y C I Q D Y C V Q D Y C L Q D Y N V Q D Y c I n D Y C I H D Y C A L D Y N I IH Y N L K D Y A R K D Y C T H D Y L K Q D Y C E I D Y C K V D Y L K G D Y I K N D m 1 K N m

A A A A G

A C

Y L-523 Y L-502 Y L-518 F L-517 V L-511 L L-584 A L-565 A L-500 V L-545 M L-396

A

c

A V V C T T

C Q E E D V L M L V E Q

S S S

I T V C I I

L-440

L-451 V-445 5-32? L-738 L-400 L-474 1-416 A-480 A-556 T-152 1-169 1-172

YEAST

DNA

POLYMERASE

107

35, 83, 84) and that apparently led to two misconceptions: first, Exo 1 was mislocated in the left part of Region IV instead of proximal to it (35, 39, 40); and second, the idea that these active-site residues occur in all aphidicolinsensitive DNA polymerases (35).A modified version of the basic hypothesis of Salas and co-workers is given in Figs. 1 and 2. Exo I is located upstream from Region IV, Exo 11 is coincident with the right part of Region IV, and Exo 111 is between Regions IV and 11.

2 . MUTATIONOF ACTIVE-SITERESIDUES Identification of the Exo motifs has been tested by site-directed mutagenesis of specific residues in several polymerases. The effects of these mutations have been measured by assay of the 3'+5' exonuclease and polymerase activities of the partially or completely purified protein, and/or by genetic assay of a spontaneous mutator phenotype, which is the expected phenotypic effect of a loss of exonucleolytic proofreading. The identification of the conserved Asp-Ile-Glu motif (D-I-E in the single-letter code) upstream from Region IV as Exo I is supported by the results of site-directed mutagenesis of the proposed Asp and Glu active-site residues in yeast DNA polymerases 111 and I1 (35, 39). In the case of DNA polymerase 11, mutation of both the Asp-290 and Glu-292 residues to alanine reduced the ratio of 3'+5' exonuclease and polymerase activities at least 200-fold. With DNA polymerase 111, mutation of the corresponding residues increased the spontaneous mutation rate more than 100-fold. Exo TI, containing the conserved Asp residue corresponding to Asp424 of the E . coZi DNA polymerase I Klenow fragment, lies in the right part of Region IV. Mutation of the corresponding Asp residue in yeast DNA polymerase 111 and in bacteriophage +29 DNA polymerase specifically reduced 3'+5' exonuclease activity (30, 39). Experimental support for the identity of the Exo 111 motif came from the 3l-5' exonuclease-deficient Asp-32bGly mutation of coliphage T4 DNA

FIG. 2. Alignments of aphidicolin-sensitive DNA polymerase amino-acid sequences proposed to contain conserved 3'-95' exoiiuclease active-site residues. (The one-letter code is used.) Numbers refer to amino-acid residues. The three conserved regions Exo I, Exo 11, and Exo 111 (see Fig. 1)are shown. Asterisks mark invariant residues proposed to correspond to the following 3'-+5' exonuclease active-site residues of the Klenow fragment: in order (from left to right) D355, E357, D424, Y497, and D501. Boxed are residues whose mutation led to a decreased ration of 3'+5' exonuclease and DNA polymerase activities. DNA polymerases known to possess 3'+5' exonuclease activity appear above the horizontal line; below the horizontal line are a DNA polymerases and the predicted REV3 protein. Abbreviation: Pol, DNA polymerase. Sequences are from the Genbank/EMBL database. The S. pornbe DNA polymerase I1 sequence is the unpublished work of J. Sebastian and A. Sugino. S. cereoisiae DNA polymerase 111 is numbered according to 120.

108

ALAN MORRISON AND AKIO SUGINO

polymerase, isolated using a genetic screen for isolates with high mutator activity (83).The interpretation is somewhat clouded because the Glu-lSl-+hla mutant rather than wild-type T4 DNA polymerase was used for the screen; however, the Glu-lSl--+Alamutation alone had no significant effect on 3'-+5' exonuclease activity, whereas the Glu-l91-+Ala, Asp-324-Gly double mutation reduced it about 100-fold. With $29 DNA polymerase, mutation of the conserved Exo I11 Tyr and Asp residues reduced the exonuc1ease:polymerase activity ratio: the Asp-169-Ala mutation reduced the ratio lO3-fold, and the T y r - 1 6 h P h e and T y r - 1 6 5 - K ~mutations ~ reduced it 13- and 24-fold, respectively, without diminishing polymerase activity as determined by filling-in of extended 5'-termini or by using (dC),;(dG),, as primer-template (41).

3.

REGION

IV

IN

a POLYMEKASES

The Exo I and 111 motifs, and the conserved Asp residue in Exo 11, can be located in all class-B polymerases except the a polymerases and the predicted REV3 protein (Fig. 2; 35). The a polymerases and the putative REV3 protein are thus predicted, on the basis of sequence alignments, to lack an intrinsic 3'-+5' exonuclease activity. Although several different alignments have been presented to suggest conservation of the Exo motifs in a polymerases and the predicted REV3 protein (21, 30, 40, 41), in no case are the five predicted active-site residues absolutely conserved within a b'w e n a polymerase, nor is any of these residues invariant among the a-polymeraselREV3 subset. This contrasts with the other more-than-20 class-B polymerases in which these five residues are invariant. Consistent with this, the human a-polymerase catalytic subunit is devoid of 3'-+5' exonuclease activity (34). Earlier reports of 3'-+5' exonuclease activity associated with the a-polymerase catalytic subunit can be attributed to contamination and the then-reasonable belief that these essential replicases ought to possess a proofreading function (85, 86). We conclude that the 3'1.5' exonuclease is neither cryptic nor inactive, but is simply not present in the a-polymerase catalytic subunit. The above conclusion presents a paradox, since Region IV, which contains the Exo I1 motif, is conserved in ci polymerases. It appears that they possess at least a part of the 3'-+5' exonuclease activity domain, but lack the actual active-site residues. The Region-IV domain presumably has some function in addition to 3'-+5' exonuclease activity. We can speculate that this function might be related to binding single-stranded DNA. The crystal structure of an editing complex showed that the Klenow fragment bound four nucleotides of single-stranded DNA to the 3'-+5' active site (87).This complex is stabilized by hydrophylic interactions between the last three bases at the 3'-terminus and amino acids other than the four invariant acidic residues and tyrosine in the Exo I, 11, and 111motifs (87, 88). Consistent with the existence of a function other than 3'+5' exonuclease activity in Region IV,

YEAST

DNA POLYMERASE

109

mutation of residues of herpes-simplex-virus DNA polymerase Region IV, mapping upstream from the conserved Asp residue in Exo 11, inactivated the polymerase (89). Similarly, the cdcl7-2 mutation, a Gly-637-Ala change in Region VI of S. cereuisiae DNA polymerase I, caused a temperaturesensitive phenotype indicating conditional inactivation of the protein (90). With T4 DNA polymerase, the Glu-l91+Ala, A s p - 3 2 b G l y double mutation appeared to reduce polymerase activity on a gapped template-primer about 30-fold, but only about %fold with a “nicked” template-primer (83). [Mutation of conserved acidic residues in the Exo I or I1 motifs did not detectably reduce DNA polymerase activity of the +29 enzyme, or of yeast DNA polymerase I1 or I11 (30, 35, 39).]Mutant $29 polymerases altered in the Exo I11 motif had a considerably reduced ability to replicate +29 DNA, though they were not defective in DNA synthesis assays in which strand displacement was not required, suggesting that the conserved Region IV domain might function in strand-displacement (41). Conservation of such a function in Region IV can explain why a polymerases retain this region. 4. ONLY6 AND E DNA POLYMERASES ARE PREDICTED TO POSSESSAN INTRINSIC3’+5’ EXONUCLEASE The amino-acid sequence alignments of the conserved 3’+5‘ exonuclease domain discussed above lead to the interesting proposition that, of the known nuclear DNA polymerases (i.e., excluding the mitochondria1 polymerase), only the 6 and E DNA polymerases possess an intrinsic 3‘+5’ exonuclease activity. The catalytic subunits of a polymerases, and the structurally related REV3 protein, lack the 3‘+5‘ exonuclease active-site residues. Yeast DNA polymerase I holoenzyme is devoid of 3‘+5’ exonucleolytic proofreading activity (91).DNA polymerase p lacks both the conserved 3’+5’ exonuclease domain (31)and an intrinsic 3’+5‘ exonuclease activity. Deoxyribonuclease V is sometimes associated with DNA polymerase p, but does not have the properties of a proofreading exonuclease (10).The 6 and E DNA polymerases therefore appear to be the only candidates with the intrinsic ability to perform exonucleolytic editing during DNA replication. While it is conceivable that an autonomous proofreading 3‘-+5‘ exonuclease exists, there is no precedent for this: the rule is that polymerase and 3’+5’ exonuclease coexist in the same polypeptide chain or, as with E . coli DNA polymerase I1 holoenzyme, as tightly complexed subunits, reflecting the mechanistic coupling of the two activities (17).

D. POL2 Homologs in Fission Yeast and Fruit Fly Parts of genes homologous to POL2 have been amplified from genoinic DNA of Schizosaccharomyces pombe and cDNA o f Drosophila melanogaster using PCR and primers similar in nucleotide sequence to the DNA encoding

110

ALAN MORRISON AND AKIO SUGINO

Regions IV and I11 of S. cerevisiue DNA polymerase 11. A DNA fragment of approximately 800 base-pairs was amplified from D . melanogaster larval cDNA; its nucleotide sequence contained an open reading frame encoding a 272-amino-acid sequence that was 57% identical to the amino-acid sequence between Regions IV and I1 of S. cerevisiue DNA polymerase I1 (A. B. Clark and A. Sugino, unpublished observations). A similar DNA fragment was amplified from S. pombe DNA (J. Sebastian and A. Sugino, unpublished observations). A DNA clone containing part of the S. pomhe POL2 homolog was isolated using the PCR DNA fragment to probe a X library of S. pombe DNA. The DNA sequence of this clone showed part of an open reading frame corresponding to the N-terminal half of DNA polymerase 11, with all of the conserved regions shown in Fig. 1 present. Residues 192-1188 of S. cerevisiae DNA polymerase I1 showed 62% identity to the translated sequence of the S . pomhe clone.

V. Genetics of 3’+5’ Exonuclease A. Spontaneous Mutator Phenotype of 3’+5’ Exonuclease-Deficient Mutants Since the role of the 3’+5’ exonuclease is to remove incorrectly inserted nucleotides during polymerization, its inactivation is expected to lead to an increase in spontaneous single-base mutations in vivo, as is observed with mutants of dnaQ, which encodes the 3’--+5’exonuclease E subunit of E . coli DNA polymerase I11 (92).We created the exonuclease-deficient ~012-4and POD-01 strains by altering the Phe-Asp-Ile-Glu Exo I motifs to Phe-AlaIle-Ala (35, 93). Similar mutants of POL3 were constructed by Simon et al. (39). We measured mutation rates either by reversion assays or by forward mutation to 5-fluoroorotic acid resistance (94, 95) of a URAS reporter gene inserted near a defined replication origin (ARS306) on chromosome 111. The relative URA3 mutation rate for POD-01 (i. e., the mutations per cell division of the POD-01 strain divided by the mutations per cell division of the P O U + strain) was 130 (94, which compares to about 400 for mutation to canavanineresistance reported for similar pol3 mutants (39). The relative URA3 mutation rate for ~012-04was 12 (unpublished observations); this compares with 20, the average of the relative reversion rates for six different markers: his72 and ade2-1 (unpublished observations), and ade5-1, hisl-7, leu2-1, and h o d - 1 0 (35). A limited number of the mutations produced by each mutant polymerase were sequenced. Both ~012-4 and POD-01 generated exclusively singlebase changes within the URA3-coding region, but gave different mutational

YEAST

111

DNA POLYMERASE

spectra. Twenty sequenced uru3 mutations from POD-01 occurred at 15 locations, and were predominantly transitions and single-base additions or deletions, with only one (G.C+T.A) transversion. Nineteen sequenced uru3 mutations from ~012-4occurred at nine locations and, except for a singlebase deletion, consisted of G.C+A.T, C.G+A.T, and T.A+A.T changes. Between both polymerase mutants, all classes of single-base mutation were represented in these spectra except G.C-42.G transversions.

B. Epistatic Relationships 1.

RELATIONSHIP BETWEEN

OF

3’+5’ EXONUCLEASES

DNA POLYMERASES I1 AND I11

While the ~012-4and POD-01 exonuclease-deficient mutants grew normally, the double-mutant POD-01 ~012-4haploid was inviable, as shown by tetrad analysis of spores from heterozygous diploids. The inferred pot%-01 pd2-4 spores formed microcolonies of inviable cells having no unique terminal morphology, consistent with death from a catastrophically high rate of unedited errors. We posited that such an error rate might be protected against by diploidy, and successfully constructed a viable double homoallelic pot%-01 /pot%-01 po12-4/po12-4 diploid. This diploid displayed a relative his7-2 reversion rate of about 2 x lo3, compared to 240 and 9 for the pot%-01 and po12-4 single-mutant haploids, respectively (unpublished observations). As with similar mutants constructed by Simon et al. (39),pot%-01 was partially dominant, still displaying a demonstrable mutator effect in the presence of POL3 (93).We transformed a plasmid carrying the pot%-01 allele into a haploid strain containing the wild-type POL3 gene, a genetic configuration denoted as POL3 [poD-O1]. We then compared the relative mutation rates of haploids of genotypes ~ 0 1 2 - 4 ,POL3 [pot%-Ol], and ~012-4POL3 [poD-OI]. We observed simple additivity between the relative URA3 mutation rates of POL3 [POD-011and ~012-4mutants, indicating that the epistatic relationship between the two exonucleases is that of competition rather than action in series (unpublished observations). The synergy observed in the pot%-01 ~012-4double mutant indicates that one 3’+5’ exonuclease can compensate for the absence of the other. These results suggest that there is no autonomous 3‘+5‘ exonuclease that can effectively substitute for those of DNA polymerases I1 and 111. Interestingly, the idea that the 3’+5‘ exonuclease of one DNA polymerase could compensate for the lack thereof in another came from a search for an exonuclease activity that would allow DNA polymerase a to elongate an annealed primer with a mismatched 3’-terminus (96). This search found an activity identified as DNA polymerase 6.

112

2.

&LATIONSHIP

ALAN MOHHISON AND AKIO SUGINO

WITH

MISMATCHk P A I H

In yeast, what appears to be a generally conserved mismatch correction system (97-99) requires PMSI, which is structurally related to the prokaryotic mutL and hexB (100). The P M S l system probably corrects mismatches arising during D N A replication. Thus, the occurrence of the MluI sequence motif in the 5'-untranslated region of PMSl, and the periodic expression of its transcript at the G,/S-phase boundary, suggest it is under the same cell cycle control as many D N A replication genes (93). Furthermore, a pmsl deletion mutant displayed a spontaneous mutator phenotype of about 31-fold (101). We observed that a ~ 0 1 2 - 4 pins1 double mutant was viable, but grew poorly, and had a relative URA3 mutation rate that was approximately the product of the relative mutation rates of the pol2-4 and pmsl single mutants (unpublished observations). This multiplicative relationship indicates that the POL2 3'-+5' exonuclease and P M S l act in series. We also observed multiplicity between the relative mutation rates of pols-01 and pmsl mutants, indicating that the POL3 3'+5' exonuclease and P M S l act in series (93). The results, then, are consistent with the idea that mismatches produced by either D N A polymerase I1 or 111 are normally corrected by the P M S l mismatch correction system (102).

VI. The DNA Repair Polymerase A. In Vitro Systems Human D N A polymerase E was identified as a factor required to reconstitute an undefined pathway of D N A repair synthesis in UV-irradiated permeabilized human diploid fibroblasts, thus implicating the E polymerase in this process (103). PCNA (and by implication, D N A polymerase 6) was required for nucleotide excision repair in a HeLa cell-free system (104).In nuclear extracts of human cells, D N A polymerase was identified as the polymerase responsible for short-patch D N A repair of processed apyrimidinic sites (105). In yeast, base excision repair synthesis of D N A either containing uracil or treated with UV or osmium tetroxide was examined in cell-free nuclear extracts (106; Z. Wang, X. Wu and E. C . Friedberg, personal communication). Repair synthesis did not occur in ~012" mutant extracts, and could be restored by addition of purified D N A polymerase 11. Repair synthesis was stimulated in extracts, while addition of purified D N A polymerase 111 was inhibitory. This suggests that D N A polymerase 111 competes with D N A polymerase I1 to sequester the primer-template, but that D N A polymerase

YEAST

DNA POLYMERASE

113

111 performs little DNA synthesis, perhaps because it lacks a cofactor such as PCNA.

B. Genetic Evidence for Polymerases Involved in DNA Repair 1. DNA POLYMERASE I1 DNA polymerase I1 has been proffered as the “repair polymerase” (8, 47). Contrary to the implicit assumption, DNA repair is not a single process but an array of different pathways that have been dissected at least partially by genetic studies (107). It cannot be presumed that a particular polymerase acts in repair but not in replication, or that a single polymerase performs all DNA repair, or that each repair pathway has a dedicated polymerase. If DNA polymerase I1 is involved in the repair of damaged DNA, it might be possible in principle to obtain radiation-sensitive mutants either in POL2, or in DPB2 or DPB3. However, none of the mutants (Table 111) confers increased sensitivity to UV, y-rays, or methyl methanesulfonate. Nor has any poZ2 allele been picked up in genetic screens for mutants defective in DNA repair. However, alleles of both POL1 and POU, but not of POL2, conferring sensitivity to methyl methanesulfonate were found in a genetic screen for mutants with a hyper-recombination phenotype (108). While not excluding the possibility that DNA polymerase I1 acts in the repair of damaged DNA, the fact that POL2 is an essential gene means that it cannot be required solely for any known DNA repair pathway, since all known DNA repair pathways are dispensable for cell growth (107, 109-111).

2. REV3,

THE

“COMPUTERPOLYMERASE”

One minor repair pathway, the REV3 pathway, probably does have a dedicated polymerase: the REV3 “computer polymerase” (19).This pathway, one of at least two repair pathways in the RADG epistasis group (the other is the RAD18 postreplication repair pathway), is minor in terms of overall repair, but is responsible for virtually all mutations induced by environmental DNA-damaging agents. Hypothetically, the pathway acts by translesion synthesis, i.e., by insertion ofa base opposite a position in the template that contains a damaged or aberrant base; the predicted lack of an intrinsic REV3 3‘+5‘ exonuclease (Fig. 2) is in accord with this.

3. DNA POLYMERASES I AND 111 The POL1 transcript accumulates following UV treatment of cells, suggesting a role for DNA polymerase I in the repair of UV damage (112). Yeast DNA polymerase I, however, appears not to be required for the repair of

114

ALAN MORHISON AND AKIO SUGINO

X-ray-induced single-strand breaks, as determined by alkaline sucrosedensity-gradient sedimentation of radiolabeled DNA from pollt' cells (113). Conversely, DNA polymerase I11 was implicated in recombinational repair induced by UV or y-ray treatment of cells: in an ingeniously designed experiment, stationary-phase diploids heteroallelic for two different pol3f5 (cdc2) mutations were given sublethal irradiation doses and incubated in fresh medium at the non-permissive temperature ( I 14). In a control experiment with a diploid heteroallelic for cdc4, a cell-cycle gene not required for recombinational repair, DNA damage-induced recombination created wildtype CDC4 genes in some cells, which permitted growth and colony formation at a dose-dependent frequency. The pol3 heteroallelic diploid, however, failed to form colonies.

VII. Role of DNA Polymerase II in DNA Replication The involvement of DNA polymerases I and I11 or their higher eukaryotic a and 6 counterparts in replicating DNA is well documented (9, 10, 14, 15). Conversely, the REV3, M I P I , or POLX genes can be inactivated without causing cell inviability (or any growth defect in the cases of REV3 and POLX) and are thus not essential for genoinic DNA replication (19,115;P. Hopp and A. Sugino, unpublished observations). After the cloning of POL2, three lines of argument led to the suggestion that DNA polymerase I1 also participated in DNA replication. First, POL2 was essential for normal cell growth (51), whereas all known DNA repair systems are dispensable. Second, POL2, DPB2, and DPB3 appeared to be coregulated with DNA replication genes during the cell cycle (52-.54). Third, the terminal arrest morphology of yeast cells lacking POL2 or DPB2, or cells carrying temperature-sensitive p012 or dpb2 alleles held at the nonpermissive temperature, was typical of arrest during S-phase (51-53). The phenotypic characteristics of poi2 conditional mutants are indistinguishable from those of conditional mutants of pol1 or pol3. The finding that the nonessential UNGl (encoding uracil DNA glycosylase) and P M S l genes are also under the same cell-cycle control as DNA replication genes (93, 110) leaves open the possibility that POL2 (or, indeed, POL1 or P O U ) is required to repair a hypothetical form of lethal spontaneous damage that occurs during replication: this, by definition, would be an essential part of DNA replication and the argument thus becomes a tautology. Direct evidence for a replicative role of DNA polymerase I1 came when

YEAST

DNA POLYMERASE

115

it was shown that incorporation of labeled precursor into D N A of pol2 or dpb2 temperature-sensitive mutant cells ceases after shift-up to the nonpermissive temperature (52, 53). Analysis by FACS (fluorescence-activated cell sorting) also showed that DNA synthesis in poZ2ts cells held at the nonpermissive temperature ceased with the same kinetics shown by poll t~ cells (53). More recently, Budd and Campbell ( 1 1 5 ~ )used alkaline sucrosedensity-gradient analysis of DNA synthesis products to show thct no chromosomal-sized DNA is synthesized after shift of an asynchronous p 0 l 2 ~ ~ culture to the non-permissive temperature. The DNA profiles of replication intermediates from the pol2ts mutants were similar to those observed with DNA synthesized in mutants deficient in DNA polymerase I under the same conditions. Further evidence implicating DNA polymerase I1 in DNA replication comes from the spontaneous mutator phenotype of the exonuclease-deficient ~012-4 mutant. While a spontaneous mutator phenotype may also arise in DNA repair mutants, the P O & - 4mutation does not confer sensitivity to DNA-damaging agents, and the epistatic relationships discussed above link the DNA polymerase I1 3 ' 4 5 ' exonuclease with DNA replication. We have defended a model of the DNA replication/error correction cycle in which DNA polymerases a, 8, and E copy the template DNA, the 6 and E 3'+5' exonucleases excise misincorporated nucleotides, and remaining errors are corrected by the serial action of the P M S l mismatch correction system (102). Several models have been proposed to explain the roles of the a,6, and E DNA polymerases in eukaryotic DNA replication (lo),and more may have to be devised. It has become dogma that DNA polymerase a synthesizes only short RNA-DNA primers, thus making only a small quantitative contribution to total DNA synthesis. Nucleotides misincorporated by DNA polymerase a might be corrected by the 6 and E 3 ' 4 5 ' exonucleases, or might simply result in abortive priming. From the spontaneous mutation rates of the PO&01 and pd2-4 mutants, the 3 ' 4 3 ' exonuclease of the 6 polymerase appears to make an approximately 10-fold greater contribution to reducing spontaneous mutations than does that of the E polymerase. Whether this reflects the relative contributions of the two polymerases to DNA synthesis depends on their ratios of exonuclease and polymerase activities in the cell, which are unknown. The locations of the spontaneous mutations generated in the pol% 4 and POD-01 mutants suggest that both polymerases act at sites scattered widely throughout the genome, and both act within the URA3 gene inserted near the ARS306 replication origin. These results are consistent with models in which both E and 6 DNA polymerases participate in DNA replication, but further investigation is needed to define their precise roles.

116

ALAN MORRISON AND AKIO SUGINO

GLOSSARYOF YEAST GENESO ADE2 ADE5 ARS306 CAN1 CDC2 CDC4 CDCl7

DPB2 DPB3

nm HIS7

HO HOM3 HSP6O LEU2

LYSI

LY s2 MAT MlPl PMSl

POL1

POL2 POL2 POLX

Encodes phosphoribosylaminoimidazole carboxylase; its mutants are &nine auxotrophs Encodes phosphoribosylglycinamide synthetase; its mutants are adenine auxotrophs Locus of DNA replication origin 306 on chromosome 111 Encodes arginine permease; confers sensitivity to canavanine Required for progression of the _cell-_division_cycle, encodes DNA polymerase 111 (Table I) and is identical to POL3 Required for progression of the cell-division zycle Required for progression of the cell-division _cycle; encodes DNA polymerase I (Table I) and is identical to POL1 Encodes subunit B of DNA polymerase I1 (Table 11); inutants are listed in Table 111 Encodes subunit C of DNA polymerase I1 (Table 11); mutants are listed in Table 111 Encodes ATP phosphoribosyltransferase; mutants are histidine auxotrophs Encodes glutamine amidotransferase; mutants are &tidine auxotrophs Directs efficient interconversion of mating type in homothallic strains Encodes aspartate kinase; mutants are k o s e r i n e auxotrophs Encodes a mitochondrial heat-qhock protein Encodes P-isopropylmalate dehydrogenase; its mutants are leucine auxotrophs Encodes saccharopine dehydrogenase; its mutants are k i n e auxotrophs Encodes 2-aminoadipate reductase; its mutants are k i n e auxotrophs Locus determining the E t i n g type of a haploid cell; present as either a or 01 Encodes mitochondrial DNA polymerase (see Table I) Required for mismatch repair; gene product shows aminoacid sequence similarity to the prokaryotic mutL and h e d mismatch-repair genes; p r m l strains display a spontaneous mutator phenotype in mitotic cells and, in meiosis, show increased post-meiotic segregation of mutations Encodes catalytic subunit of DNA wlymerase I (Table I); identical to C D C I 7 Encodes catalytic subunit A of DNA p&ymerase 11 (Tables I and 11); mutants are listed in Table 111 Encodes catalytic subunit of DNA polymerase 111 (Table I); identical to CDC2 Encodes a “computer DNA polymerase” with amino-acid sequence similarity to higher eukaryotic DNA polymerase p (Table 1)

117

YEAST DNA POLYMERASE GLOSSARY(continued) RAD6

Required for the repair of radiation-damaged DNA; RAD6 is the eponymous member of a group of genes that includes REV3 and RAD18, and its mutants have a pleiotropic phenotype; encodes a ubiquitin conjugase and is identical to

RAD18 RA DS0 REV3

Required for the repair of +iation-damaged D N A Required for the repair of radiation-damaged D N A Required for the appearance of genomic D N A mutations following treatment of cells with genotoxic agents, a phenotype typically measured by reversion of auxotrophic marker genes; encodes a “computer DNA polymerase” with amino-acid sequence similarity to class-B D N A polymerases (see Table I) Encodes uracil D N A glycosylase Encodes orotidine-:,‘-phosphate decarl~oxylase;its mutants confer uracil auxotrophy and resistance to 5-fluoroorotic acid

UBCZ

UNGl uRA3

“By convention, S. cerecisiae gene symbols consist of three letters followed by a number, all italicized. Upper-case letters, as in “POLi?,” signify dominance, while lower-case letters are used for recessive genes. Mutants ofa gene are given allele nnmberc, as in “,no&18,” nr the symbol “A,” which indicates “deletion,”or the letters “fs,” indicating “temperature-sensitive.”Information on the genes listed is from 121-123 or from references quoted in the text.

ACKNOWLEDGMENTS The authors are grateful to D. Thomas, P. Ropp, and C. Bennet for their many comments on this manuscript, and to J. E. Syvaoja and Z. Wang for communicating unpublished data.

REFERENCES 1 . K. Fien and B. Stillman, MCBiol 12, 155 (1992).

PNAS 83, 2869 (1986). D. H . Weinberg and T. J. Kelly, PNAS 86, 9742 (1989). S.-H. Lee, T. Eki and J. Hunvitz, PNAS 86, 7361 (1989). L. H. Hartwell, R. K. Mortimer, J. Culotti and M. Culotti, Genetics 74, 267 (1973). A. Pizzagalli, P. Valsanini, P. Plevani and 6. Lucchini, PNAS 85, 3772 (1988). A. Boulet, M. Simon, 6. Faye, G. A. Bauer and P. M. J. Burgers, EMBOJ. 8,1849 (1989). K . C. Sitney, M . E. Budd and J. L. Campbell, Cell 56, 599 (1989). T. S.-F. Wang, ARB 60, 513 (1991). S. Linn, Cell 66, 185 (1991). R. A. Bambara and C. B. Jessee, BBA 1088, 11 (1991). J. E. Syvaoja, BioEssuys 12, 533 (1990). I. R. Lehman and L. S . Kaguni, JBC 264, 4265 (1989).

2. Y. Murakami, C . R. Wobbe, L. Weissbach, F. B. Dean and J. Hunvitz,

3. 4.

5. 6. 7. 8. 9. 10. 11. 12. 13.

118

ALAN MORRISON AND AKIO SUGINO

14. A. 6 . So and K. M. Downey, C R C Crit. Rev. Biochern. M o l . Biol. 27, 129 (1992). 15. J. L. Campbell and C. S. Newlon, in “The Molecular and Cellular Biology of the Yeast Saccharomyces” (J. R. Broach, J. R. Pringle and E. W. Jones, eds.), pp. 41-141. CSHLah, Cold Spring Harbor, New York, 1991. 16. J. Hurwitz, F. D. Dean, A. D. Kwong and S.-H. Lee, JBC 265, 18043 (1990). 17. H. Echols and M. F. Goodman, A R B 60, 477 (1991). 18. T. A. Kunkel, Cell 53, 837 (1988). 19. A. Morrison, R. B. Christensen, J. Alley, A. K. Heck, E. 6. Hernstine, J. F. Lemontt and C . W. Lawrence, J . B a t . 171, 5659 (1989). 20. S. W. Wong, A. F. Wahl, P.-M. Yuan, N. Arai, H. E. Pearson, K.-I. Arai, D. Korn, M . W. Hunkapiller and T. S.-F. Wang, E M B O J . 7, 37 (1988). 21. F. Hirose, Y. Masamitsu, Y. Nishida, M. Masutani, H . Miyazawa, F. Hanaoka and A. Matsukage, N A R e s 19, 4991 (1991). 22. D. W. Chung, J. Zhang, C.-K. Tan, E. W. Davie, A. 6. So and K . M. Downey, PNAS 88, 11197 (1991). 23. J. Zhang, D. W. Chung, C.-K. Tan, K. M. Downey, A. 6. So and E. W. Davie, Bchein 30, 11742 (1991). 24. C. L. Yang, L. S. Chang, P. Zhang, H. Hao, L. Zhu, N. L. Toomey and M. Y. Lee, NARes 25, 735 (1992). 25. P. M. J. Burgers, R. A. Hambara, J. L. Campbell, L. M . S. Chang, K. M. Downey, U. Hubscher, M. Y. W. T. Lee, S. M . Linn, A. 6 . So and S. Spadari, EJB 191,617 (1990). 26. P. Bork, C. Ouzounis, C. Sander, M. Scharf, R . Schnelder and E. Sonnhammer, Nature 358, 287 (1992). 27. J. Syvaoja, S. Suomensaari, C. Nishida, J. S. Goldsmith, G. S. J. Chui, S. JainandS. Linn, PNAS 87, 6664 (1990). 28. T. Weiser, M. Gassmann, P. Thommes, E. Ferrari, P. Hafkemeyer and U. Hubscher, J B C 266, 10420 (1991). 29. J. Ito and D. K. Hraithwaite, NARes 19, 4045 (1991). 30. A. Bernad, L. Hlanco, J. M . Lazaro, G. Martin and M . Salas, Cell 59, 219 (1989). 31. A. Matsukage, K. Nishikawa, T. Ooi, Y. Seto and M . Yamaguchi, JBC 262, 8960 (1987). 32. J. It0 and D. K. Braithwaite, NARes 18, 6716 (1990). 33. L. Blanco, A. Bernad and M. Salas, NARes 19, 955 (1991). 34. W. C. Copeland and T. S.-F. Wang, JBC 266, 22739 (1991). 35. A. Morrison, J. H. Hell, T. A. Knnkel and A. Sugino, PNAS 88, 9473 (1991). 36. T. S.-F. Wang, S . W. Wong and D. Korn, FASER J. 3, 14 (1989). 37. V. Dernagnez, J. Tillit, A.-M. d e Recondo and G . Baldacci, M C G 226, 182 (1991). 38. C. H. C. Hwang, K. L. Ruffner and D. M. Coen, I. Virol. 66, 1774 (1992). 39. M. Simon, L. Giot and 6. Faye, E M B O J . 10, 216.5 (1991). 40. L. Blanco, A. Hernad and M. Salas, Gene 112, 139 (1992). 4 1 . M. S. Soengas, J. A. Esteban, J. M. LBzaro, A. Bernad, M. A. Blasco, M . Salas and L. Blanco, E M B O J. 11, 4227 (1992). 42. U. Wintersberger and E. Wintersherger, EJB 13, 11 (1970). 43. W. H. Helfman, EJB 32, 42 (1973). 44. E. Wintersberger, EJB 50, 41 (1974). 45. L. M. S. Chang, JBC 252, 1873 (1977). 46. E. Wintersberger, EJB 84, 167 (1978). 47. M. E . Budd, K . C. Sitney and J. L. Campbell, J B C 264, 6557 (1989). 48. G. A. Hauer, H. M. Heller and P. M. J. Burgers, J B C 263, 917 (1988). 49. P. M. J. Burgers and G. A. Bauer,JBC 263, 925 (1988). 50. R. K. Hamatake, H. Hasegawa, A. H. Clark, K. Hebenek, T. A. Kunkel and A. Sugino, JBC 265,4072 (1990).

YEAST DNA POLYMERASE

119

51. A. Morrison, H. Araki, A. B. Clark, R. K. Hamatake and A. Sugino, Cell 62, 1143 (1990). 52. H. Araki, R. K. Hamatake, L. H. Johnston and A. Sugino, PNAS 88, 4601 (1991). 53. H. Araki, P. A. Ropp, A. L. Johnson, L. H. Johnston, A. Morrisonand A. Sugino, EMBO J . 11, 733 (1992). 54. H. Araki, R. K. Hamatake, A. Morrison, A. L. Johnson, L. H. Johnston and A. Sugino, NARes 19, 4867 (1991). 55. J. E. Walker, M. Saraste, M. J. Runswick and N. J. Gay, EMBO J . 1, 945 (1982). 56. W. C. Brown, J. K. Smiley and J. L. Campbell, PNAS 87, 677 (1990). 57. S. J. Brill and B. Stillman, Nature 342, 92 (1989). 58. W.-D. Heyer, M. R. S. Rao, L. F. Erdile, T. J. Kelly and R. D. Kolodner, EMBO J . 9, 2321 (1990). 59. J. K. Smiley, W. C. Brown and J. L. Campbell, NARes 20, 4913 (1992). 60. S.-H. Lee, 2.-Q. Pan, A. D. Kwong, P. M. J. Burgers and J. Hurwitz, JBC 266, 22707 (1991). 61. P. M. J. Burgers, JBC 266, 22698 (1991). 62. B. L. Yoda and P. M. J. Burgers, JBC 266, 22689 (1991). 63. J. E. Syvioja and S. Linn, JBC 264, 2489 (1989). 64. T Kesti and J. E. Syvkja, JBC 266, 6336 (1991). 65. F. Focher, M. Gassmann, P. Hafkemeyer, E. Ferrari, S. Spadari and U. Hubscher, NARes 17, 1805 (1989). 66. G. Siegal, J. J. Turchi, T. W. Myers and R. A. Bambara, PNAS 89, 9377 (1992). 67. Y. Ishimi, A. Claude, P. Bullock and J. Hurwitz, JBC 263, 19723 (1988). 68. M. Goulian, S. H. Richards, C. J. Heard and B. M. Bigsby, JBC 265, 18461 (1990). 69. L. H. Johnston and N. F. Lowndes, NARes 20, 2403 (1992). 70. J. S. Cibhs, H. C. Chiou, K. F. Bastow, Y.-C. Cheng and D. M. Coen, PNAS 85, 6672 (1988). 71. J. D. Hall, Y. Wang, J. Pierpont, M. S. Berlin, S. E. Rundlett and S . Wong, NARCS 17, 9231 (1989). 72. D. I. Dorsky and C. S. Crumpdcker, J. Virol. 64, 1394 (1990). 73. A. Bernad, J. M. Lazaro, M. Salas and L. Blanco, PNAS 87, 4610 (1990). 74. A. Bernad, L. Blanco and M. Salas, Gene 94, 45 (1990). 75. G. Jung, M. C. Leavitt, M. Schultz and J. Ito, BBRC 170, 1294 (1990). 76. A. I. Marcy, C. B. C. Hwang, K. L. Ruffner and D. M. Coen, J . Virol. 64, 5883 (1990). 77. I. Joung, M. S. Horwitz and J. A. Engler, Virology 184, 235 (1991). 78. D. L. Ollis, P. Brick, R. Hamlin, N. G . Xuong and T. A. Steitz, Nature 313, 762 (1985). 79. V. Derbyshire, P. S. Freemont, M. R. Sanderson, L. Beese, J. M. Friedman, C. M. Joyce and T. A. Steitz, Science 240, 199 (1988). 80. V. Derbyshire, N . D. F. Grindley and C. M. Joyce, E M R O J . 10, 17 (1991). 81. E. K. Spicer, J. Rush, C. Fung, L. J. Reha-Krantz, J. D. Karam and W. H. Konigsberg, JBC 263, 7478 (1988). 82. M. C. Leavitt and J. Ito, PNAS 86, 4465 (1989). 83. L. J. Reha-Krantz, S. Stocki, R. L. Nonay, E. Dimayuga, L. D. Goodrich, W. H. Konigsberg and E. K. Spicer, PNAS 88, 2417 (1991). 84. L. J. Reha-Krantz, Gene 112, 133 (1992). 85. S. M. Cotterill, M. E. Reyland, L. A. Loeh and I. R. Lehman, PNAS 84, 5653 (1987). 86. R. G. Brooke, R. Singhal, D. C. Hinkle and L. B. Dumas, JBC 266, 3005 (1991). 87. P. S . Freemont, J. M . Friedman, L. S. Beese, M. R. Sanderson and T A. Steitz, PNAS 85, 8924 (1988). 88. L. S. Beese and T. A. Steitz, E M B O J . 10, 25 (1991). 89. J. S. Gibbs, K. M. Weisshart, P. Digard, A. deBruynkops, D. M. Knipe and D. Coen, M C B i d 11, 4786 (1991).

120

ALAN MORRISON AND AKIO SUCINO

90. G. Lucchini, M. M. Falconi, A. Pizzagalli, A. Aguilera, H. L. Klein and P. Plevani, Gene 90, 99 (1990). 91. T. A. Kunkel, R. K. Hatnatake, J. Motto-Fox, M. P. Fitzgerald and A . Sugino, MCRiol9, 4447 (1989). 92. R. M. Schaaper, PNAS 85, 8126 (1988). 93. A. Morrison, A. L. Johnson, L. H. Johnston and A. Sugino, EMBO J . 12, 1467 (1993). 94. J. D. Boeke, F. LaCroute and 6. R . Fink, MGG 197, 345 (1984). 95. G. S.-F. Lee, E. A. Savage, R. G. Ritzel and R. C. von Borstel, MGG 214, 396 (1988). 96. F. W. Perrino and L. A. Loeb, Bchem 29, 5226 (1990). 97. J. Holmes, S. Clark and P. Modrich, PNAS 87, 5837 (1990). 98. 1. Varlet, M. Radmau and P. Brooks, PNAS 87, 7883 (1990). 99. D. C. Thomas, J. D. Roberts and T. A . Kunkel, JBC 266, 3744 (1991). 100. W. Kramer, B. Kramer, M. S. Williamson and S. Fogel, J . B a t . 171, 5339 (1989). 101. B. Kramer, W. Kramer, M. S . Williamson and S. Fogel, MCBiol 9, 4432 (1989). 102. A. Morrison and A. Sugino, Chromoso~na102, S146 (1992). 103. C. Nishida, P. Reinhard and S. Linn, J B C 263, 501 (1988). 104 M. K. K. Shivji, M. K. Keiiny and R. D. Wood, Cell 69, 367 (1992). 105. K. Wiebauer and J. Jiricny, PNAS 87, 5842 (1990). 106. 2. Wang, X. Wu and E. C. Friedberg, Bchem 31, 3964 (1992). 107. E. C. Friedberg, Microbiol. Rea 52, 70 (1988). 108. A. Aguilera and L. H . Klein, Genetics 119, 779 (1988). 109. K . J. Percival, M. B. Klein and P. M. J. Burgers, J B C 264, 2593 (1989). 110. K. J. Iinpellizzeri, B. Anderson and P. M. J. Burgers, /. Ract. 173, 6807 (1991). 111. I). Ramotar, S. C. Popoff, E. B. Gralla and B. Demple, MCBiol 11, 4537 (1991). 112. L. H. Johnston, J. H. M. White, A. L. Johnson, G. Lucchini and P. Plevani, NAHes 15, 5017 (1987). 113. M. E. Budd, K. D. Wittrup, J. E. Bailey and J. L. Caniplxll, MCBiol 9, 365 (1989). 114. F. Farbre, A. Boulet and 6 . Faye, MGG 229, 353 (1991). 115. F. Foury, JBC 264, 20552 (1989). 115a. M. E . Budd and J. L. Campbell, MCBiol 13, 496 (1993). 116. R . K. Mortimer, C. R. Contopoulou and J. S. King, in “The Molecular and Cellular Biology of the Yeast Succhuroinyces” (J. R. Broach, J. R . Pringle and E. LV. Jones, eds.), pp. 737-812. CSHLab, Cold Spring Harbor, New York, 1991. 117. A. D. Gietz and A. Sugino, Gene 74, 527 (1988). 118. M . D. Tornalski, J. Wu and L. K . Miller, Virology 167, 591 (1988). 119. H. Iwasaki, Y. Ishino, H. Toh, A. Nakata and H. Shinagawa, MGG 226, 24 (1991). 120. A. Morrison and A. Sugino, NARes 20, 375 (1992). 121. E. W. Jones and G. R. Fink, in “The Molecular arid Cellular Biology of the Yeast S a c charomyces: Metabolism aud Gene Expression” (J. N. Strathern, E. W. Jones and J. R. Broach, eds.), p. 181. CSHLali, Cold Spring Harbor, New York, 1982. 122. J. R. Pringle and L. H. Hartwell, in “The Molecular and Cellular Biology of the Yeast Saccharoniyces: Life Cycle and Inheritance” (J. N. Strathern, E. W. Jones and J. R . Broach, eds.), p. 97. CSHLah, Cold Spring Harlior, New York, 1981. 123. J. R. Broach, in “The Molecular and Cellular Biology of the Yeast Succhuroniyces: Life Cycle and Inheritance” (J. N. Strathern, E . W. Jones atid J. R. Broach, eds.), p. 653. CSHLab, Cold Spring Harhor, New York, 1981.

Regulation of Bacillus subtilis Gene Expression during the Transition from Exponential Growth to Stationary Phase MARKA. STRAUCH Dizjision of Cellular Biology Department of Molecular and Experimental Medicine The Scripps Research Institute La Jolla, California

1. The Bacillus subtilis Transition State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Keeping Stationary Phase Gene Expression Off: Transition-State Regulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. AbrB Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Hpr Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Sin Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Pai Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Activators and Modulators of Transition-State Gene Expression . . . . . . A. Deg Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. SenS Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. TenA and Ten1 Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. ComP and ComA Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Initiating Sporulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . spo0 Genes and the “Phosphorelay” Signal-Transduction System . . B. Transition-State Regulators and Sporulation . . . . . . . . . . . . . . . . . . . . V. “Redundant” Control of Transition-State Genes . . . . . . . . . . . . . . VI. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

123 124 124 132 134 136 138 138 141 142 142 143 143 145 146 148 149

When the environment can no longer support exponential increases in a bacterial population, the constituent cells induce expression of many genes normally repressed during conditions conducive to rapid growth. In general, there are two broad classes of genes that become expressed as the bacterium enters stationary phase: those encoding components of alternative pathways for carbon, nitrogen, and phosphorus metabolism that maximize utilization of available resources; and those encoding functions that lead to a more environmentally resistant cellular state. In some bacteria (e.g., Bacillus or myxobacteria), the response can lead to production of dormant spores. The proper expression of these classes of genes is of paramount importance to I’mgrrss i n Nucleic Acid Research and hlolrciilar Biology, \bl 46

122

MARK A. STRAUCH

viability in natural habitats where nutrients are often limited. However, when abundant resources are available, such functions are detrimental to the organism’s goal of producing as many progeny as possible. Obviously, the cell must possess regulatory mechanisms that silence these functions during replicative growth, but that allow their rapid and coordinated expression in response to nutrient deprivation. If molecular biologists in 1993 were given a chance to play “Mother Nature” and design a regulatory network responsive to nutrient deprivation that could effectively coordinate expression of genes in a large number of divergent metabolic pathways, they would probably incorporate certain essential features. Among these would be sensing mechanisms receiving input from various sources, each indicating the level of certain key metabolites. Ideally, the number of different mechanisms would be kept to a minimum, with perhaps one being a central control point integrating much of the essential information. Overall, the system would have to be able to discriminate between messages signaling that growth was still possible, albeit at a slower rate, from those warning of the imminent cessation of replicative growth. This would be especially critical if one of the cell’s options was differentiation into a dormant form, such as an endospore. The intelligence received through the sensors would then be relayed to a few key regulatory “generals” able to coordinate expression of large numbers of genes either directly or through their “orders” to other regulatory subordinates. While it would probably be desirable to keep the total number of regulators to a minimum, this would have to be balanced by considerations related to: (1) differential requirements in terms of timing and levels of individual gene expression; (2) the ability to optimize the energy flow for different types and magnitudes of limited nutrients; and (3) incorporation of a “fail-safe” redundancy into the system, insuring that minor perturbations would not prematurely trigger the complete response. Finally, some of the regulators would function as focal points connecting the various aspects (nutrient utilization, environmental resistance) of the different possible responses. Nature, being wiser than (most) biologists and having had more time to solve such problems, no doubt has added elaborate levels of complexity and interconnections in designing regulatory networks that maximize the chances of survival in hostile environments. Recently there has been renewed interest in studying stationary phase events in the model Gram-negative bacterium Escherichia coli ( 1 , 2 ) ,but the most studied microorganism in this regard is the Gram-positive Bacillus subtilis. Genetic and biochemical aspects of B. subtilis post-exponential gene expression have been studied intensively for some time because of their association with the developmental program of endospore formation. Numerous regulatory proteins have been identified and their biochemical

TRANSITION-STATE GENE EXPRESSION

123

functions are rapidly being elucidated. The emerging picture of the regulatory network controlling gene expression during the early stages of postexponential growth shows intricate complexities and interconnections, as expected. Yet most of the essential features are not unlike those our hypothetical molecular biologists would incorporate. Of particular interest are those regulatory “generals” that control expression of genes in many different pathways and serve as focal points for the integration of the entire response. The primary focus of my essay is to outline our current understanding of these B. subtilis regulatory proteins.

1. The Bacillus subtilis Transition State One of the most frequently encountered stresses that a bacterium faces is depletion of one or more critical nutrients, which leads to the end of balanced exponential growth and the subsequent entry into some type of stationary phase. But it is probably improper to think of this transition as abrupt. Rather, the cell undergoes a “transition state” of unbalanced growth during which it is still expressing many growth-related functions and beginning to express gene products necessary for survival in the increasingly hostile environment. The transition state is when B . subtilis assesses the environmental and metabolic conditions and decides which of two pathways is the most advantageous: a non-dividing semiquiescent period during which it maintains a low metabolic activity, or the developmental program of endospore formation. Additionally, if the depleted nutrients are replenished before a Commitment has been made, the cell can usually resume vegetative growth without the long lag phase necessary for cells that have entered stationary phase. Therefore, the transition state of B. subtilis can be viewed as a crossroads, where functions necessary for smooth entry into any of the alternate paths are present, but before any final commitment has been made. The proper regulation of gene expression during the transition state is obviously critical for optimizing survival. In fact, in its natural soil habitat, where nutrients are usually limiting, the transition state is probably the predominant metabolically active state of a Bacillus cell. Many gene products first expressed during the transition state are not necessary for the sporulation response under laboratory conditions (Table I). However, in the wild, these functions probably play crucial roles in increasing the efficiency of spore formation, and thus provide a selective advantage. Numerous regulatory genes showing pleiotropic effects on expression of many of these post-exponential functions have been discovered (Table 11). Some of these regulators are also strictly required for sporulation initiation (sp00 and other genes; Section IV); others are not. While none of the known

124

MARK A. STRAUCH

TABLE I FUNCTIONS EXPRESSED DURING THE Bacillus TRANSITION STATE Proteases (intra- and extracellular) Starch-degrading esterases

enzymes

and

unspecified

Transport of metabolites Development of competence for DNA uptake Motility (synthesis of flagella) Complete respiratory tricarboxylate cycle Nitrogen utilization enzymes Alkaline phosphatase Catalase Membrane and cell envelope changes Antibiotics Alterations of ribosomal proteins Sporulation regulators

mutations in the latter class of regulators leads to a significant sporulation defect in the laboratory, many are controlled by essential sporulation genes and some have been shown (or implicated) to regulate genes necessary for sporulation. Functionally, there are two types of regulators affecting transition-state gene expression. One type actively prevents inappropriate gene expression during vegetative growth. At the onset of the transition state, the effects of these “transition-state regulators” are negated, and the appropriate functions can be expressed. The second type of regulators do not directly prevent expression during exponential growth but rather activate or modulate expression of post-exponential genes previously unexpressed.

II. Keeping Stationary Phase Gene Expression Off: Transition-State Regulators Four B. subtilis regulatory proteins that act directly to silence postexponential gene expression during vegetative growth have been identified. They are the products of the abrB, hpr, sin, and pai genes.

A. AbrB Protein In addition to blocking the initiation of sporulation, sp00 mutations (Section IV) have pleiotropic effects on the expression of many fiinctions associ-

125

TRANSITION-STATEGENE EXPRESSION

TABLE I1 PLEIOTROPIC REGULATORSOF Bacillus subtilis POST-EXPONENTIAL GENEEXPRESSION Gene symbol

Mnemonic"

A. Non-essential for sporulationc abrB sntibiotic Lesistance -hyperpro tease hpr sin" 5porul;;fion &hibition kind inhibitor of pai (ORF1) ? degs degradative degradative degU degradative &gQ degR degradative tenA transcriptional e+anceinent tenl transcriptional enhancement comP competence C o d competence senS Secretion enhancing

a

B. Essential for sporulation initiation spoOA sporulation zero -

spoOB spoOE SPOOF spoOH kid kid

kinase

(0)

Function of encoded protein6

Transcriptional regulator Transcriptional regulator Transcriptional regulator Inhibitor of Sin activity Transcriptional regulator Protein kinase acting on DegU Transcriptional regulator Transcriptional regulator Transcriptional regulator Transcriptional regulator Inhibitor of TenA Protein kinase acting on ComA Transcriptional regulator Transcriptional regulator Transcriptional regulator Phosphoprotein phosphotransferase Negative regulator Response regulator, secondary messenger RNA polymerase sigma factor Protein kinase acting on SpoOF Protein kinase acting on SpoOF

0 In many cases, the listed mnemonic is based o n only one phenotype caused by mutations in the gene. Therefore, this alone does not provide an adequate representation of the pleiotropic nature of these regulators. b A listing of genes and functions affected by the particular regulators is found in the text. c Deletion or null mutations have no noticeable effects on sporulation under laboratory conditions. d In order to conform with more standard rules, a change in the nomenclature of these genes has been made 0.A. Hoch and I. Smith personal communication, 1992). The new designations are sinR and sinl.

ated with the onset of the transition state. Early attempts to dissect the pleiotropy of spoOA mutants led to the identification of loci that, when mutated, could revert many of the spoOA mutant phenotypes, but could not restore the ability to sporulate (3). The overwhelming majority of these suppressors mapped at a single locus named cpsX (4). By using different selection procedures, other groups obtained similar suppressors, termed abs, tol, and abr (5-7). Genetic analysis revealed that the cpsX, abs,tol,and most of the abr mutations mapped near the origin of chromosome replication at a locus termed abrB (8). These results suggested that the abrB gene

126

MARK A. STRAUCH

TABLE 111 GENESAND PROCESSES CONTROLLED BY AbrB ~~~

Gene

Functiona

A. Direct regulation via AbrB-DNA interactions" abrB AbrB aprE Subtilisin tycA Tyrocidine synthetase I dciA Dipeptide transport spoOE Sporulation regulator spoOH RNA polymerase sigma factor spoVG Required for Spo stage V isin-sin Regulatory proteins

B. Indirect or uncertain mode of AbrB regulation hPr Regulatory protein nprE Neutral protease hut Histidase katA Catalase corn? Competence genes ? Antibiotic production ? Antibiotic resistance ? Phage tolerance ? Ribosome proteins ? Esterases ?: Motility ? Nitrate reductase Q

Notes

Autoregulation

B . brevis

U'3

Activation of hpr Indirect through Hpr Indirect (?) Exact targets unknown

Membrane/envelope components? Alterations or changes in expression Unspecified activities Inducibility of enzyme

As evidenced by DNase-I or other footprinting procedures

product was an important factor involved in the pleiotropic effects of spo0 mutations. Table I11 lists some of the post-exponentially expressed genes and functions that have been shown to be under abrB control. The abrB gene is monocistronic and codes for a protein of approximately 10,500 Da (9). It is transcribed by two differentially regulated promoters, whose start sites are 14 base-pairs apart (9). Mutations conferring an AbrB phenotype have been located in either the promoter region or within the coding sequence. One promoter region mutation (abrBl5) eliminates transcription from the downstream promoter, but not from the upstream one (9). As of this writing, two mutations in the coding region have been sequenced. One changes a cysteine residue to a tyrosine (9);the other, abrB703 (10)is a frame-shift mutation that fuses the first 57 codons of abrB to as many as 53 downstream codons before the RNA transcript is terminated (P. Zuber, personal communication, 1992). These results indicate that the AbrB phenotype arises because of either a lowering of the expression of the wild-type protein or the production of an altered non-functional protein.

TRANSITION-STATE GENE EXPRESSION

127

Purified AbrB is a hexamer of identical subunits that binds to promoter regions of genes it is known to control (11-13). It contains a region with slight similarity to the helix-turn-helix DNA binding motif, but this region has been shown (via site-directed mutagenesis) to be dispensable for normal binding to DNA in vitro (14). It does not contain any other known DNA binding motif (15).Nevertheless, AbrB binding to DNA is specific and cooperative. The cooperativity is believed to be an important factor in the cell’s ability to quickly negate AbrB control at the onset of the transition state. Comparison of gel retardation and DNase-I footprinting experiments presents a bit of a conundrum with respect to one aspect of AbrB binding. In the mobility shift assays, AbrB binding appears to be quantized in discrete steps: at any given subsaturating AbrB concentration, only a single retarded DNA hand is observed, and the degree of retardation increases as the protein concentration is increased (11, 13, 16). This seems to indicate that binding is occurring progressively from an initial nucleation site, with each step representing a partially occupied binding region. However, the DNase1protection experiments do not show a progressive increase in the protected region over the same span of AbrB concentrations. Instead, the binding is “all-or-none’’ in appearance over a relatively large, but defined, contiguous region (30-120 nucleotides, depending on the promoter). A possible explanation (17 )for this discrepancy is that AbrB can initially bind with similar affinity to any one of several independent sites within the target region, and that subsequent binding of additional AbrB molecules is cooperative. Each step in the mobility shift experiments could represent a given number of AbrB molecules bound per DNA, although the locations would vary from one DNA molecule to the next. If not all of the DNA molecules have AbrB hound at the same site under subsaturation concentrations, then only a partial protection at any given site will be observed in the DNase-I footprinting assays. If there are multiple identical sites, then the partial protection would extend over the same length as the fully protected regions. Examination of some AbrB footprint results indicates that this may indeed be occurring, although these results are by no means conclusive. The different lengths of protected regions seen for the different target promoters circumstantially argue that they contain different numbers of AbrB recognition sites; however, the existence of multiple identical binding sites within the footprint regions has not yet been proven. Methylation protection (11) and hydroxyl-radical footprinting experiments (16)indicate that the closest points of AbrB-DNA contact are situated on only one side of the DNA. All of the AbrB-protected guanines are on one face of the helix, implying that binding involves stacking of the protein along that side, with the recognition determinants being repeated with a 10-bp (or other integral number of helical turns) periodicity. The hydroxyl-radical

128

MAHK A. STRAUCH

studies revealed short stretches of preferentially protected A+T-rich sequences and, where they occurred, these were also spaced about one helical turn apart. However, within some of the binding regions (as determined by DNase-I footprints), there are short A+T-rich sequences that are not protected from hydroxyl-radical attack even though they appear to be properly spaced in relation to the protected A+T’s. Binding of AbrB along only one face of the helix may be an important feature in the regulation of certain promoters controlled by multiple regulatory proteins (see Sections 11,B and V). What are the DNA determinants that AbrB recognizes and binds to? Binding of AbrB to a number of promoters has been observed by DNase-I footprinting (11, 16, 18, 18a). As mentioned above, there is wide variability in the total extent of the protected regions from target to target. While each of these regions is relatively A+T-rich, closer examination does not reveal any obvious candidates for a sequence motif that is recognized by AbrB. It has been hypothesized ( 1 2 , 17) that AbrB recognizes a specific DNA structure that can be formed by a finite variety of base sequences. Recognition of a three-dimensional DNA structure can account for the DNA-binding properties of other proteins, in particular those with relatively little or no binding specificity, such as type-I1 DNA-binding proteins TF1 of bacteriophage SPOl (19) and bacterial Hu-like proteins (20).But because it binds with high specificity, AbrB must recognize a relatively specific and unusual structure that is not found randomly in the genome, nor is a general feature of promoter regions. The A+T-richness in AbrB binding regions could be an important factor in that runs of A and T can induce DNA bending and produce a structure that differs from usual B-DNA (21-23). In fact, some, but not all, promoters to which AbrB binds are in (or near) intrinsically bent regions (16, 18a; E. Ferrari, personal communication). But A+T-richness alone does not appear to be the sole determinant of AbrB binding. On the aprE promoter, there are a number of guanines that AbrB hinding protects from rnethylation (11). The sequences around these guanines show a high conformity to the degenerate 8-bp sequence TGNRWNNA (R = a purine, W = A or T, N = any base). Statistical analysis showed that the similarity around the protected guanines is not due to chance when compared to sequences around non-protected guanines, or around guanines in promoter regions not bound by AbrB (11). Sequences with conformity to TGNRWNNA can be found in many other, but not all, regions to which AbrB binds. Thus, this sequence could be a member of the subset of those that can form the structure recognized by AbrB. Circumstantial support for this view comes from an examination of the AbrB binding region on the spoVG promoter (10). This region contains the sequence TGAAAAAA, which shows complete conformity to the TGNRWNNA sequence. A point

TRANSITION-STATE GENE EXPRESSION

129

mutation of the G to an A (resulting in TAAAAAAA) overcomes AbrB repression in vivo (10)and shows impaired AbrB binding in vitro (13).Given that guanine tracts can affect DNA bends normally directed by tracts of A’s (22),it is reasonable to postulate that the aforementioned G in the AbrB binding region on spoVG is an important determinant in the formation of a threedimensional structure recognized by AbrB. A mutant form of AbrB (AbrB4) has been purified and its binding properties have been investigated (12).The abrB4 mutation changes the protein’s lone cysteine residue to a tyrosine, yet purified AbrB4 is still a hexamer in solution, establishing that oligomerization does not require disulfide bonds. AbrB4 binds poorly to DNA and does not exhibit the cooperative binding characteristics of the wild-type protein. In subunit mixing experiments, the presence of only one or two mutant subunits per hexamer effectively abolished the heterogenous protein’s ability to bind DNA. The single amino-acid change in only one or two subunits apparently alters the overall structure of the hexamer, or the interaction between subunits required for DNA binding. Perhaps this means that the DNA binding site of AbrB is formed by two or more of the subunits (recall that the AbrB monomer contains no region with significant homology to currently identified DNA binding motifs). The overall regulatory function of AbrB is to prevent the expression of post-exponential phase genes at inappropriate times, such as active growth on good nutrient sources. Once the cells can no longer maintain balanced exponential growth, due to depletion of available nutrients, and thus enter the transition state, AbrB’s “down-regulation” is lifted and a battery of functions concerned with survival and developmental options become expressed. AbrB appears to use three different mechanisms (repression, prevention, and activation) to fulfill its regulatory purposes. For some genes, AbrB appears to be the sole repressor: expression of these genes is constitutive during vegetative growth in abrB mutants. A classic example of this type of AbrB regulation is the tyrocidine synthetase I gene (tycA) of Bacillus brevis (24).Two essential sporulation genes, spoOE (25) and spoOH (26), are also apparently controlled solely by AbrB repression during vegetative growth. (SpoOE is a regulator of the signal-transduction mechanism responsible for sporulation initiation, and the spoOH gene encodes an alternate RNA polymerase sigma factor, uH.The initiation ofsporulation is discussed more fully in Section IV.) The most prominent role that AbrB plays is that of “preventer.” This role is perhaps best illustrated by an example. In spo0 mutants, a number of genes, including aprE (which encodes the alkaline protease subtilisin), do not become expressed during the transition state, primarily due to overexpression of AbrB. Mutations ofabrB restore subtilisin production in a sp00background, but do not render it constitutive during vegetative growth;

130

MARK A. STRAUCH

additional temporal controls are still intact. Thus, AbrB acts in concert with other transition-state regulators, such as Hpr and Sin, to prevent inappropriate expression (the appearance of activators during the transition state can also play a role). The physiological significance of this apparent redundancy in controlling “prevented genes is addressed and discussed in Section V. AbrB also appears to be an activator of other regulators or components in alternative metabolic pathways, but it is not currently known whether this effect on the activated genes is direct or indirect. Genetic evidence (27; M. Perego, personal communication) indicates that AbrB is required for the normal expression of the Hpr transition-state regulator (Section 11,B). Although gel-mobility assays indicate that AbrB binds to a DNA fragment containing the hpr promoter region, attempts to localize the site of binding via footprinting procedures have been unsuccessfnl (11, 18a). Perhaps AbrB binding to the promoters it activates is somehow different from its binding to negatively regulated promoters, the former being recalcitrant to footprinting analysis. Alternatively, the gel-mobility experiments may be misleading and not reflect specific AbrB binding. In this case, AbrB activation of hpr may reflect a situation in which AbrB is a negative regulator of an unidentified repressor of hpr. AbrB has also been implicated in activating a component of the competence pathway (28)and the enzymes involved in histidine utilization (S. Fisher, personal communication), but the specific target genes have not yet been identified. For multistep transition-state processes (e.g., competence development), AbrB can play multiple roles, activating some genes while repressing (preventing) others. Any apparent AbrB regulation observed in vivo need not be the result of direct AbrB binding at the promoter in question. For example, the production of neutral protease (nprE)is prevented by AbrB (29), but AbrB does not bind to the nprE gene. Instead, AbrB-mediated regulation of nprE is indirect through its activation of Hpr, a protein that does bind to the nprE promoter. AbrB also participates in the regulation of the transition-state regulator Sin (Section II,C), thus indirectly controlling Sin-mediated regulation. In the cases of direct AbrB binding to promoters, how does this affect RNA polymerase’s ability to transcribe? Each AbrB binding region discovered so far overlaps with at least one essential promoter element (-35, -10, or + 1). The simplest explanation for negative control, therefore, is that The tycA (tyrocidine synthetase I) operon from B . brecis (16) not only has an AbrB binding site in the promoter region (-35 to -60) but also two adjacent protected sites well downstream from the transcription start-site (+169 to +199 and +207 to +231). The promoterassociated site appears to have less amnity for AbrB and apparently depends on the integrity of the + 169 to +231 binding region. Deletion of this region can abolish AbrB repression, but it is not known whether binding at the downstream sites alone is sumcient to cause repression. It is

TRANSITION-STATE GENE EXPRESSION

131

bound AbrB sterically interferes with RNA polymerase’s interaction with its determinants on the DNA. But how might AbrB activate transcription (assuming that this effect can occur in cis)? In these instances, AbrB binding might enhance a particular step in the RNA polymerase-DNA interaction, such as formation of the open or closed complex. This could involve an AbrB-polymerase contact analogous to the NtrC-polymerase contact, which enhances open complex formation at the glnA promoter (30).If so, AbrB binding at some distance from the promoter and possibly in a relatively transient manner, might be able to achieve this effect on bound, but inactive, RNA polymerase. (Could this by why the apparent AbrB-hpr interaction has been recalcitrant to footprint analysis?) Alternatively, AbrB binding could alter or stabilize a localized DNA structure that is necessary for proper polymerase recognition. Binding at negatively controlled promoters would result in a DNA conformation not recognized by the transcriptional apparatus. Binding at positively controlled promoters could “present” a DNA conformation recognized by RNA polymerase. Considering that AbrB binding occurs via stacking along one face of the helix, it seems conceivable that RNA polymerase could “ r e a d the other side of the helix and, in the case of positive control, displace (or ignore) the bound AbrB if the promoter were presented in the proper conformation. Another possibility is that AbrB binding facilitates the binding of an additional regulatory protein that can activate transcription. In this latter case, the role of AbrB would be analogous to those instances when binding of Hu (the major chromosomal-associated protein of Escherichia coli) affects the binding of regulatory proteins through its effect on DNA structure (31). In B . subtilis, DNA curvature upstream from -35 can have profound effects on promoter utilization (32, 33). The A+T-rich region upstream from the spoVG promoter is known to be curved or bent (34).When this region is placed upstream from the E . coli gal gene, in place of the normal -35 region and CRP binding site, it can effectively substitute for the activating effect of CRP CAMP (35).The spoVG A+T-rich region is necessary for AbrB control in uiuo and AbrB binding in uitro (10, 16). Taken together, these facts imply that curvature of the A+T region is required for promoter utilization and that AbrB binding acts to change the curvature such that it can no longer function to activate transcription. Therefore, AbrB’s negative effect on spoVG could be due to masking (or altering) a DNA structure necessary for promoter usage. Altering DNA conformation may, in fact, be a general mechanism by which AbrB regulates genes; there is evidence that AbrB binding induces a bend in many promoter regions ( 1 8 ~ ) . What negates AbrB-mediated control as the cell enters stationary phase? possible that binding at the downstream sites serves in some way to facilitate the binding at the promoter site, and it is the AbrB bound at the latter which causes repression.

132

MARK A. STRAUCH

Transcription of the abrB gene is controlled by negative autoregulation and repression by the SpoOA protein (9). Control during vegetative growth is primarily by autoregulation (12). It is believed that autoregulation is used to maintain the intracellular AbrB concentration slightly above a threshold level of effectiveness, such that sudden drops due to changes in transcription or binding activity of the protein can be easily sensed. The cooperativity of AbrB binding to its targets would magnify the effect that slight decreases in AbrB levels have on the expression of these targets. It is thus envisioned that, during exponential growth, AbrB is maintained at a relatively constant level (slight adjustments being continually made due to autoregulation) that maintains effective down-regulation of post-exponentially expressed genes, but that is poised to respond rapidly to signals indicating the onset of the transition state. The SpoOA protein appears to be the critical factor responsible for decreasing AbrB levels during the transition state (9, 36).The SpoOA protein is subject to phosphorylation by a regulatory cascade that becomes activated in response to nutrient deprivation (37, 38; Section IV). Phosphorylated SpoOA has a much greater binding afFinity for regulatory sites on the abrB promoter (36) than does the unphosphorylated form, and it represses transcription of abrB (regardless of AbrB levels), causing the AbrB concentration to drop below its threshold of effectiveness, thereby lifting the repressive effects of AbrB. There is no evidence that the binding activity of preformed AbrB is controlled by covalent modification or by binding of an effector molecule or . SpoOA-P repression of abrB is the only known mechanism ion ( 1 8 ~ )Thus, accounting for the abolishment of AbrB-mediated control during the transition state. Given the complexity of the regulatory circuits in which AbrB is involved, additional controls at the level of DNA-binding activity may be discovered, but the postulation of such controls is not necessarily required to account for our present observations regarding AbrB.

6. Hpr Protein The hpr gene was originally defined by mutations causing overproduction of extracellular proteases and alkaline phosphatase (hpr mutants, 39; scoC mutants, 40, 41), and catabolite repression-resistant sporulation (catA mutants, 42). The gene codes for a protein of 23,700 Da and sequencing of mutant alleles show that the hpr phenotype is the result of losing the protein. Hpr has been purified; it is a DNA-binding protein (43).Unlike AbrB, a consensus sequence for Hpr binding has been deduced on the basis of DNase-I protection experiments (12 protected regions from five genes: 43, 1 8 ~ )The . sequence is (R)ATANTAT(Y), the 5’ half being more highly conserved than the 3’ half. Four of the genes examined each have two separate

TRANSITION-STATE GENE EXPRESSION

133

and distinct Hpr binding regions separated by at least 30 bp. The aprE gene has four separate Hpr binding sites arranged as two sets of pairs, with the sets separated by over 130 bp. The in vivo overexpression of subtilisin and neutral protease from mutant genes deleted for their upstream Hpr binding sites supports the conclusion that the negative effects of Hpr result from a direct binding interaction with the DNA (44, 45). These results also suggest that the full negative effect of Hpr requires both the upstream and downstream sites. This could indicate that Hpr bound at the upstream sites interacts with that bound at the downstream sites to created a “repression loop,” as has been observed for other prokaryotic repressor proteins (46). Cooperativity between the upstream and downstream sites for binding of Hpr has not been vigorously examined; however, in vitro binding of Hpr to the upstream pair of sites on the aprE gene does not require the presence of the downstream pair, and vice versa (43). One of the downstream Hpr binding regions of the aprE gene (-35 to - 14 relative to the start-point of transcription) occurs entirely within the region (-59 to +15) to which AbrB binds. Additionally, the other downstream Hpr site (-79 to -59) abuts the AbrB site. It has not yet been possible to determine conclusively whether AbrB and Hpr binding in this area are mutually exclusive or whether they can occur simultaneously. Since AbrB binding appears to involve stacking along one face of the helix, it is conceivable that Hpr may be able to bind along the opposite face, and the two regulators may act synergistically. Even if AbrB precludes Hpr from binding at the downstream sites, that would not seem to preclude Hpr from binding to the upstream sites (see results cited above). Perhaps the effects of Hpr at the upstream site alone are qualitatively different than if Hpr were present at both the upstream and downstream sites. In fact, one can imagine different regulatory levels being exerted by each of the three possible combinations: AbrB alone, Hpr alone, AbrB downstream with Hpr only at the upstream sites (Section V). It is assumed that Hpr binding exerts its negative effects through preclusion of RNA polymerase activity. None of the genes known to be controlled by Hpr become constitutively expressed during vegetative growth in hpr mutants. Additionally, there are no examples of genes activated by Hpr. Therefore, the role of Hpr in controlling transition-state gene expression seems to be primarily (or solely) as a “preventer.” Hpr has been implicated in controlling many transition-state processes (besides the production of proteases and alkaline phosphatase), and it may also function in the cells’ response to oxidative stress during vegetative growth (47). Based on the existence of Hpr binding sites, as determined by in vitro footprinting analysis (18a), Hpr may participate in regulating an

134

MARK A. STRAUCH

enzyme involved in nitrogen assimilation. However, in oiuo evidence for this role has not been obtained (H. Schreier, personal communication); it is possible that in this case Hpr functions only under very defined conditions, which have yet to be mimicked in the laboratory. Circumstantial evidence also points to a role for Hpr in the production of at least one antibiotic (M. Perego and M. A. Strauch, unpublished) and in the process of motility (J. A. Hoch, personal communication). Direct interaction with target promoters need not be the only means by which Hpr regulates gene expression. It may also do so by affecting other regulators. Hpr has been shown to be a regulator of the operon containing the gene encoding the transition-state regulator Sin (43, 48a; Section II,C), and it is possible that it controls other regulatory proteins as well. Some evidence suggests that Hpr is involved in the regulatory circuits governing the sporulation response. As mentioned above, hpr mutants relieve the catabolite repression of sporulation caused by excess amounts of glucose in the growth medium (42). However, the exact mechanism of Hpr’s role in this phenomenon has not been elucidated. Overproduction of Hpr from a multicopy plasmid inhibits sporulation (27), implying that Hpr is a negative regulator of at least one essential sporulation gene. The identity of this spo gene is unknown. Since Hpr negatively regulates an inactivator of the Sin protein, and Sin is a preventer of certain stage I1 spo genes (Section 11,C), it is possible that the multicopy effect of Hpr occurs through this regulatory circuit. AbrB activates hpr transcription (Section I1,A) meaning that during exponential growth, when AbrB levels are at their highest, so too are the Hpr levels. Upon entry into the transition state, SpoOA-P repression of AbrB also leads to a lowering of Hpr levels, thereby releasing Hpr-dependent repressive effects. It is not known whether the pre-existing Hpr is also inactivated at this time, or whether the activity of Hpr is modulated by effector molecules.

C. Sin Protein The sin gene was first identified as a clone of B . subtilis DNA which inhibited sporulation and protease production when present in multiple copies (48).The sin gene is part of a dicistronic operon and is preceded by a gene encoding a small polypeptide (57 amino acids), tentatively called isin (48, 48a). The protein (Isin) is a regulator of Sin activity. The regulation of this operon is discussed in more depth later in this section. Inactivation of the sin gene leads to a number of detectable phenotypes: increased expression of aprE (subtilisin), aniyE (a-amylase), sacB (levansucrase), the sin operon itself, and certain stage II (spoil) sporulation genes (48, 48a, 49); also a decreased ability to develop competence for the uptake

TRANSITION-STATE GENE EXPRESSION

135

of DNA molecules (48); and the cells are non-motile autolysin-negative, growing in filamentous (but septate) chains with a rough colonial morphology on solid media (48). Many of these phenotypes have also been noted for mutations in a locus calledflaD (50,51), and it has now been shown that the flaD and sin genes are identical (52, 53). The Sin protein has been purified and shown to have DNA-binding properties (43, 54).DNase-I footprinting analysis has been reported for only one Sin-DNA interaction, although other interactions have been investigated by such procedures as gel retardations and methylation-interference (48a). Due to the paucity of information at this time, only a tentative consensus sequence (GNCNCGAAATACA) for the Sin binding determinant has been assigned (55). On the subtilisin gene (aprE), sequences from -263 to -216 (relative to the start-point of transcription) are protected from DNase-I cleavage due to Sin binding (43, 54). This location of binding is immediately adjacent to one of the Hpr binding sites (Section II,B), but the two proteins can bind independently (43).This location indicates that Sin binding can exert its negative effects on aprE at a considerable distance from the promoter. This “repression from a distance” gains further support from studies examining aprE expression from genes that have been deleted for the Sin binding region (54). Two possible mechanisms have been suggested (43). Sin binding may preclude the binding of a transcriptional activator to the region. In fact, the existence of an unknown activator binding to this approximate region was first suggested by results examining expression from defined deletion derivatives of aprE in an hpr mutant background (44). It could be that both Hpr and Sin prevent access of this hypothetical activator. Alternatively, although Sin and Hpr bind independently, Sin functioning may depend on the presence of bound Hpr. Kecall that one possibility is that Hpr molecules bound at the upstream and downstream sites on aprE come together to form a “repression loop” (Section 11,B). The Sin binding site is within this putative loop, and it could be that loop formation brings the Sin protein into immediate proximity to the transcriptional apparatus where it can exert its negative effects. Of course, other mechanisms can be envisioned; which is correct awaits further experimentation. A major function of Sin is to regulate negatively the expression of at least four operons necessary for stage I1 of sporulation: spoZlA, spoIZD, spoZZE, and spolZG (56).Since phosphorylated SpoOA is required for activation of at least three of these operons (spoIZA, - E , and -G; 38, 57, 58), it may be that Sin’s role is to ensure that they are silent until sufficient SpoOA-P has accumulated in response to nutrient limitation (Section IV, A). Once the SpoOA-P level is high enough, it might then override Sin control. Additionally, SpoOA-P acts to inhibit the amount of active Sin in the cell.

136

MARK A. STRAUCH

The sin operon is transcribed from three different promoters, each with a different pattern and level of expression (49). Promoters P1 and P3 are upstream of isin and transcribe both cistrons; P3 is between isin and sin and only transcribes the latter. P3 mRNA is abundant during growth and early sporulation, but the amount of detectable Sin protein in the cell suggests that this transcript is poorly translated. Transcription from P2 becomes detectable only 2 hours (t,) after the initiation of sporulation (t(,).It is not known whether P2 mRNA is the product of a sporulation-specific RNA polymerase (perhaps E-oE, crE being encoded in the spoIIG operon) or a processed form of transcripts initiating at P1. Transcription from P1 is low during vegetative growth but rises dramatically at the onset of the transition state and sporulation (to).This rise in P1 expression is dependent on the spoOA and spoOH (uH) genes and is catabolite-repressed by glucose. At least three regulatory proteins (AbrB, Hpr, and SpoOA) bind in the vicinity of the P1 promoter (18a, 43). The SpoOA binding site is just upstream from -35, a location reminiscent of cases in which SpoOA-P is known to activate transcription (38, 57, 58). The Hpr and AbrB binding sites imply that these proteins act to prevent P1 expressing during vegetative growth. Support for this conclusion comes from experiments showing that sin operon expression is increased in hpr and abrB mutants (48a). Remembering the SpoOA-P-AbrB-Hpr circuit described above, it appears that SpoOA-P production leads to P1 expression by two means: repression of negatively acting transition-state regulators and direct activation of the promoter itself. Since P1 also appears to be dependent on the sigma factor (spoOH), there exits yet another apparent level of control: SpoOA-P repression of abrB leads to an increase in the level o f d ' due to the fact that AbrB is a negative effector of spoOH (26). Figure 1 illustrates the regulation of Sin. The increase in P1 expression probably leads to an increase in the Isin: Sin intracellular ratio (recall that the sin cistron appears to be translated poorly, presumably due to inefficient translation-initiation signals; 48, 49). A number of observations indicate that Isin antagonizes Sin activity but does not affect Sin synthesis or degradation (48a). The Isin protein does so via a direct interaction with the Sin protein, resulting in Isin-Sin complexes that have lost the DNA-binding activity associated with Sin (48a). Overall, the ability of Sin to regulate transition-state processes is controlled by a complex circuit that includes other transition-state regulators and the mechanisms leading to SpoOA-P formation.

D. Pai Protein Compared to the three previously discussed transition-state regulators, relatively little is currently known about Pai. The pai operon was initially cloned as a fragment of the B . suhtilis chromosome, which, when on a high-

137

TRANSITION-STATE GENE EXPRESSION

transition state processes

FIG. 1. SpoOA-AbrB-Hpr-Sin circuit of transition-state control. During balanced exponential growth, the proteins AbrB, Hpr, and Sin function to prevent stationary phase-associated gene expression. At the onset of the transition state (to), a signal-transduction mechanism (Section IV) results in the phosphorylation of SpoOA. SpoOA-P action alleviates the negative effects exerted by the transition-state regulators. Abolishment of Sin control involves inactivation through a protein-protein interaction with Isin. The 1sin:Sin ratio rises at to due to transcription from a promoter (Pl) for both genes and the fact that the sin gene is less efficiently translated than isin. With the exception of the Isin-Sin interaction, all effects appear to be at the transcriptional level. +, Positive effects; -, negative effects.

copy-number plasmid, caused decreased levels of extracellular proteases (59). Further analysis revealed that multiple copies of pai led to lower levels of subtilisin, neutral protease, a-amylase, levansucrase, and alkaline phosphatase and a greatly reduced frequency of sporulation. mRNA levels of the neutral protease gene (nprE) were examined and found to be reduced, implying that p a i acted at the transcriptional level (but effects on inRNA stability were not ruled out). The pai operon has been sequenced (59)and consists of two open reading frames, ORFl and ORF2, encoding proteins of 21,000 and 24,000 Da, respectively. Strains bearing the multicopy pai plasmid overproduce two proteins of this size, and N-terminal analysis of partially purified preparations indicates that the proteins correspond to the two pai genes. The presence of both genes on the multicopy vector is required for the aforementioned phenotypic effects. The 21,000-Da protein contains a region homologous to the helix-turn-helix motif of DNA-binding proteins, but an examination of its DNA-binding properties has not been reported. Deletion of the ORFl gene had no effect on protease production and sporulation in a medium that does not cause glucose catabolite repression. However, the ORFl deletion did render protease secretion and sporulation resistance to normally repressive levels of glucose. Disruptions of ORF2

138

MARK A . STRAUCH

could not be obtained, suggesting that it encodes a function essential for cell growth. Based on these preliminary studies, at least one of the pai operon genes appears to encode a transition-state regulator belonging in a class with AbrB, Hpr, and Sin. Confirmation of this assessment, and elucidation of where Pai fits into the scheme of preventing inappropriate transition-state gene expression, obviously await further experimentation-in particular, a biochemical analysis of the purified Pai proteins.

111. Activators and Modulators of Transition-State Gene Expression A number of regulatory proteins function to activate or modulate expression levels of many transition-state genes. The main teleological difference between these regulators and the transition-state regulators just discussed is that, while the latter are active during vegetative growth to silence postexponential-specific genes, the former primarily make their presence felt only after the onset of the transition state (although, as with almost any rule, there are exceptions). In other words, these activatorslmodulators do not act (or are prevented from acting) on transition-state-specific genes during exponential growth.

A. Deg Proteins Genes encoding these proteins were first identified either by mutations or on cloned fragments of DNA which affected the production of enzymes responsible for degrading macromolecules such as proteins or polysaccharides. [A comprehensive discussion of these regulators can be found in recent reviews (60, SOU).] 1. DegS

AND

DegU PROTEINS

Mutations in either degS or degU pleiotropically affect production of many degradative enzymes and transition-state processes. Two different phenotypes (hyperproduction or deficiency) can result, depending on the nature of the specific mutation (61). Strains with degU(Hy)or degS(Hy) missense mutations overproduce subtilisin, levansucrase, a-amylase, neutral protease, P-glucanase, and intracellular serine protease. They also exhibit defects in the synthesis of flagella, the development of competence, and the normal glucose repression of sporulation. In contrast, the degU- and degSclass of mutations (either missense or deletions) generally show the opposite effects: deficiency in degradative enzyme synthesis, but unaffected in mo-

TRANSITION-STATE GENE EXPRESSION

139

tility and competence. An exception to this pattern occurs with deletions of degU, which, like degU(Hy), are competence-deficient. DegS and DegU are members of the conserved prokaryotic family of two-component signaling systems (62, 63). DegS is a protein kinase that autophosphorylates at a histidine residue in response to some unknown environmental signal (64,65), and then transfers the phosphate group to DegU (presumably onto an aspartate residue). DegU is a regulatory molecule that, when phosphorylated, brings about the activation of certain specific genes. DegU might be a DNA-binding protein whose affinity is changed by phosphorylation but so far there is no evidence for this (T. Msadek, personal communication). A detailed analysis of DegS and DegU mutants has led to a number of conclusions (61, 64).DegU-P, like AbrB, is ambiactive: it has both positive and negative effects on gene expression. It activates degradative enzyme expression, but negatively regulates flagella production. The degU(Hy) and degS(Hy) mutations either lead to greater production (or stability) of DegU- P or cause the unphosphorylated DegU(Hy) protein to mimic the active structure normally assumed only when phosphorylated. The degSand degU- mutations result in the absence of functional DegU-P because either DegU is not phosphorylated or because the mutant DegU-P is still unable to interact with its regulatory targets. The same effect on competence of degU(Hy) and degU deletions is explained by postulating that it is the unphosphorylated form of DegU that activates certain competence genes: absence of DegU or having it predominantly in the phosphorylated form would thus produce the same results.

2. DegR PROTEIN Overexpression of the DegR protein from multicopy plasmids carrying

degR (originally called p r t R ) results in dramatic increases in subtilisin, neutral protease, and levansucrase during the transition state, but does not alter the temporal regulation of these enzymes (66, 67). Furthermore, expression of these enzymes does not require DegR, since null mutations in degR have no apparent phenotype. Based on lac2 fusion studies and measurement of mRNA levels, DegR appears to act at the level of transcription initiation and not by affecting mRNA stability (67, 68). The DegR protein is small (60 amino acids) and apparently very hydrophilic. Purification and an examination of its biochemical properties have not been reported, so it is impossible to say definitely how it affects transcription. Transcription of the degR gene itself is relatively constant throughout growth and early stationary phase (67) and is directed by RNA polymerase containing the minor vegetative sigma factor, crD (69). These latter observa-

140

MARK A. STRAUCH

tions might indicate that DegR is present in an inactive form during vegetative growth and can in some way become activated (if appropriate) at the onset of the transition state. This activation could involve other regulators and, in fact, there is some evidence that the enhancing effect of DegR requires a functional DegU-DegS system (70).

3. DegQ PROTEIN Like DegR, the DegQ protein is small (46 amino acids) and, when overexpressed from a multicopy plasrnid in B . sub&, has a pleiotropic effect on the production of many degradative enzymes (71, 72). Multicopy degQ (originally named sacQ; 73, 74) causes elevated production of subtilisin, neutral protease, a-amylase, xylanase, levansucrase, and P-glucanase. A chromosomal mutation, degQ36, exhibits a similar phenotype and has been shown to be a “promoter-up” mutation (71). Additionally, a variant DegQ protein (DegQ*), which is more stable than the wild type, causes even further enhancement of protease and levansucrase production (7.5).However, overproduction of DegQ has no effect on the temporal expression pattern of the target genes: they remain unexpressed until the onset of the transition state. Deletion of degQ from the genome has no discernible phenotype during either vegetative growth or stationary phase (71). Overproduction of DegQ causes increased levels of subtilisin mRNA (76) and deletion-mapping studies implicated the region from - 164 to - 141 on the aprE gene as the site of action of DegQ (44).A similar study of the levansucrase gene (sac€?)identified a putative DegQ site of action just upstream from the promoter, which shows 67% nucleotide sequence identity to the -164-+-141 region of aprE (44, 77). Since DegQ has not yet been purified, it is not known whether the regions are sites of direct DegQ binding. These regions have also been identified as the sites for DegU/DegS regulation (44,77), although, as mentioned above, DegU has also not been shown to possess DNA-binding properties. How DegU and DegQ act, or interact, to regulate transcription from these sites remains a mystery. Much is known about the regulation of the degQ gene itself (61, 78). Its expression increases at the beginning of the transition state, but is subject to catabolite repression by glucose. The glucose repression can be overridden (78) by the addition of decoyinine (a specific inhibitor of GMP synthetase), which induces sporulation (79).Amino-acid deprivation, nitrogen starvation, and phosphate starvation (conditions leading to the onset of the transition state and sporulation), all stimulate degQ expression (61, 78). Expression of degQ requires functional components of two different two-component regulatory systems: DegS-DegU (see above) and ComP-ComA (Section 111, D). A recently discovered gene, comQ, mapping just downstream from degQ,

TRANSITION-STATE GENE EXPRESSION

141

and apparently required for the development of competence, also plays a role in degQ expression (78). A deletion analysis of the degQ upstream region revealed that the sites of DegS-DegU and ComP-ComA regulation are distinct (78).While stimulation of degQ by amino-acid deprivation required ComA, the phosphate starvation, decoyinine induction, and glucose repression phenomena are independent of both DegS-DegU and ComP-ComA. There is some evidence that degQ transcription may be directed by two different RNA polymerase sigma factors, u* (71) and UD (69). Obviously, control circuits regulating degQ expression are as complex as the circuits in which DegQ appears to operate, and bode to be an exciting area of future experimentation. Evolution seems to have invested quite heavily in the 46-amino-acid DegQ protein, proof once again that good things can come in small packages. 4. DegT PROTEIN

degT is a gene isolated from Bacillus stearotherniophilus; its B. subtilis counterpart has not yet been identified. However, when degT is cloned on a multicopy plasmid and introduced into B. subtilis, it enhances production of subtilisin, xylanase, cellulase, and levansucrase, while decreasing autolysin, competence, and motility (80). Overproduced DegT also results in filamentous cell growth and glucose-resistant sporulation. Although in this case these effects are due to a “foreign” protein, it seems likely that a degT gene will be found in B. subtilis, just as homologs of degR, degQ, and senS (Section II1,B) have been identified in other species of Bacillus (66, 71, 72, 81). The DegT protein has not been purified, but nucleotide sequence analysis indicates it has 372-amino-acid residues with several potential membrane-spanning regions and a putative DNA binding domain (80). It is thus possible that DegT is a membrane-bound protein sensing environmental signals and transducing this information either directly to its target genes or to other regulatory proteins.

B. S e n S Protein SenS is a small (65 amino acids) highly charged basic protein with partial homology to RNA polymerase sigma factors (82).It seems unlikely that SenS is an actual sigma factor, but rather it may interact with RNA polymerase in some other fashion. Alternatively, it may be a DNA-binding protein, as it does possess a region with homology to the helix-turn-helix motif. Multicopy clones of senS show a limited (2- to 4-fold) increase in subtilisin, neutral protease, amylase, and alkaline phosphatase during the transition state, but, like many of the cases discussed so far, temporal regulation of they2 enzymes

142

MARK A. STRAUCH

in unaffected (82). Interestingly, preliminary studies (cited in 82) indicate that concentrations of SenS above a certain level are lethal to the cell. Deletion of senS has no discernible phenotype. Expression of senS may involve an antitermination mechanism at an attenuation site located between the start-point of transcription and the ribosome binding site (82). The presence of a nusA-box in the presumed attenuator region (83)could indicate the involvement of a protein analogous to NusA of E . coli (84, 85).

C. TenA and Ten1 Proteins When cloned on a multicopy plasmid, the dicistronic tenA-ten1 operon enhances the production of subtilisin, neutral protease, and levansucrase about 10-fold during the onset of stationary phase (86). Temporal regulation of these extracellular enzymes is not affected. Null mutations in tenA and ten1 have no noticeable effects on the production of the enzymes or growth, although they do produce a delay in sporulation. However, the sporulation process is not blocked: additional incubation results in normal levels of spores. When tenA alone was present in multiple copies, an additional 5-fold enhancement of target gene expression was observed. This suggests that the TenI protein (205 amino acids) in some way negatively affects the expression or enhancing activity of TenA (236 amino acids). Assays using an aprE-lacZ fusion indicated that TenA exerts this enhancing effect at the transcriptional level, and epistatic studies showed that the stimulation requires a functional DegS-DegU system. Neither TenA nor TenI has been purified. Little is known regarding tenA-ten1 expression and its regulation. The tenA and ten1 open reading frames overlap by eight amino acids (the reading frames are offset by 1 bp), suggesting some form of translational coupling. Transcription of the operon may, like senS, involve an antiterminationlattenuation process at a site located between the promoter and the tenA gene (86).

D. ComP and ComA Proteins ComP and ComA form a signal-transducing two-component regulatory system that is required for the development of competence in B. subtilis (for a review of competence, see 28). ComP is a protein kinase that can phosphorylate the ComA response regulator. Mutations in comP or c o d not only dramatically reduce competence but also reduce expression of degQ (Section 111, A), the srfA operon (encoding functions necessary for the synthesis of the extracellular peptide antibiotic surfactin; 87, 88), and gsiA, a gene induced by glucose starvation that participates in regulating some sporulation genes (89, 90). Overproduction of ComA inhibits sporulation about 10- to zo-fold, but this inhibition can be effectively overcome by the concomitant over-

TRANSITION-STATE GENE EXPRESSION

143

production of ComP (91). This seems to suggest that the unphosphorylated forms of ComA can repress some step required for spore formation, but the significance of this effect under physiological conditions is unknown. Sequence homologies indicate that ComA may be a DNA-binding protein that functions to activate transcription of its targets, but a biochemical demonstration of these activities has not yet been presented. ComP possesses several possible membrane-spanning regions (92), which raises the possibility that it transduces extracellular or membrane-associated signals to the (presumably) soluble ComA regulator. The nature of the signals that trigger ComP phosphorylation of ComA is unknown.

IV. Initiating Sporulation No description of the Bacillus transition state and early stationary phase is complete without a discussion concerning the regulation of sporulation initiation. In fact, due to the interconnected regulatory circuits, a full understanding of transition-state regulators and modulators requires knowing how the cell makes the decision whether or not to sporulate. What follows is a brief overview of this topic; more details concerning the genes and mechanisms involved can be found in a recent review (93).

A. sp00 Genes and the ”Phosphorelay” Signal-Transduction System The initiation of sporulation requires that the SpoOA transcriptional regulator become phosphorylated (37, 38, 94-98). Therefore, one of the first steps in the initiation process must be the activation of one or more protein kinases that can lead to the formation of SpoOA-P. At least two major kinases, KinA (99, 100) and KinB (K. A. Trach and J. A. Hoch, personal communication) are required for this process. However, direct action of these kinases upon SpoOA does not seem to be appreciable under normal physiological conditions. Rather, a phosphotransfer cascade mechanism, termed the phosphorelay, is responsible for forming SpoOA-P (37). The activated kinases undergo autophosphorylation and transfer the phosphate group to the SpoOF protein. SpoOF serves as a secondary messenger capable of receiving signals (i.e., phosphate groups) from various sensor kinases, each of which responds to different metabolic and environmental stimuli. This aspect of sporulation initiation has been termed a “cumulative environsensory mechanism” (101)and ensures that a response will be based on an integrated picture of the total environmental conditions. The phosphate groups from SpoOF-P are then transferred to SpoOA through the action of

144

MARK A . STRAUCH

the phosphoprotein phosphotransferase encoded by the spoOB gene (102). SpoOA-P, a transcriptional activator (38, 57, 58) and repressor (36),is the critical factor responsible for reprogramming gene expression during the transition state and early sporulation. Obviously, the phosphorelay contains multiple points of possible control over the initiation process. Each of these may be responsive to different subsets of signals, thereby serving to provide an integrated picture of the specific cellular condition and assuring that sporulation will take place only when it becomes the best possible survival strategy. The discovery of the phosphorelay is relatively recent and we are just beginning to uncover some of the regulatory features controlling the component proteins and their activities. We do know that SpoOA-P operates in positive feedback loops that ) of its own gene (98).Additionally, activate transcription of spoOF ( 1 0 2 ~and the product of the spoOE gene (103),which is regulated by AbrB, appears to control the flow of phosphate through the relay by some as yet undetermined mechanism (104).There seeins to be little doubt that other controls affecting either the synthesis or activity of the phosphorelay components exist and await future experimentation. What are the exact metabolic signals that induce sporulation (presumably by activating the phosphorelay)? There have been numerous attempts to identify one, or a few, key “triggering” metabolites, but no definitive answers have yet been found ( cf. 105-107). Sporulation initiation is accompanied by a transient decrease in G D P and GTP pools and, in fact, artificially lowering these pools-for example by addition of decoyinine (an inhibitor of G M P synthetase)-can rapidly induce the sporulation response (79, 108). Unfortunately, specific sporulation targets responsive to guanine nucleotide pools have not been identified, so the nature of this effect remains mysterious. There is, however, one intriguing possibility. The spoOB gene is part of a dicistronic operon (102, 109). The downstream gene, obg, codes for a raslike GTP-binding protein that is absolutely essential for cellular growth ( l o g ) , and it has been postulated that Obg connects the phosphorelay to the cell cycle, perhaps by controlling the activity of SpoOB (38). An oligopeptide permease operon ( o p p ) , defined by spoOK mutations, is required for normal initiation of sporulation (110).Perhaps this system transports small peptides that activate one or more of the kinases (or other steps). The nature of the peptides transported is unknown, but it has been suggested that they might be recycled cell-wall peptides that communicate the state of cellular growth (110).The o p p system might also be the transporter of an extracellular hormone-like molecule (“sporemone”),released by sporulating cells, that signals other cells in the population to begin the process (111). An extensive search for effectors of in uitro KinA activity has been per-

TRANSITION-STATE GENE EXPRESSION

145

formed (18a). Surprisingly, although no conclusive activators were discovered, it was observed that cis-unsaturated fatty acids are potent allosteric inhibitors of KinA autophosphorylation (112). The trans isomers, saturated forms, and iso-branched species had little or no inhibitory effect. B. subtilis phospholipids were fovnd to contain at least one as yet unidentified type of fatty acid that inhibited KinA. These results suggest that the intracellular concentration of a specific unsaturated fatty acid could be a signal that links the activity of a phosphorelay kinase to some membrane-associated event, such as septation. The role of lipids in controlling the initiation of sporulation is not clear but other studies have shown that some fatty acids and phospholipids can affect sporulation (at least under certain conditions) when added to cultures of Bacillus (113-115).

B. Transition-State Regulators and Sporulation Not surprisingly, many transition-state regulators function in the circuits regulating the sporulation response. However, since no mutants of them leading to an Spo- phenotype have yet been isolated, they appear to be nonessential for spore formation. But some essential sporulation genes are targets for direct control by AbrB (spoOE, spoOE1, and spoVG) and Sin (spoIZA, spoIID, spoZIE, and spo11G). Hpr and Pai may also regulate spo genes, as shown by the Spo- phenotype of strains overexpressing these proteins. The Spo+ nature of mutants in the regulators is at least partially explained by the fact that in each case the wild-type protein is a negative effector of the target spo genes. Figure 2 illustrates the known roles of AbrB, Hpr, and Sin in the early stages of sporulation. During vegetative growth, they function to prevent the inappropriate expression of various spo genes. Once sporulation initiation signals have been received, it becomes necessary to lift the repressive effects. Through the phosphorelay, the initiation signals are transduced into the form of SpoOA-P. The first result is repression of abrB [lower SpoOA-P levels are needed for abrB repression than are needed for SpoOA-Pmediated activation of spoII genes (3811. Lowering AbrB deactivates hpr expression and also contributes (through derepression of spoOH) to the positive feedback loops that increase SpoOA-P production.2 Due to lowered AbrB and Hpr levels, the isin gene is derepressed and the accumulating SpoOA-P pool probably activates isin expression even further. Isin interacts with the Sin protein to deactivate the latter and thus removes an impediment to SpoOA-P activation of various spo11 genes. SpoOA-P also represses expression ofgsiA, a recently discovered gene whose product appears to be a 2 By leading to the release of an extracellular differentiation factor ( 111) , the lowered AbrB levels might also promote spore formation in other cells of the population.

146

MARK A. STRAUCH

SpoOK signal

FIG. 2. Phosphorelay signal-transduction mechanism controlling the initiation of sporulation. Formation of the transcription regulator SpoOA-P is the crucial factor leading to subsequent steps in the sporulation process. Phosphorylation of SpoOA involves a cascade of phosphate transfers, beginning with autophosphorylation of protein kinases KinA and KinB. These phosphorylate SpoOF to SpoOF-P, which serves as the donor of P in the SpoOB-catalyzed phosphorylation of SpoOA. This sequence of reactions has been termed a phosphorelay (37, 38). SpoOA-P functions in positive feedback loops leading to increased production of itself and of the SpoOF secondary messenger. SpoOE regulates the flow of phosphate through the relay by some unknown means. SpoOK is an oligopeptide permease system that seems to transport a signal affecting the relay; the nature of the signal and its point of effect are not known. Mutations in sp00 genes block the initiation process; mutations in, or non-expression of, spoZZ genes block sporulation at stage I1 (formation of an asymmetric division system). +, Positive effects; -, negative effects.

negative regulator of certain spoll genes (89, 90). Expression of the spoZl operons (two of which encode sporulation-specific sigma factors) is absolutely required for further endospore development. Except for AbrB regulation of spoVG, control of sporulation events beyond stage I1 has not been observed for any known transition-state regulator.

V. "Redundant" Control of Transition-State Genes Many functions that become expressed when the cell enters the transition state are controlled by more than one regulator. This is perhaps not too surprising for multistep processes, such as competence and sporulation, in

147

TRANSITION-STATE GENE EXPRESSION

I

WJ

-AbrB

FIG. 3. Multiple controls of subtilisin expression. The upstream region of the structural gene (uprE) for subtilisin is drawn to scale. Known binding regions for the transition-state regulators AbrB, Hpr, and Sin are indicated by bracketed lines. The putative site of DegS/DegU and DegQ action is shown as a wavy line. Sites ofaction for the regulators enclosed in boxes are not known. The start-point of transcription is designated + l . -, Negative effects; +, positive effects.

which final commitment requires an integrated view of the environment. But what of single genes, such as those encoding degradative enzymes, that are subject to multiple controls? Perhaps the most extreme example of this is the regulation of the alkaline protease subtilisin (Fig. 3). At least four transition-state regulators (AbrB, Hpr, Sin, and Pai) prevent subtilisin expression during vegetative growth. Additionally, there are at least 10 activators or modulators that affect subtilisin expression once the negative effects have been lifted at the onset of the transition state. Why are some genes redundantly regulated while others are controlled by only one transitionstate regulator (e.g., spoOE and tycA repression by AbrB)? In the case of subtilisin, the above question seems even more enigmatic since the presence of the protease is entirely dispensable for growth, stationary phase survival, and sporulation under laboratory conditions (29). But given the considerable effort the cell expends to regulate its expression, it appears probable that subtilisin confers a selective advantage or performs an essential function during nutrient-limiting conditions found in the wild. Conversely, the expression of subtilisin during vegetative growth may have the opposite effect: lethality or selective disadvantage (in fact, no one has

148

MARK A. STRAUCH

ever observed subtilisin expression during active growth, even under laboratory conditions and despite numerous attempts to do so). Thus, in some cases, redundancy of control could be a means to ensure that a potentially deleterious gene will remain silent despite minor environmental perturbations and be expressed only when it is advantageous. Redundant control using multiple regulators may also reflect the fact that there are many nutrient-limiting environments in which a cell might find itself. Redundancy could provide a rheostat mechanism that adjusts the level of gene expression in response to the exact metabolic conditions encountered, with each regulator sensitive to a different metabolic cue (or set of cues). This would provide a means for the cell to fine-tune transition-state gene expression to its precise requirements and thereby avoid unnecessary expenditure of energy and metabolites-a primary consideration when nutrients are in short supply.

Vi. Concluding Remarks For B. subtilis, under laboratory conditions, the transition from exponential growth into stationary phase is of short duration. However, in its natural soil environment, where nutrients are usually limited, the transition state is probably the predominant metabolically active stage. Transition-state regulators, activators, and modulators should therefore be considered among the major factors that optimize survival under the most commonly encountered conditions. They are critical elements that integrate metabolic and environmental information and channel the cell along the proper paths while ensuring that metabolites and energy sources necessary for the task are readily available. They may be dispensible in the laboratory, but it seems likely that they provide key advantages in the wild. Elucidation of the regulatory mechanisms controlling transition-state gene expression in Bacillus is not just a matter of basic research. Many compounds produced during this period, such as degradative enzymes and antibiotics, are commercially and humanly valuable commodities (116). Additionally, transition-state regulators are likely to exist in other bacterial species that undergo a differentiation process and may function in any bacterial transition from active growth to stationary phase. Beyond gaining a deeper understanding of general bacterial physiology, studies examining changes in gene expression due to nutrient limitation may have specific impacts on public health questions, since nutrient or metabolite deprivation can induce synthesis of virulence factors in many pathogens (117-120). For instance, starvation conditions lead to the expression of genes that allow Salmonella typhimurium to survive inside macrophages; low iron concentra-

TRANSITION-STATE G E N E EXPRESSION

149

tions derepress production of diphtheria toxin; Yersinia virulence genes are only expressed in the absence of calcium. Given their role in coordinating a differentiation process (sporulation), analogs of B. subtilis transition-state regulators (such as AbrB) may also exist in eukaryotic cells poised to respond in various developmental ways depending on the stimulus. Interestingly, although they share no significant amino-acid homology, functional similarities can be drawn between AbrB and homeodomain proteins (121, 122). Like AbrB, homeodomain proteins are transcriptional regulators, some apparently capable of both positive and negative effects; they are DNA-binding proteins, some with unclear binding specificities; some are autoregulated; some regulate other regulatory genes; and the activation of their targets represents a developmental transition. Perhaps transition-state regulators and modulators are prokaryotic equivalents or prototypes of the eukaryotic homeodomain proteins, each group having evolved convergently to meet the needs of transitional periods in growth and development. Further studies aimed at dissecting and integrating the regulatory complexities surrounding Bacillus subtilis post-exponential gene expression, and the identification in other organisms of analogs to transition-state regulators, should result in a deeper understanding of the mechanisms used to ensure biological survival.

ACKNOWLEDGMENTS I thank Kari Ohlsen and Katherine Welsh for critically reading the manuscript and Janet Hightower for preparation of the figures. Preparation of this article was supported in part by Grant GM46700 from the National Institutes of General Medical Sciences, National Institutes of Health, United States Public Health Services. This is paper No. 7662-MEM from the Department of Molecular and Experimental Medicine.

REFERENCES 1 . A. Matin, Mol. Microbiol. 5 , 3 (1991). 2. D. A. Siegele and R. Kolter, I . Bact. 174, 345 (1992). 3. J. F. Guespin-Michel, MGG 112, 243 (1971). 4. J. F. Guespin-Michel, J. Bact. 109, 241 (1971). 5. J. Ito, MGG 124, 97 (1973). 6. J. Ito, 6 . Mildner and J. Spizizen, MGG 112, 104 (1971). 7. J. Trowsdale, S. M. H. Chen and J. A. Hoch, in “Spores VII” (C. Chambliss and J. C. Vary, eds.), p. 131. American Society for Microbiology, Washington, D.C., 1978. 8. J. Trowsdale, S. M. H. Chen and J. A. Hoch, MGG 173, 61 (1979). 9. M. Perego, G. B. Spiegelman and J. A. Hoch, Mol. Microbiol. 2, 689 (1988).

150

MARK A. STRAUCH

10. P. &her and R. Losick, J. Bact. 169, 2222 (1987). 11. M. A. Strauch, G. B. Spiegelman, M. Perego, W. C. Johnson, D. Burhulys and J. A. Hoch, EMBO J. 8, 1615 (1989). 12. M. A. Strauch, M. Perego, D. Burhulys and J. A . Hoch, Mol. Microbiol. 3, 1203 (1989). 13. J. B. Robertson, M. Gocht, M. A. Marahiel and P. Zuber, PNAS 86, 8457 (1989). 14. R. Furhass and M. A. Marahiel, FEBS Lett. 287, 153 (1991). 15. C. 0. Pabo and R. T. Sauer, ARB 61, 1053 (1992). 16. R. Furbass, M. Gocht, P. Zuber and M. A. Marahiel, MGG 225, 347 (1991). 17. M. A. Strauch, in “Bacillus subtilis and Other Gram-Positive Bacteria” (A. L. Sonenshein, R. Losick and J. A. Hoch, eds.)p. 757. American Society for Microbiology, Washington, D.C., 1993. 18. F. J. Slack, J. P. Mueller, M. A. Strauch, C. Mathiopoulos and A. L. Sonenshein, Mol. Microbiol. 5, 1915 (1991). 18a. M. A. Strauch, unpublished. (1992). 19. J. R. Greene, S. M. Brennan, D. J. Andrew, C. C. Thompson, S. H. Richards, R. L. Heinrickson and E. P. Geiduschek, PNAS 81, 7031 (1984). 20. K. Drlica and J. Ronviere-Yaniv, Microbiol. Rev. 51, 301 (1987). 21. H . 4 . Koo, H.-M. Wu and D. M. Crothers, Nature 310, 501 (1986). 22. D. L. Milton, M. L. Casper, N. M. Wells and R. F. Gesleland, NAres 18, 817 (1990). 23. H. C. M. Nelson, J. T. Finch, B. F. Luisi and A. Klug, Nature 330, 221 (1987). 24. M. A. Marahiel, P. Zuber, C. Czekay, and R. Losick, 1. Bact. 169, 2215 (1987). 25. M. Perego and J. A. Hoch, J. Baet. 173, 2514 (1991). 26. J. Weir, M. Predich, E. Dubnau, G. Nair and I. Smith, J. B a t . 173, 521 (1991). 27. M. Perego and J. A. Hoch, /. Baet. 170, 2560 (1988). 28. D. Duhnau, Microbiol. Reti. 55, 395 (1991). 29. F. Valle and E. Ferrari, in “Regulation of Procaryotic Development” (I. Smith, R. A. Slepecky and P. Setlow, eds.), p. 131. American Society for Microbiology, Washington, D.C., 1989. 30. W. Su, S. Porter, S . Kustu and H. Echols, PNAS 87, 5504 (1990). 31. Y. Flashner and J. D. Gralla, Cell 54, 713 (1988). 32. C. F. McAllister and E. C. Achberger, JBC 263, 11743 (1988). 33. C. F. McAllister and E. C. Achberger, JBC 264, 10451 (1989). 34. C. D. B. Banner, C. P. Moran, Jr., and R. Losick, J M B 168, 351 (1983). 35. L. Bracco, D. Kottarz, A. Kolb, S. Diekmann and H. Buc, E M B O J. 8, 4289 (1989). 36. M. A. Strauch, V. Webb, G. B . Spiegelman and J. A. Hoch, PNAS 87, 1801 (1990). 37. D. Burbulys, K. A. Trach and J. A. Hoch, Cell 64, 545 (1991). 38. K. A. Trach, D. Burbulys, M. A. Strauch, J.-J. Wu, R. Jonas, N. Dhillon, C. Hanstein, P. Kallio, M. Perego, T. Bird, 6. Spiegelman, C. Fogher and J. A. Hoch, Res. Microhiol. 142, 815 (1991). 39. T. B. Higerd, J. A. Hoch and J. Spizizen, J. Baet. 112, 1026 (1972). 40. B. Dod and 6. Balassa, MGG 163, 57 (1978). 41. V. Jeannoda and G. Balassa, MGG 163, 65 (1978). 42. J. Ito and J. Spizizen, in “Spores V (H. 0. Halvorson, R. Hanson and L. L. Campbell, eds.), p. 107. American Society for Microbiology, Washington, D.C., 1973. 43. P. T. Kallio, J. E. Fagelson, J. A. Hoch and M. A. Strauch, JBC 266, 13411 (1991). 44. D. J. Henner, E. Ferrari, M . Perego and J. A. Hoch, J . B a t . 170, 296 (1988). 45. S . Toma, M. Del Bue, A. Pirola and G. Grandi, J. B a t . 167, 740 (1986). 46. R. Schleif, ARB 61, 199 (1992). 47. B. C. A. Dowds and J. A. Hoch, J . Gen. Microbiol. 137, 1121 (1991). 48. N. K. Gaur, E. Dubnau and I. Smith, J. Bact. 168, 860 (1986).

TRANSITION-STATE GENE EXPRESSION

151

48a. I. Smith, personal communication, 1992. 49. N. K. Gaur, K. Cabane and I. Smith, J. Bact. 170, 1046 (1988). 50. T Akamatsu and J. Sekiguchi, Agric. Biol. Chern. 51, 2901 (1987). 51. H. Pooley and D. Karamata, J. B a t . 160, 1123 (1984). 52. J. Sekiguchi, B. Ezaki, K. Kodama and T Akamatsu, J. Gen. Microbiol. 134, 1611 (1988). 53. J. Sekiguchi, H. Ohsu, A. Kuroda, H. Moriyama and T. Akamatsu, J. Gen. Microbiol. 136, 1223 (1990). 54. N. K. Gaur, J. Oppenheim and I. Smith, J. Bact. 173, 678 (1991). 55. 1. Smith, I. Mandic-Mulec and N. Gaur, Res. Microbiol. 142, 831 (1991). 56. I. Mandic-Mulec, N. Gaur, U. Bai and I. Smith, J. Bact. 174, 3561 (1992). 57. S. Satola, P. A. Kirchman and C. P. Moran, Jr., PNAS 88, 4533 (1991). 58. K. York, T. J. Kenney, S. Satola, C. P. Moran, Jr., H. Poth and P. Youngman, J, B a t . 174, 2648 (1992). 59. M. Honjo, A. Nakayama, K. Fukazawa, K. Kawamura, K. Ando, M. HoriandY. Furutani, f. Bact. 172, 1783 (1990). 60. T. Msadek, F. Kunst and G. Rapoport, in “Bacillus subtilis and Other Cram-Positive Bacteria” (A. L. Sonenshein, R. Losick and J. A. Hoch, eds.), p.729. American Society for Microbiology, Washington, D.C., 1993. 60a. I. Smith, in “Bacillus subtilis and other Gram-Positive Bacteria” (A. L. Sonenshein, R. Losick and J. A. Hoch, eds.), p. 785. American Society for Microbiology, Washington D. C . 1993. 61. T. Msadek, F. Kunst, D. Henner, A. Klier, 6 . Rapoport and R. Dedonder,]. B u t . 172, 824 (1990). 62. D. J. Henner, M. Yang and E. Ferrari, J. B a t . 170, 5102 (1988). 63. F. Kunst, M. Debarbouille, T. Msadek, M. Young, C. Mauel, D. Karamata, A. Klier, 6. Rapoport and R. Dedonder, J Boct. 170, 5093 (1988). 64. M. K. Dahl, T. Msadek, F. Kunst and C. Rapoport, J. B a t . 173, 2539 (1991). 65. K. Mukai, M. Kawata and T. Tanaka, JBC 265, 20000 (1990). 66. Y. Nagami and T. Tanaka, J. Bact. 166, 20 (1986). 67. M. Yang, H. Shimotsu, E. Ferrari and D. J. Henner, J. Bact. 169, 434 (1987). 68. T. Tanaka, M. Kawata, M. Saitoh and Y. Nagami, in “Genetics and Biotechnology of Bacilli” (A. T. Ganesan and J. A. Hoch, eds.), Vol. 2, p. 33. Academic Press, San Diego, 1988. 69. V. Singer, Ph. D. thesis. University of California-Berkeley, 1987. 70. T. Tanaka and M. Kawata, J. Bact. 170, 3593 (1988). 71. M. Yang, E. Ferrari, E. Chen and D. J. Henner, J. Bact. 166, 113 (1986). 72. A. Amory, F. Kunst, E. Aubert, A. Klier and G . Rapoport, /. B a t . 169, 324 (1987). 73. J.-A. Lepesant, F. Kunst, J. Lepesant-Kejzlarova and R. Dedonder, MGG 118, 135(1972). 74. J. A. Lepesant, F. Kunst, M. Pascal, J. Lepesant-Kejzlarova, M. Steinmetz and R. Dedonder, in “Microbiology-1976” (D. Schlessinger, ed.), p. 58. American Society for Microbiology, Washington, D.C., 1976. 75. A. Sloma, D. Pawlyk and J. Pero, in “Genetics and Biotechnology of Bacilli” (A. T. Ganesan and J. A. Hoch, eds.), Vol. 2, p. 23. Academic Press, San Diego, 1988. 76. H. Shimotsu and D. J. Henner, J . Bact. 168, 380 (1986). 77. A. Klier, A. Fouet, M. Debarbouille, F. Kunst and C. Rapoport, Mol. Microbiol. 1, 233 (1987). 78. T. Msadek, F. Kunst, A. Klier and G. Rapoport, J. B a t . 173, 2366 (1991). 79. T. Mitani, J. E. Heinze and E. Freese, BBRC 77, 1118 (1977). 80. M. Takagi, H. Takada and T. Imanaka, /. B a t . 172, 411 (1990).

152

MARK A. STRAUCH

81. S. L. Wong, L. F. Wang and R. H . Doi, J . Gen. Microbiol. 134, 3264 (1988). 82. L. Wang and R. H. Doi, J . Bact. 172, 1939 (1990). 83. P. McCready and R. H. Doi, in “Genetics and Biotechnology of Bacilli” (M. M. Zukowski, A. T. Ganesan and J. A. Hoch, eds.), Vol. 3, p. 393. Academic Press, San Diego, 1990. 84. J. Greenblatt, J. Li, S. Adhya, D. I. Friedman, L. S. Baron, 8. Redfield, H . Kung and H . Weisbach, PNAS 77, 1991 (1980). 85. J. Greenblatt, M. McLimont and S. Hanly, Nature 292, 215 (1981). 86. A. S. Pang, S. Nathoo and S. Wong, J . Bact. 173, 46 (1991). 87. M. M. Nakano, L. Xia and P. Zuber, J . Bact. 173, 5487 (1991). 88. M. M. Nakano, R. Magnuson, A. Myers, J. Curry, A. D. Grossman and P. Zuber,]. Bact. 173, 1770 (1991). 89. J. P. Mueller, G. Bukusoglu and A. L. Sonenshein, J. B a t . 174, 4361 (1992). 90. J. P. Mueller and A. L. Sonenshein, J . B a t . 174, 4374 (1992). 91. I. Smith, E. Dubnau, M. Predich, U. Bai and R. Rudner, Biochimie 74, 669 (1992). 92. Y. Weinrauch, R. Penchev, E. Dubnau, I. Smith and D. Dubnau, Genes Dec. 4, 860 (1990). 93. J. A. Hoch, in “Bacillus subtilis and Other Gram-Positive Bacteria” (A. L. Sonenshein, R. Losick and J. A. Hoch, eds.) p. 747. American Society for Microbiology, Washington, D.C., 1993. 94. J. A. Hoch, K. Trach, F. Kawamura and H. Saito, /. Bact. 161, 552 (1985). 95. F. A. Ferrari, K. Trach, D. LeCoq, J. Spence, E. Ferrari and J. A. Hoch, PNAS 82, 2647 (1985). 96. 6. Spiegelman, B. Van Hoy, M. Perego, J. Day, K. Trach and J. A. Hoch, J. Ruct. 172, 5011 (1990). 97. M. Perego, J.-J. Wu, G. B. Spiegelman and J. A. Hoch, Gene 100, 207 (1991). 98. M. A. Strauch, K. A. Trach, J. Day and J. A. Hoch, Biochimie 74, 619 (1992). 99. C. Antoniewski, B. Savelli and P. Stragier, J . Bact. 172, 86 (1990). 100. M. Perego, S. P. Cole, D. Burbulys, K. Trach and J. A. Hoch, J . Bact. 171, 6187 (1989). 101. K. Trach, D. Burbulys, 6. Spiegelman, M. Perego, 8. Van Hoy, M. A. Strauch, J. Day and J. A. Hoch, in “Genetics and Biotechnology of Bacilli” (PA. M. Zukowski, A. T. Ganesan and J. A. Hoch, eds.), Vol. 3, p. 357. Academic Press, San Diego, 1990. 102. F. A. Ferrari, K. Trach and J. A. Hoch, J . Bact. 161, 556 (1985). 102a. M. A. Strauch, J.-J. Wu, R. H. Jonas and J. A. Hoch, Mol. Microbiol. 7, 967 (1993). 103. M. Perego and J. A. Hoch, Mol. Microbiol. 1, 125 (1987). 104. M. Perego and J. A. Hoch, J. B a t . 173, 2514 (1991). 105. A. L. Sonenshein, in “Molecular Biology of Microbial Differentiation” (J. A. Hoch and P. Setlow, eds.), p. 185. American Society for Microbiology, Washington, D.C., 1985. 106. A. L. Sonenshein, in “Regulation of Procaryotic Development” (I. Smith, R. A. Slepecky and P. Setlow, eds.), p. 109. American Society for Microbiology, Washington, D.C., 1989. 107. E. Freese, E. B. Freese, E. R. Allen, Z. Olempska-Beer, C. Orrego, A. Varma and H. Wabiko, in “Molecular Biology of Microbial Differentiation” (J. A. Hoch and P. Setlow, eds.), p. 194. American Society for Microbiology, Washington, D.C., 1985. 108. J. M. Lopez, A. Dromerick and E. Freese, J . Bact. 146, 605 (1981). 109. K. Trach and J. A. Hoch, J . Bact. 171, 1362 (1989). 110. M. Perego, C. F. Higgins, S. R. Pearce, M. P. Gallagher and 1. A. Hoch, Mol. hficr<Jhio/. 5 , 173 (1991). 1 1 1 . A. D. Grossman and R. Losick, PNAS 85, 4369 (1988). 112. M. A. Strauch, D. d e Mendoza and J. A. Hoch, &lo/. Microbiol. ti, 2909 (1992). 113. J. W. Foster, W. A. Hardwick and B. Guirard, J . Bact. 59, 463 (1950). 114. W. A. Hardwick, B. Guirard and J. W. Foster, J . B a t . 61, 145 (1951).

TRANSITION-STATE GENE EXPRESSION

153

115. S. Petridou and R. A. Slepecky, Biochimie 74, 749 (1992). 116. M . M . Zukowski, in “Biology of Bacilli-Applications to Industry” (R. H . Doi and M. McCloughlin, eds.), p. 311. Butterworth-Heinemann, Stoneham, Massachusetts, 1992. 117. M . R. W. Brown and P. Williams, Annu. Rev. Microbiol. 39, 527 (1985). 118. D. B. Roszak and R. R. Colwell, Microbiol. Rev. 51, 365 (1987). 119. B. B. Finley and S. Falkow, Microbiol. Rev. 53, 210 (1989). 120. J. J. Mekalanos, j . Bmt. 174, 1 (1992). 121. S. Hayashi and M . P. Scott, Cell 63, 883 (1990). 122. M. P. Scott, J. W. Tamkun and G. W. Hartzell 111, BBA 989, 25 (1989).

This Page Intentionally Left Blank

Genomic Organization of T and W, a New Family of Double-Stranded RNAs from Saccharomyces cerevisiae’ ROSAESTEBAN,~ NIEVES RODRIGUEZ-COUSIRO AND LUIS M. ESTEBAN lnstituto de Microbiologia Bioquimica Departamento de Microbiologia y Cenktica Consejo Superior de lnvestigaciones Cientificas Universidad de Salamanca Salamanca 37071, Spain

......................... I. General Features . . . . . . . . . . . . ......................... T and W Genomic Organization Single-Stranded RNA Counterparts of T and W dsRNAs Configuration of W and 20-5 RNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T and W Replication Cycles ................................ VI. T and W Are Not Encapsidat to Viral Particles . . . . . . . . . . . . . . . . VII. Are the RNA Polyrnerases Associated with the RNAs? . . . . . . . . . . . . . . VIII. Relationship between T and W and Other Mycoviruses . . . . . . . . . . . . . IX. Evolutionary Origin of T and W . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X. Conclusions and Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ............................... References . . . . . . . . . . . . 11. 111. IV. V.

156 157 169 173 174

175 176 179 179 180

Double-stranded RNAs (dsRNAs) have been found in many fungi. With recent progress in the molecular biology of RNA, their unique nature as viruses has begun to emerge. Some of these RNAs have been cloned and sequenced. Most of them encode RNA polymerases that have the consensus motifs for RNA-dependent RNA polymerases of (+) strand and dsRNA viruses. Unlike animal dsRNA viruses, however, many of them are unseg1 Abbreviations used: RDRP, RNA-dependent RNA polymerase (EC 2.7.7.48); CBI, codon bias index; PIR, Protein Identification Resource data base from the National Biomedical Research Foundation; VS, VS-RNA. A single-stranded circular molecule isolated from the Varkud-lc natural isolate of Neurosporu; IPNV, infectious pancreatic necrosis virus. 2 To whom correspondence may be addressed.

Progress in Nucleic Acid Research and Moleciilar Biology, Vol. 46

155

Copyright 0 1993 b y Academic Press. Inc. All rlghta of reproductmr m any form resewed.

156

ROSA ESTEBAN ET AL.

mented and they are not transmitted extracellularly ; the transmission is vertical from mother to daughter cells in the mitosis or by hyphal fusion or anastomosis. One of the most widely studied are the dsRNAs responsible for the so-called “killer phenomena” in Saccharomyces cerevisiae, namely, L-A and M,.3 Both RNAs are separately encapsidated in isometric viral particles. L-A codes for the viral coat proteins, including its own RNA polymerase, and MI,a satellite virus of L-A, encodes the K1 killer toxin. This toxin is efficiently secreted into the medium and its secretory mechanism has been studied in detail as a model for secretion of proteins in eukaryotes (for reviews, see 1 and 2 and references therein). In 1984, two new dsRNAs were described in S. cerevisiae strains, T and W4 (3).Unlike L-A and M, dsRNA, T and W are not encapsidated into viral particles. This review focuses on the characterization of these two dsRNAs. Most of the information comes from cloning and sequencing of their cDNAs. Their nucleotide sequences reveal that they are of viral origin, but their nonencapsidated nature distinguishes them from other dsRNA viruses.

1. General Features T and W were first described as dsRNAs present in a number of different laboratory strains of S. cerevisiae; the sizes of T and W were estimated to be 2.7 and 2.25 kb, respectively, in agarose gels (3).Hybridization experiments showed no homology to each other, to other dsRNAs present in S. cerevisiae, or to mitochondria1 or genomic DNA. Both T and W are cytoplasmically inherited and are not encapsidated into viral particles (3).Both are inducible in certain yeast strains when grown at 37°C. W is present in most laboratory yeast strains. In a search carried out with different laboratory strains, more than 90%of those tested carried W, whereas a very limited number of strains carry both T and W. These T-0 strains, however, do not have any genetic impairment to maintain T dsRNA, because upon transfection from a T-carrying strain by cytoduction, the cytoductans could maintain T dsRNA stably (M. P. Garcia-Cuellar and R. Esteban, unpublished). The fact that T has always been found in W-carrying strains may suggest that T is dependent on W for its maintenance. In laboratory yeast strains, T and W seem to be highly stable: conditions such as the ones used for the curing of “killer” viruses (growth at 37-39°C for several generations) do not eliminate these RNAs, but instead increase their copy-number, as mentioned above. 3

The symbols and abbreviations are listed in Table I. See also Table 11. [Eds.] T and W are trivial, ud hoc, designations, without intrinsic meaning (3). [Eds.]

YEAST T AND

w RNAs

157

II. T and W Genomic Organization A. Nucleotide Sequences Most of the nucleotide sequences of T and W dsRNAs have been determined recently (4,5). In both cases, T and W cDNA clones were obtained by random priming with denatured T or W dsRNA as template, followed by reverse transcription. The sequence of W, so far as it has been determined, includes 2505 basepairs. That of T is, at present, 2871 base-pairs. Our estimation of their sizes, by agarose gel electrophoresis, is 2 . 4 kilobases for W and 2.9 for T; only a few nucleotides at both ends remain to be determined. There is no homology between the nucleotide sequences of T and W, with the exception of one stretch of 17 nucleotides between 1362 and 1378 in the T sequence, and 1248 and 1264 in the W sequence. In fact, they show no hybridization to each other in Northern blots (3-5).This stretch of 17 nucleotides is located in the central part of the sequences that have coding capacity (see Section 11, B). Both T and W are slightly GC-rich (58%in the case of W and 59% in the case of T). With T- or W-specific probes, it was analyzed by Southern blot hybridization whether there were any copies of these RNAs in the yeast genome (nuclear or mitochondrial). The probes showed no hybridization even after a prolonged exposure, indicating that T and W nucleotide sequences are not present in the yeast genome (4, 5). Most laboratory yeast strains carry other dsRNAs, namely, L-A, L-BC, and M, dsRNAs (Table I) encapsidated in viral particles of 39-nm diameter. L-A and M , cDNAs have been synthesized and cloned and their nucleotide TABLE I dsRNAs FROM S. cerevisiae Encoded proteins Name

Size (kb)

Mass

Function

Location

Single-stranded RNA counterparts

Presence in viral coats

~~~~~

~

L-A

4.6

76 kDa 170 kDa

M,

1.8

L-BC

~~~~~~

Major coat protein RNA polymerase

Cytoplasmic

39-nm spherical particle

Message strands

32 kDa

Killer toxin precursor

Cytoplasmic

39-nm spherical particle

Message strands

4.6

?

?

Cytoplasmic

39-nm spherical particle

Message strands

T

2.9

104 kDa

RNA polymerase

Cytoplasmic

Non-encapsidated

2 3 4 RNA

w

2.5

91 kDa

RNA polymerase

Cytoplasmic

Non-encapsidated

2 0 6 RNA

158

ROSA ESTEBAN ET AL.

sequences have been determined (6-9). Computer analysis showed that these dsRNAs have no homology with T or W dsRNAs; this lack of homology was also confirmed by Northern analysis (4, 5). Another dsRNA similar in size to L-A is L-BC. L-BC dsRNA (4.6 kb) is present in most laboratory yeast strains and is encapsidated into viral particles as well. The size of L-BC particles is similar to L-A’s, but with a different protein composition (10). Only part of the L-BC genome (737 nucleotides) has been sequenced (11). This 737-nucleotide fragment shows no homology with T or W nucleotide sequences. Again, it was confirmed by Northern analysis that in the rest of the L-BC sequence there is no homology with T or W. Therefore, T and W dsRNAs are different from the other dsRNAs present in S . cerevisiae and they are maintained autonomously in the yeast cells, probably by an RNA-toRNA replication pathway.

6. T and W Encode Their Own RNA Polymeruses Both RNAs have coding capacity. In each case, only one strand [the (+) strand] has a long open-reading frame (ORF), which spans almost the entire length of the RNA (Fig. 1). The complementary strands, the (-) strands, have no coding capacity for proteins larger than 100 amino acids. The T-encoded protein, predicted from the known nucleotide sequence, is 940 amino acids in length (estimated molecular mass, 104 kDa). The reading frame starts with C(AT) at the third base of the known T (+) strand 5’ end preceded by TG, and ends with T(GA) at position 2823 (48 nucleotides from the known 3’ end) (5). Since this estimated molecular mass is in good agreement with that obtained from S DS-polyacrylamide-gel electrophoresis (see Section VII), the 5‘ end nucleotides (TG) seem to be part of the initiation codon of the T-encoded protein. The (+) strands of W contain information for the synthesis of a protein of 829 amino acids (molecular mass, 91 kDa) from an A(TG) at base 6 to a T(GA) at position 2492 (only 10 nucleotides from the 3‘ end). In both proteins, there are sequences diagnostic of RNAdependent RNA polymerases (RDRPs) from (+) strand or dsRNA viruses (11-14). The existence of these putative polymerases, together with the fact that there are no DNA copies of T or W, indicates that these RNAs are replicated through an RNA-to-RNA replication pathway and that the proteins encoded by them are implicated in this process. Although T and W show no homology to each other at the nucleotide level, the proteins encoded by them show a high degree of conservation. This conservation extends beyond the region that contains the consensus sequences for RDRPs, as shown in Fig. 2. On average, only 22.5% of the amino acids of these two proteins are identical, but there are several regions with highly conserved sequences, particularly the central parts of both mole-

YEAST T AND W RNAs

159

2505 bp

W dsRNA

T. dsRNA _.

FIG. 1. Genomic organization of T and W dsRNAs and the proteins encoded in the (+) strands. Sequence motifs (A to D) conserved in RNA-dependent RNA polymerases (RDRP) (14) are boxed and shaded. Motifs 1, 2, and 3 are only present in these two proteins (see Section 11, B and Fig. 2). The tandem repeats of praline in the C-terminus of T protein are indicated.

cules, between amino acids 352 (glycine) and 502 (glutamic acid) in the W protein, and between amino acids 388 (glycine)and 541 (glutamic acid) in the T protein. This interval of about 150 amino acids includes the four motifs (A to D) described as characteristic of RDRPs (14). These four motifs are boxed and indicated on top of the sequences in Fig. 2. Particularly, region B shows a high degree of conservation, with 26 amino acids identical in an interval of 35 amino acids (74% identity), including the . . . S/T/MG . . . T . . . consensus motif (12-14). Region C, which contains the GDD motif, initially identified as characteristic of RDRP (12), is conserved as well. Outside this 150-amino-acid interval, there are also some sequences conserved, with more than 50% of the amino acids identical; those are indicated as regions 1, 2, and 3 on top of the respective sequences (Fig. 2). No similar sequences have been found in other viral RDRP. Recently, however, a number of reports have appeared concerning the classification of RNA viruses and retroelements, based on the degree of conservation among their RDRP (11,15, 16), and it was suggested that, in addition to the central motifs described by Poch et al. (14, there could be other divergent domains that have functional importance in these proteins. Therefore, it is not yet known whether domains 1 , 2 , and 3 in the T and W sequences are necessary for the polymerase activity as additional regions or are conserved only in these particular polymerases due to special functions other than RNA polymerization. The strongest sequence divergence between the T and W proteins is seen in the N- and C-terminal regions (Fig. 2). Comparison of regions A to D at the nucleotide level shows a lower degree of conservation between the T and W nucleotide sequences. As mentioned above, only one stretch of 17 nucleotides appears in both RNAs. Most of the conserved amino acids in the interval A to D are encoded by different codons. However, more than 50%of these triplets maintain the first

160 1

u

1

ROSA ESTEBAN ET AL.

HHKVNVKTPREVHFPMDLLPACGASAPRPVARVSRATDLDRRYRCVLSLPEERARSVGCKUSSTRAALRRGLEELGSREFRRRLRLADDC

* . ..* * * ** * * . . *.* . ___---1 _ _ _ - _ - _ _ - - _ _ _ _ _ _ _MKEPVDCRLSTPAGFSGTVPPPGRTKAARPGTIPVRRSRGSASALPGKIYGUSRRQRDRFAMLLSS

90

66

T

91

URAICAA---VCTGRKFPSFSVTDRPARARLAKVYRMGRRLLVGVVCRGESVVSDLKQECADLRRVIFEGSTRIPSSSLUGLVGVLGUTS 177

U

67

FDAALAAYSGWVSRGTRSLPPSLRLFRAMTRKU------LSVTARGNGVEFAIASAKEFSAACRAGUISGTVPDHFFMKULPEPVRRKS

T

* * * * * * .

.*t

170 PERA

U 151

T

265

U

240

T

355

U

325

GLUA

**

* * * *

*i

. * ..*

*

LTFIGRALPYGSPDVERRALASHAATL lPAECHPNYLVAAEPFAKSUADDNL---PRKFRlYPlAVPESSCMEYSRAPGGLLP

**** ** *

*

* *** * *

.

.*

** .*

* * -**. *

*

.

*

2 . .**

VAWPERGFKARIVTTHSASRVTFGHPFRRYLLP

354

SlTDLVSSSPTDNLPPLESMPFGPTPGPALPVHVLEVSLSRYHNGSDPKG----

VSVVRERGHKVRVVSAMETHELVLGHAARRRLFK

324

*

* ** *** * *.*

* ***

..

** *.

444

*

405 B

534

*****

U 406

495

m 535

U

496

T

625

u

538

T

713

264

SFRKGFVGYDPAAPSADPDDLELAKERGFSRlRASUYSTFRYRGELKSTNaSLE

445

T

150

SFEVPADVLTSLRNYSEDUARRHLAADPDPSLLCEPCTGNSATFERTRREGGFAQ 239

LSFIGRSLPEGGDRHEIEALANHKAAL

m

T

*

m

I

~CFTVKPVKMKVRVRSEPSLKGYVFSRSSAFSCRMGGKG

I TG I RAARLYT I GAMPRUSRRI RDVYPGSLEHRTASPRYGEPVT

.

*

. *

. **

624 537

.*

**** ***

**

*

712

**. *

621

--KSTGLTTKIGAVASRRUMSRIGHDLYRSRERKSTLGRVTLSTSPAYAASLHEVEKFMDR---------PDIILTRKCRNPMLKHARE

*. * * *

*. *

*.

***

**

* *

791

t

U

622

SDRVDAPAFHRKAISSLVVGSDATAAYSFlRM~GFEGHPUKT---~--AASQETDTUFADYKVTRPGKMYPDRYGFLDGESLRTKSTML

705

LGLFEEVFESRVGGGILUASLNGKALVESHSPSILPVSRNLRRSLACPSGGFLRPSAPIGKLVPRHTLPRGTVUFLESSATDSARPGGMG

881

T

792

U

706 NSAVYETFLGPDPOATHYPSLRIVASRLAKVRKDLVNRUPSVKPVGKDLGTILEAFEESKLCTLMTPYDASGYFDDSLLLMDESVYPRR

T

882

LPPPPPPPLGGGGMAGPPPPPFMGLRPESSVPTSVPFTPSMFSERLAALESLFGRPPPS

U

796

FRPLVIAGLMREGRMGDLL--FPNULPPSTWSGFP---------~-------~-----

**

*

**

*

*

*

* * *

. ..

* *

*

* .*

*

FIG. 2. Comparison between the T- and W-encoded proteins. Each amino acid is represented by a single letter. Asterisks indicate identical amino acids. Dots indicate conservative amino-acid changes. The consensus sequences conserved in RNA-dependent RNA polymerases from (+) strand and double-stranded RNA viruses are indicated with a closed square under the amino-acid sequences. Boxed are the A-D motifs (14) that include the most-conserved regions among these polyrnerases. The boxed regions indicated as 1, 2, and 3 are only conserved between the T- and W-encoded proteins. (Reproduced with permission from 5)

795

YEAST T A N D W RNAS

161

and second nucleotides unchanged and the third nucleotide changed. These conservative changes at the nucleotide level suggest that T and W are evolutionarily related and that there exists a selective pressure to maintain functionally similar proteins through evolution. Outside this region, there is almost no homology at the nucleotide level. Analysis of T and W proteins codon usage reveals some interesting features. It has been suggested that codon usage is an important factor directly influencing the expression level of a gene (17). In S. cerevisiae, highly expressed genes have an extremely high codon bias, as first noted with the gene encoding the glyceraldehyde-3-phosphate dehydrogenase (18). Highly expressed genes use, almost exclusively, just 25 of 61 sense codons. In contrast, in genes expressed at lower levels, all 61 sense codons are utilized, and codon usage is generally less biased (17). T and W proteins use all 61 sense codons and their codon bias indexes are close to 0, consistent with the evidence that these proteins are expressed at a low level. T and W protein codon usages are significantly different from that of the host (either highly expressed or lowly expressed genes). This is due, in part, to their different nucleotide compositions (the GC content of yeast genes is about 38-40% on average, whereas T and W have almost a GO% GC content). This high GC content may be a remnant of old days, when yeast cells acquired T and W as intracellular parasites. Alternatively, T and W may require a high GC content in their genomes to encode a protein rich in particular amino acids, such as arginine or proline. T and W proteins are very rich in arginine (in fact, arginine is the residue most frequently found in both proteins). More than 10% (in the case of W) and 11% (in the case of T) are arginine residues; on average, in a total of 575 yeast protein sequences analyzed (19), there was 4.4% of arginine residues; and if the entire PIR sequence data base is considered, the arginine residues are about 5.2% (20). Arginine-rich regions have been implicated in the binding of proteins to RNA (21). They are present in a subfamily of RNA-binding proteins that comprises the anti-terminator N-proteins of bacteriophage h, the E . coli phage 21, and Salmonella phage p22 (22), in addition to some ribosomal proteins and RNA-virus proteins. In the case of W, one stretch of argininerich residues is found between amino acids 327 and 333 (RRERRLR) (4). In T, there are also several stretches of amino acids rich in arginine (5).One of these stretches is present in the C-terminal third of the protein (739RSRERK-745), and this C-terminal 255-amino-acid fragment (from glutamine 617 to glycine 935), when expressed in E . coli, has single-stranded RNA-binding activity (L, M. Esteban, T. Fujimura and R. Esteban, unpublished results). This binding activity may be involved in the recognition of the RNA by the RNA polymerase (see Section VII1,B). Both proteins are also rich in proline (more than 7% of the residues). In T

162

ROSA ESTEBAN ET AL.

protein, part of these residues forms tandem repeats close to the C-terminus (Fig. 1).In W, they are scattered throughout the sequence. The biological significance of this high proline content is not known (yeast proteins average about 4.4% proline, and other RDRPs from RNA viruses are about 5% proline). A high content of proline appears in a number of proteins with no apparent common function.

C. Relationship to Other Viral RNA Polymerases Conserved amino-acid-sequence motifs have been identified in the RNA polymerase genes of many RNA viruses. Indeed the presence of such motifs is used to identify putative polymerase genes in newly sequenced viral genomes. When the interval of about 150 amino acids conserved between the T- and W-encoded proteins, including the four motifs (A to D), was compared with similar intervals from other RDRPs from (+) strand and doublestranded RNA viruses, some interesting features emerged. The computer programm CLUSTAL (23)was used to align the sequences; the dendrogram obtained with the branching order is shown in Fig. 3. It contains 38 protein fragments, whose size varied from 121 amino acids for the MSZ and Ga bacteriophage replicase p subunits and 290 amino acids in the case of Hav c putative RNA polymerase (Table 11). Polyinerases included in the comparison were from (+) strand RNA viruses (25), dsRNA viruses (7), and RNA coliphages (4), in addition to T and W ORFs. In the (+) strand RNA viruses, there are members belonging to three different superfamilies: (i) the poliovirus-like group; (ii) the sindbisvirus-like group; and (iii) the luteoviruslike group (24). The branching order in the dendrogram shown in Fig. 3 indicates that T and W polymerases form one of the outgroups, clustered together with the RNA coliphage polymerases and separated from the rest of (+) strand and double-stranded RNA viruses. The similarity of the residues in the conserved motifs among T and W proteins and coliphage’s RNA polymerases has been observed ( 4 , 5 ) (Fig. 4). Recently, several investigators (11,16,25)have proposed different classifications of RNA viruses, taking into account not only the central domains (A to D) of Poch et aZ. (14),but including an additional region from the N-terminal (16) or C-terminal ( 1 1 ) portion of the putative RNA polymerases. The difference in the methods used for the alignment generation has given cluster dendrograms with a different branching order; both authors, however, concluded that bacteriophages were the outermost group in the tentative phylogenetic trees. In the dendrogram shown in Fig. 3, the seven members of dsRNA viruses that include viruses from fungi (such as the L-A virus from S. cerevisiue and Hav c virus from the chestnut blight fungus Cryphonectria

YEAST T AND W

163

RNAS

T and W RNAs RNA Coliphages Ga MS2

Polio

coxv HN 14

Plus-strand Poliovirus-like Group

Fmdv Emc Tmev

I

Hav Cpmv

t-l

Tev Tvmv IWV Phi 6

Double-stranded RNA Viruses

Re0 Hav C L-A

Rot BtV

Tymv YfV Wnv

Plus-strand Sindbisvirus-like Group

t J

Tmv Trv Bmv Cucmv Almv Sinv SfV

Plus-strand Luteovirus-like Group

I

t

-

1

Bnyw Bydv Carmv Cnv CY

FIG. 3. Dendi :ram of dsRNA and (+) strand RNA viruses based on RDRP similarity (the sizes of the fragments and their symbols are indicated in Table 11).

parasitica), the bacteriophage Phi6, and animal viruses, are grouped together and separated from the T- and W-encoded polymerases, even though T and W are dsRNAs. As will be mentioned below (Section 111),T and W have single-stranded RNA counterparts, 23-S and 20-S RNAs, respectively. These RNAs have (+) strand polarity. It is not known, at present, whether the double-stranded forms or the single-stranded ones are the genomic RNAs. Thus, the close relationship of T and W RNA polymerases with those of the (+) strand single-stranded linear coliphages suggests that the (+) strands (23S and 2 0 3 RNAs) could be the genomic strands and that T and W dsRNAs

164

ROSA ESTEBAN ET AL.

VIRUSES

TABLE I1 WHOSERNA POLYMERASES ARE ANALYZEDIN

Virus S. cerevisiae T S . cerevisiae W

Bacteriophage QP Bacteriophage S P Bacteriophage MS2 Bacteriophage GA Poliovirus type 1 Coxsackievirus Human rhinovirus 14 Foot-and-mouth disease Encephalornyocarditis Theiler’s murine encephalitis Hepatitis A Cowpea mosaic Tobacco etch Tobacco vein mottle Infectious bursa1 disease Bacteriophage Phi 6 Reovirus C. parasitica virus S. cerevisiae L-A Bovine rotavirus Bluetongue Turnip yellow mosaic West Nile Yellow fever Tobacco mosaic Tobacco rattle Brome mosaic Cucumber mosaic Alfalfa mosaic Sindbis Semliki Forest Beet necrotic yellow vein Barley yellow dwarf Carnation mottle Cucumber necrosis Cymbidium ringspot

Abbreviation T W

QP SP MS2 GA Polio Coxv Hrv 14 Fmdv Emc Tmev Hav Cpmv Tev Tvmv Ibdv Phi6 Re0 Hav c L-A Rot Btv Tymv Nwv Y fv Tmv TrV Bmv Cucmv Almv Sinv SfV Buyv Bydv Carmv Cnv Cyrv

THE

DENDROCRAM

Host

Genome

Interval

Reference

Fungi Fungi Bacteria Bacteria Bacteria Bacteria Mammal Mammal Mammal Mammal Mainmal Mammal Mammal Plant Plant Plant Bird Bacteria Mammal Fungi Fungi Mammal Mammal Plant Mammal Mammal Plant Plant Plant Plant Plant Mammal Mammal Plant Plant Plant Plant Plant

dsRNA dsRNA (+) strand (+) strand (+) strand (+) strand (+) strand (+) strand (+) strand (+) strand (+) strand (+) strand (+) strand (+) strand (+) strand (+) strand dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA dsRNA (+) strand (+) strand (+) strand (+) strand (+) strand (+) strand

386-531 350-492 269-391 266-388 251-37 1 249-369 1973-2109 1948-2085 1944-2079 2095-2234 2057-2196 2069-2208 1974-2123 1427-1583 2519-2666 2464-2615 407-580 330-557 582-843 1905-2194 1108- 1338 512-690 630-804 1570-1692 3053-3218 3033-3198 1377-1508 1447-1578 457-586

5 4

(+) strand (+) strand

506-635

(+) strand (+) strand (+) strand (+) strand (+) strand (+) strand (+) strand

522-654 746-877 2180-2311 1833-1962 559-703 464-600 521-656 521-656

24 46 62 45 63

64 65 66 67 68 69 70 71 72 73 74 75 59 8 76 77 78 79 80 81 82 83 84 85

86 87 88 89 90 91 92

YEAST T AND

w RNAs

165 B

A

I T W

DLVKAILRCIFSDPDRRPPGTSLR3'jFDiV SVASAIWGLEASGRLLPVEIAG---LRAC

DGSWTYEKlSYGN5YTFELESLIFA

QB

SP

Ga

us2

247

.

.

GSV~GSLATIDLSSASDSISRLWVSPLP---------PEIV---

I '

*, ***

.

,

I

T W

tt

4'

. ..

D

C

W C I D S U I R C Y L F O FWVELSDWAPARPNBSRGFVLGES ICGDDLI B ' f ~WWRPERIALYNQIAWC T ( S D ~ ~ ~ V AGQ FPS ADW LS E SLK L T WRG ISF T~ ~VFTVK N L G508 5 4L7 ~ ~ ~ ~ L ~ ~ Q

QB SP Ga

1

VBLHYPDGSEVTraQGILTTWPLLCLI~ QBLVYPDGSE I T T ~ G I L T T W A I ~

SLARSV^SII3LCS!:FVTW--A!ARS'/cELLFI3QSTVSW---.

Ms2

ALSKS.?ILSHCVTGSLGI-Y--AIVKATQIHTCNAGTI(i1-Y---

I

* .

* * *

FIG.4. Conserved residues between the T- and W-encoded polymerases (segments A-D) and coliphage replicase p subunits. Coliphages belong to the following groups: MS2 (group I), Ga (group 11), Qp (group III), and SP (group I\'). Sequences are from references 45 and 46. B, Consensus residues of RNA polymerases from (t)strand RNA viruses (14). Asterisks (*) represent identical residues, and dots represent conservative amino-acid changes in the included sequences. Identical residues between T- and iV-encoded proteins are indicated in boldface.

are either replicative forms or dead-end products of 23-S and 20-S RNA replication cycles (see Section V). Until replication cycles for T and 1%' are established, this remains to be answered.

111. Single-Stranded RNA Counterparts of T and W dsRNAs A. The (+) Strand of W dsRNA Is Identical to 20-S RNA As mentioned above (Section II,A), W dsRNA is not related to either yeast D N A or other dsRNAs in S. cereuisiae strains. However, W (+) strandspecific probes hybridize with a single-stranded RNA species previously known as 20-S RNA (26, 27). This 20-S RNA was first described as an RNA species whose copy-number was greatly induced under the nitrogen starvation conditions necessary for sporulation of diploid yeast cells. This molecule was later shown (28) not to be related to the sporulation process, because its copy-number also increases in haploid cells that do not undergo sporulation. The 20-S RNA is also cytoplasmically inherited (29). Mating strains that carry 20-S RNA with those that are 20-S RN.A- produces diploid strains whose progenies, after meiosis, are all 20-S R N A + . This non-Mendelian segrega-

166

ROSA ESTEBAN ET AL.

tion pattern suggested that 20-S RNA is an independent replicon. In 1990, Matsumoto et al. (30)obtained random cDNA clones that covered part of the 20-S RNA sequence and showed that 20-S RNA is not related either to L-A and M, dsRNAs or to the yeast DNA genome. In Northern hybridization experiments, W (+) strand-specific probes recognized 20-S RNA (Fig. 6A). Yeast strains that did not carry 2 0 3 RNA did not show any hybridization signal. W (-) strand-specific probes showed a slight hybridization signal, but only after a prolonged exposure. From the Northern experiments, it was estimated that less than 1% of the 20-S HNA species have the same polarity of W (-) strands and the rest have W (+) strand polarity (4). The relationship between W and 2 0 3 RNA was studied further, by analyzing their sizes in a glyoxal-denaturing agarose gel. In the gel, denatured W strands and 2 0 4 RNA had the same mobility, which corresponded to that of a single-stranded RNA molecule about 2.5 kb in size, in good agreement with W sequencing data (see Section II,A) (Fig. 1). W (+) strands and 20-S RNA also had the same mobility in an acrylamide strandseparation gel, where W (+) and (-) strands can be separated. Finally, in a 7-M urea acrylamide denaturing gel, denatured W and 2 0 4 RNA also comigrated (N. Rodriguez-Cousifio and R. Esteban, unpublished data). These results, taken together, suggest that 2 0 4 RNA is identical to the (+) strands of W. This was confirmed, at the nucleotide level, with the 20-S RNA nucleotide sequence determined independently (31).The authors sequenced 2479 nucleotides from 2 0 4 RNA using random cDNA clones [the known W (+) strand sequence is, so far, 2505 nucleotides]. The 20-S RNA sequence matches entirely with the W (+) strand sequence, with only one mismatch at 20-S RNA position 2082, C, where there is a T in the W cDNA sequence. However, changes of nucleotides also occur even in cDNA clones synthesized from the same RNA preparation. For example, in the W cDNA sequencing analysis of independently isolated clones, three different nucleotide substitutions have been observed. These changes at the nucleotide level, however, do not modify the amino-acid sequence of the protein encoded in W (4).

B. T Also Has a Single-Stranded RNA Counterpart, 23-S RNA The similarities between the T- and W-encoded RNA polymerases (Fig. 2) that suggest a common origin for these RNAs led to an investigation of the existence of a molecule similar to 20-S RNA as the T single-stranded RNA counterpart. With a T (+) strand-specific probe, such a molecule was found. It was present only in T-carrying strains and its mobility in native agarose gels was between 20-S RNA and 26-S ribosomal RNA (52) (Figs. 5 and 6B). This molecule was characterized further by electrophoretic analysis (glyoxal

YEAST T AND W

RNAS

167

1 2

LdsRNA DNA

23s RNA 20s RNA

-

- 26SrRNA - 18s rRNA

FIG.5. Ethidium-bromide-stained agarose gel of total nucleic acids prepared from two yeast strains grown in conditions that induced 23-S and 20-S RNAs accumulation (shift to 1% potassium acetate after growth in complete medium for 2 days). Lane 1 is a strain containing T and W. Lane 2 is a strain containing neither T nor W. The presence of T and W dsRNAs in the strain analyzed in lane 1 is not detected in these conditions d u e to their low copy-number.

agarose gels and acrylamide strand-separation gels). In both gel systems, this new molecule comigrated with T (+) strands. Due to its sedimentation coefficient of 23-S in sucrose density gradients, it was named “23-S RNA.” The nucleotide sequence of 23-S RNA has not been determined directly. However, from the existing information about the relationship between W and 20-S RNA (4), and the similarity between T and W (5), 2 3 3 RNA was proposed to be the same as the (+) strand of T dsRNA. In fact, three different T (+) strand-specific probes, covering different regions of the T sequence, hybridize with 23-S RNA, suggesting that all the sequences of the T (+) strand are present in 23-S RNA (5). Since 20-S RNA is cytoplasmically transmitted (29), it was examined whether 23-S RNA is inherited in a similar way. Two standard criteria for cytoplasmic inheritance in yeast cells were followed: (i) in a 4:0 meiotic

168

ROSA ESTEBAN ET AL.

A

1

2

3

4

1 2 3 4

w

T

4-T +W

20s RNA-

23s RNA-

W (+) strand-specificprobe

T (+) strand-specific probe

FIG. 6. Northern blots of an agarose gel. Total nucleic acids were prepared from different yeast strains. The same blotted sheet was probed with a W (+) strand-specific probe (A) or a T (+) strand-specific probe (B). Lanes indicated as W or T contain a mixture of purified T and W dsRNAs. Lane 1 is a strain with only W, lanes 2 and 3 are strains that contain both T and W dsRNAs, and lane 4 is a strain with neither T nor W.

segregation, all meiotic segregants receive the same genetic information when it is cytoplasmically located, in contrast to a 2:2 segregation for a chromosomal location, and (ii) the ability to be transferred from one strain to another by cytoplasmic mixing (cytoduction) using karl-1 mutants defective in nuclear fusion (32). The 23-S RNA fulfills both criteria. The four meiotic segregants of diploid strains formed by mating two haploid strains, one of which was 23-5 RNA+ (and T+) and the other 23-S RNA- (and T-), all carried 23-S RNA (and T dsRNA) (5), indicating that it is cytoplasmically inherited. Furthermore, when the same parental strains were used for a cytoplasmic mixing experiment, all the cytoductans analyzed received both 23-S RNA and T dsRNA (5). Synthesis of 20-3 RNA has been reported to be subject to two types of regulation: (i) it is induced under such stress conditions as heat shock and starvation (26-28,31), and (ii) its copy number is negatively regulated by the chromosomal SKI2 and SKI8 gene-products (30).Analogously, the 2 3 8 KNA copy-number also increases at least 10-fold when the cells are grown at high temperature (37°C) as compared to the growth at 30°C; the effect of high temperature on 2 3 4 RNA copy-number is parallel to the increase of T dsRNA copy-number under the same growth conditions (3). Growth of cells on 1% potassium acetate (the nitrogen starvation conditions usually used to induced sporulation) also increases the 2 3 4 RNA copy-number 10- to 100-

YEAST T AND W

RNAs

169

fold as compared to vegetative growth in rich media. In both cases, the best estimate of the highest 23-S RNA copy-number is between 104 and 105 copies per cell, with small variations depending on the strains used. In general, 23-S RNA and 20-S RNA copy-numbers are similar in the strains that carry both molecules. The effect of SKI genes on 23-S RNA copynumber has not yet been examined.

IV. Configuration of W and 2 0 4 RNA A. Circular RNA Genomes Circular single-stranded RNA genomes have been identified as pathogens of plants and, at least in one case, of animals. They belong, mainly, to four groups: (i) viroids; (ii) virusoids; (iii) encapsidated linear satellite RNAs, which also exist as non-encapsidated circular RNA forms; and (iv) hepatitis delta virus (33).All of them seem to replicate by a rolling-circle mechanism that involves, with some variations, a self-cleavage and self-ligation step to produce unit-length monomer circles from concatemeric (-) or (+) strands. None of them encodes an RDRP. Other variants of circular RNAs are a Neurosporu mitochondria1 plasmid transcript called VS (34, 35), and the transcript of a nuclear satellite DNA of a newt (36).Recently, it was proposed that 20-S RNA is also a circular replicon (30).As discussed in Section 111, A, however, the electrophoretic behavior of 20-S RNA and W (+) strands is identical and there is some evidence supporting linearity in W dsRNA, so the proposal of 20-S RNA circularity is not consistent with these experimental data. There are also some differences between 20-S RNA and the circular genoines mentioned above. None of those small plant pathogens or even the hepatitis delta virus that has a larger genome (1.7 kb) encodes its own RDRP, whereas 2 0 4 RNA (and 23-S RNA) does. Furthermore, unlike hepatitis delta virus (37)and the viroids (38),20-S RNA and 23-S RNA apparently do not form rod-like structures. Rather, they have highly branched strong secondary structures. Since the structures of 20-S RNA and W dsRNA not only are important from a phylogenetic point of view, but are also essential to an understanding of their replication mechanism, this discrepancy is discussed in more detail with the data so far available.

B. Evidence for Linearity of W dsRNA W dsRNA can be labeled at its 3' ends with [32P]pCpand T4 RNA ligase, which indicates that it has free OH groups at its 3' ends as acceptors (39). The labeling with 32P is stoichiometric, that is, more than 90% of the 3' ends of the molecules become labeled and both (+) and (-) strands receive rough-

170

ROSA ESTEBAN ET AL.

1

LINEAR

5’

CIRCULAR

0 olig. V

3

-

20s RNA

1

olig. V RNaseH

5’

’

1

RNase~

1

5

0

3

EIR

1

5’

3 1.6 kb

0.9 kb

2

937

915

\/

5’

R’

2 5 kb

3’

I

W cDNA (2505 bp)

1

olig. v

PROBES a 633

1246

b 1246

-

142

2497 C

429

3 A kb

1

C

0 2

C

1

2

c

1 2 c

D i

2

c

-46

4.6 18-

RNaseH

kb

0

- +

0

--+

? - +

-18

- +

FIG 7 RNase-H treatment of 20-S RNA (1)Diagram of two posuble 20-S RNA configurations (linear and circular), and the results expected from RNase-H digestion The 20-S RNA was annealed with a synthetic oligodeoxynucleotide (indicated as olig V) and digested with RNase H If 20-S RNA was linear, two RNA fragments of about 0 9 and 1 6 kb would be foimed, if 2 0 4 RNA was circular, only one 2 5-kb RNA molecule would be obtained (2) Diagram of the W cDNA, the site where the oligodeoxynucleotide hybridizes and the region5 that probes a, b, and c recognize (3) Proof that 2 0 3 RNA IS linear. RNase-H-digested 20-S RNA samples were separated either in a glyoxal-denatured agarose gel (A-C) or in a 7 - M urea denatured 6%

YEAST T AND W

RNAs

171

ly the same amount of labeling (40). These results eliminate the possibility that W dsRNA might have a covalently closed circular structure. The analysis of the terminal nucleotides by thin-layer chromatography after digesting the labeled strands with T2 nuclease showed that more than 70% of the (+) strands have C residues at the 3' ends, and about 30% have an A. The (-) strands had a similar terminal nucleotide composition at the 3' ends (40). This result may reflect the existence of heterogeneity in W 3' ends, or alternatively it may be due to a non-encoded posttranscriptional addition of an extra adenine to the newly synthesized RNA, which is often seen in many dsRNA and (+) single-stranded RNA viral genomes (41, 42). L-A dsRNA, another dsRNA present in most yeast strains, has either a paired C or an unpaired A at its 3' ends (43,44).The sequence of W (+) strand 3' end thus was suggested to be . . . CC-OH or . . . CCA-OH (40),quite similar to the 3' end terminal sequences found in other (+) strand or double-stranded RNA viruses, such as RNA coliphages (. . . CCCA-OH) (45, 46), or to tRNA 3' ends (. . . CCA-OH). The same terminal composition was also found in T dsRNA; about 85-90% of both T 3' ends are C and only a minor 10-15% are A (L. M . Esteban and R. Esteban, unpublished results), consistent with the close relationship between these two RNAs. When the 3' end nucleotide sequence of W (+) strands was directly analyzed by reverse transcription, the synthesis of the DNA complementary to W ended at a unique point, only 10 nucleotides beyond the sequence determined previously in several independently isolated W cDNA clones synthesized by random priming (4).This may reflect some strong secondary structure on the template that blocked the advance of the reverse transcriptase. However, since only a few nucleotides at the (+) strand 3' end appear to remain undetermined, this ending point of reverse transcription is more likely to be the actual 3' end of W.

C. 20-SRNA Is Linear The linearity of 20-S RNA was directly demonstrated by the site-directed cleavage of 20-S RNA with RNase H , as shown in Fig. 7. This enzyme cuts only the H-bonded RNA in a DNA.RNA base-paired region (47, 48). The cleavage of 20-S RNA under such conditions will generate two RNA fragments, as expected from a linear molecule. If 20-S RNA were circular, the acrylamide gel (D). Then RNA was blotted onto nylon membranes and hybridized with the probes a, b, or c shown in 2. Lane 2, RNase-H-treated sample; lane 1, same as lane 2 except that RNase H was omitted from the reaction (mock experiment); lane C, original untreated sample. Note that in A, B, and C the same nylon membrane was used to hybridize with probes a, b, and c, respectively. In A and D we used the same probe (a). Molecular markers were L-A (4.6 kb) and M, (1.8 kb) dsRNAs. (Reproduced from 40 by permission of Oxford University Press.)

172

ROSA ESTEBAN ET AL.

cleavage at a single site should create a unique RNA product (40). The experiment designed to analyze whether 20-S RNA is circular or linear is outlined in Fig. 7-1. The 20-S RNA was annealed with a synthetic oligonucleotide (olig. V in Fig. 7), complementary to W (+) strand sequence from nucleotides 915 to 937, and then digested with RNase H. Samples were then separated in a denaturing agarose gel or in a denaturing acrylamide gel. After the RNA was transferred onto nylon membranes, 2 0 4 RNA fragments were identified with different RNA probes that were complementary to various parts of 2 0 3 RNA (Fig. 7-2). Probe a, which recognizes W (+) strand sequences upstream and downstream from the cleavage site, detected two discrete RNA fragments produced by the RNase H treatment. The untreated original sample and a mock experiment without RNase H gave only one band, uncut 20-S RNA. The lengths of the two fragments were estimated to be 1.6 and 0.9 kb with denatured RNA markers. The other RNA probes (b and c) detected only one of each fragment. The mobility of 2 0 4 RNA in the mock experiment was the same as that of the untreated sample, thus ruling out the possibility that linear 2 0 4 RNA molecules were created by selfcleavage during the experiment due to the conditions used for RNase H digestion. From these results, it was concluded that 2 0 3 RNA is linear (40). As mentioned above, however, 20-S RNA had been proposed to be circular (30). The proposal was based on three lines of evidence. (i) It was claimed that 50% of 20-S RNA molecules were circular when examined by electron microscopy. However, the authors had crosslinked 20-S RNA with the T4 gene-32 protein to spread the RNA. Therefore, the circular molecules observed may have been an artifact caused by the procedure used. Alternatively, in the case of other viruses, like the infectious pancreatic necrosis virus, proteins covalently attached to the 5' ends of the dsRNA genome can circularize the linear dsRNA, and this circular form might have disappeared after proteinase K treatment (49). (ii) It was claimed that 20-S RNA showed abnormal mobility in a two-dimensional gel, specifically in an acrylamide gel under denaturing conditions. As shown in Fig. 7-3, however, 20-S RNA showed no abnormal mobility in a denaturing agarose gel or in a 7 - M urea acrylamide gel. Therefore, it is likely that the 2 0 4 RNA or RNA markers used were not properly denatured in the two-dimensional gel electrophoresis (30).(iii) The third line of evidence was the inability to label 20-5 RNA 5' and 3' ends with T4 polynucleotide kinase and T4 RNA ligase, respectively. However, the difficulty of labeling free 3' end -OH groups of single-stranded RNA with T4 RNA ligase has been well known (50). In this context, it should be noted that the last 24 nucleotides of the W (+) strand 3' end can be folded, forming a stem-loop structure with a free energy of AG = -25 kcal/mol(51). This strong secondary structure thus may affect the label-

YEAST T AND W

RNAS

173

ing reaction with T4 RNA ligase. With respect to the 5' end, there still remains the possibility of a small protein moiety attached to it, which could explain the inability to label 20-S RNA with T4 polynucleotide kinase. Because the electrophoretic behavior of 23-S RNA is identical to that of T (+) strands, and because of the existing parallelism between T and W dsRNAs and their single-stranded RNA counterparts, it is proposed that 23S RNA is also linear (5).

V. T and W Replication Cycles Two possible replication pathways for W dsRNA and 2 0 3 RNA (4) or for T and 23-S RNA (5) are summarized in Fig. 8. One resembles the model of RNA coliphages, such as Qp (Fig. 8B); that is, 20-S or 23-S RNAs replicate to produce the (-) strand single-stranded RNAs, which are transcribed to make 2 0 4 or 23-S RNA genomic strands. In this context, 20-S or 23-S RNAs are the genomic RNAs, and W and T dsRNAs are by-products or dead-end products originated by the annealing of 20-S or 23-S RNAs and their complementary strands. These reactions could be accelerated by growing the cells at 3TC, the temperature that favors accumulation of T and W dsRNAs (3). Furthermore, it should also be noted that the putative RNA polymerase consensus sequence of T or W is related most closely to those of the coliphage RNA polymerases (Fig. 4). The other model proposes that both W dsRNA and 20-S RNA, or T and 2 3 3 RNA, are constituents of the same replication cycle; that is, W and T dsRNAs are transcribed to make 20-S or 23-S RNAs, and then the (+) single-stranded RNAs replicate to produce new T or W dsRNA (Fig. 8A). This second pathway is consistent with the evidence that the (-) strands of T or W have no coding capacity for proteins and that the (-) strands in 2 3 3 or 20-S RNAs preparations have been estimated to be less than 1%of the (+) strands. Growth of cells in the sporulation medium may deregulate the transcription (or repress the replication), resulting in the accumulation of the single-stranded forms, 2 3 3 and 20-S RNAs. In either case, their replication mechanism should be quite different from those of other encapsidated dsRNA viruses, such as L-A or L-BC found in the same host, S . cerevisiae, since neither the double-stranded nor the single-stranded form seems to be encapsidated into viral particles (see Section VI). A rolling-circle mechanism for 2 0 3 RNA replication was suggested by Matsumoto and Wickner (31). They claimed that there are minor dimerlength molecules with (+) or (-) strand polarity. Based on the assumption that 20-S RNA was circular, and in analogy with viroids, they suggested that such dimer molecules self-cleave and ligate to form monomer circles.

174

ROSA ESTEBAN ET AL.

+

A

W dsRNA

4

f

Transcription

+

+

W(+) strand = 20s RNA

B

+

1 4

+ W dsRNA (byproduct)

b

Replication

Replication Side reaction favored at high temperature

FIG.8. (A and B) Two hypothetical replication pathways for W and 20-S RNA (see text for explanations). The same replication pathways can be applied to T and 23-S RNA.

VI. T and W Are Not Encapsidated into Viral Particles The first report about T and W dsRNAs concluded that these RNAs are not encapsidated in viral particles (3). This conclusion was based on the sedimentation rates of the RNAs in sucrose gradients. T and W sediment more slowly than expected for RNAs associated with a protein capsid, since most of the mass of the small RNA viruses is provided by the proteins in the

YEAST T AND W

RNAS

175

capsid (52). Furthermore, their sedimentation rates in the gradient do not change when the RNAs are treated with phenol to denature proteins prior to centrifugation. With respect to the single-stranded counterparts, it was reported in 1978 (28)that 20-S RNA forms a 32-S ribonucleoprotein particle consisting of one 20-S RNA molecule and 18-20 identical 23-kDa protein subunits. Recently, however, it has been shown (53)that the sedimentation pattern of 20-S RNA in sucrose gradients does not change even after treatment with phenol. It was also observed that the 23-kDa protein is, in fact, the heat-shock protein hsp26 induced in the same starvation conditions under which 2 0 3 RNA accumulates. Subsequently, it was found (54) that hsp26, when expressed in high amounts, forms aggregates of about 5 x 105 Da that cosediment in a sucrose density gradient with 20-S RNA. Since hsp26 shows some homology with the VP2 poliovirus coat protein, which also has the ability to selfassemble (55),it was speculated that hsp26 could have been the coat protein of 20-S RNA, but that subsequently, in the course of evolution, this RNA adapted its mode of inheritance, not requiring packaging into an infectious virion since transfer could be simply and effectively mediated via cell-to-cell contact and plasmogamy as part of the mating process (54). This idea, however, does not explain why this putative “coat protein” for 20-S RNA is now encoded in the yeast genome and not in the virus genome, and why it has been maintained in the cell. The acquisition of some function important for the host could have preserved it through evolution. At any rate, 20-S RNA is well-maintained in hsp26-negative cells, and the presence or absence of hsp26 does not affect the mobility of 20-S RNA in sucrose gradients (53). Therefore, there is no positive evidence supporting physical interactions between 20-S RNA and hsp26. The sedimentation behavior of 23-S RNA in sucrose gradients does not change even after phenol treatment (L. M. Esteban and R. Esteban, unpublished). Therefore, like 20-S RNA, 23-S RNA seems not to be encapsidated. The only other “naked” single-stranded RNAs that have so far been described are the viroids. However, viroids have small circular genomes (about 350 nucleotides in size) that form rod-like structures due to extensive self-complementarity in their nucleotide sequences (56), quite different from T and W or their single-stranded counterparts.

VII. Are the RNA Polymerases Associated with the RNAs? The nucleotide sequences of T and W dsRNAs indicate that each RNA encodes only one large protein. However, it is not known whether these

176

ROSA ESTEBAN ET AL.

proteins undergo any posttranslational modification to produce smaller proteins with distinct functions. It is well known that, in picornaviruses, the original translational polyprotein undergoes proteolytic processing to generate different products, such as a protease, coat proteins, and KNA replicase. In order to study this possibility, we prepared polyclonal antibodies against different portions of the T- and W-encoded proteins (expressed in E . coli) and used them to analyze the proteins present in yeast extracts. Antibodies raised against the N-terminal255 amino acids (from arginine 10 to glutamine 264) and the C-terminal319 amino acids (from glutamine 617 to glycine 935) of the T protein both cross-reacted with a protein of about 104 kDa (p104) in crude extracts prepared from yeast cells. This protein was present only in T-carrying strains, and the ability to synthesize p104 was cotransmitted with T dsRNA from a T+ to a T-0 strain by cytoduction. The N-terminus and C-terminus specific antisera only recognized p104 and no other smaller proteins, suggesting that there was no proteolytic processing of this protein after translation. Consistently, no protease-like sequence has been assigned to the T-protein sequence. In the case of W, a protein of about 90 kDa is present in W-carrying strains and cross-reacts with W-specific antisera, in good agreement with W coding capacity (N. Rodriguez-Cousiiio and R. Esteban, unpublished data). A further analysis of p104 showed that this protein is present in yeast cells in different growth phases and in cells grown at different temperatures (30 or 37°C). During vegetative growth, the amount of p104 is roughly the same as that in the cells induced for 23-S RNA accumulation (12-16 hours in 1% potassium acetate with no nitrogen source), even though there is an increase of at least 100-fold in 23-S copy-number under the latter conditions (see above) (L. M. Esteban and R. Esteban, unpublished data). Although T and 23-S RNA seem not to be encapsidated into viral particles, it does not necessarily mean that they are not associated with proteins. One appealing possibility is the association with their own RNA polymerase (p104) and it will be the subject of future analysis.

VIII. Relationship between T and W and Other Mycoviruses The mycoviruses are intracellular viruses found in both unicellular and filamentous fungi. Most mycoviruses contain dsRNAs encapsidated in an icosahedral viral coat. They are transmitted by a cytoplasmic mixing that occurs during mating or a hyphal fusion called anastomosis. So far no natural extracellular route of transmission has been reported for these viruses. AIthough T and W dsRNAs (and their single-stranded RNAs) are not encapsi-

YEAST T AND W

RNAS

177

dated into viral particles, these dsRNAs reside peacefully in yeast cells like other mycoviruses in their host. Several questions arise if we compare T and W with other mycoviruses: (i) is there any advantage or disadvantage for T and W not being encapsidated?; (ii) why, if they are intracellular parasites, do mycoviruses need coat proteins but T and W do not?; and (iii) how are T and W RNAs protected against nucleolytic attack inside the cell? As discussed below, however, some of the features that distinguish these RNAs from other mycoviruses may be irrelevant; rather, the discussion suggests that T and W occupy a unique position among mycoviruses.

A. Advantages and Disadvantages of Not Being Encapsidated Maintenance of a capsid structure that is not needed is costly for the cell. So, from the point of view of the host, energy and cellular precursors for protein synthesis could be saved by its elimination. In turn, an extracellular route of transmission would require the virus to evolve methods of leaving and re-entering cells. The viral RNAs need less genetic information to survive inside the cell: all the information that remains in these RNAs is for their own replication. Thus, the system seems to be very economical. This may explain why such high copy-numbers of 23-S and 20-S RNAs are found in yeast cells without any apparent deleterious effect. But the vulnerability of naked RNAs to RNases must have forced the viral RNA to adopt a highly organized spatial configuration (see Section VIII, B). This adaptation costs the virus its freedom in the choice of its nucleotide sequence for protein coding. This could be the reason that most mycoviral RNAs are indeed found encapsidated. Furthermore, this limitation may explain why viroids, which are naked RNAs with very organized structures, have no coding capacity for proteins.

B. Role of Coat Proteins in Mycoviruses Capsids in mycoviruses may have several functions: they can protect the genomic RNAs from the surrounding environment, perhaps especially from the RNases present in the cytoplasm. Even though dsRNAs are relatively resistant to RNases under certain conditions, single-stranded RNAs are quite sensitive. The 20-S and 23-S RNAs are single-stranded RNAs, but have very strong secondary structures. The FOLD program (51) was used to predict the secondary structure of 20-S RNA. The result was a highly branched structure with AG = -1033 kcalfmol(31). The 23-S RNA also has a highly branched structure with a AG = - 1085 kcal/mol (L. M . Esteban and R. Esteban, unpublished). In both cases, the free energies found were smaller than predicted for “shuffled” RNAs with the same GC content (58-59%); probably, these RNAs are tightly packed, with few regions exposed to nucle-

178

ROSA ESTEBAN ET AL.

ases; in this way they could be more resistant to RNases. This highly organized spatial configuration thus could substitute for a protein coat. However, the existence of some protein subunits helping to hold or protect the RNAs cannot be ruled out. Mycoviruses have their RDRPs present in viral particles, which is essential for their replication cycles. Thus, the viral coat provides the space where the RNA polymerase and its template are highly concentrated. This high concentration could be critical for the polymerase to function. For example, in an in vitro system, the transcriptase of yeast L-A virus shows a low affhity (but high specificity) toward its template dsRNA (57).Therefore, it requires a high concentration of template RNA for the reaction. However, in normal circumstances, the enzyme sees only the template encapsidated in the same particle. So it does not require a high affinity toward the template. The RNA polymerases of T and W are not encapsidated with their templates into viral particles. In the cytoplasm, a milieu crowded with other RNAs and macromolecules, these RNA polymerases should be able to find their templates. Our preliminary Northwestern experiments show that T’s putative polymerase (p104) has single-stranded RNA-binding activity, specific for T (+) and (-) strands (L. M . Esteban, T. Fujimura and R. Esteban, unpublished). So, if T and W RNA polymerases have sufficient affinities to find their templates in the cytoplasm, this property would fulfill part of the functions provided by protein coats. Furthermore, like other mycoviruses, T and W have no extracellular route of transmission. This will also make it unnecessary for them to have the viral coat or envelope, which most viruses need to leave or enter the host. Therefore, T and W may be such highly evolved mycoviruses that they do not require the viral coat but only their RNA polymerase for replication.

C. dsRNA Viral Genomes in Vesicles There are some dsRNAs that are not encapsidated into viral particles; rather, they are associated with intracellular vesicles. The dsRNAs associated with hypovirulence in the chestnut blight fungus Cryphonectriu parusitica (58)is one of such examples. The larger RNA (about 12.7 kb in size) has recently been cloned and sequenced (59).It encodes two polypeptides, one of which, after auto-catalytic cleavage, could originate an RDRP. In fact, RNA polymerase activity appears to be associated with those vesicles (58).A similar case is the 16.7-kb dsRNA associated with the “447” male sterility trait in the plant Vicia fuuu. The dsRNA has been reported to be associated with membranous vesicles, which have a specific RDRP activity (60). Thus, vesicles could substitute for a viral coat in the sense that they can protect the sequestered viral RNA from RNases. And if the RNA polymerase and other proteins essential for viral replication are also sequestered in the same vesi-

YEAST T AND W

RNAS

179

cles, the vesicles would provide the place where the RNA template and its replication enzymes are highly enriched. However, the physical properties of the vesicles would be quite different from those of a viral coat. This gives raise to questions concerning the traffic of materials through the vesicular membranes. For example, how are the viral sense strands transported to the cytoplasm to be translated by the ribosomes? and how are, in turn, these translated proteins, including the RNA polymerase, incorporated into vesicles? and so on. These questions, in addition to the standard questions about their replication mechanism, remain to be answered.

IX. Evolutionary Origin of T and W Figure 1 shows the genomic organization of T and W (+) strands and the regions conserved in their proteins. The regions boxed in both proteins are organized in the same linear order with similar spacing among them; in only one gap (between motifs D and 3) does T protein have a larger spacing segment than W. This raises the possibility that there was a common ancestor from which both T and W evolved. During the process, T might have gained extra nucleotide sequences by insertions, or alternatively W might have lost sequences by deletions. As mentioned above, the RNA polymerases of T and W are more closely related to RNA bacteriophages than to those of any other RNA viruses. Furthermore, the putative replication pathway shown in Fig. 8B is quite analogous to Qp’s replication cycle (61).These lines of evidence may suggest that T and W arose from one original ancestor (+) strand virus that lost its infectious capacity and concomitantly its coat proteins; later on, it evolved into the two species that we find, at present, in yeast strains.

X. Conclusions and Perspectives Cloning and sequencing of W dsRNA and 20-S RNA performed independently by us and others (31)led to the unexpected finding that these RNAs were dsRNA and (+) strand RNA forms of the same extrachromosomal genetic entity. W encodes its own RDRP, which is responsible for the RNARNA replication pathway. Cloning and sequencing of another yeast dsRNA, T, indicated that T’s putative polymerase has extensive homology to W’s polymerase. And the further finding of 23-S RNA as T’s single-stranded RNA counterpart led us to propose a new RNA family in yeast. These RNAs are quite different from other dsRNA mycoviruses of the same host, since T and W are not encapsidated into viral particles.

180

ROSA ESTEBAN E T AL.

The non-encapsidated nature of T and W raises our interest, particularly about their replication mechanisms and about how these RNAs are protected from host RNases within the cell. Since induced cells can easily accumulate more than 10,000copies of 20-S and 23-S RNAs per cell, it is also interesting to investigate the mechanism and the metabolism which allow the accumulation of these RNAs in such large quantities. We hope that knowledge about these interesting RNA viruses will add another chapter to the versatile and fertile RNA world. REFERENCES 1. R. B. Wickner, in “The Molecular Biology of the Yeast Saccharornyces” (J. R. Broach, E. W.

Jones and J. R. Pringle, eds.), Vol. 1, pp. 263-296. CSHLab, Cold Spring Harbor, New York, 1991. 2 . H . Bussey, Yeast 4, 17 (1988). 3. M. Wesolowski and R. B. Wickner, MCBiol 4, 181 (1984). 4 . N. Rodriguez-Cousino, L. M. Esteban and R. Esteban, J B C 266, 12772 (1991). 5 . L. M . Esteban, N. Rodriguez-Cousirio and R. Esteban, JBC 267, 10874 (1992). 6 . N. Skipper, D. Y. Thomas and P. C. K. Lau, E M B O J . 3, 107 (1984). 7 . D. E. Georgopoulos, E. M . Hannig and M. J. Leibowitz, in “Extrachromosomal Genetic Elements in Lower Eukaryotes” (R. B. Wickner, A. Hinnebusch, A. M. Lambowitz, 1. C. Gunsalus and A. Hollaender, eds.), pp. 203-213. Plenum, New York, 1986. 8. T. Icho and R. B. Wickner, JBC 264, 6716 (1989). 9. M . E. Diamond, J. J. Dowhanick, M . E. Nerneroff, D. F. Pietras, C. Tu and J. A. Bruenn, J. Virol. 63, 3983 (1989). 10. S. S. Sommer and R. €3. Wickner, Cell 31, 429 (1982). 11. J. A. Bruenn, NARes 19, 217 (1991). 12. 6. Kamer and P. Argos, NARes 12, 7269 (1984). 13. P. Argos, NARes 16, 9909 (1988). 14. 0. Poch, I. Sauvaget, M. Delarue and N. Tordo, E M B O J . 8, 3867 (1989). 15. Y. Xiong and T. H . Eickbush, E M B O J 9, 3353 (1990). 16. E. V. Koonin, 1. Gen. Virol. 72, 2197 (1991). 17. J. L. Bennetzen and B. D. Hall, JBC 257, 3026 (1982). 18. J. P. Holland and M . J. Holland, JBC 254, 9839 (1979). 19. P. M. Sharp aiid E. Cowe, Yeast 7, 657 (1991). 20. J. F. Collins and A. F. W. Coulson, in “Molecular Evolution: Computer Analysis of Protein and Nucleic Acid Sequences” (R. F. Doolittle, ed.), pp. 474-487. Academic Press, San Diego, 1990. 21. D. Lazinski, E. Grzadzielska and A. Das, Cell 59, 207 (1989). 22. N. C . Franklin, J M B 181, 85 (1985). 23. D. G . Higgins and P. M. Sharp, Gene 73, 237 (1988). 24. N. Habili and R. H . Symons, NARes 17, 9543 (1989). 25. E . V. Koonin, G. H . Choi, D. L. Nuss, R . Shapira and J. C. Carrington, PNAS 88, 10647 (1991). 26. K. Kadowaki and H. 0. Halvorson, J . B a t . 105, 826 (1971). 27. K. Kadowaki aiid H. 0. Halvorson, J. Bact. 105, 831 (1971). 28. P. J. Wejksnora and J. E. Haber, J . Bact. 134, 246 (1978).

YEAST T AND W

RNAS

181

B. Garvik and J. E. Haber, J. B a t . 134, 261 (1978). Y. Matsumoto, R. Fishel and R. B. Wickner, PNAS 87, 7628 (1990). Y. Matsumoto and R. B. Wickner, JBC 266, 12779 (1991). J. Conde and 6. R. Fink, PNAS 73, 3651 (1976). R. H. Symons, TZBS 14, 445 (1989). B. J. Saville and R. A. Collins, Cell 61, 685 (1990). B. J. Saville and R. A. Collins, PNAS 88, 8826 (1991). L. M. Epstein and J. G. Gall, Cell 48, 535 (1990). K. S. Wang, Q. L. Choo, A. J. Weiner, H.-J. Ou, R. C. Najarian, R. M. Thayer, G . T. Mullenbach, K. J. Denniston, J. L. Gerin and M. Houghton, Nature 323, 508 (1986). 38. R. A. Owens and R. W. Hammond, in “RNA Genetics: 11. Retroviruses, Viroids and RNA Recombination” (E. Domingo, J. J. Holland and P. Alhquist, eds.), pp. 107-125. CRC Press, Boca Raton, Florida, 1988. 39. N. P. Higgins and N. R. Cozzarelli, in “Methods in Enzymology” (R. Wu, ed.), Vol. 68, p. 50. Academic Press, New York, 1979. 40. N . Rodriguez-Cousifio and R. Esteban, NARes 20, 2761 (1992). 41. J. N . Bausch, F. R. Kramer, E. A. Miele, C. Dobkin and D. R. Mills, JBC 258, 1978 (1983). 42. W. A. Miller, J. J. Bujarski, T W. Dreher and T. C. Hall, JMB 187, 537 (1986). 43. J. A. Bruenn and V. E. Brennan, Cell 19, 923 (1980). 44. D. J. Thiele, E. M. Hannig and M. J. Leibowitz, MCBiol 4, 92 (1984). 45. Y. Inokuchi, A. B. Jacobson, T. Hirose, S. Inayama and A. Hirashima, NARes 16, 6205 (1988). 46. Y. Inokuchi, R. Takahashi, T. Hirose, S. Inayama, A. B. Jacobson and A. Hirashima, J , Biochem. 99, 1169 (1986). 47. T. Kogoma, J . Bmt. 166, 361 (1986). 48. R. Karwan and W. Wintersberger, JBC 263, 14970 (1988). 49. R. H. Persson and R. D. Macdonald, J . Virol. 44, 437 (1982). 50. T. E. England, A. G. Bruce and 0. C. Uhlenbeck, in “Methods in Enzymology” (L. Grossman and K. Moldave, eds.), Vol. 65, p. 65. Academic Press, New York, 1980. 51. M. Zucker and P. Stiegler, NARes 9, 133 (1981). 52. R. E. F. Matthews, in “Plant Virology,” 3rd Ed. Academic Press, San Diego, 1991. -53. W. R. Widner, Y. Matsumoto and R. B. Wickner, MCBiol 11, 2905 (1991). 54. N. J. Bentley, I. T. Fitch and M. F. Tuite, Yeast 8, 95 (1992). 55. R. E. Susek and S. L. Lindquist, MCBiol 9, 5265 (1989). 56. T. 0. Diener, Virology 45, 411 (1971). 57. T. Fujimura and R. B. Wickner, JBC 264, 10872 (1989). 58. D. R. Hansen, N. K. Van Alfen, K. Gillies and W. A. Powell, J. Gen. Virol. 66,2605 (1985). 59. R. Shapira, G. H. Choi and D. L. N u s , E M B O I . 10, 731 (1991). 60. A. Lefebre, R. Scalla and P. Pfeiffer, Plant Mol. B i d . 14, 447 (1990). 61. C. Priano, F. R. Kramer and D. R. Mills, CSHSQB 52, 321 (1987). 62. W. Fiers, R. Contreras, F. Duerinck, 6. Haegeman, D. Iserentant, J. Merregaert, W. Min You, F. Molemans, A. Raeymaekers, A. Van den Berghe, G. Volckaert and M. Ysebaert, Nature 260, 500 (1976). 63. V. R. Racaniello and D. Baltimore, PNAS 78, 4887 (1981). 64. A. M. Lindberg, P. 0. K. Staalhandske and U. Pettersson, Virology 156, 50 (1987). 65. G. Stanway, P. J. Hughes, R. C. Mountford, P. D. Minor and J. W. Almond, NARes 12, 7859 (1984). 66. A. R. Carroll, D. J. Rowlands and B. E. Clarke, NARes 12, 2461 (1984). 67. A. C. Palmenberg, E. M. Kirby, M. R. Janda, N . L. Drake, 6. M. Duke, K. F. Potratzand M . S. Collett, NARes 12, 2969 (1984). 29. 30. 31. 32. 33. 34. 35. 36. 37.

182

ROSA ESTEBAN ET AL.

68. D. C. Pevear, M. Calenoff, E. Rozhon and H. L. Liptou, J. Virol. 61, 1507 (1087). 69. R. Najarian, D. Caput, W. W. Gee, S. J. Potter, A. Renard, J. Merryweather, 6. Van Nest and D. Dina, PNAS 82, 2627 (1985). 70. 6 . P. Lomonossoff and M. Shanks, E M B O J . 2, 2253 (1983). 71. R. Allison, R. E. Johnson and W. 6 . Dougherty, Virology 154, 9 (1986). 72. L. L. Domier, K. M. Franklin, M. Shahabuddin, G. M. Hellman, J. H. Overmeyer, S . T Hiremath, M. F. E. Siaw, 6. P. Lomonossoff, J. 6 . Shaw and R. E. Rhoads, NARes 14, 5417 (1986). 73. M . M . Morgan, I . 6. Macreadie, V. R. Harley, P. J. Hudson and A. A. Arad, Virology 163, 240 (1988). 74. L. Mindich, I. Nemhauser, P. Gottlieb, M. Romantschuk, J. Carton, S. Frucht, J. Strassman, D. H. Bamford and N. Kalkkinen, /. Virol. 62, 1180 (1988). 75. J. R. Wiener and W. K. Joklik, Virology 169, 194 (1989). 76. J. Cohen, A. Charpilienne, S. Chilmonczyk and M. K. Estes, Virology 171, 131 (1989). 77. P. Roy, A. Fukusho, 6. 11. Ritter and D. Lyon, NARes 16, 11759 (1988). 78. M. D. Morch, J. C. Boyer and A. L. Haenni, NARes 16, 6157 (1988). 79. E. Castle, U. Leidner, T. Now& G. Wengler and G. Wengler, Virology 149, 10 (1986). 80. C . M . Rice, E. M. Lenches, S. K . Eddy, S. J. Shin, R. L. Sheets and J. H. Strauss, Science 229, 726 (1985). 81. P. Goelet, 6. P. Lomonossoff, P. J. G . Butler, M. E. Akam, M . J. Gait and J. Karn, PNAS 79, 5818 (1982). 82. W. D. 0. Hamilton, M . Boccara, D. J. Robinson and D. C. Baulcombe, /. Gen. Virol. 68, 2563 (1987). 83. P. Ahlquist, R. Dasgupta and P. Kaesberg, J M R 172, 369 (1984). 84. M . A. Rezaian, R. H. V. Williams, K. H. J. Gordon, A. R. Could and R. H. Symons, EJB 143, 277 (1984). 85. 8. J. C. Cornelissen, F. T. Brederode, 6. H. Veeneman, J. H. van Boom and J. F. Bol, NARes 11, 3019 (1983). 86. C. M . Rice and J. H. Strauss, PNAS 78, 2062 (1981). 87. H. Garoff, A. M. Frischauf, K. Simons, H. Lehrach and H. Delius, PNAS 77, 6376 (1980). 88. S. Bouzoubaa, L. Quillet, H. Guilley, 6. Jonard and K. Richards, I . Gen. Virol. 68, 615 (1987). 89. W. A. Miller, P. M. Waterhouse and W. L. Gerlach, NARes 16, 6097 (1988). 90. H. Guilley, J. C. Carrington, E. Balazs, G. Jonard, K . Richards and T. J. Morris, NARCS 13, 6663 (1985). 91. I>. M . Rochon and J. H. Tremaine, Virology 169, 251 (1989). 92. F. Grieco, J. Burgyan and M. Russo, NARes 17, 6383 (1989).

Mechanism of Action and Regulation of Protein Synthesis initiation Factor 4E: Effects on mRNA Discrimination, Cellular Growth Rate, and Oncogenesis ROBERT E. RHOADS,' SWATI JOSHI-BARVE~ AND CARRIE RINKER-SCHAEFFER3 Department of Biochemistry and Molecular Biology Louisiana State Unioersity Medical Center Shreoeport, Louisiana 71130

I. Regulation of Initiation .................................... 11. mRNA Binding to Ribos ................................... 111. Isolation and Structural Characterization of eIF-4E . . . . . . . . . . . . . . . . IV. Activities of eIF-4E . . . . . . . ......................... A. Binding of eIF-4E to Caps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Binding of eIF-4E to the 40-S Ribosome . . . . . . . . . . . . . . . . . . . . . . C. Binding of eIF-4E to eIF-4A .... V. Regulation of eIF-4E Activity by Phosphorylation . . . . . . . . . . . . . . . . . . VI. Alteration of Intracellular Levels of A. Overexpression of eIF-4E . . . . B. Underexpression of eIF-4E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII. Summary, Conclusions, and Future Directions . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

184 186 188 191 191 193 197 197

207 213 215

Protein synthesis is one of the principal metabolic processes of the eukaryotic cell. A recent study estimated that 80% of the total oxygen consumption of isolated hepatocytes is attributable to protein synthesis (1).It is To whom correspondence may be addressed. Current address: Department of Surgery, University of Kentucky Medical Center, Lexington, Kentucky 40536. Current address: Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21231. 1

Progrecc in Niicleic Acrd Research and \lolecular Biolnw, \ol 46

183

Copyright Q 1993 b y Academic Press. Inc All rights nf reprodiictiori in any farm reservd

184

ROBERT E. RHOADS ET AL.

also a major point for the control of gene expression (2). Cellular mRNAs vary over a 100-fold range in their translational efficiencies (3), and the relative translation of different mRNAs varies with the overall rate of protein synthesis (4). Thus, because nearly all catalytic activities are performed by proteins, the rate of protein synthesis impacts upon every aspect of metabolism, growth, development, and differentiation. It is therefore reasonable to expect this rate to be carefully controlled. The purpose of this article is to review the progress in our own laboratory and those of others in understanding one point at which regulation of the rate of protein synthesis occurs, the mRNA-binding step. Our studies have concentrated on the structure, function, and regulation of protein synthesis initiation factor 4E (eIF-4E).4 The results suggest that external stimuli such as mitogens, growth factors, and heat shock can alter the protein synthesis rate through a direct effect on eIF-4E activity, mediated by its phosphorylation. Changing the protein synthesis rate can, in turn, alter the spectrum of mRNAs translated, the cellular growth rate, and the phenotype of the cell.

1. Regulation of Initiation Most regulatory mechanisms thus far discovered operate at the initiation stage rather than at elongation or termination stages. Within the initiation phase, two sites of regulation have been the most extensively investigated. The first is at the level of initiator tKNA binding to the 40-S ribosome (Step 1 in Fig. 1) and involves the inhibition of eIF-2 by phosphorylation of the a subunit. Regulation at this step occurs in such “catastrophic” situations as heat shock, virus infection, altered pH, chemical poisoning, or the deprivation of amino acids, glucose, heme, or iron (5).Regulation also occurs at the point of mRNA binding to the 40-S ribosome (Step 2 in Fig. l),which is catalyzed by initiation factors of the eIF-4 group. This appears to be the form of regulation that predominates during “non-catastrophic” conditions, e. g., the stimulation of cell growth by mitogens, growth factors, and fertilization, and the reduction of protein synthesis during mitosis (6). It is logical that “normal” regulation would occur at Step 2 of Fig. 1 because mRNA binding is normally rate-limiting for protein synthesis (7, 8). The limiting factor is not mRNA but rather some other component of the protein-synthesis machinery. For this reason, there is competition among mKNAs for binding to the 40-S ribosome. Those mRNAs with higher rateAbbreviations used: eIF-, eukaryotic initiation factor; UTR, untranslated region; ORF, open reading frame; HSP, heat-shock protein; SDS-PAGE, polyacrylamide gel electrophoresis in the presence of sodium dodecyl sulfate; TCDD, tetrachloro-p-dioxin; AS, antisense.

MECHANISM AND REGULATION OF

met

mRNA

Initiator tRNA

L

A

eIF-4

elF-2 40s Subunit

185

eIF-4E

43s Complex

48s Complex

(Step1

80s Complex

(Step1

FIG. 1. Schematic model for initiation of protein synthesis. The designation of eIF-2, eIF-4, and eIF-5 as the catalysts for each step indicates, in some cases, classes of initiation factors. Factors involved in several steps, or whose stages of involvement are not known, are not listed. (For a more complete description, see 2 and 13.)

constants of initiation are translated preferentially. Lodish (4) derived a kinetic model for this phenomenon and obtained the surprising result, verified experimentally, that changing the overall rate of protein synthesis changes the ratio of translation of individual mRNAs. He proposed that the ratelimiting component of protein synthesis, as yet unidentified but termed R*, controls the rates of translation of mRNAs. At a given concentration of R*, certain “strong” messengers, which have high rate-constants of initiation, are optimally translated, while others with lower rate-constants (“weak messengers) are suboptimally utilized. As the concentration of R* is increased, proteins encoded by weak mRNAs account for an increasingly greater fraction of the total protein synthesized. Conversely, as R* decreases, one reaches a point at which only a few cellular mRNAs account for most of the proteins synthesized. The determinants of mRNA efficiency are not well understood, but factors such as secondary structure in the 5’-UTR,4 interactions between 5’-UTR and either coding region or 3’-UTR, sequence context of the initiation codon, upstream ORFs, and the use of non-AUG initiation codons modulate translational efficiency (reviewed in 9). Examples of the evergrowing list of such translationally regulated inRNAs are given in Table I. Collectively, they code for an interesting group of proteins: components of growth-regulatory pathways such as proto-oncoproteins and growth factors, components of the translational machinery itself, and enzymes involved in cell division. All of these proteins share the property of being “growthregulated,” that is, synthesized in response to growth stimuli (10).Thus, although many mRNAs undoubtedly have specialized mechanisms for their translational control (II), it is evident that a global increase in proteinsynthesis rate will result in a proportionately greater translation of these growth-regulated weak mRNAs merely by virtue of their position in the kinetic hierarchy.

186

ROBERT E . RHOADS ET AL.

TABLE I SOME TRANSLATIONALLY REPRESSED mRNAs

INVOLVED IN CELLULAH GROWTH

~~~

Protein encoded Protooncogene products c-myc c-myc c-myc

c-lck c-sis

c-fos Growth factors, cytokines Transforming growth factor-fi3 Interferon-fi Tumor necrosis factor GM-CSF Proteins involved in DNA replication, cell division Ribonucleotide reductase Ornithine decarboxylase Ornithine decarboxylase Cyclin A Protamine 1 S-Adenocylmethionine decarboxylase

Human hlouse Xenopus Mouse Human Mouse

99 100 101 102 103 104

Human Human Human Human

105 106 107 104

Clam oocyte Rat Bovine Clam oocyte Mouse Bovine

108 109 110 108 111 112

hlouse Drosophila Dictyostelium Bovine

86 113 114 115

Ribosomal proteins

L32 rp21, rp49 rp1024 L32, L30, S16

II. mRNA Binding to Ribosomes The mechanism by which mRNA is recruited into actively translating polysomes has not yet been fully elucidated. However, a number of critical observations are briefly summarized below. (Refs. 12 and 13 have a more complete description.) Unlike prokaryotic initiation, eukaryotic initiation requires ATP (14). A "cap" residue, consisting of 7-methylguanosine linked 5'-to-5' to the first encoded nucleotide of mRNA by a triphosphate bridge, is found on all cellular (i.e., nonviral) mRNAs and stimulates their translation (15).In most instances, the 403 ribosome binds at or near the 5' terminus of the mRNA and scans in the 3' direction until the first AUG is encountered, at which point the 60-S ribosome joins and the first peptide bond is formed (Step 3 in Fig. 1; reviewed in 9). Most eukaryotic mRNAs are thought to

MECHANISM AND REGULATION OF

eIF-4E

187

follow the scanning mode of initiation, but translation of picornaviral RNAs follows a cap-independent pathway (reviewed in 16). Such RNAs undergo internal initiation, which requires structural elements in the 5’-UTR (called internal ribosome entry sites). Certain cellular mRNAs, e.g., those of heatshock proteins, may follow this pathway as well (17, 18). A number of initiation factors are involved in the mRNA-binding step, the individual roles of which have been only partially elucidated. eIF-3 is a complex of at least seven polypeptides totaling 520 kDa (19), is present on the 43-S initiation complex in a 1:l stoichiometry with initiator tRNA prior to mRNA binding, and is required for mRNA binding (20, 22). Polypeptides of the eIF-4 group are also required for mRNA binding but it is not yet clear which, if any, are present on the 43-S initiation complex. eIF-4A is a 46kDa polypeptide that possesses RNA-dependent ATPase activity and ATPdependent RNA-binding activity and appears to function as an RNA helicase (22). eIF-4B is an 80-kDa polypeptide that has a consensus RNA binding domain (23);it greatly stimulates the helicase activity of eIF-4A (24), and may participate in recycling of eIF-4E between rounds of initiation (25). eIF-4E is a 25-kDa protein that specifically binds the mRNA cap (26,27)and is transferred to the 43-S complex simultaneously with mRNA (28). eIF-4E exists both free and in high-molecular-weight complexes (29). Purification of the latter yields a complex containing eIF-4A, eIF-4B, eIF-4E, and a new polypeptide of 220 kDa, initially termed p220 (30-32) but subsequently renamed eIF-4y (33, 34). Further purification eliminates eIF-4B, and the remaining complex of eIF-4A, eIF-4E, and eIF-4y was named eIF-4F (subsequently renamed eIF-4; 33). All four polypeptides are required for maximal unwinding of mRNA secondary structure in a cap- and ATP-dependent fashion (35).Translation of capped mRNAs is severely restricted when the eIF-4y polypeptide is proteolytically cleaved during infection by some picornaviruses (30, 36), although the correlation between cleavage and inhibition of initiation has been questioned (37, 38). How is the mRNA-binding step regulated? A number of the initiation factors are reversibly phosphorylated (reviewed in 13), and in the case of 43S complex formation (Fig. 1, Step 1),the phosphorylation of eIF-2a is clearly established as the regulatory switch. In the case of 4 8 4 complex formation (Fig. 1, Step 2), although e I F 4 y and eIF-4B are also known to be phosphorylated, it is the phosphorylation of eIF-4E that is the best understood. Several lines of evidence, reviewed in Section V, link eIF-4E phosphorylation with the rate-determining step of protein synthesis. In all of these studies, however, it is not yet known whether eIF-4E per se is the target for phosphorylation, as opposed to eIF-4E which is in the eIF-4 complex. One study indicates, in fact, that the latter is more readily phosphorylated (39).

188

ROBERT E. RHOADS ET AL.

111. Isolation and Structural Characterization of el F-4E Mammalian eIF-4E can be isolated from the ribosomal salt wash (27, 40) but is present in larger quantities in the postribosomal supernatant (41, 42). The protein from the ribosomal salt wash is partially complexed with various other initiation-factor polypeptides. Although a variety of complexes have been isolated, it is not yet clear whether these are stable species or transient intermediates in the initiation process. Nor are the interactions among individual polypeptides well defined. However, from the various purification schemes and the isolation of particular complexes, a relative order of affinities of factors within the cap recognition machinery can be approximated. eIF-3 is the first removed (27, 43, 44), followed by eIF-4B (31),eIF-4A (35, 45, 46), and finally eIF-47. The eIF-4E from the postribosomal supernatant appears to be free ofassociation with other proteins. In wheat germ, there is no factor eIF-4E per se, since the polypeptides having cap-binding activity are always isolated in complexes with other proteins and are named as subunits of those complexes (47, 48). Also, wheat germ contains two distinct but functionally interchangeable complexes: eIF-4F, with polypeptides estimated by SDS-PAGE4 to be 220 and 26 kDa, and eIF-(iso)4F, with polypeptides of 80 and 28 kDa (49). Cap-binding proteins are readily purified by affinity chromatography using cap analogs bound to a solid support, e.g., m7GDP-Sepharose (40) and m7GTP-Sepharose (42). For structural work, more purified preparations can be obtained by reversed-phase liquid chromatography (50). The primary sequence of eIF-4E from a number of species has been determined (Fig. 2). All three mammalian eIF-4E’s are 217-amino-acid residues in length, or 25,117 Da in the case of the human protein, and have nearly identical sequences. Yeast and wheat proteins are less similar to the mammalian forms, but surprisingly, mouse eIF-4E can substitute for yeast eIF-4E in vivo (51). Figure 2 reveals that many amino-acid residues are conserved in all six sequences. Particularly notable are the Trp residues, in which eIF-4E is unusually rich (52);all eight Trp residues found in mammalian eIF-4E are conserved in all six sequences (marked with asterisks). The fact that the most highly conserved regions are centrally located whereas the N- and C-termini are more variable suggests that a central core region is responsible for those functions of eIF-4E that are common across species (i. e., cap binding and ribosome binding, but not necessarily regulation of activity). Another conserved structural feature within this putative central core region of the three mammalian proteins is the presence of four Cys residues (Fig. 2). Two techniques, modification with [ 1-14CIiodoacetic acid and SDS-

........10.. ......20.. ......30.. ......40.. ...... 5 0 KFEENVSVDDTTATPKTVLSDSAHFD PVAATETP--EVAAEGDAGAA- -EAK

FKN FKN FKN TKP DIQ DNP

Human Mouse Rabbit Yeast Wheat 2 8 Wheat 2 6

....60.. .....

*

*

.......llO.. ..... 1 2 0 . . .....130.. .....140 .......150

*

*

Human Mouse Rabbit Yeast Wheat 2 8 Wheat 2 6

*

....... 1 6 0 . . .....170.. .... .180.. ....,190.. .....200

1

LPPKIVIGYQS LPPKIVIGYQS LPPKIVIGYQS LTDDGHLEFFPI YNDKMV--YSFI YKDS--1GFI

.... .210.. .. 2

% identity

Human Mouse Rabbit Yeast Wheat 2 8 Wheat 2 6

--

FIG 2 Amino-acid sequences of eIF-4E from various souices. Amino-acid residues identical in all six sequences are shown in a black box. Those identical in five sequences are shown in gray Conseived Trp residues dre marked with an asterisk “% identity” referi to compdrisons with t h e human sequence. The sequences are described in the following pub~icdtions.human (S2), mouse (Sl),rabbit (126), yeast (123),wheat 28 [the 28-kDa wbunit of eIF-(iso)4F], (124), and wheat 26 (the 26-kDa subunit of eIF-4F), (125)

190

ROBERT E. KHOADS ET AL.

h

0

n Y

67

-

Y

cn in

a

E a

53

44-

30-

u

w

J

0

20-

I I I

I 1

5.5

I

I

I

I

1 1

6 .O

1

1 1 1

6.5

I

I

I

I I 7.0

1 1 1 1 1 1

7.5

PH FIG. 3. Analysis of eIF-4E isoforms. Affinity-purified human eIF-4E was separated on a two-dimensional gel in denaturing conditions. The first dimension was isoelectric focusing in Pharmalyte 5-8; the second, a 12% sodium dodecyl sulfate-polyacrylamide gel. Staining was with silver. [Reprinted with permission from the lournal of Biological Chemistry (50).]

PAGE after oxidation or reduction, indicate that the native protein is intermediate between fully oxidized and fully reduced states (50).This suggests that one pair of Cys residues forms a disulfide bridge, whereas the other exists as two sulfhydryls. In fact, the protein migrates as a doublet on SDSPAGE under standard conditions; reduction with 100-mM dithiothreitol is required to obtain a single band (50). Human erythrocyte eIF-4E exists in at least five forms differing in isoelectric point (Fig. 3; 50). Two major forms are found in rabbit reticulocyte (50), HeLa (53, 54), and other tissues. One of the major determinants for this charge heterogeneity is phosphorylation; the two major forms, with pl’s of 5.9 and 6.3, apparently differ from each other by the presence or absence of one phosphate group (50, 54) at Ser-53 (55). The origin of the other isoforms of eIF-4E is less clear. Metabolic labeling with 32P indicates that, although 85% of the radioactivity in eIF-4E is associated with the pl 5.9 species, the remaining 15%is the pl 6.1 and 5.7 species (50). This could either mean multiple phosphorylations or other sources of charge heterogeneity in a monophosphorylated protein. Deamidation of Gln or Asn would be an attractive possibility for introducing additional negative charges into the molecule, but in zjitro experiments make

MECHANISM AND REGULATION OF

eIF-4E

191

that unlikely. When eIF-4E is synthesized in a reticulocyte translation system using a synthetic transcript as mRNA, five forms are produced with isoelectric points similar to those observed in vivo (28). All five forms are present by 30 minutes, and over several hours, the more acidic forms increase at the expense of the more basic ones. These kinetics are more consistent with phosphorylation than deamidation. Further experiments indicate .that the charge heterogeneity is not due to partial N-terminal acetylation. Depletion of acetyl-CoA during translation produces two new basic forms, suggesting that all of the in uitro-synthesized eIF-4E is normally N-acetylated (56).Additional evidence for multiple phosphorylations comes from two sources. When the Ser-53 is changed to Ala-53 by site-directed mutagenesis of the cDNA, the protein continues to be phosphorylated, albeit more slowly, suggesting other sites of phosphorylation (56).Also, the use of the phosphatase inhibitor okadaic acid with intact cells causes hyperphosphorylation of eIF-4E (57, 58), suggesting phosphorylation at sites not normally detected due to rapid dephosphorylation. Little is presently known about the secondary and tertiary structures of eIF-4E. Circular dichroism measurements estimate that 14% of the protein is a-helix, 47% is (3-sheets, and 13% is (3-turns (59).Consistent with the idea that there is a central core protein that retains cap-binding activity, a lowermolecular-weight form of eIF-4E has been identified (53) and shown to correspond to approximately residues 47-182 (i. e., the central two-thirds of the molecule; 60). This protein binds to m7GTP-Sepharose and can be labeled with cap analog photoaffinity probes (60), suggesting that the cap binding site is intact.

IV. Activities of elF-4E The eIF-4E protein does not catalyze the making or breaking of bonds, so it does not have, per se, an enzymatic activity. Nonetheless, a number of biochemical and cell biological actions can be demonstrated for eIF-4E, apart from those that are properties of complexes containing eIF-4E. These are (i) cap binding, (ii) ribosome binding, (iii) eIF-4A binding, (iv) the ability to induce quiescent cells to enter S-phase, and (v) the ability to cause certain cell types to lose growth control, grow in soft agar, and form tumors in nude mice. The binding activities are treated in this section, the cell biological activities, in subsequent sections.

A. Binding of elF-4E to Caps eIF-4E is the only initiation factor thus far demonstrated to interact with the cap as an isolated polypeptide, and it does so in the absence of ATP (27, 72). Binding to caps has been demonstrated by cross-linking of eIF-4E to

192

ROBERT E. HHOADS ET AL.

0

300

350

1

400 Wavelength (nrn)

2

3

450

FIG.4. Determination of cap-binding affinity of eIF-4E by fluorescence qnenching. The fluorescence emission spectra of eIF-4E excited at 258 nm was determined at various concentrations of m7GTP (increasing top to bottom from 0 to 22 pM).(Inset) Eadie-Hofstee plot of fluorescence data. AF was calculated at 330 nm where AF = Fe,F-lE- FrlF.IE+m:CTP. [Reprinted with permission from Biochemistry (67). Copyright 0 1989 American Chemical Society.]

periodate-oxidized cap-labeled reovirus mRNA (26), reversible binding to capped oligonucleotides (27, 42, 61-64), photochemical cross-linking to unmodified mRNA (65),affinity labeling with a photo-active cap analog (60, 66), and binding to immobilized cap analogs (40, 42). The most quantitative results are obtained by measuring the quenching of tryptophan fluorescence in eIF-4E by cap binding (59, 67; Fig. 4), which shows that the affinity of protein for ligand ranges from 10s to 106 M - 1 , depending on the nature of the capped species, the binding conditions, and the nature of the protein. Concerning the capped ligand, binding requires the presence of the 7-methyl group (59). Additional methyl groups at position 2 of the guanine reduces the a n i t y (61, 68), and changing the substituent at position 7 to an aromatic R group increases it (68).Binding increases as nucleotide residues are added to the cap (61), and both eIF-4E and eIF-4F (mammalian) bind globin mKNA 5-fold stronger than m7GpppG (69).

MECHANISM AND REGULATION OF

eIF-4E

193

The nature of the adjacent oligonucleotide is important as well; the presence of “hairpins” and their positions relative to the cap can alter the a n i t y constant over a 5-fold range (62-64). Concerning binding conditions, the pH profile of binding varies with the pK, of the N-1 proton of guanine in various analogs (67, 68), supporting the hypothesis that only the enolate form of m7G binds to the protein (70). Binding decreases as the K+ concentration is raised, indicating varying contributions of electrostatic interactions for eIF-4E, eIF-4F, and eIF-(iso)4F (61, 67). Concerning the protein moiety, mammalian eIF-4E and eIF-4F bind capped ligands with the same affinity (69). Wheat germ eIF-4F binds cap analogs approximately twice as strongly as wheat germ eIF-(iso)4F, and the same is true for binding of these two factors to globin mRNA (62). Cap binding leads to a small but reproducible change in overall secondary structure of the protein (59).Amino-acid substitutions for the various conserved Trp residues affect both secondary structure and cap binding (59, 71).

B. Binding of elF-4E to the 40-S Ribosome A second activity of eIF-4E is binding to the 40-S ribosome, although it is not clear at this time what this means mechanistically. The observation that eIF-4E can be detected on the 48-S initiation complex (28, 56) may reflect its association with mRNA, eIF-47, eIF-3, the ribosome itself, or some other component. In order to demonstrate this activity, the eIF-4E cDNA was cloned into a plasmid that allows in uitro transcription of a synthetic mRNA. The predominant in uitro translation product was 25 kDa and behaved identically to authentic eIF-4E with regard to cap-binding activity. The in uitro-synthesized eIF-4E was phosphorylated in reticulocyte lysate, and both major isoforms (pl’s 5.9 and 6.3) appeared to have equal affinity for the mRNA cap structure, suggesting that cap binding is not affected by phosphorylation (28). Since the in uitro-synthesized eIF-4E appeared, by these criteria, to be indistinguishable from native eIF-4E, it was used as a tracer to determine the subcellular localization of eIF-4E. 3”S-labeled eIF-4E was purified by afEnity chromatography and added back to a translation reaction programmed with globin mRNA. The initiation complexes were separated by sucrose-gradient centrifugation and the radioactive eIF-4E in each fraction was determined. Since 48-S initiation complexes are normally undetectable, GMPPNP, a protein-synthesis inhibitor that causes accumulation of 48-S complexes, was added in the reactions. In the absence of GMPPNP, eIF-4E remained at the top of the gradient, trailing into the heavier fractions (Fig. 5A, squares). In the presence of GMPPNP, eIF-4E was detected in the 48-S region, as seen by the peak of radioactivity at fraction 16 (Fig. 5A, crosses). This assay is termed the ribosome-binding assay.

40s

+

n

E

200-

n.

60s aos

+ +

0

v

)r

.-> .5

150-

CI

100-

cd

.-0 u m

50-

11:

7

40s

60s 80s

200-

150-

100-

50

C:elF-4E

TOP

Glu

10

20

Fraction Number

Bottom

MECHANISM AND REGULATION OF

eIF-4E

195

To determine whether mRNA is required for the binding of eIF-4E to ribosomes, the assay was carried out in the same message-dependent lysate system with and without adding exogenous globin mRNA (Fig. 6). GMPPNP was maintained in all reactions to allow the detection of 48-S complexes. Essentially no eIF-4E was transferred to ribosomes in the absence of mRNA (Fig. 6B), in contrast to the mRNA-containing control (Fig. 6A). It was possible that occupation of the cap-binding site by m7GTP triggered a conformational change in eIF-4E that caused it to bind to the ribosome. Thus, we repeated the assay, except that m7GTP was added in place of globin mRNA (Fig. 6C). A small amount of binding was detected, suggesting that there may be some effect of m7GTP itself. The finding that eIF-4E is not present on the 40-S ribosome in the absence of mRNA suggests that it does not pre-exist on the 43-S initiation complex prior to mRNA binding (see Fig. 1). It was also of interest to determine whether ATP is required for the binding of eIF-4E to ribosomes. The cap-binding activity of eIF-4E does not require ATP (27, 72). However, the binding of the 40-S ribosomal subunit to the 5' end of the mRNA does require ATP (73).Since the binding of eIF-4E to ribosomes may actually be a measure of the binding of mRNA to ribosomes, it was possible that ATP would be required for this step as well. To investigate this possibility, the ribosome-binding assay was carried out in the presence and the absence of ATP. ATP was depleted by one of two methods. In the first, the ribosome-binding assay was carried out using a translation reaction containing the non-hydrolyzable ATP analog AMPPNP instead of ATP. In the second, the rabbit reticulocyte lysate was incubated with glucose and hexokinase to deplete ATP. This lysate was then used in a ribosomebinding assay containing no added ATP. GMPPNP was always present to prevent the 483 initiation complex from continuing to the 80-S complex (Fig. 1). In the presence of ATP, the 3%-labeled eIF-4E was detectable at the 4 8 4 region (Fig. 7A). When AMPPNP was substituted in the reaction, the signal at 48-S was reduced significantly (Fig. 7B). When ATP-depleted lysate was used, the peak of radioactivity in the 48-S region was completely elimiFIG. 5. Binding of eIF-4E to the 40-S ribosome and effect of altering the amino-acid residue at position 53. eIf-4E cDNA was modified by site-directed mutagenesis to produce variant forms of eIF-4E upon cell-free transcription and translation. 35S-labeled eIF-4Esvr-a, eIF-4EA'a-53,and eIF-4EG'"-53were synthesized, purified by affinity chromatography, and adde d to translation reactions programmed with globin mRNA, carried out either in the presence (crosses) or absence (squares) of G M P P N P . The samples were then analyzed by sucrose-density gradients and the radioactivity in the fractions determined. The positions of the ribosomal subunits and tnonosomes are indicated by arrows. (A) Control reactions containing e I F - 4 E s e r - s (the wild type); (B) reactions containing eIF-4EAIa-53; (C) reactions containing eIF-4E';lU-". [Panels (A) and (B) are reprinted with permission from the journal of Biological Chemistry (56).]

Bottom

Fraction Number

MECHANISM AND REGULATION OF

eIF-4E

197

nated (Fig. 7C). These results indicate that the transfer of eIF-4E to the ribosomes is dependent on ATP. There are several possible explanations for the ATP-dependence of eIF-4E transfer to the small ribosomal subunit (Fig. 8). As described in Section V, transfer of eIF-4E to the ribosome requires phosphorylation at Ser-53 (56). It is possible that the elF-4E synthesized in citro is dephosphorylated in the rabbit reticulocyte lysate and unable to be rephosphorylated in the absence of ATP (Model 1). The requirement for ATP could also stem from an interaction with eIF-4A (Model 2). In this scenario, eIF-4E binds the mRNA in the absence of ATP, but the eIF-4E:mRNA complex is unable to associate with the 43-S subunit unless it first interacts with eIF-4A (either alone or as part of a factor complex). This step would depend on ATP, since ATP is essential for maximal binding of eIF-4A to mKNA (22, 74, 75). The third model is based on the hypothesis that eIF-4E contains an ATP binding site. ATP would bind eIF-4E at an allosteric site and cause a conformational change, converting the protein into a form capable of interaction with the ribosome or factors bound to it. Further studies are necessary to determine the role of ATP in eIF-4E binding to ribosomes.

C. Binding of elF-4E to elF-4A It is likely, based on the detection of initiation-factor complexes (Section 111), that eIF-4E binds specifically to several of the eIF-3 or -4 group polypeptides. However, the only such interaction studied to date is that between eIF-4E and eIF-4A. A binding constant of 7.1 x 105 M-1 was found; this is similar to the binding constant of eIF-4E for mRNA (69).

V. Regulation of elF-4E Activity

by

Phosphorylation

Covalent modification of proteins, primarily by phosphorylation, is one of the most common ways of rapidly regulating metabolic processes in the cell. In several instances of translational control, e.g., fertilization, mitosis, and treatment with hormones such as insulin and growth factors such as epidermal growth factor and platelet-derived growth factor, the rate of protein synthesis changes within minutes and is readily reversed, suggesting FIG.6. Effect of mHNA and m X T P on the binding of eIF-4E to the 404 ribosome. 35S-labeled eIF-4E was synthesized and analyzed in the ribosome-binding assay as in Fig. 5, except that all reactions contained 3-mM GMPPNP. (A) Complete reaction; (B) no globin mRNA was added to the translation reaction; (C) same as (B), except that 400-pM m7GTP was added to the translation reaction.

0'

TOP

10

20

Fraction Number

Bottom

MECHANISM AND REGULATION OF

eIF-4E

199

that the regulatory mechanism involves changes in the specific activities of proteins rather than their cellular levels. In all the conditions mentioned above, the change affects the initiation step, yet 43-S complex formation appears unaffected, suggesting that the control point may be at the mRNAbinding step. eIF-4E is the least abundant of the initiation factors (54, 76) and as such may represent the rate-limiting component in initiation. A priori, its covalent modification would be an effective way to control the rate of protein synthesis. Evidence from a variety of systems, summarized in Table 11, demonstrates a close correlation between phosphorylation of eIF-4E and the rate of protein synthesis. Dephosphorylation of the protein coincides with a reduction in protein synthesis after heat shock or adenovirus infection and during mitosis. In contrast, an increase in protein synthesis is accompanied by an increase in eIF-4E phosphorylation upon stimulation of cells with hormones, growth factors, and other mitogens. Since these are only correlations, it was important to examine, at the molecular level, how the phosphorylation might affect eIF-4E formation. We constructed a modified form of eIF-4E in which Ser-53, the major phosphorylation site, was changed to either a neutral (Ala-53) or a negatively charged (Glu-53) residue. The amino acids were changed by site-directed mutagenesis of the cDNA and the altered forms of the protein (eIF-4EAIa and eIF-4E"'") were examined for cap-binding activity by retention on m7GTPSepharose. Both eIF-4EAIa and eIF-4EG11iwere retained and specifically eluted with m'GTP, showing that cap binding is not grossly affected by the phosphorylation state of the protein (56). The effect of altering the major phosphorylation site of eIF-4E was also tested by the ribosome-binding assay. The distribution of eIF-4ESer, eIF-4E*la, and eIF-4EGIL1 in translation reactions programmed with globin mRNA is shown in Fig. 5. As described above, GMPPNP was used to increase the concentration of 4 8 3 complexes. Whereas eIF-4Eser was observed in the 48-S region (Fig. 5A), a corresponding peak did not appear (Fig. 5C). The most straightforward with eIF-4EAIa (Fig. 5B) or eIF-4EGIL1 interpretation of these results is that phosphorylation of eIF-4E is obligatory for its action in the transfer of mRNA to the 43-S initiation complex. In support of this conclusion, eIF-4ESerfrom isolated 48-S initiation complexes consisted predominantly of the phosphorylated form (56). The results with eIF-4EGIUalso showed that the phosphate residue per se was important and that any negatively charged moiety would not substitute for it. FIG. 7. Effect of ATP on the binding of eIF-4E to the 40-S riboaome. Binding of 35S-labeled eIF-4E was carried out as in Fig. 6. (A) Complete reaction; (B) 1.5-mM AMPPNP was added in the reaction instead of ATP; (C) the reticulocyte lysate was depleted of ATP by incubation with 2-mM glucose, 3.7-mM Mg acetate, and 57 U/m1 yeast hexokinase at 30°C for 15 minutes. No ATP was added during the incubation with ?-labeled eIF-4E.

ROBERT E. HHOADS ET AL.

200 Model 1

P I

4E (

47944

Model 2

Model 3

FIG.8. Models to explain the ATP-requirement for ribosome binding ofeIF-4E. (Model 1) The absence of ATP prevents e I F - J E from being phosphorylated by the hypothetical eIF-4E kinase, which is in some way required for binding to the ribosome. (Model 2) The hydrolysis of ATP by eIF-4A is somehow required before eIF-4E can form a stable complex with the ribosome. (Model 3) ATP binds to eIF-4E allosterically to change its conformation and allow binding to the ribosome.

VI. Alteration of lntracellular levels of elF-4E The in oitro studies described in the previous section provide a possible mechanism by which phosphorylated eIF-4E could accelerate the ratelimiting mRNA-joining step. The studies demonstrate that unphosphorylated eIF-4E does not become part of the 4 8 4 ribosome, suggesting that

MECHANISM AND REGULATION OF

eIF-4E

201

TABLE I1 AGENTS AND CONDITIONS WHICH ALTER eIF-4E PHOSPHORYLATION ~

Condition A. Inhibition of eIF-4E phosphorylation Mitosis Heat shock Heat shock Adenovirus

B. Stimulation of eIF-4E phosphorylation Mitogens PMA PMA PMA LPS Polypeptide growth factors Insulin Insulin Serum EGF PDGF PDGF NGF Oncogene products pp6OSr" IJ21ras

~

Tissue HeLa cells HeLa cells Ehrlich ascites cells Human 493 and RD cells

~

Reference 77 54 116

117

Rabbit reticulocytes Mouse B lymphocytes Rat 3T3 cells Mouse B lymphocytes

118

Mouse 3T3-Ll cells Human HIR 3.5 cells Mouse 3T3 fibroblasts Human mammary epithelium Rat 3T3 cells Human lung fibroblasts Rat PC12 cells

120 121 86 58 119

Rat 3T3 cells Rat embryo fibroblasts

119 83

57 If9 57

122 84

mRNA may not be recruited to the ribosomes in the absence of phosphorylated eIF-4E. The idea that phosphorylation is important for eIF-4E function, drawn from in vitro studies, has now been verified by in vivo studies in which the cellular concentration of eIF-4E is altered by a variety of techniques.

A. Overexpression of elF-4E An episomally replicating shuttle vector (RDB) was used to overexpress eIF-4E in HeLa cells (78). Two plasmids, RDB-wt and RDB-ala, containing the cDNAs for the unmodified form eIF-4ESerand the phosphorylation site variant eIF-4EA'a, respectively, under the control of a TCDD-inducible promoter were constructed and transfected into HeLa cells. Cells containing the vector were selected on the basis of resistance to the antibiotic G418. After 4 days, most of the cells transfected with RDB-wt were refractile, suggesting they were rounded-up and undergoing cell division, and grew in foci many cells thick. Figure 9 follows the development of a single focus over time, in this case in the absence of G418 selection. The contrast between

Day 7

Day 15

Day 10

Day 23

FIG.9. HeLa cells were transformed with RDB-wt, an episomal vector expressing eIF-4E (wild type). The figure shows growth progression of a single focus of transformed cells without 6418 selection. The same area of the culture was photographed 7, 10, 15, and 23 days after transfection. G418 was added 20 days after transfection. HeLa cells transfected with a vector expressing eIF-4E were indistinguishal~lefrom nntransfected HeLa cells (data not shown). [Reprinted from 78.1

MECHANISM AND REGULATION OF

eIF-4E

203

refractile rounded-up cells in the focus and the “lawn” of normal cells in monolayer is apparent. When the antibiotic G418 was added at 20 days, the lawn of cells died, but those in the focus were completely resistant (Fig. 9, Day 23). By counting cells in colonies, it could be estimated that the growth rate was increased 20% by RDB-wt transfection. By contrast, cells transfected with RDB-Ala that survived G418 selection grew to small colonies with morphology indistinguishable from that of untransfectcd HeLa cells, viz. flat, spindle-shaped, and with only a small proportion of round dividing cells. After 1 month, they formed a confluent monolayer, with growth and morphology characteristics identical to those of the parental HeLa cells. After 1 month, the cells transformed with RDB-wt were large and strangely shaped, and most contained multiple nuclei. In some cells, as many as six nuclei could be distinguished. Ultimately, all of these cells lysed, suggesting that excessive protein synthesis results in unscheduled nuclear division (without cytokinesis) and eventually cell death. Thus, cells overexpressing eIF-4E (approximately 3- to $-fold) divided more frequently, lost contact inhibition, formed syncytia, and ultimately died. None of these results were seen with the eIF-4EAla, adding further support for the idea that phosphorylation at Ser-53 is essential for eIF-4E function. Similar results were obtained upon overexpression of eIF-4E in HeLa and HBLlOO cells using an integrating mammalian vector, pMAMneo, that constitutively expresses eIF-4E. Transfectants were selected for neomycin resistance with 400 kg/ml G418 and, in the case of the HeLa cells, expanded into mass cultures and subsequently clonal lines. Approximately 6-8 weeks after transfection, cells were examined by light microscopy. Cells containing the eIF-4ESer cDNA integrated into their genome exhibited an abnormal morphology, that is, an enlarged, striated, granular appearance- (Fig. 10B and 1OC).The cells were also multinucleated, suggesting loss of coordination between nuclear and cellular division. Cells containing the pMAMneoeIF-4EAlaconstruct appeared normal (Fig. AOA). To confirm the multinucleated phenotype, the cells containing the pMAMneo-eIF-4ESer and pMAMneo-eIF-4EAla constructs were stained with propidium iodide and examined by fluorescence microscopy. Most of the cells expressing eIF-4EAIa contained one nucleus but a small percentage contained two nuclei, as did the untransfected HeLa cells. By contrast, most or all cells expressing eIF-4ESer contained four or more nuclei. Figure A1 shows one such cell harboring eight nuclei. This appears to be a result of uneven nuclear divisions in the absence of mitosis. An explanation might be that an increase in eIF-4E results in an increase in protein synthesis, which in turn results in a faster accumulation of cell mass. This allows cells to progress into S-phase, since entry into S-phase is closely correlated with cell

204

KOBEKT E. KHOADS ET AL.

A pMAMneo-eIF-4 Eala

6 pMAMneo-eIF-4ESer

pMAMneo-elF-4Eser

FIG. 10. Appearance of HeLa cells transfected with pMAMneo-eIF-4E""' and pMAMneo-eIF-4EA'a. (A) Cells transfected with pMAMneo-eIf-4E a'' had the same appearance as control (untransfected) HeLa cells. (B and C) Cells ti-ansfected with pMAMneoeIF-4ESer, expressing the wild type eIF-4E protein at approximately three times the endogenotis level, were enlarged, had a striated appearance, and exhibited multiple nnclei.

MECHANISM AND REGULATION OF

eIF-4E

205

FIG. 11. Effect of eIF-4E overexpression on nuclear and cellular division. HeLa cells transformed with pMAMneo-eIF-4ESer were propagated and fixed for 20 minutes with 2% paraformaldehyde in phosphate-buffered saline, permeabilized with 0.2%Triton X-100, stained with propidium iodide, and examined by confocal fluorescence microscopy. A single cell containing eight nuclei is shown. (Photo courtesy of Paul Andreassen and Robert Margolis, University of Washington, Seattle.)

size (79). The eIF-4E-induced increase in protein synthesis and DNA synthesis may push the cells faster than normal through the cell cycle until they undergo nuclear division. At this stage, some other factor may become ratelimiting so that the rate of cytokinesis is unable to match the increased rate of nuclear division, causing the cells to become multinucleated. Overexpression of eIF-4E was also studied in NIH 3T3 and Rat-2 cells (80). In addition to exhibiting a transformed phenotype in monolayer culture, the cells grew in soft agar and formed tumors in nude mice. Again, the eIF-4EAla did not produce these effects, confirming the importance of eIF-4E phosphorylation in cell growth. Using a slightly different approach, Smith et aE. (81)microinjected purified initiation factors into quiescent NIH 3T3 cells and found that either eIF-4E or eIF-4 (formerly eIF-4F) caused rapid entry of cells into the S-phase and also caused morphological transformation. The microinjection of other factors, including EF-la, E F - l H , eIF-4A, and eIF-4EAla, showed none of these effects. Sonenberg (80)has suggested that eIF-4E is a member of a new class of cytoplasmic oncogenes associated with protein synthesis, distinct from oncogenes associated with the cell membrane or nucleus. However, it should be noted that the cell lines (HeLa, HBL100, NIH 3T3, and Rat-2) used in all of the studies described above were immortalized. Transformation of prima-

206

ROBEHT E. HHOADS ET AL.

ry non-established cells by eIF-4E alone has never been demonstrated. In fact, tumorigenic transformation of primary rat embryo fibroblasts in culture by eIF-4E requires the collaboration of oncogenes from a different class, such as c-myc or E1A (82). Also, preliminary evidence suggests that overexpression of eIF-4E is unable to cause transformation of murine T-lymphocytes (S. Bane, R. De Benedetti, R. E. Rhoads, D. Cohen and A. Kaplan, unpublished results). These findings are consistent with the requirement for complementation and cooperation between oncogenes to transform cells. The finding that eIF-4E overexpression caused a transformed cell phenotype raised the question of whether some known oncogenes might exert some or all of their actions through an effect on eIF-4E. As noted above, regulated phosphorylation of eIF-4E appears to play a critical role in the alteration of translational efficiency in response to growth modulators. Many oncogene products mimic growth factors or their receptors and hence may exert their influence by altering phosphorylation of eIF-4E. This hypothesis was examined in continuous rat embryo fibroblasts transformed with the activated Harvey ras oncogene (83).The expression of Ha-ras increased the rates of growth and protein synthesis about 4-fold but did not alter the levels of eIF-4E mRNA or protein. However, the rate of phosphorylation of eIF-4E was increased almost 7-fold. Despite this, the fraction of eIF-4E in the phosphorylated state was unchanged in ras-transfected cells. This suggests that there is a proportional increase in eIF-4E dephosphorylation and raises the possibility that the turnover of the Ser-53 phosphate rather than the degree of phosphorylation is more closely correlated with an enhanced initiation rate. Phosphorylation of eIF-4E was also induced in a ras-dependent manner during nerve-growth-factor-mediated differentiation of PC12 cells (84). An increase in the steady-state phosphorylation of eIF-4E was seen upon treatment of PC12 cells with nerve growth factor. This effect was abrogated in PC12 cells expressing a dominant inhibitory ras mutant. These results demonstrate a link between regulation of translational initiation by phosphorylation of eIF-4E and the rus signal-transduction system. The effect of the oncogene src has also been examined. NIH 3T3 cells transformed with pp6OSrC exhibit a 3-fold enhancement in eIF-4E phosphorylation but no change in the protein level (119). The foregoing results demonstrate that increased eIF-4E activity, whether caused by overexpression of the protein itself or overphosphorylation by the putative eIF-4E kinase, results in accelerated growth, a transformed phenotype, and, in some cases, abnormal cell division and death. The mechanisms underlying these results are undoubtedly complex. It is attractive to speculate, however, that the increased activity of eIF-4E relieves translational repression of mRNAs such as those listed in Table I. Supporting this

MECHANISM AND REGULATION OF

eIF-4E

207

hypothesis are the reports that the translation of ornithine aminotransferase mRNA is enhanced by overexpression of eIF-4E (85) and that, following serum stimulation of fibroblasts, eIF-4E is phosphorylated with the same kinetics as the recruitment of the translationally repressed L32 mRNA into polysomes (86). Regardless of the mechanism, the results suggest that eIF-4E is an important component of growth factor- and mitogen-activated signal-transduction pathways and probably functions downstream from many mitogens. Most mitogens cause changes in activity of a variety of kinases and phosphatases. The kinase that phosphorylates eIF-4E, although as yet unidentified, appears to be among these. Some studies have indicated that eIF-4E is not phosphorylated on Ser-53 in uiuo by known kinases such as S6, PAKI, PAKII, CKI, CKII, PKA, and PKC (57, 87-89). Also, the amino-acid sequence surrounding Ser-53 does not fit the consensus of any known kinase (90). Bearing in mind the profound effects on overall growth, ATP consumption, and expression of rare proteins that would result from changes in eIF-4E activity, it is likely that there is a unique kinase dedicated to eIF-4E. Identification of the kinase (and phosphatase) for eIF-4E may help decipher the cascade of events triggered by external stimuli.

B. Underexpression of elF-4E 1. EXPRESSION OF ANTISENSE RNA Additional evidence for the key role played by eIF-4E in cell growth was obtained by expressing antisense RNA (AS RNA) against eIF-4E mRNA (91). A 20-nucleotide oligomer complementary to the 5'-terminal region of eIF-4E mRNA was cloned into the RDB vector described above. Cells transformed with the antisense construct (hereafter referred to as AS cells) grew slowly, with a doubling time of about 100 hours (Fig. 12A, open squares), whereas untransfected HeLa cells doubled in approximately 25 hours (Fig. 12B, solid circles). The fact that inhibitory effects were observed in the absence of inducer is consistent with previous observations that a low level of constitutive gene expression occurs with this promoter-enhancer combination (78, 91, 92). Addition of the inducer TCDD to AS cells caused further slowing of the growth rate, followed by a decline in cell number after 2 days (Fig. 12A, solid squares). TCDD had no detectable effect on untransfected HeLa cells (Fig. 12B, open circles) or on cells transformed with RDB-0, the vector with no insert (data not shown). The 100-hour doubling time in Fig. 12A was obtained with 0.2 mg/ml G418. When cells were cultured in 0.4 or 0.6 mg/ml G418, the doubling time increased to 170 hours, reflecting the increase in vector copy number (93).Conversely, when AS cells were maintained without G418 selection, they resumed normal

208

ROBERT E. RHOADS E T AL.

100

50

h

cu

0 7

X

v

lo+ 5

4712-

3000 1869 -

1

1

3

5

1

3

5

Days after plating FIG. 12. Effect on cell growth of expressing antisense (AS) RNA complementary to eIF-4E mRNA. Cells were plated in 25-cm2 flasks with a gradnated bottoln. The average cell number in four random I-cmz grids was taken each day, beginning 1 day after plating. (A) AS cells cultured in 0 . 2 mg/ml G418, with (solid squares) and without (open squares) the inducer TCDII. (B) Control untransfected HeLa cells grown with (open circles) and without (solid circles) TCDD. (Inset) Northern analysis of AS RNA produced in control HeLa cells (C), AS cells without inducer (AS- ), and AS cells with inducer for 18 hours (AS' ). [Reprinted with permission from Molecular and Cellular Biology (91).]

growth rates in about 2 weeks, presumably due to a reduction of vector copies. These results indicate that the phenotype of slow growth is due to the expression of eIF-4E AS sequences and not to the vector, G418 or TCDD per se. Both eIF-4E mRNA and protein levels were reduced in proportion to the degree of AS RNA expression. The same was true for the rates of protein synthesis both in vivo and in uitro. The polysoines in the AS cells were disaggregated with a concomitant increase in ribosomal subunits. Surprisingly, when an in vitro translation system was made from AS cells, translation was not restored by addition of eIF-4E but was restored by eIF-4

MECHANISM AND REGULATION OF

Y

209

eIF-4E

__p

9

elF-4A

j

m \

Protein Synthesis

- - ._ -

----.

1'0

20 30 Time After TCDD Addition (hours)

..

40

a-

FIG. 13. Decay rates of eIF-4A, eIF-4E, eIF-4y (p220), and protein synthesis in AS cells. AS cells were incubated with TCDD in multiwell plates at the times indicated. Proteinsynthesis rates were measured in cells pulse-labeled for 3 hours with [3H]leucine (solid squares). eIF-4E was measured in cells labeled to equilibrium with [3H]leucine for 48 hours. The cells were lysed and eIF-4E was isolated by affinity chromatography and subjected to SDSPAGE, and the radioactivity was quantitated by fluorography (open squares). e I F 4 y (solid triangles) and eIF-4A (hourglasses) were measured by Western blotting. All the values in this figure are expressed relative to AS cells not treated with TCDD. [Reprinted with permission from Molecular and Cellular Biology (91).]

(formerly eIF-4F). This suggests that the extracts were deficient in another component of eIF-4 in addition to eIF-4E. The levels of the two other components of eIF-4, viz. eIF-4A and eIF-4y, were determined immunologically along with the overall rate of in t h o protein synthesis after addition of the inducer TCDD5 (Fig. 13). The results indicated that, except for a slight initial lag, eIF-4E and eIF-4y decayed with nearly the same kinetics. eIF-4A remained unchanged over the period in which eIF-4E and e I F 4 y decreased to undetectable levels. In a separate experiment, the level of eIF-4A in control HeLa cells was the same as in AS cells (data not shown). Protein synthesis in the AS cells decreased the most between 6 and 18 hours, in parallel with eIF-4E and eIF-4y, and thereafter more slowly, until a residual level of translation was 5 Note that in the experiment described in Fig. 13, the AS cells without addition of inducer were already synthesizing protein at only 20-25% of the control (untransfected) HeLa cells. Addition of inducer further depressed synthesis to 8%.

210

ROBERT E. RHOADS ET AL.

reached. Synthesis of most proteins was reduced in AS cells, but that of some proteins was more resistant to the general inhibition. The identity of many of these proteins was established in experiments described in the following section.

2. AS CELLSMAKE ONLYHSPs

The foregoing experiments demonstrated that expression of AS RNA against eIF-4E mRNA caused a decrease in the levels of eIF-4E and e I F 4 y , the initiation factors required for cap-dependent translation, which resulted in a drastic reduction in the rate of protein synthesis as well as a change in the spectrum of polypeptides synthesized. Nevertheless, protein synthesis continued at approximately 8% of the control rate. Heat shock is another condition in which protein synthesis is both quantitatively and qualitatively altered (94). Therefore, the pattern of 3%-labeled polypeptides synthesized after heat shock was compared to that of AS cells (18; Fig. 14). When compared to control HeLa cells (Lane C ) , the translation in heat-shocked HeLa cells was reduced slightly and the synthesis of HSPs was induced (Lane HS). Since the cells were allowed to recover for 3 hours after heat shock, the translation of normal mRNAs was almost fully recovered. Protein synthesis in the AS cells (Lane ASO) was drastically reduced (note the longer time of autoradiography). When AS RNA expression was further induced by treatment with TCDD, protein synthesis decreased even further (Lane AS48; again, note the difference in autoradiographic exposure), although some proteins continued to be synthesized. In most cases, these displayed mobilities characteristic of HSPs. To obtain positive identification of the proteins resistant to eIF-4E depletion, Western blotting experiments were performed with monoclonal antibodies to known HSPs. The results of Western blotting with four antibodies generated against HSPs 90, 70, 65, and 27 provided positive identification of major polypeptides surviving in AS cells and indicated that, by both electrophoretic mobility and immunoreactivity, they were the same as known HSPs. Furthermore, the levels of these proteins as a percentage of total cellular protein were increased in AS cells in most cases. The Western blotting experiments provided information on the total levels of HSPs but not on their rates of synthesis. In order to measure the latter, newly synthesized 3%-labeled proteins were analyzed by immunoprecipitation. Pulse labeling followed by immunoprecipitation indicated that HSPs 90 and 70 were synthesized more rapidly in AS cells than in control cells. The accelerated synthesis of HSPs in the AS cells was not due, however, to increased mRNA levels; the levels of HSP-90 and -70 mRNAs either remained the same or decreased, respectively, after induction of AS RNA expression as judged by Northern analysis.

MECHANISM AND REGULATION OF

eIF-4E

211

200 92.5 69 46

-

30 21.5 -

C

HS

AS0 AS48

FIG. 14. Pattern of 3%-labeled proteins in heat-shocked and AS cells. HeLa cells were either untreated (C) or were subjected to heat shock and reeovery for 3 hours (HS). Cells transfected with a vector expressing AS RNA against eIF-4E under control of a TCDDinducible promoter (AS cells) were either untreated (ASO) or were induced with TCDD for 48 hours (AS48). Cells were labeled with [35S]methionine for the last 3 hours of treatment, cell extracts were prepared, and 50-pg aliquots of protein from each sample were resolved by SDSPAGE. The autoradiographic exposure times for C , HS, ASO, and AS48 were 6, 6, 18, and 76 hours, respectively. [Reprinted with permission from the Journal of Biological Chemistry (18).]

The foregoing results indicate that HSP mRNAs are more efficiently translated in AS cells, since more protein was synthesized from either the same (HSP-90) or less (HSP-70) mRNA. Since initiation of protein synthesis is rate-limiting under normal conditions, the more efficient utilization of mRNAs should be reflected in a higher rate of initiation and a concomitant shift of HSP mRNAs to higher polysomes, assuming that there is not a simultaneous and proportionate increase in elongation rate. To test this prediction directly, RNA from polysomal fractions was analyzed using 02Plabeled cDNA probes to HSP-90, HSP-70, and p-actin mRNA (which should represent the polysomal behavior of a normal cellular message). As reported previously (91),polysomes were clearly visible in the control cells (Fig. 15A, HeLa), but in AS cells treated with TCDD for 48 hours (AS), they were almost completely disaggregated, with accompanying increases in monosomes and ribosomal subunits. The mRNA of p-actin mRNA was associated mostly with large pofysomes in control cells (midpoint, 8-9 ribosomes per mRNA) but was shifted into the nonpolysomal and small polysomal fractions

212

ROBERT E. RHOADS ET AL.

A: POLYSOME PROFILES b

L

AS

B: ACTIN

Polysomes

Top

40 60 80

12

3 4

5 6 7 8 9 1 0

HeLa

AS

C:HSP 90 Polysomes

Top

40 60 80

12

3

4

5 6 7 8 9 1 0

He1 AS FIG. 15. Analysis of polysomal mRNA distribution. Control HeLa cells and AS cells treated with TCDD for 48 hours were harvested and analyzed for polysomes. (A) Optical density profiles of polysomes. The direction of sedimentation is left to right. (B) The polysomal distribution of p-actin mRNA as determined by hybridization to actin cDNA using a “slot-blot” apparatus. The ribosomal subunits are indicated by 40 and 60; monosomes, by 80; and polysomes, by 2, 3 (disome, trisome), etc. (C) Same as (B), except cDNA to HSP 90 was used for hybridization. ] [Reprinted with permission from the Journal of Biological Chemistry (18).

MECHANISM AND REGULATION OF

eIF-4E

213

in AS cells (Fig. 15B). The HSP-90 mRNA showed just the opposite behavior; it was associated with small polysomes in control cells but shifted to larger polysomes in AS cells (Fig. 15C). A similar shift from lighter to heavier polysomes was observed in the case of HSP-70 mRNA (not shown). This confirms that HSP-90 and -70 mRNAs are more efficiently translated in AS cells, despite the almost complete loss of polysomes. The reason that HSP mRNAs continue to be translated in AS cells is not clear. The previous observation that they are preferentially translated under conditions of reduced polypeptide chain initiation caused by either heat shock or hypertonic conditions (95) led to the suggestion that they are “strong” mRNAs. This is not likely to be the case since (i) after heat shock, a return to normal translation is accompanied by a decrease in HSP synthesis, without an appreciable decrease in HSP mRNA levels (96), (ii) HSP-70 mRNA is associated with large polysomes during heat shock but with small polysomes after 90 minutes of recovery, and (iii) HSP mRNAs are associated with small polysomes in control cells (Fig. 15). If they were strong mRNAs, as hypothesized, they would be associated with large polysomes, as was actin mRNA. Thus, it appears that HSP mRNAs are actually outcompeted for translation under normal conditions. A more likely explanation for the preferential translation of HSP mRNAs is that their mechanism of initiation is qualitatively different from that of normal mRNAs. One possibility for the low requirement of HSP mRNAs for eIF-4E and/or eIF-4y is that they contain very little secondary structure and hence do not require ATP-dependent unwinding by the eIF-4 machinery. Another possibility is that they use a non-cap-dependent pathway of initiation (e.g., internal initiation). HSP-78 mRNA can support internal initiation (17); when expressed in a bicistronic construct, the 5’-UTR of HSP-78 mRNA allows internal ribosome binding and translation of the distal cistron. The only other cellular mRNA currently thought to be internally initiated is that of the Drosophila antennapedia gene (127). Similar experiments with bicistronic constructs will be necessary to establish whether all HSP mRNAs are, in fact, internally initiated. It appears logical that stress proteins would be initiated by a non-cap-dependent mechanism since heat shock inactivates the eIF-4 complex (97, 98).

VII. Summary, Conclusions, and Future Directions Initiation of protein synthesis is a complex and highly regulated process, and there are numerous mechanisms by which regulation occurs, both of a global nature and mechanisms targeted for individual mRNAs. The results

214

ROBERT E. RHOADS ET AL.

presented in this review focus on only one factor and one step in protein synthesis, eIF-4E and the binding of mRNA to ribosomes. However, both in uivo and in uitro evidence is accumulating that this is a point of major regulation of overall protein synthesis rate. Phosphorylation of eIF-4E at Ser-53 profoundly affects its activity, but it should be kept in mind that no effects of eIF-4E phosphorylation have yet been demonstrated in purified systems. Hence, the effects may be indirect. It is also clear that multiple phosphorylations take place on eIF-4E, yet neither their sites nor effects on activity are known. A variety of external stimuli affect both eIF-4E phosphorylation and the rate of protein synthesis. It is tempting to formulate hypothetical schemes of phosphorylation cascades beginning with receptor-linked protein-tyrosine kinases and ending with activation of eIF-4E. Yet eIF-4E itself is a very poor substrate for any of the purified kinases tested, and one must recognize the possibility that phosphorylation takes place at a unique stage of initiation, i.e., when eIF-4E is in a particular complex. Furthermore, despite considerable effort by a number of researchers, the putative eIF-4E kinase remains unidentified. Perhaps fundamental to the problem of how eIF-4E activity is regulated is a knowledge of how the eIF-4 factors function in the initiation process. Until the order of binding, the relative affinity constants, the alterations on covalent modification, the intermediate complexes, and the individual activities of the various eIF-3 and -4 polypeptides are delineated, proposals for their regulation remain premature. The loss of growth control upon overexpression of eIF-4E, by increasing either protein level or phosphorylation, will undoubtedly be an even more complex process to understand. The original model of Lodish (4),however, is appealing in its simplicity (Fig. 16). If R* is considered to be the phosphorylated form of eIF-4E, overriding the normal cellular controls of eIF-4E activity may cause the excessive translation of weaker mRNAs, e.g., those i n Table I, and cause the cells to undergo more rapid growth and cell division (Fig. 9). Thus, one may think of eIF-4E as a proto-oncogene and consider translational routes to oncogenesis, to complement and broaden current transcriptional models. In fact, some of the poorly translated proto-oncogene mRNAs listed in Table I encode transcription factors, so that a general stimulation of translation would be expected to produce a disproportionate increase in transcription factor levels. At even higher levels of eIF-4E, or if sustained over a longer period of time, nuclear division may exceed cell division (Figs. 10 and 11)and lead to cell death. Reduction of eIF-4E levels slows cell growth (Fig. 12) and produces a block in translation (Fig. 14) which resembles heat shock and picornavirus infection. Under these conditions, only strong mRNAs, or alternatively, mRNAs that utilize non-cap-dependent mechanisms of initiation, are trans-

MECHANISM AND REGULATION OF

a ‘stronc

cell

7 I

a “wea ’ mRNA

1 Quiescence

215

eIF-4E

Slow Growth

Normal Growth

Transformation Tumorigenesis

Abnormal Cell Growth

cell

Death

Death

FP (elF4E-P) Concentration FIG. 16. Model for the interrelation of cell growth, cell phenotype, and protein-synthesis rate. The rate-limiting component for protein synthesis under normal (non-stressed) conditions, originally termed R* (4), may, in fact, be the phosphorylated form of eIF-4E, based on the evidence presented in this article. According to the model, increases in the levels of eIF-4E protein by overexpression (Figs. 9-11) or phosphorylation of eIF-4E by mitogens or oncogene products (Table 11) would lead to a relatively greater expression of “weak mRNAs (Table I), causing cells to escape normal growth controls and, in the extreme, lose the coordination between mitosis and cytokinesis (Fig. 11).Decreases in eIF-4E protein levels by underexpression (Figs. 12-1.5) or decreases in eIF-4E phosphorylation (Table 11) would cause cells to slow, enter quiescence, or, in the extreme, lose viability.

lated. These proteins may have special survival value for the cell or aid in recovery from the stress condition. At the extreme, however, cell death results from complete depletion of eIF-4E (Fig. 12). The ability to deplete the intact cell of eIF-4E and eIF-4y should provide a new tool for identifying mRNAs that utilize this non-cap-dependent pathway of initiation and for elucidating of the mechanism by which this pathway occurs.

REFERENCES 1. 2. 3. 4.

M. H. G. H.

C. Pannevis and D. F. Houlihan, J . Cornp. Physiol. Biol. 162, 393 (1992). Trachsel, Ed., “Translation in Eukaryotes,” CRC Press, Boca Raton, Florida, 1991. Koch, J. A. Bilello, J. Kruppa and F. Koch, Ann. N.Y. Acad. Sci. 339, 280 (1980). F. Lodish, Nature 251, 38.5 (1974).

216

ROBERT E. RHOADS ET AL.

5. R. J. Jackson, in “Translation in Eukaryotes” (H. Trachsel, ed.), pp. 193-230. CRC Press, Boca Raton, Florida, 1991. R. E. Rhoads, Curr. Opin. Cell Biol. 3, 1019 (1991). A. Marcus, JBC 245, 955 (1970). C. Darnbrough, S. Legon, T. Hunt and R. J. Jackson, JMB 76, 379 (1973). M. Kozak, JBC 266, 19867 (1991). R. Baserga, Cancer Res. 50, 6769 (1990). R. E. Thach, Ed., “Translationally Regulated Genes in Higher Eukaryotes,” Enzyme 44. Karger, Basel, 1991. 12. R. E. Rhoads, in “Translation in Eukaryotes” (H. Trachsel, ed.), pp. 109-148. CRC Press, Boca Raton, Florida, 1991. 13. J. W. B. Hershey, ARB 60,717 (1991). 14. A. Marcus, JBC 245, 962 (1970). 15. A. J. Shatkin, Cell 40, 223 (1985). 16. R. J. Jackson, Nature 353, 14 (1991). 17. D. 6. Macejak and P. Sarnow, Nature 353, 90 (1991). 18. S. Joshi-Barve, A. De Benedetti and R. E. Rhoads, JBC 267, 21038 (1992). 19. S. C. Milburn, R. F. Duncan and J. W. B. Hershey, ABB 276, 6 (1990). 20. H. Trachsel, B. Erni, M. H. Schrier and T. Staehelin, J M B 116, 755 (1977). 21. R. Benne and J. W. B. Hershey, JBC 253, 3078 (1978). 22. R. D. Abramson, T. E. Dever, T. G. Lawson, B. K. Ray, R. E. Thach and W. C. Merrick, JBC 262, 3826 (1987). 23. S. C. Milburn, K. Kelleher, M. V. Davies, R. J. Kaufman and J. W. B. Hershey, E M B O J . 9, 2783 (1990). 24. F. Rozen, I. Edery, K. Meerovitch, T. E. Dever, W. C. Merrick and N. Sonenberg, MCBiol 10, 1134 (1990). 25. B. K. Ray, T G. Lawson, R. D. Abramson, W. C. Merrick and R. E. Thach, JBC 261, 11466 (1986). 26. N. Sonenberg, M. A. Morgan, W. C. Merrick and A. J. Shatkin, PNAS 75, 4843 (1978). : . Chu and R. E. Rhoads, JBC 257, 4056 (1982). 27. 6. M. Hellmann, LY 28. L. S. Hiremath, S. T. Hiremath, W. Rychlik, S. Joshi, L. L. Domier and R. E. Rhoads, JBC 264, 1132 (1989). 29. S. M . Tahara, M. A. Morgan and A. J. Shatkin, JBC 256, 7691 (1981). 30. D. Etchison, S. C. Milburn, I. Edery, N . Sonenberg and J. W. B. Hershey, JBC 257, 14806 (1982). 31. J. A. Grifo, S . M. Tahara, M. A. Morgan, A. J. Shatkin and W. C. Merrick,]BC 258, 5804 (1983). 32. I. Edery, M . Humbelin, A. Darveau, K. A. W. Lee, S. Milburn, J. W. B. Hershey, H. Trachsel and N. Sonenberg, JBC 258, 11398 (1983). 33. B. Safer, EJB 186, l(1989). 34. R. W. Yan, W. Rychlik, D. Etchison and R. E. Rhoads, jBC 267, 23226 (1992). 35. B. K. Ray, T. G. Lawson, J. C. Kramer, M. H. Cladaras, J. A. Grifo, R. D. Abramson, W. C. Merrick and R. E. Thach, JBC 260, 7651 (1985). 36. K. A. W. Lee and N. Sonenberg, PNAS 79, 3447 (1982). 37. A. M. Bonneau and N. Sonenberg, J. Virol. 61, 986 (1987). 38. L. Perez and L. Carrasco, J. Virol. 189, 178 (1992). 39. P. T. Tuazon, S. J. Morley, T. E. Dever, W. C. Merrick, R. E. Rhoads and J. A. Traugh, JBC 265, 10617 (1990). 40. N . Sonenberg, K. M. Rupprecht, S. M. Hecht and A . J. Shatkin, PNAS 76, 4345 (1979).

6. 7. 8. 9. 10. 11.

MECHANISM AND REGULATION OF

eIF-4E

217

41. J. L. Hansen, D. 0.Etchison, J. W. B. Hershey and E. Ehrenfeld, MCBiol2,1639 (1982). 42. N. R. Webb, R. V. J. Chari, G. DePillis, J. W. Kozarich and R. E. Rhoads, Bchem 23, 177 (1984). 43. H. Trachsel, N. Sonenberg, A. J. Shatkin, J. K. Rose, K. Leong, J. E. Bergmann, J. Gordon and D. Baltimore, PNAS 77, 770 (1980). 44. D. Etchison and K. Smith, JBC 265, 7492 (1990). 45. B. Buckley and E. Ehrenfeld, jBC 262, 13599 (1987). 46. D. Etchison and S. Milburn, MCBchem 76, 15 (1987). 47. S. R. Lax, K. S. Browning, D. M . Maia and J. M. Ravel, JBC 261, 15632 (1986). 48. S. N. Seal, A. Schmidt, A. Marcus, I. Edery and N. Sonenberg, ABB 246, 710 (1986). 49. K. S. Browning, S. R. Lax and J. M. Ravel, JBC 262, 11228 (1987). 50. W. Rychlik, P. R. Gardner, T. C. Vanaman and R. E. Rhoads, JBC 261, 71 (1986). 51. M. Altmann, P. P. Muller, J. Pelletier, N. Sonenberg and H. Trachsel, JBC 264, 12145 (1989). 52. W. Rychlik, L. L. Domier, P. R. Gardner, G. M . Hellmann and R. E. Rhoads, PNAS 84, 945 (1987). 53. B. Buckley and E. Ehrenfeld, Virology 152, 497 (1986). 54. R. Duncan, S. C. Milburn and J. W. B. Hershey, JBC 262, 380 (1987). 55. W. Rychlik, M. A. Russ and R. E. Rhoads, JBC 262, 10434 (1987). 56. S. Joshi-Barve, W. Rychlik and R. E. Rhoads, JBC 265, 2979 (1990). 57. W. Rychlik, J. S. Rush, R. E. Rhoads and C. J. Waechter, JBC 265, 19467 (1990). 58. R. W. Donaldson, C. H. Hagedorn and S. Cohen, JBC 266, 3162 (1991). 59. W. D. McCuhbin, I. Edery, M. Altmann, N. Sonenberg and C. M. Kay, JBC 263, 17663 (1988). 60. A. J. Chavan, W. Rychlik, D. Blaas, E. Kuechler, D. S. Watt and R. E. Rhoads, Bchem 29, 5521 (1990). 61. S. E. Carberry, E. Darzynkiewicz and D. J. Goss, Bchem 30, 1624 (1991). 62. S. E. Carberry and D. J. Goss, Bchem 30, 4542 (1991). 63. S. E. Carberry and D. J. Goss, Bchem 30, 6977 (1991). 64. S. E. Carberry, D. E. Friedland, R. E. Rhoads and D. J. Goss, Bchem 31, 1427 (1992). 65. J. Pelletier and Sonenberg, N., MCBiol 5, 3222 (1985). 66. E. Patzelt, D. Blaas and E. Kuechler, NARes 11, 5821 (1983). 67. S. .E. Carberry, R. E. Rhoads and D. J. Goss, Bchem 28, 8078 (1989). 68. S. E. Carberry, E. Darzynkiewicz, J. Stepinski, S. M. Tahara, R. E. Rhoads and D. J. Goss, Bchem 29, 3337 (1990). 69. D. J. Goss, S. E. Carberry, T. E. Dever, W. C. Merrick and R. E. Rhoads, Bchem 29,5008 (1990). 70. R. E. Rhoads, G. M . Hellmann, P. Remy and J. P. Ebel, Bchem 22, 6084 (1983). 71. W. D. McCuhbin, I. Edery, M. Altmann, N. Sonenberg, and C. M. Kay, FEBS Lett. 245, 261 (1989). 72. N. Sonenberg, NARes 9, 1643 (1981). 73. M. Kozak, Cell 22, 459 (1980). 74. J. A. Grifo, S. M. Tahara, J. P. Leis, M. A. Morgan, A. J. Shatkin and W. C. Merrick, JBC 257, 5246 (1982). 75. D. J. Goss, C. L. Woodley and A. J. Wahba, Bchem 26, 1551 (1987). 76. L. S. Hiremath, N. R. Webb and R. E. Rhoads, JBC 260, 7843 (1985). 77. A. M. Bonneau and N. Sonenberg, JBC 262, 11134 (1987). 78. A. De Benedetti and R. E. Rhoads, PNAS 87, 8212 (1990). 79. R. Baserga, Erp. Cell Res. 151, 1 (1984).

218

ROBERT E. RHOADS ET AL.

80. A. Lazaris-Karatzas, K. S. Montine and N. Sonenberg, Nature 345, 544 (1990).

81. M. R. Smith, M. Jaramillo, Y. Liu, T. E. Dever, W. C. Merrick, H. Kung and N. Sonenberg, New Biol. 2, 648 (1990). 82. A. Lazaris-Karatzas and N. Sonenberg, MCBiol 12, 1234 (1992). 83. C. W. Rinker-Schaeffer, V. Austin, S. Zimmer and R. E. Rhoads, J B C 267, 10659 (1992). 84. R. M. Frederickson, W. E. Mushynski and N. Sonenberg, MCBiol 12, 1239 (1992). 85. R. J. Fagan, A. Lazaris-Karatzas, N. Sonenberg and R. Rozen, JBC 266, 16518 (1991). 86. R. L. Kaspar, W. Rychlik, M. W. White, R. E. Rhoads and D. R. Morris, JBC 265, 3619 (1990). 87. P. T. Tuazon, W. C. Merrick and J. A. Traugh, JBC 264, 2773 (1989). 88. E. L. McMullin, D. W. H a s , R. D. Abrarnson, R. E. Thach, W. C. Merrick and C. H . Hagedorn, BBRC 153, 340 (1988). 89. D. W. Haas and C. H. Hagedorn, ABB 284, 84 (1991). 90. B. E. Kemp and R. B. Pearson, TZBS 15, 342 (1990). 91. A. De Benedetti, S. Joshi-Barve, C. Rinker-Schaeffer and R. E. Rhoads, MCBiol11,5435 (1991). 92. P. B. C. Jones, L. K. Durrin, D. R. Galeazzi and J. P. Whitlock, Jr.. PNAS 83,2802 (1986). 93. A. D e Benedetti and R. E. Rhoads, NARes 19, 1925 (1991). 94. S. Lindquist, ARB 55, 1151 (1986). 95. E. D. Hickey and L. A. Weber, Bchem 21, 1513 (1982). 96. A. D e Benedetti and C. Baglioni, JBC 261, 15800 (1986). 97. B. J, Lamphear and R. Panniers, JBC 266, 2789 (1991). 98. J. M. Zapata, F. G. Maroto and J. M. Sierra, JBC 266, 16007 (1991). 99. H. Saito, A. C. Hayday, K. Wirnan, W. S. Hayward and S . Tonegawa, PNAS 80, 7476 (1983). 100. A. Darveau, J. Pelletier and N. Sonenberg, PNAS 82, 2315 (1985). 101. F. Godeau, H . Persson, H. E. Gray and A. B. Pardee, E M B O J . 5, 3571 (1986). 102. J. D. Marth, R. W. Overell, K. E. Meier, E. G . Krebs and R. M. Perlmutter, Nature 332, 171 (1988). 103. C . D. Rao, M. Pech, K. C. Robbins and S. A. Aaronson, MCBiol8, 284 (1988). 104. V. Kruys, 0 . Marinx, G . Shaw, J. Descharnps and G . Huez, Science 245, 852 (1989). 105. B. A. Arrick, A. L. Lee, R. L. Grendell and R. Derynck, MCBiol9, 4306 (1991). 106. V. Kruys, B. Beutler and G . Huez, Enzyme 44, 193 (1990). 107. J. Nan, T. Brown and B. Beutler, J . Exp. Med. 171, 465 (1990). 108. N . Stdndart and T. Hunt, E n z y m e 44, 106 (1990). 109. P. J. Blackshear, R. A. Nemenoff, J. G. Hovis, D. L. Halsey, D. J. Stumpo and J. K. Huang, Mol. Endocrinol. 1, 44 (1987). 110. M. W. White, T. Kameji, A. E. Pegg and D. R. Morris, EJB 170, 87 (1987). 111. R. E. Braun, Enzyme 44, 120 (1990). 112. M. Mach, M. W. White, M. Nenbauer, J. L. Degan and D. R. Morris, J B C 261, 11697 ( 1986). 113. S. Hongo and M. Jacobs-Lorena, Deu. Biol. 145, 338 (1991). 114. L. F. Steel and A. Jacobson, Deu. Genet. 12, 98 (1991). 115. S. Levy, D. Avni, N. Hariharan, R. P. Perry and 0. Meyuhas, PNAS 88, 3319 (1991). 116. B. J. Lamphear and R. Panniers, JBC 265, 5333 (1990). 117. J. Huang and R. J. Schneider, Celt 65, 271 (1991). 118. S. J. Morley and J. A. Traugh, JBC 264, 2401 (1989). 119. R. M. Frederickson, K. S. Montine and N . Sonenberg, MCBiol 11, 2896 (1991). 120. S . J. Morley and J. A. Traugh, JRC 265, 10611 (1990).

MECHANISM AND REGULATION OF

eTF-4E

219

121. J. M. Manzella, W. Rychlik, R. E. Rhoads, J. W. B. Hershey and P. J. Blackshear, JBC 266, 2383 (1991). 122. X . Bu and C. H. Hagedorn, FEBS Lett 283, 219 (1991). 123. M. Altmann, C . Handschin and H. Trachsel, MCBiol7, 998 (1987). 124. M. L. Allen, A. M. Metz, R. T. Timmer, R. E. Rhoads and K. S. Browning, JBC 267, 23232 (1992). 125. A. M. Metz, R. T. Timmer and K . S. Browning, NARes 20, 4096 (1992). 126. W. Rychlik and R. E. Rhoads, NARes 20, 6415 (1992). 127. S.-K. Oh, M . P. Scott and P. Sarnow, Genes Develop. 6, 1643 (1992).

This Page Intentionally Left Blank

Enzymology of Homologous Recombination in Saccharomyces cerevisiae W.-D. HEYER Institute of General Microbiology CH-3012 Bern, Switzerland

R. D. KOLODNER' Division of Cellular and Molecular Biology Dana-Farber Cancer Institute Boston, Massachusetts 02115 Department of Biological Chemistry and Molecular Pharmacology Haruard Medical School Boston, Massachusetts 02115

I. Recombination Models . . . . . . . . . . . . . ......................... 11. Physical Analysis of Recombination . . . . . ......................... 111. Enzymology of Homologous Genetic Recombination in S. cereoisiae . . A. In Vitro Recombination Systems Based on Crude Cell Extracts . . . B. Individual Enzymatic Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. Proteins Involved in Hybrid DNA Formation . . . . . . . . . . . . . . . 2. Nucleases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Single-Stranded DNA-Binding Proteins (SSBs) . . . 4. DNA Helicases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. DNA Topoisomerases ........................ 6. DNA Polymerases and DNA Ligase . . . . . . . . . . . . . . . . . . . . . . . . C. Toward a Reconstitution of a Complete in Vitro System . . . . . . . . . . D. Concluding Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

222 228 228 229 232 232 244 254 256 258 259 261 264 266

Genetic recombination is a process in which DNA molecules pair at regions of homology and undergo rearrangements leading to a change in the linkage arrangement along each DNA molecule. Our knowledge about the molecular and enzymatic mechanisms of genetic recombination have come from four types of experiments. (1) Genetic experiments that predict DNA I

TO whom correspondence should be addressed at the Dana-Farber Cancer Institute.

Progress in Nucleic Acid Research and Molecular Riology, Vol. 46

221

Copyright 0 1993 by Acadrmic Press, Inc. All rights of reproduction in any form resewed

222

W.-D. HEYEH AND R. D. KOLODNEH

structures involved in recombination have led to the proposal of molecular models for recombination. (2) Physical characterization of recombination intermediates that occur in vivo have provided support for proposed recombination models. (3) Mutational analysis has identified gene products required for recombination and has defined recombination pathways and mechanisms; the availability of such mutants has provided a means of identifying, overproducing, and purifying recombination proteins. (4) Enzymatic studies have provided in vitro recombination systems for identifying and characterizing recombination proteins. While the exact mechanism(s) of genetic recombination is not clear at present, a combination of these approaches has led to the elucidation of some of the general features of genetic recombination. The goal of this article is to review the current state of knowledge of the proteins that function in genetic recombination in the simple eukaryote Saccharomyces cerevisiae. To provide a framework, a brief discussion of current views of the molecular mechanism(s) of genetic recombination, particularly recombination models, is presented below. More detailed discussions of recombination mechanisms, mutational analysis of recombination, and the biochemistry of recombination in higher eukaryotes are in other reviews (1-4).

1. Recombination Models Exchanges between nearby markers in fungi and bacteriophage are often non-reciprocal (5-9); this is called gene conversion (5). There appears to be an association between gene conversion and crossing-over (10). To explain this association, a model was proposed by Holliday (11) and extended by others (12-1 7). The model proposes a crossed strand-exchange junction at the region of cross-over called a “Holliday junction” (Fig. 1). Such junctions can move by branch migration to produce regions of heteroduplex D N A . Subsequent processing of the Holliday junction can result in production of either crossover or non-crossover products. If the formation of heteroduplex D N A leads to the pairing of mutant and wild-type alleles, then the heteroduplex region will contain mispaired bases, the exact nature of which will depend on the exact D N A sequences of the mutant and wild-type alleles. The repair or failure-to-repair regions of heteroduplex DNA containing a mismatch (pairing of a wild-type and mutant marker) leads to gene conversion and post-meiotic segregation, respectively. To explain several genetic studies suggesting the apparent absence of symmetric regions of heteroduplex D N A in S. cerevisiae, different models were subsequently proposed (Fig. 2). The Meselson and Radding model

-

HOMOLOGOUS RECOMBINATION OF

s. CWf?UiSiU&?

223

/

A

a

2

A

a

/

A

a

b

:/kxZz

3

4a

B

$

1

b

b

4b

/

B

A

b

a

b c-

B

FIG. 1. Adaptation of the original Holliday recombination model. (1)Recombination initiates at breaks in the same sequence of the same single strand of both parental molecules. (2)The different parental single strands are rejoined to each other to create a cross-strand-exchange junction, popularly known as a Holliday junction. (3)The Holliday junction can branch-migrate to yield regions of heteroduplex DNA that are symmetrically located in both parental molecules. If the heteroduplex covers a region in which one of the parental molecules contains a mutation, both heteroduplex regions will contain mispaired bases, the exact nature of which will depend on the sequence of the mutant and wild type regions. (4) Resolution of the Holliday junction occurs by endonucleolytic cleavage of two of the crossed strands. If strands 1 and 2 are cleaved, the resulting product molecules (4a) will each contain a heteroduplex patch and there will not have been crossing-over of outside markers with respect to each other. Alternatively, if strands 3 and 4 are cleaved, the resulting product molecules (4'0) will each contain a heteroduplex patch and crossing-over of outside markers with respect to each other will have occurred. Key: Thin and thick lines distinguish DNA derived froin the two different parental DNA molecules; half arrows indicate the 5' ends of DNA strands.

postulated a mechanism for formation of asymmetric regions of heteroduplex DNA that could subsequently lead to the formation of a true Holliday junction (12). The double-strand-break-repair model postulated gene conversion resulting from repair of double-strand gaps rather than postulating the formation and subsequent repair of regions of heteroduplex DNA (16). The double-strand-break-repair model also postulated the involvement of two Holliday junctions, one on either side of the recombination initiation site (2, 5 , 16). This feature provided an explanation for how crossing-over could occur on either side of a recombination initiation site. The single-strand-gap-

224

W.-D. HEYER AND R. D. KOLODNER

4

t rc

5

......-

/

FIG.2. Alternative recombination models. (A). Adaptation of the Meselson-Radding model, which is sometimes called the Avimore model. (1)Recombination initiates at a break in one strand of a parental molecule. (2) DNA synthesis initiates at the break, resulting in the displacement of a single strand. This single strand invades the homologous region of the other parental molecule, resulting in a D-loop. (3)The displaced strand of the D-loop is ultimately processed (degraded) by endo- and exonucleases. This results in the formation of a region of asymmetric heteroduplex D N A in which one parental molecule has become heteroduplex while the corresponding region of the other parent remains homoduplex due to the round of D N A synthesis that drives strand displacement. (4)The single strand-exchange is converted to a Holliday junction by a second cross strand-exchange. (5)The Holliday junction can then branch-migrate to yield a region of symmetric heteroduplex DNA adjacent to the region of asymmetric heteroduplex produced by the first single strand-exchange. Resolution of the Holliday junction can then occur as illustrated in Fig. 1 to yield both cross-over and non-cross-over configurations of product molecules. (B) Adaptation of the Double-Strand-Break-Repair model. (1) Recombination initiates at a double-strand break in one parental molecule. (2) This break is acted on by 5’-to-3’ exonucleases to yield 3’-terminated single-strand tails at each end of the break. In this version of the model, the 3’ tails are only subject to limited degradation so that only a small, if any, double-strand gap is produced. (3) The 3’ single-strand tails invade the other parental nlolecule to yield a joint molecule containing two single-strand exchanges. (4)The ends of the 3’ tails serve as primers for DNA synthesis, which copies the displaced strands of the parental molecule and can also result in the extension of the single strand exchange junctions similar to that postulated by the Meselson and Radding model (Fig. 2A). (5)The singIe strand exchange

HOMOLOGOUS RECOMBINATION OF

S. cerevisiae

22s

junctions are converted to two Holliday junctions by subsequent single strand exchanges similar to that postulated by the Meselson-Radding model (Fig. 2A). This results in one Holliday junction on each side of the initiation site. By appropriate resolution of these two Holliday junctions, as illustrated in Fig. 1, both cross-over and non-cross-over configurations of product molecules can be generated, such that the crossover point can be on either side of the initiation site. The final product molecules potentially contain three distinct regions of altered DNA sequence. First, in the central region produced by gap repair, both molecules have the same parental sequence because all of the information originated from one parent and was copied by DNA synthesis. Second, regions of asymmetric heteroduplex DNA flank both sides of the gap repair region. They were produced essentially as postulated by the Meselson and Radding model (Fig. 2A) and their length is determined hy the extent of the original 5’ to 3’ degradation and the extent of the subsequent DNA synthesis that drives the extension of the single strand exchanges. Third, regions of symmetric heteroduplex DNA Rank both sides of the asymmetric heteroduplex DNA region. Their length is determined by the extent of branch migration of the Holliday junctions. (C) Adaptation of the single-strand-gap model proposed by Radding (18). (1)Recombination initiates by the formation of a single strand gap in one parental molecule. (2) The single strand of the gap pairs with the intact parental molecule to yield two molecules joined by a D-loop. This intermediate can then be processed in three different ways. First, (34 The right single strand exchange junction is cleaved leaving the left single strand exchange junction intact. (4a) The single strand gap in the parental molecule is repaired by DNA synthesis yielding essentially the intermediate postulated hy the Meselson-Radding

226

W.-D. HEYER AND R. D. KOLOUNER

C

2

3a

3b

3c

4a

4b

4c

e-

/

....

L

.....

r...w

/

I

FIG.2. (continued) model (Fig. 2A, intermediate 2) and is further processed as described by the Meselson-Radding model. Second, (3h) the left single strand exchange junction is cleaved leaving the right single strand exchange junction intact. (41)) The single strand gap in the parental molecule is repaired by discontinuous DNA synthesis yielding essentially the intermediate postulated by the Meselson-Kadding model (Fig. 2A, intermediate 2) and is further processed as described by the Meselson-Radding model. The difference between these two outcomec is the side of t h e initiation site on which the exchange junction lies and hence where the region of symmetric heteroduplex DNA and the ultimate site of crossing over will b e relative to both the initiation site and the associated region of asymmetric heteroduplex DNA. Third, (3c) the two single strand junctions are extended outwards by branch migration and each is converted to a Holliday junction b y a second single strand exchange. (4c) The remaining gap is filled in by repair synthesis and t h e flanking Holliday junctions are free to branch migrate to produce more extensive regions of heteroduplex DNA. This resulting intermediate is then processed essentially as postulated for the double-strand-break-repair model (Fig. 2B). This third outcome of the single-strand-gap model is similar to the double-strand-break-repair model in that it yields products containing regions of symmetric heteroduplex DNA that flank both sides of a central asymmetric heteroduplex DNA region. However, it lacks the central region where double strand gap repair occurred. Key. Thin and thick lines distinguish DNA derived from the two different parental DNA molecules. Dashed lines indicate DNA derived hy repair DNA synthesis using template DNA whose source is indicated by the thickness of the dashed line. Half arrows indicate the 5' ends of DNA strands. Full arrows indicate the 3' ends of DNA strands that are used as primers for DNA synthesis.

HOMOLOGOUS RECOMBINATION OF

S. cerezjisiae

227

repair model was proposed as a modification of the Meselson and Radding model to explain crossing-over on both sides of a recombination initiation site, in addition to the asymmetric information transfer seen in gene conversion in S. cereuisiue (3, 18). Another central feature of the double-strandbreak-repair model and the single-strand-gap-repair model is that the initiation event (double-strand gap and single-strand gap, respectively) defines an acceptor of genetic information, whereas in the original Meselson-Radding proposal the initiating single-strand nick defines the donor. It is now generally accepted that the formation of heteroduplex regions of DNA and subsequent repair or failure to repair mismatches play a major role in recombination and represent the major underlying mechanism of gene conversion (3, 19, 20). Because of this, two models are generally debated as explaining recombination in S. cerevisiue. One is the single-strand-gaprepair model. The other is a version of the double-strand-break-repair model in which the break is not expanded to a gap. It is instead processed by exonucleases to yield single-strand tails which then pair with the homolog to yield long stretches of heteroduplex DNA flanking the site of the break (3, 21, 22). The attractiveness of the double-strand-break-repair model is the high efficiency with which double-strand breaks are repaired in S. cerevisiue (2, 16, 23, 24). However, these two models are similar and there are few available data that distinguish them. In addition to the recombination models discussed above, numerous variations of these models as well as different models have been considered; it is beyond the scope of this review to discuss them in detail. It is unclear to us whether any individual recombination model can explain all recombination events observed, and there is considerable evidence that many different recombination mechanisms can function in the same organism. However, virtually all recombination models incorporate many similar features, including: homologous pairing resulting in the formation of joints, like Holliday junctions, that allow the exchange of single strands between DNA molecules to produce heteroduplex DNA; the resolution of these joints to produce crossover and non-crossover configurations; and the repair or failure to repair mispaired bases contained within regions of heteroduplex DNA. Because of these similarities, most enzymatic studies have focused on the discovery and analysis of enzymes and enzyme systems that promote these recombination subreactions. Where recombination models differ, and where there is considerable uncertainty, is at the initiation stage of the reaction. [Petes et al. (3) have written a more detailed and recent discussion of most presently accepted recombination models, including a discussion of how different models account for different patterns of marker segregation observed during recombination. ]

228

W.-D. HEYER AND R. D. KOLODNER

II. Physical Analysis of Recombination Physical analysis of DNA molecules isolated from cells during the period when recombination is occurring has provided important insights into recombination. Electron microscopy and two-dimensional gel electrophoresis have demonstrated that branched DNA molecules containing structures resembling Holliday junctions are formed at the time of meiotic recombination in S. cerevisiae (25)Physical analysis has defined the kinetics of crossing-over for both meiotic recombination and double-strand-break-induced mitotic recombination (26-28). Physical analysis of double-strand-break-induced mitotic recombination demonstrated that the double-strand break is initially processed by a 5'-to-3' double-stranded DNA-specific exonuclease (23,27), and similar conclusions have been reached about meiotic recombination events that appear to be associated with the meiosis-specific induction and recombination-associated repair of a double-strand break (22, 28). Formation of extensive regions of heteroduplex DNA during recombination and the subsequent repair or failure to repair mismatches contained within this region of heteroduplex DNA at the time of recombination have been demonstrated (20). Analysis of the repair of heteroduplex DNA molecules after transformation into the cell have provided support for the existence of mismatch repair systems, and have provided considerable information about the properties of mismatch repair (19, 29-31). These types of studies have provided evidence for many of the general features of recombination postulated by the recombination models discussed above. As the physical methods used to analyze recombining DNA molecules are expanded and utilized in conjunction with the analysis of mutations that affect the frequency of recombination, we expect that this approach will yield many new insights into the mechanism of recombination over the next few years.

111. Enzymology of Homologous Genetic Recombination in S. cerevisiae In this review, we discuss recent progress in the biochemical analysis of the mechanism(s) of homologous recombination in the yeast S. cerevisiae. The several sections are (A) in vitro recombination systems utilizing crude cell extracts to catalyze recombination; (B) analysis of individual enzymatic activities implicated in recombination; and ( C )progress in reconstitution of a complete system of recombination in vitro, using purified components.

HOMOLOGOUS RECOMBINATION OF

S. cerevisiae

229

A. In Ktro Recombination Systems Based on Crude Cell Extracts A goal for a biochemist is to duplicate an observed natural reaction in vitro. The success of this approach for the analysis of complex reactions of DNA metabolism was first demonstrated by the dissection and reconstitution of DNA replication using Escherichia coli and its phages as model systems (32, 33). A similar approach has been used successfully in studies of SV40 DNA replication in vitro (34), and mismatch repair in prokaryotes (35).All of these studies have followed the same approach pioneered by A. Kornberg and co-workers, which includes: (I) development of in vitro systems using crude cell extracts as a source of relevant enzymatic activities; (2) fractionation of these systems to obtain individual required components; and (3) reconstitution of fully defined systems from these purified components. We assume that genetic recombination will be mechanistically as complex as DNA replication but still amenable to reconstitution in vitro. The genetic complexity, as documented by the number of mutants in different genes affecting recombination (3, 36), and the emerging biochemical complexity justify this opinion. In vitro systems that catalyze an entire process or major steps in it are invaluable for the study of the biochemistry of any reaction, particularly a process like recombination. In vitro reactions not only serve as an assay for purification of individual components but can also allow a mechanistic analysis that need not be constrained by the assumptions of a particular recombination model. Crude extract systems also offer the possibility of readily analyzing many different substrates and reaction conditions. However, the specificity of the in vitro reaction must be correlated with the reactions that occur in uivo, preferably through the use of mutations that affect recombination. This is difficult to achieve in higher eukaryotic organisms, but is more straightforward in S. cerevisiae, which offers both traditional genetic analysis and genetic analysis of genes obtained by first purifying a protein of interest, cloning the gene encoding the protein using molecular biology techniques, and producing mutations in the gene by directed mutagenesis. The development of the first eukaryotic cell-free in vitro recombination system catalyzed by crude extracts of S. cerevisiae was an important step in the study of recombination enzymology in this organism (37).Figure 3 shows the experimental scheme in which a pair of plasmid D N A substrates is incubated in a reaction with a crude extract of S. cerevisiae, and then the formation of recombinants is followed by either transformation into a recombination-deficient E . coli host or by physical assays (37-39). The reac-

230

W.-D. H E Y E R AND R. D. KOLODNER

pRDK39

A

m

p

1. Direct transformation of E. coli and select tet recombinants

6 tet- 14

AmpB >-+ crude extract

-10

pRDK35

+

2. Physical analysis of recombinant (tet R , plasmids 3. Physical analysis of intermediates by EM and gel electrophoresis 4. Genetic analysis of isolated reaction intermediates

FIG.3. In oitro recombination system in a crude extract of S . ceretisiue and the analytical methods used. The assay scheme illustrated in this figure is based on the results of Symington et

al. (37-39).

tion showed the kinetics and cofactor requirement typical for enzymes acting on DNA (see 32). Heat-treated extracts showed no significant activity. The contribution of recombination in the E . coli host during the transformation assay was assessed by incubating the substrates in the crude extract separately followed by cotransformation into an E . coli recA strain. This gave a frequency two orders of magnitude lower for wild-type recombinant formation compared to experiments in which the substrates were coincubated in the extract. The level of wild-type recombinant formation was also three orders of magnitude higher than could be accounted for by revertants in the substrate population. Thus, the reaction clearly produced new recombinants rather than simply enhancing the recovery of background recombinants. Structural analysis of the reaction products obtained after transformation into E . coli excluded trivial non-specific mechanisms of recombinant formation. Physical demonstration of recombinant formation after incubation of the substrates with the crude cell extract, but prior to transformation into E . coli, provided additional evidence that recombination had occurred in uitro. Genetic analysis of isolated physical intermediates and products of the in vitro reaction demonstrated that they were enriched in wild-type recombinants. The specificity of the reaction was ascertained by the use of a strain containing a mutation in the RAD52 gene. The FUD52 gene product is essential for the recombinational repair of double-stranded DNA breaks (36), and extracts from a rad52 mutant strain show reduced activity when one of the plasmid DNA substrates contains a double-strand break. Environmental conditions known to affect recombination were used to provide additional

HOMOLOGOUS RECOMBINATION OF

S. cereuisiue

231

correlative evidence that the in uitro reaction resembled recombination in uiuo. Gamma-ray irradiation induces recombination, and extracts from gamma-ray-irradiated cells showed a clear increase in recombination activity. Furthermore, meiotic recombination in yeast is induced 102 to 103times the mitotic recombination rates. Extracts from meiotic cells have a 10- to 20-fold increase in the formation of recombinants, reflecting the meiotic induction seen in viuo (38). Using similar substrates (37, 38), Hotta et al. (40) also described an in vitro system that used extracts of S. cerevisiae cells to catalyze the formation of wild-type recombinants. Only a limited analysis of the system was presented, and the work does not appear to have been continued. An in uitro system for double-strand break repair catalyzed by nuclear extracts of S. cereuisiue cells has been described (41) that used pairs of plasmid substrates, one that was circular and the other containing a doublestrand gap in a region of homology between the two plasmids. The recombinants formed in uitro were detected by physical methods, including Southern blotting, polymerase-chain-reaction (PCR), and one- and two-dimensional gel electrophoresis. A low level of double-strand-gap-repair products was observed. The major portion of these products appeared to be formed by a partial reaction in which only one end of the linearized substrate recombined with the circular substrate. Surprisingly, the major product molecules observed consisted of two classes of structures in which two double-strand linear molecules had been associated either head-to-head or tail-to-tail in a reaction similar to that previously observed in viuo (42). No information is available about the biochemical reaction requirements or the genetic requirements of this system. Mismatch repair is an integral part of the recombination process (43)and an in vitro system catalyzing mismatch correction based on the above cellfree extract was developed (44).This system used specialized substrate molecules that had single defined mismatches in a phage-M 13-based doublestranded circular DNA molecule. The mismatch was placed in the center of two overlapping restriction sites so that the substrate is resistant to digestion at both sites. Upon mismatch repair, one restriction site (depending on the direction of repair) is restored and can now be cleaved by the respective enzyme. This can easily be monitored and quantified by restriction analysis, gel electrophoresis, and densitometry. This system showed typical general biochemical reaction parameters. The specificity of repair observed in this in vitro system was similar to the in uivo repair specificity. Small insertions or A . C and G.T mismatches were repaired with high efficiency. All other mismatches, particularly C . C and G . G mismatches, were repaired relatively inefficiently. This correlates with segregation) values (indithe in uiuo observation of high PMS (post-meiotic

232

W.-D. HEYER AND

R. 13.

KOLODNER

cating “no repair” in yeast meiotic recombination; see Fig. 2) involving these two mispairs (45, 46). However, genetic studies measuring mismatch repair of individual mismatches have concluded that only C.C mismatches are consistently poorly repaired, whereas all other mispairs are generally repaired with almost equal efficiency (30, 31, 47). The repair track in the in uitro system (44) was 10-20 nucleotides, whereas estimates of repair-track length from conversion events in mitosis or meiosis or during transformation experiments with a inismatch containing DNA are at least 30-50 times longer (reviewed in 3). Thus, the in oitro mismatch repair system described clearly shows some differences from what has been observed in vivo. pmsl mutations cause a significant decrease in the frequency of mismatch repair in S. cerevisiae (19, 30, 31, 48). However, the occurrence of residual mismatch repair in strains carrying the p m s l mutation suggests the existence of more than one mismatch repair system in S. cerevisiae (19, 30, 31). It is unclear at present which system(s) is operating in the in vitro reaction. The results discussed above indicate that it is possible to develop cellfree systems that can promote reactions implicated in recombination. A major weakness of these studies is that it has not yet been possible to use these in vitro reactions as assays for use in direct fractionation and reconstitution studies. We believe that achieving this will be an important step in unraveling the enzymatic mechanism of genetic recombination in organisms like S. cerevisiae. Nonetheless, these cell-free systems have been very useful in providing the basis for some of the enzymology studies discussed below (Section 111, B).

B. Individual Enzymatic Activities In this section, we summarize the progress that has been made in identifying biochemical activities suggested to be involved in genetic recombination. In addition, we discuss gene products first identified through mutations that affect recombination, where the biochemical properties of these proteins suggest that they act on DNA during recombination.

1. PROTEINSINVOLVED I N HYBRIDDNA FORMATION As illustrated in Figs. 1 and 2, hybrid DNA is a central intermediate in genetic recombination. Clearly, proteins that promote homologous pairing and subsequent strand exchange to produce the types of joint molecules postulated by these types of recombination models must play a critical role in recombination. This view has received biochemical support from the characterization of the in vitro properties of the E . coli RecA protein. RecA is an important recombination protein in E . coli and is essential for many, but not all, homologous recombination processes in this organism. [For a review

HOMOLOGOUS RECOMBINATION OF

233

S . cerevisiae

that concentrates on recombination genetics, see 49. For a review on homologous pairing and DNA strand-exchange proteins, see SO]. The discovery that RecA protein can form hybrid DNA from model DNA substrates directly linked the genetically proven role of the recA gene in recombination and repair to a defined biochemical intermediate. The pioneering work with RecA (reviewed in 18, 51-54) has provided many biochemical assays for enzymatic activities that can promote the formation of hybrid DNA. A direct biochemical approach to the identification of homologous pairing activities was attractive, since no known mutation affecting recombination in yeast has as dramatic an effect on recombination (reviewed in 3, 36) as recA mutations have in E . coZi. The hybrid DNA assay shown in Fig. 4 was developed in the work with RecA and has been termed the strand-

linear ds DNA

+

(T

II)

- form

displaced single strand

circle

---+ hybnd DNA

-+

n-

protein

+

open circle joint molecules

+

displaced single strand

FIG.4. Hybrid DNA assay: the strand-exchange reaction. Illustration of a strand-exchange reaction utilizing homologous double-stranded linear and single-stranded circular DNA substrates as an assay for the formation of hybrid DNA. Incubation of the two substrates (doublestranded linear and single-stranded circular DNA) with a strand-exchange activity results in the formation of joint molecules as illustrated. The migration pattern of these DNA species on an agarose gel is drawn schematically beneath the two substrates. Among the joint molecules, three classes can be readily distinguished by electron-microscope analysis: the 0-form with very little hybrid DNA formed and no observable strand exchange; the a-form with a variable length of hybrid DNA (the double-stranded circular region) and an attached displaced single strand indicative of strand exchange; the open circle, which is the endproduct of the reaction when the entire single-strand has been transferred to the circle.

234

W.-D. HEYER AND R. D. KOLODNER

exchange assay (55).This assay or variants thereof (56)have been widely used to assay for RecA-like activities in different organisms. Because of the central importance of this assay, we discuss its properties and possible artifacts. The assay makes use of two DNA substrates, a doublestranded linear DNA and a homologous single-stranded circular DNA. The formation of joint molecules between the substrates can be monitored by agarose gel electrophoresis of deproteinized samples. The joint molecules migrate slower than the substrates, making detection and quantitation simple (see Fig. 4). Alternatively, filter-binding assays can be used to detect joint molecule formation (reviewed in 57). The structural predictions for joint molecules are precise, and three classes can easily be distinguished by electron microscopy. The direct observation by electron microscopy of a-forms (see Fig. 4) among the reaction products is crucial, because their existence demonstrates that joint molecules are formed by a strand-exchange mechanism. The presence of the displaced single-strand tail in the joint molecule directly excludes the possibility that the joint molecule was formed by first degrading the linear duplex with an exonuclease followed by simple reannealing of the digested duplex with the single-stranded circle. An alternate approach to the detection of the displaced single-strand produced by strand displacement is to measure the production of DNA that is sensitive to degradation by S1 nuclease (reviewed in 57). The scope of this section is a detailed discussion of the proteins from S. cerevisiue that form hybrid DNA in vitro. Yeasts, particularly S. cereoisiae, are an obvious choice for a eukaryotic model system for use in identifying proteins involved in hybrid DNA formation. Not only is the level of homologous recombination high in this organism, but most gene conversion is generated through a hybrid DNA intermediate (3), suggesting that high levels of homologous pairing proteins might be present. One approach to the identification of RecA-like proteins in S. cerecisiae that has been tried in several laboratories makes use of anti-RecA antiserum. This immunological approach has identified a single cross-reacting polypeptide in S. cerevisiae of size similar to RecA (58-60). However, cloning the corresponding gene demonstrated that this cross-reaction was fortuitous. The identified gene encoded the small subunit of ribonucleotide reductase. The epitope common to the two proteins recognized by the antiserum apparently consists of the four most C-terminal amino acids of both proteins, which are identical (59, 60; and 58 as cited in 59). A second approach to the identification of RecA-like proteins in S. cerevisiae has been to analyze the proteins encoded by genes that have been implicated in genetic recombination and repair through genetic analysis. Recently the RADIO gene has been shown to encode a protein that binds to single-stranded DNA and promotes renaturation of complementary single-

HOMOLOGOUS RECOMBINATION OF

S . cerecisiue

235

stranded DNA (61). While these properties suggest that RADlO could be similar to other homologous pairing proteins (see below), it has not yet been demonstrated that RADIO will promote strand-exchange reactions like other homologous pairing proteins. Recently, the RAD51 and DMCl genes have been sequenced and shown to encode related proteins that have significant homology with RecA (62-64). In the case of RAD51 protein, it was shown that it would bind to both double-stranded and single-stranded DNA, but the protein did not appear to promote renaturation of complementary singlestranded DNA or strand-exchange reactions (64).There has not yet been any reported biochemical analysis of the DMCl protein. Thus, it is not clear whether the DMCl and RAD51 proteins promote homologous pairing reactions, nor is it clear what the significance of the homology between these proteins and RecA is for recombination. To date, the most successful route to identifying such proteins has been the biochemical approach of purifying activities that promote hybrid DNA formation in vitro. This approach has resulted in the identification of several different proteins (see Table I). In the following section, we discuss these proteins in detail. At the end, we compare them to each other and to the RecA protein. Starting with crude extracts from S. cerevisiae that promote recombination in witro (37), Kolodner et al. (65) identified and purified a protein, called SEPl (for strand exchange protein l), from mitotic cells that catalyzes strand exchange in the assay shown in Fig. 4. The existence of such an activity in crude extracts (39) had been predicted from the analysis of reaction intermediates and products formed in an in witro recombination reaction. Initially, SEPl was purified as a polypeptide of M , 132,000 (65). Subsequent analysis demonstrated that this form is the product of partial proteolysis, and that the authentic SEPl is a polypeptide of M , 175,000 (66, 69). The M , 132,000 form of SEPl contains an intact NH,-terminus (69) and is missing approximately 40 kDa of the C-terminus. SEPl protein catalyzes strand exchange between linear duplex M13 replicative form and circular single-stranded M 1 3 viral DNA at an optimal stoichiometry of one SEPl monomer per 12 nucleotides of single-stranded DNA for the M, 132,000 form (65), and one monomer per 35-40 nucleotides for the M , 175,000 form (66). The formation of joint molecules occurs only when the substrates have a region of homology. The reaction requirements were found to be simple; essentially only Mgz+ was needed as a divalent cation. It was striking that ATP or other high-energy cofactors were not required. (65, 66, 70). This did not necessarily pose an energetic paradox, as the protein did not appear to turn over. Electron-microscope analysis of the product DNA demonstrated that &-forms (see Fig. 4) were the most abundant product class and that relaxed

TABLE I STRANDEXCHANGE PROTEINS IN THE YEAST Saccharomyces cerevisiae

Name

111,

Polarity in strand exchange

ATP requirement

SEP1.

STPa

132,000 175,000 38,000

3' to 5' 3' to 5' N.D.L

No No No

D PA

120,000

None

NO

SF1

55.000

N.D.

No

~~

" The M , 132,000 form of SEPl is a proteolytic fragment of the authentic M , 175,000 form. c

N . D., N o t determined or data not available. As calculated from thc published reaction conditions in the presence of ySSBs 67.

d

As calculated from the published reaction conditions (68).

'r

Stoichiometry 1/12 nt ssDNA 1/35-40 nt ssDNA 1/350-1400 nt ssDNA or 1/734 bp dsDNAI' 1/11-29 nt ssDNA or 1/7-15 bp dsDNAd 1/10-1/20 nt ssDNA

Reference

65 66

67

68 Norris and Kolodner (unpublished results) ~

~~~~

HOMOLOGOUS RECOMBINATION OF

S. cerevisiae

237

circles, the end product of the reaction, were also detected. As discussed above, the demonstration of a-forms eliminated the possibility that the joint molecules were formed by a degradation and reannealing mechanism. The average length of the hybrid DNA formed was 4.1 kbp, using the 7.2-kbp M 13mp19 substrate, In addition, significant amounts of full-length opencircle products containing 7.2 kbp of heteroduplex DNA were obsewed (65, 66). SEPl also catalyzed DNA renaturation in a stoichiometric fashion, with reaction requirements identical to the requirements of strand exchange, that is, Mg2f -dependent and ATP-independent. In addition, the protein bound to double-stranded and single-stranded DNA, but in an Mg2+-independent fashion. Binding of SEPl to both types of DNA resulted in the aggregation of the DNA into large structures, and these aggregates were suggested to be kinetic intermediates in the strand-exchange reaction (71). The singlestranded DNA in such aggregates was protected from S1 nuclease, consistent with a tight protein. DNA complex similar to the recA,DNA aggregates first described by Radding and co-workers (72, 73). Preparations of the M , 132,000 form of SEPl consistently contained low amounts of exonuclease activity, originally ascribed to a contaminating polypeptide (65). However, extensive analysis of the overproduced protein has definitively demonstrated that SEPl has an intrinsic 5’-to-3’ exonuclease activity (66; see discussion in Section 111, B, 2, a: Exonucleases). This 5’-to-3’ exonuclease copurified with the M , 175,000 polypeptide and the strandexchange activity in all chromatographic steps designed to separate them. Both activities showed identical heat inactivation profiles. Most importantly, the nuclease activity was overexpressed to the same extent as the M , 175,000 protein upon induction. Independent evidence further corroborates the finding that SEPl has an intrinsic 5’-to-3’ exonuclease. Stevens and coworkers (74-76) purified and characterized a 5’-to-3’ exo-ribonuclease that degrades both RNA and the RNA strand of an RNA.DNA hybrid. Cloning of this gene, called X R N l (77),and subsequent sequence analysis (A. Stevens, unpublished results) established that it is identical to S E P l . While the specificity of the exonuclease activity of SEPl remains to be studied further, it is clearly present and has significance for the strandexchange reaction. The nuclease and strand-exchange activity of SEPl can be distinguished by their cofactor requirement. The nuclease activity depends on the presence of Mgz+, whereas the strand-exchange activity can use Mgz+ or Caz+ (66). However, the nuclease activity is important for the strand-exchange reaction. Under conditions in which the nuclease was not active (in the presence of Ca2+), the strand-exchange activity depended on prior resection of the double-stranded DNA ends by an exogenous exonuclease. The occurrence of a-forms in these reactions provides evidence for

238

W.-D. HEYER AND R. D. KOLODNER

a true protein-promoted strand-exchange reaction catalyzed by SEPl under the nuclease-minus condition (66). This mode of action is similar to that of another strand-exchange protein from S. cereuisiae, DPA (68),as discussed below. Thus, the nuclease activity is important for initiation of joint molecule formation but not for strand exchange itself. The presence of the nuclease activity appeared to determine the polarity of the hybrid DNA formation reaction. The polarity of the strand-exchange reaction was 3' to 5' when the nuclease was functional (65) but could occur in both directions when the nuclease was inactivated, and partially resected double-stranded linear substrate DNAs were used (66). In the latter experiments, the polarity of the homologous single-stranded tail determined the polarity of the branch migration reaction. Exactly what role the nuclease activity or the singlestranded tails it generates plays in the strand-exchange reaction is not yet known. The gene encoding the SEPl polypeptide has been cloned and sequenced (69, 78), revealing an open reading frame with a coding potential for an M , 175,000 polypeptide. Monoclonal antibodies directed against the M , 132,000 form of SEPl (W.-D. Heyer and R. D. Kolodner, unpublished results) recognize a band of this apparent M , in crude cell extracts of S. cereuisiue (66). Overexpression of the SEPl gene product results in the overproduction of a polypeptide consistent with an M , of 175,000 (66). This M , form is present throughout all phases of the S . cereuisiue life-cycle (mitosis, meiosis, spores) (W.-D. Heyer and R. D. Kolodner, unpublished results). Therefore, the M,. 175,000 form appears to be the authentic SEPl polypeptide. The recent finding of apparent double-stranded DNA breaks associated with hotspots of meiotic recombination (22, 28, 79) that are processed to contain extensive 3' overhanging single-stranded DNA ends (80) suggests a role for a multi-functional protein like SEP1. SEPl could provide both the required 5'-to-3' exonuclease activity and the DNA-pairing activity to initiate and promote strand exchange between the single-stranded DNA and a homologous double-stranded DNA. However, direct experiments are needed to test this hypothesis. The gene encoding the SEPl protein has been cloned independently, using both degenerate oligonucleotide probes derived from protein sequence information and antibody screening methods (69, 77, 78). In addition, the SEPl gene has been independently identified in mutational analyses as K E M l (81)and as RAR5 (82). For the sake of simplicity, we refer to the gene and protein as SEP1. Mutations in the S E P l gene are pleiotropic. The gene is not essential; however, it is important for mitotic growth, as cells containing null alleles of S E P l have decreased growth rates (69, 77, 78, 81, 82) and appear to be delayed in entering S phase (69). Otlier mitotic phe-

HOMOLOGOUS RECOMBINATION OF

S . cerecisiue

239

notypes include a delay in return to growth after UV or y-ray irradiation without significantly increased killing compared to wild type (69). Spontaneous intragenic recombination was slightly decreased (69, 78), whereas intergenic recombination appeared normal (81). The gene was first identified genetically as an enhancer of the nuclear fusion defect caused by a karl mutation (KEM = kar- enhancing mutation) (81);K E M l ( S E P I ) was one of three complementation groups discovered. Aside from the selected phenotype, mutations in K E M l ( S E P I ) display only a modest Kar- phenotype in a Kar+ background. In addition, mutants have a slightly increased sensitivity to the microtubule depolymerizing drug benomyl, have a 10- to 20-fold increased rate of chromosome loss (nondisjunction), and a lowered viability upon nitrogen starvation. A defect in spindle-pole-body duplication and/or separation was observed cytologically. Kearsey and co-workers (82) searched for mutants that allow replication of centromere-containing plasmids with a defective ARS element; RARS ( S E P l )(RAR = regulation of autonomous replication) was one of five complementation groups identified. rar5 (sepl)mutants strongly reduced the rate of loss of the ARS-defective plasmid, which could be detected by a colony color assay for plasmid stability. Clearly sepl (kemllrar5lxml) mutants have a wide variety of mitotic phenotypes; a considerable amount of work is required to relate these phenotypes to each other and to the different activities of SEP1. Aside from an array of mitotic phenotypes, mutations in S E P l display a strong meiotic phenotype, as homozygous sepllsepl diploids show a drastically reduced ability to sporulate (69, 78, 81). This defect has been further defined as an accumulation of fully viable cells before meiosis I after an apparently normal round of pre-meiotic DNA synthesis (69). Mutations in SEPl show a variety of effects on meiotic recombination. Tishkoff et al. (69) reported an increased induction of recombination phenotype for intragenic recombination at H l S 4 , whereas Dykstra et al. (78) found an increased induction but a reduced final frequency of recombinants for intragenic meiotic recombination. Additional experiments have shown a significant decrease in the frequency of a number of different types of meiotic recombination events, including intragenic recombination at multiple loci, a single-site gene conversion event, and an intergenic recombination event as measured in return-to-growth experiments, and an almost complete absence of meiotic crossing-over as measured by physical methods using the system of Cao et al. (28)( D . Tishkoff and R. D. Kolodner, unpublished data). The cause of the locus-specific differences in recombination in sepl mutants is unclear at present. In addition to a quantitative effect on meiotic recombination, there is a qualitative difference, when recombination is monitored, in the small num-

240

W.-D. HEYER AND R. D. KOLODNER

ber of spores formed during sporulation of sepl mutants. Crossovers were observed in these spores, but positive chiasma interference in the HZS4LEU2-MAT interval was absent, in strains carrying mutations in SEPl (69). Epistasis analysis using spol3 and rad50 mutations showed that neither spol3 alone nor the combination of spol3 and rad50 rescued the sporulation defect of sepl mutations (69, 78).This analysis indicates that SEPl represents a new class of genes in which mutations cause defects in meiosis, in addition to the better-known classes of genes in which defects can be rescued by spol3 directly (e.g., H O P I , M E R l , M E 1 4 , RADSO, R E D l , and S P O l l ) and by spol3 in combination with rad50 (e.g., RADS1, 52, and 57) (reviewed in 3). In summary, mutations in SEPl (XRNI, K E M l , and RARS) show pleiotropic effects. Interesting parallels to E . coli oriC-dependent replication have been pointed out (82).The rar phenotype of sepl mutants, which could arise from a relaxed specificity for ars-dependent replication, is analogous to the observation of specificity factors for oriC-dependent DNA replication, which suppress productive initiation at sites other than oriC (83, 84). One of these specificity factors has been identified as E . coli RNase H (83), and the S E P l polypeptide exhibits a type of RNase-H activity (76).In addition, it should be noted that, in E . coli, recA is essential for stable DNA replication under SOS conditions (85-87) and in rnh mutants (88). Since the S E P l protein is multi-functional (66),further studies are required to determine whether the individual biochemical activities can be correlated with specific defects, as only null mutations have been analyzed to date. In addition, a structural role for SEPl that could explain the pleiotropic consequences of sepl mutations cannot be ruled out. Any interpretations of the phenotype of mutations in S E P l are limited by the fact that S. cerevisiae cells contain at least three strand-exchange proteins (see Table I), which raises the possibility that some of the potential effects of sepl mutations are masked or partially substituted for by the presence of other, potentially redundant, activities. Using an extract procedure similar to one described earlier (39, Sugino and co-workers (67) identified and purified a strand-exchange activity from meiotic S. cerevisiae cells. The appearance of the activity correlated with the onset of meiotic recombination. The purified protein, named strand transfer protein a (STPa), had an M , of 38,000 and the strand-exchange -activity depende2 on the presence of a second protein that could be any one of a number of as yet little-described single-stranded DNA-binding proteins. The amount of STPa required for strand exchange was substantially lower (67) than in the case of SEPl (65; see also Table I) and, unlike SEP1, it is unclear whether STPa can even promote homologous pairing reactions in the absence of stimulatory factors. The reaction requirements for STPa were simple and only Mg2+ was required. Strikingly, the reaction did not require

HOMOLOGOUS RECOMBINATION OF

S. cerevisiae

241

ATP or any other energy cofactor. Electron-microscope analysis of purified reaction products revealed both a-forms and open-circle end products, indicative of a true strand-exchange reaction (67). STPa also catalyzed the formation of joint molecules between linear duplex and linear singlestranded DNA substrates (89). The DSTl gene encoding STPa has recently been cloned and characterized, although this characterization has not yet included the demonstration that the overproduced DSTl gene product has strand-exchange activity (90). The DSTl gene is identical to PPR2 (91),a gene identified as a positive regulatory factor of URA4. PPR2 is apparently the S. cerevisiae homolog of the mouse RNA polymerase-I1 transcription-elongation factor TFIIs (92)and interacts specifically with RNA polymerase I1 from S. cerevisiae (93).Disruption of the DSTl gene had no apparent effect on mitosis, meiosis, or spore viability (90). Recombination monitored in dstl mutant strains showed no effect on intrachromosomal deletion formation, mitotic intragenic recombination, and meiotic intergenic recombination. However, meiotic intragenic recombination between one pair of his1 heteroalleles was reduced about 6-fold (90). It is interesting to note that, despite the hypo-recombination phenotype, spore viability is normal in dstl mutant strains, whereas a meiotic hypo-recombination phenotype usually results in reduced spore viability (see 3, 69). Finally, Halbrook and McEntee (68) discovered and purified a strandexchange activity from mitotic S. cerevisiae cells using two different DNA reannealing assays. The purified protein had an apparent M , of 120,000 and was termed yeast DNA-pairing activity (DPA). DPA catalyzed DNA renaturation in a stoichiometric fashion with no evidence for catalytic turnover of the protein. The reaction requirements were again simple, although the activity could use both CaZ+ and Mg2+ as a cofactor. The protein bound tightly to single-stranded DNA, resulting in the aggregation of the available DNA in large protein.DNA networks. DPA did not catalyze the strand-exchange reaction between singlestranded circular DNA and linear double-stranded DNA containing blunt ends. The double-stranded linear DNA substrate could be activated by limited digestion with either E . coli exonuclease I11 or phage-T7 gene-6 exonuclease, producing 5' and 3' single-strand-tailed duplex DNAs, respectively. When these substrates had single-strand tails averaging 50 nucleotides long, DPA could catalyze the formation of joint molecules. The reaction required one M , 120,000 monomer per 20 nucleotides of singlestranded or double-stranded DNA. This reaction required the DNA to be homologous and Mg2+ or Ca2+ as divalent cation. The structure of the joint molecules as determined by electron microscopy showed both a-forms produced by simple renaturation of the tziled duplex DNA and the circular

242

W.-D. HEYER AND R . D. KOLODNER

single-strand as well as a significant amount of a-forms. The extent of hybrid DNA was estimated to be 3-5 kbp. Apparently, no relaxed circular doublestranded DNA end product was found. As listed in Table I, S. cerevisiae contains at least three strand-exchange proteins that have similar activities. Despite the substantial similarities among the three proteins, the available evidence seems to suggest that they are encoded by different genes and are therefore different. Recent immunological data (70) suggest that STPa and SEPl are different proteins and demonstrate that the S E P l polypeptide is present in strains containing a disruption for the gene encoding STPa. Subsequent cloning studies show that S E P l and STPa are encoded by different genes (69, 78, 90). Comparisons of protein sequences lead to the conclusion that SEPl and DPAl are different proteins (R. D. Kolodner and K. McEntee, unpublished). The observation of at least three strand-exchange proteins raises the questions of whether these proteins can substitute partially for each other, and whether multiple mutant combinations will cause more severe phenotypes than any of the single mutations. The three yeast proteins (65, 67, 68, 70) and strand-exchange activities from other eukaryotes so far described (56, 94-96) do not require ATP hydrolysis or any other energy cofactor to catalyze strand exchange. This does not represent an energetic paradox of a “perpetuum mobile,” since there is no evidence for catalytic turnover during the reaction. Several possible explanations could account for this observation. First, the protein might be consumed by the reaction in uivo like a suicide protein resembling the proteins that function in the detoxification of methylation damage in DNA repair (93.2 This would mean that hybrid DNA is formed at the price of the constant resynthesis of protein by the cell, which would require a large expenditure of energy. Second, these proteins might have been purified as an activated intermediate, as can occur with E . coli DNA ligase (98), and might require a high-energy cofactor under different conditions. Third, the protein itself might be unable to turn over catalytically in the in citro reaction because of the absence of additional protein factors that contribute to a functional multiprotein complex. The absence of an energy requirement for many eukaryotic strandexchange activities has been puzzling, and is in contrast to the concept developed from studies of the E . coli RecA protein that strand exchange is an energy-consuming reaction. Initially, it was suggested that RecA protein had to bind and hydrolyze ATP in order to form hybrid DNA (55).Experiments employing ATPyS, a non-hydrolyzable analog of ATP, demonstrated that recA can bind to DNA and form unstable (paranemic) joints. Therefore, it 2

S e e the essay by S. Mitra and B. Kaina in Vol. 44 of this series. [Eds.]

HOMOLOGOUS RECOMBINATION OF S.

cerevisiue

243

was suggested that the conversion of unstable paranemic joints to stable hybrid DNA (plectonemic joints) requires ATP hydrolysis (99). When the analog was added to an ongoing strand-exchange reaction, the extension of hybrid DNA stopped immediately (55).This was interpreted to mean that ATP hydrolysis is required for the formation of extensive regions of hybrid DNA (i.e., branch migration). Upon revisiting the problem, it was found that in the presence of the non-hydrolyzable ATP analog ATPyS, joint molecules containing extensive stable regions of hybrid DNA are formed by recA (100, 101).The reason this had not been seen before was explained by a rather curious sensitivity of the reaction to the Mg2+ concentration. The reaction is optimal at 4-5 mM Mg2+, but is almost completely suppressed at 10 mM Mg2+, the concentration commonly used. Molecular models for the role of ATP hydrolysis in the strand-exchange reaction envision a presynaptic filament of recA and singlestranded DNA in the presence of ATP or ATPyS (100, 201;reviewed in 54) A double-stranded DNA is then bound by the presynaptic filament, resulting in the formation of a triple-stranded intermediate in which the two DNA molecules are homologously aligned. According to these models, ATP hydrolysis is needed to dissociate the protein from the product (product release), resulting in a double-stranded product in which the strands have been exchanged between the two substrate DNAs. In the presence of ATPyS, the product release step is blocked, but can be achieved in vitro by dissociating the protein from the product with sodium dodecyl sulfate/EDTA treatment. Regardless of the molecular details, the observation of extended hybrid DNA formation by RecA in the absence of ATP hydrolysis abolishes the fundamental difference between RecA and the eukaryotic strand-exchange activities with respect to the energy dependence of the reaction. As specifically pointed out (loo),the RecA reaction in the presence of ATPyS compares well with the strand-exchange activity by the ATP-independent eukaryotic activities. indeed, even the extent of hybrid DNA formation is similar (65, 100).It was suggested that the ATP-independent eukaryotic proteins are functionally equivalent to the bound ATP form of RecA (100).It is possible (as indicated above) that in eukaryotes, strand-exchange and catalytic-turnover activities are carried out by different proteins. Despite apparent similarities between the reactions carried out by the E . coli RecA protein and the S. cerevisiw strand-exchange proteins, it cannot be concluded, at present, that they act mechanistically in a similar fashion (see also 50). This implies that it will be possible to identify activities that promote the catalytic turnover of strand-exchange proteins. Approaches to the identification of such activities are discussed in Section 111, C. The results discussed above indicate that S. cerevisiue probably contains at least three proteins capable of promoting strand-exchange reactions. Ex-

244

W.-D. HEYEH AND R. D. KOLODNER

periments discussed in Section 111, C suggest the existence of a fourth such protein. One important question is, how many proteins that promote homologous pairing reactions exist in S. cerevisiae? Most searches for such proteins have utilized only two simple assays, either strand exchange using single-stranded circular and homologous linear duplex DNA substrates, or renaturation of homologous single-stranded DNA. It is possible that different homologous pairing proteins could be identified if different substrates and assays were used,

2. NUCLEASES Nucleases play a prominent role in the current models for homologous recombination (see Figs. 1 and 2). It is postulated that they are directly involved in the initiation step of the recombination process. In the Meselson-Radding model (12), a single-strand break is postulated to be the initiation event, whereas the double-strand-break-repair model (26)is based on a double-strand break being the initiating event. The processing of these postulated breaks by exonucleases is important in these models. There is also a role for nucleases at other stages of recombination, including the processing of Holliday junctions by structure-specific endonucleases and the degradation of single-stranded DNA at later stages of recombination. In E . coli, there is extensive genetic and biochemical evidence for the involvement of nucleases in recombination (102; reviewed in 49). In S. cerevisiae, the most direct evidence for the involvement of nucleases comes from the physical analysis of double-strand-break-repair where it is clear that the break is first processed by a 5’-to-3’ exonuclease to yield linear DNA molecules containing 3‘-terminal single-strand tails (80).A similar type of processing by a 5’-to-3‘ exonuclease has been observed to occur during mating-type switching (27). The involvement of nucleases that lead to the production of linear DNA molecules containing 3’-terminal single-strand tails appears to be a general feature of recombination in many systems. In this section, we describe the exonucleases, endonucleases, and structure-specific endonucleases that have been described in S. cerevisiae, and consider the evidence that they play a role in genetic recombination. For a general review on nucleases, including a discussion of the importance of 5’-to-3’ exonucleases in phage and bacterial recombination, see 103 and 104. a. Exonucleases. A number of exonucleases from S. cereuisiae have been purified and characterized (see Table 11). Exonuclease I (105, 106), a singlestranded DNA-specific exonuclease that degrades DNA in the 3‘-to-5’ direction is associated with DNA polymerase 11. Possibly, it is the “proofreading” function for this DNA polymerase.

TABLE I1 DNA EXONUCLEASES IN THE YEAST Saccharomyces cerevisiaea DNA Name exo I exo I1 exo 111 exo IV exo V exo VI yNucR SEPl NUCld

ss

ds

RNA

Direction

+ + + + + +

-

N.D.b

3' to 5'

5' pN

-

-

-

N.D.

5' to 3' 3' to 5'

5' pN 5' pN

+ + + +

+

5' to 3' 5' to 3' 5' to 3'

5' pN

+c

+ +

+

-

+ + +

Modified after 10s. N.D., Data nut available. Endonucleolytic activity only 111 d Mitochondria1 localization, not identical to yNucR 118 0

tr

N.D.

5' to 3' 5' to 3'

Product

5' pNpN 5' pN oligo 5' pN N.D.

Function DNA polymerase I1 proofreading

? DNA polymerase I11 proofreading

Reference

106, 105

107 108

? ? ? ?

108 109 110 111

RNA processing genetic recombination ? (mitochondrial)

66 112

75

246

W.-D. HEYEH AND R. D. KOLODNEH

Exonuclease I1 was detected and purified using degradation of 3Hlabeled denatured E . coli DNA as an assay. It is the major nuclease and is apparently a tetramer of four identical M , 120,000 subunits (107). It specifically degrades single-stranded DNA in the 5’-to-3‘ direction, yielding 5’-mononucleotides exclusively. Double-stranded DNA is not a substrate for exoII. Although the exoII preparations contained RNase activity, it was argued on the basis of temperature inactivation experiments and differential sensitivity to ionic strength that this RNase activity is not physically associated with exoII. There is no information regarding a possible function for this nuclease. However, the activity of exoII bears some similarity to the activity of the E . coli RecJ exonuclease, which is required for recombination (102). Exonuclease I11 from S. cerevisiue, a single-stranded DNA-specific exonuclease that degrades DNA in the 3’-to-5’ direction, is associated with DNA polymerase I11 and might be a domain of that enzyme. (108). The enzymatic activity is consistent with the idea that exonuclease 111 is the “proofreading” function of DNA polymerase 111. There is less information about exonuclease IV, a 5’-to-3’ exonuclease that copurified with the DNA polymerase-1II.exonuclease-I11complex (108). ExoIV degrades single-stranded DNA, double-stranded DNA, and RNA to 5‘-mononucleotides. The function of this nuclease is unknown, but it was suggested that the extensive co-chromatography with the DNA polymerase1II.exonuclease-III complex might have functional significance, and that exoIV could be involved in the maturation of Okazaki fragments (108). Exonuclease IV appears to be quite similar to SEP1, and the possibility that they are the same protein has not been eliminated. As part of a search for 3‘-to-5’ single-stranded DNA-specific exonucleases that have proofreading capacity during DNA replication, a processive 5’-to-3’ single-stranded DNA-specific exonuclease, exonuclease V, in S. cerevisiae was identified and purified (109). Exonuclease V is a homodimer of identical M , 57,000 subunits. It has no enzymatic activity on doublestranded DNA or RNA substrates. One enzymatic difference between the two 5’-to-3’ exonucleases, exonucleases 11 and V, is that the reaction products generated by exonuclease V are mainly dinucleotides, whereas exonuclease 11 generates mononucleotides. Recently, a 5‘-to-3’ deoxyexoribonuclease with novel substrate specificities has been described (110).Extending the earlier nomenclature (109), we designate this activity as exoVI (see Table 11). exoVI is specific for DNA with a 10-fold preference for double-stranded substrates. RNA was not a substrate (110). An endo-exonuclease from S. cerevisiue has been named yNucR (111) (now renamed NUCB, 113);it crossreacts immunologically with polyclonal antibodies raised against a Neurospora crassa endo-exonuclease that was

HOMOLOGOUS RECOMBINATION OF

S . cerevisiae

247

suggested (114) to play a role in D N A repair. yNucR/NUC2 was purified as an M , 72,000 polypeptide using both a standard nuclease assay and immunoprecipitation with the polyclonal antisera as assays. It is a single-stranded DNA-specific endonuclease, but it has exonuclease activity on doublestranded DNA substrztes. The amount of this nuclease present in cells appears to be controlled genetically by the RAD52 gene, which is essential for the recombinational repair of double-stranded breaks in S. cerevisiae. However, yNucR/NUC2 is not the gene product of the RADS2 gene (115, 116) and it is not yet clear whether this protein plays any role in recombination or repair. This finding gives further caution to using the absence of an observed activity in a mutant strain as an argument that the gene under study might encode the activity. The exonuclease activity of SEPl on DNA substrates has only recently been appreciated (66), but in fact SEPl was discovered initially as an riboexonuclease (74). This ribonuclease degrades RNA in the 5’-to-3’ direction, releasing 5’-mononucleotides (74, 75). The activity requires that the 5’ end be phosphorylated, and it can use Mg2+ as cofactors (75). In addition, the protein has an exoribonuclease-H-like activity that can degrade the RNA strand of an RNA.DNA hybrid (76). In addition to this ribonuclease activity, SEPl has a deoxyexoribonuclease activity (66) on single- and double-stranded DNA substrates with 4-fold higher specific activity for single-stranded DNA. The enzyme exhibits a 5’40-3‘ polarity, releasing 5’-mononucleotides with a turnover number for single-stranded and double-stranded DNA substrates of 70 mol of nucleotides per mole of protein per minute and 20 mol of nucleotides per mole of protein per minute, respectively (66).These turnover numbers are roughly one-half of the turnover number of E . coli exonuclease I11 (117).The relative substrate specificity of the enzyme for RNA and DNA is ssRNA > ssDNA > dsDNA RNA.DNA hybrids (66, 74-76; A. W. Johnson and H. D. Kolodner, unpublished). Although SEPl exhibits potent nuclease activity under exonuclease assay conditions, the nuclease activity under strand-exchange reaction conditions is much lower and results in degradation of only 200-400 bases from the end of the linear duplex strand-exchange substrate. It was suggested that binding of SEPl to the substrate DNAs during the strand-exchange reaction might attenuate the nuclease activity, constraining the activity to the initiation phase of the reaction (66). A possible role for the nuclease activity in genetic recombination has been discussed above (Section 111, B, 1). The major nuclease in S. cerevisiae cells, which accounts for 50% of all activity, is an inner mitochondrial-membrane-associated nuclease called NUCl (112).This enzyme exhibits an array of activities, including: degradation of single-stranded but not double-stranded RNA; endonuclease activity

-

248

W.-D. HEYEH AND R. D. KOLODNER

on single-stranded and double-stranded DNA; and 5’-to-3’ exonuclease activity on double-stranded DNA (112).The protein is immunologically related to an analogous enzyme purified from Neurospora crassa (114). Although similar in this respect to the above-mentioned nuclease yNucRINUC2 (111), the available evidence suggests that it is a different protein (118).The nuclease is encoded by a nuclear gene that has an open reading frame with a coding potential for an M , 37,000 polypeptide consistent with the M , of the purified protein (119).So far no phenotype, including altered recombination properties, has been found to be associated with a null allele of NUC1 or with overexpression of the NUCl protein (118). While a number of exonucleases that degrade DNA have been identified in S . cerevisiae, in no case has the analysis of these proteins progressed to the point where one could demonstrate a direct role in genetic recombination. Because there is considerable evidence for the involvement of 5’40-3’ degradation of DNA in recombination, probably the best candidates for nucleases involved in recombination are exonucleases 11, IV, V, and VI and SEP1. For example, exonucleases I1 and V could function in conjunction with a helicase like RAD3 to produce double-stranded DNA molecules containing 3’ single-strand tails. The 3‘ end could then play a role in strand invasion and priming of DNA synthesis. This model has been proposed to explain how the RecJ 5‘-to-3’ single-stranded DNA-specific exonuclease might act in E . coli recombination (102).Exonucleases IV and VI and SEPl could act directly on the ends of linear double-stranded DNA molecules to produce these types of 3’ single-stranded ends, which could be substrates for strand exchange. The DNA-polymerase-associated exonucleases I and I11 have properties suggesting that they function in proofreading during DNA synthesis; however, this does not preclude a role in recombination. Clearly, almost any of the nucleases described above could play some role in some step in recombination, as postulated b y different recombination models. However, it will require considerable genetic and biochemical analysis to determine which, if any, of these different exonucleases actually function in recombination.

b. Endonucleases. Double-strand breaks are the postulated recombination initiation event in the double-strand-break-repair model (16), and double-strand break-initiated meiotic and mitotic recombination occurs with high efficiency in yeasts (120, 121). Furthermore, it was recently discovered that two meiotic recombination hotspots are associated with the appearance of double-strand breaks at the time of recombination (22, 28). Therefore, double-stranded DNA endonucleases might be a class of enzymes involved in the initiation of homologous recombination. Two general double-stranded DNA endonucleases, “Endo.SceI” and “Endo.SceII,” have been identified

HOMOLOGOUS RECOMBINATION OF

S. cerevisiae

249

in S . cerevisiae. Several other double-stranded DNA endonucleases have been studied in detail in this organism (122-125), but they are involved in site-specific recombination at the MAT locus or in site-specific recombination in mitochondria with no apparent role in general chromosomal recombination. Therefore, we limit our discussion to Endo.Sce1 and 11. Endo.Sce1 was identified in a survey of 40 different yeast strains (126) and was purified to homogeneity as a heterodimer consisting of M , 75,000 and M , 50,000 subunits (127-129). Endo. SceI shares some properties with bacterial type I1 restriction endonucleases, as it requires Mgz+. In addition, it cleaves double-stranded D N A at defined sites leaving protruding fournucleotide-long 3'-termini with 5'-phosphate- and 3'-hydroxyLcontaining ends that can be joined by DNA ligase (127, 128). However, the recognition site of Endo.Sce1 is a complex 26-bp consensus sequence with many undefined (any nucleotide) or half-defined (purine or pyrimidine) positions (127, 130). Despite this rather loosely defined recognition sequence, the endonucleolytic cleavage by Endo.Sce1 is precise and defined. The DNA from strains containing Endo. SceI activity is not protected from endonucleolytic attack by the enzyme, making it unlikely to be a eukaryotic counterpart of a bacteria-like restriction-modification system (127). Endo.Sce1 is a heterodimer and the active site of the endonuclease probably resides in the smaller, M, 50,000, subunit. The M , 75,000 polypeptide is viewed as an essential effector (129). The E N S l gene encoding the larger subunit has been cloned and sequenced (131); it is identical to the recently identified gene SSCl, which encodes a related heat-shock protein of the H S P 70 family. ENSlISSC1 is essential for mitosis and meiosis in S . cerevisiae (132). Under normal conditions, most, if not all, of the M , 75,000 protein and Endo. SceI activity is found in mitochondria, suggesting that E N S l is required for mitochondrial function. The ENS2 gene encoding the M , 50,000 subunit has been identified as the mitochondrial RF3, which is postulated to encode a maturase-like protein showing sequence homology to the HO, w, and aZ4 gene products (132). The presence of the Endo.Sce1 activity correlates with the presence of an intact RF3 open reading frame in the mitochondrial genome. No phenotype has thus far been associated with mutations in the E N S 2 gene (132).These data indicate that the lethal phenotype of ens1 mutations is not due to an effect on the activity of Endo. SceI. They also point to a role of Endo. SceI in mitochondrial rather than chromosomal (nuclear) recombination, possibly like that observed for w. A second double-stranded D N A endonuclease has been identified in S . cerevisiae and named Endo. SceII on the basis of a cleavage site that is clearly different from that of Endo. SceI (122, 130). Interestingly, the recognition sites of the two S. cerevisiae endonucleases and that of the HO endonuclease share some homology (130).Moreover, the four-base 3'-overhang structure

250

W.-D. HEYEH AND R. D. KOLODNER

of the cohesive ends produced is shared among all three activities. It has been suggested that these common features might result from a functional and evolutionary relationship between the three endonucleases. As HO endonuclease is clearly involved in the initiation of the site-specific recombination of the S. cerevisiae MAT locus, a cellular function for Endo.SceI1 in general recombination has been postulated (130). At present, only limited information is available about endonucleases in S. cerevisiae. While it seems clear that specific double-strand breaks can act as recombination hotspots (22, 28, 79, 120, 121), the analysis of doublestranded DNA-specific endonucleases has not progressed to the point where it has been possible to identify the endonuclease(s) that make(s) these breaks. A direct role in chromosomal recombination has not been demonstrated for any of the known endonucleases, nor has there been any definitive analysis of single-stranded DNA-specific exonucleases. Given the important role that double-strand breaks appear to play in some recombination events, the continued analysis of double-stranded DNA-specific endonucleases should prove fruitful.

c. Nucleases That Cleave Crucqorm Structures. Considerable evidence has accumulated supporting the idea that the formation and resolution of Holliday junctions play an important role in recombination. Following the seminal studies an the gene-49 endonuclease of E . coli bacteriophage T4 (133),which cleaves Holliday junctions in addition to other substrates, three S. cerevisiae activities that cleave Holliday junctions diagonally across the junction, as required for resolution of these structures during recombination, have been identified and partially purified. It has been proposed to name these activities Endo-X1, - X 2 , and -X3, respectively (134). Endo-X2 was first described (38,135)as an activity that cleaves cruciform structures at the base and resolves the Holliday junction present in “figure-8’’ recombination intermediates. Although this activity has been purified over 50,000-fold (R. Reenan and R. D. Kolodner, unpublished results), it has not yet been possible to obtain a homogeneous preparation. Regardless, much is known about this activity. Unlike the T4 and T7 enzymes, Endo-= appears to be a IIolliday-junction-specific endonuclease, although it should be pointed out that this conclusion is based on a limited set of data. It cleaves only four-arm structures such as cruciforms, figure-8 molecules, and Holliday junctions constructed from cloned bacteriophage X att sites and does not cleave the three-armed junctions tested, single-stranded DNA, or singlestranded substrates containing double-stranded hairpins (135, 136; R. Reenan and R. D. Kolodner, unpublished results) Endo-X2 also does not cleave synthetic immobile Holliday junctions that do not contain sequence symmetry like that found in normally occurring Holliday junctions (137). Rather,

HOMOLOGOUS RECOMBINATION OF

S. cerecisiae

25 1

these structures are competitive inhibitors of the enzyme and the enzyme forms stable complexes with them. The inability of Endo-= to cleave these structures could arise from an inability to cleave structures lacking sequence symmetry similar to normally occurring Holliday junctions, or because the length of the arms in these structures was too short to allow some crucial interaction. The Holliday-junction-cleavage reaction appears to occur by a concerted double-strand cleavage that does not involve a nicked intermediate and that yields double-stranded D N A molecules containing nicks with 5’-phosphate and 3’-hydroxyl groups that can be joined by D N A ligase (136).This cleavage mechanism yields products that can easily be processed by D N A ligase to yield mature products. Examples of the cleavage products produced by this type of cleavage mechanism are presented in Fig. 5. As there illustrated, cleavage of Holliday junctions by Endo-= shows a high degree of sequence specificity. As little as a single base change in the symmetric core of a A att site Holliday junction significantly alters both the rate and directionality of the cleavage event (136). This suggests that if Endo-X2 functions in the resolution of Holliday junctions during recombination, the resolution of Holliday junctions may not be a random event. Endo-X1 has been described (138) as an activity present in mitotic cells that cleaves cruciform structures diagonally across the base. Like Endo-=, this protein has not yet been purified to homogeneity. Extensive analysis of the substrate specificity of the activity shows that it cleaves a variety of cruciform structures diagonally across the junction. (139-141). This cleavage appears to be due to the production of symmetrically located pairs of staggered nicks in the homologous arms of the cruciform located 5-14 nucleotides from the base of the cruciform, depending on the exact substrate. Similar to that observed for Endo-X2, this cleavage mechanism yields products that can, in principle, be processed by D N A ligase to yield mature products. When pseudo-cruciforms containing four arms, each of which had a different sequence, were used as substrates, they were also cleaved, but the breaks were no longer symmetrically located. This observation has been taken to suggest that Endo-X1 might recognize homologous sequences in some way (140). Endo-X3 was reported (134)as an activity in mitotic cells that has a rathe r different substrate specificity than have Endo-X1 and -X2. Remarkably, the specificity of Endo-X3 is indistinguishable from the T4 gene-49 endonuclease. It appears to be able to cleave cruciform structures, synthetic Holliday junctions that lack sequence symmetry, three armed junctions, and heteroduplex loop structures. It is not known whether this enzyme has endonuclease activity on single-stranded D N A similar to the T4 and T7 endonucleases (142-144).

,. tir l

*i

C C CG

I

I T

:2

7"

I&

I.

ra

DC

A GS ,

.I

1/2 safG Holliday junction 1st orientation

112 safG Holliday junction 2nd orientation

safG Holliday junction

A T

C I OI

7 1

I

I

*I A T

t: 7 1

GC A T

112 safT Holiday

112 safT Holiday

junction 1st orientation

junction 2nd orientation

safT Holliday junction

FIG. 5. Effect of nucleotide sequence on the resolution of Holliday junctions by endo X2. This figure illustrates seven different Holliday junctions constructed using DNAs derived from different lambda site ufinity mutants (safmutants) and the cleavage patterns produced by endo X2 in these DNAs. Only the DNA sequence immediately flanking the common core is illus-

HOMOLOGOUS RECOMBINATION OF

S. cerevisiae

253

While three different S. cerevisiae activities have been reported that cleave Holliday junctions, it is not yet clear that all of these activities are distinct from each other. Clearly, Endo-X2 and -X3 are distinct from each other on the basis of both substrate specificity and cofactor requirements. Endo-= will not cleave three-arm junctions or at the base of hairpins in single-stranded DNA, nor can Ca2+ be used as a cofactor (R. Reenan and R. D. Kolodner, unpublished results), whereas Endo-X3 cleaves these substrates and utilizes Ca2+ as a cofactor. Endo-X1 and -X2appear to have some differences in substrate specificity in that Endo-X1 will cleave the cruciform present in the plasmid pIRbke8 efficiently and Endo-X2 does not cleave this substrate (R. Reenan and R. D. Kolodner, unpublished results). Endo-X1 and -X2 also appear to have different native molecular weights, although there could be considerable inaccuracy in the reported molecular weights of each activity, as only impure protein preparations have been analyzed. Endo-X1 and -X3 show differences in both native molecular weight and the inability of Endo-X1 to use Ca2+ as a cofactor, suggesting these two activities are distinct. It seems clear that Endo-X3 is distinct from either Endo-X1 or -X2. The available data do not prove that Endo-X1 and -X2 are distinct activities. A resolution of this question awaits further purification of these activities. It is presently unclear whether any of the S. cerevisiae Holliday-junctioncleaving activities function in recombination. Recently, a gene, C C E l (cruciform-sleaving endonuclease 1)that encodes the major cruciform-cleaving endonuclease in S . cerevisiae has been identified (145). Endo-= appears to be absent in ccel mutants, and the Holliday-junction-cleavingactivity that is overproduced in S . cerevisiae strains containing the CCEl gene on a high copy number plasmid or in E . coli under control of a T7 promoter cochromatographs with Endo-X2 (A. Soreng and R. D. Kolodner, unpublished results). This suggests that CCEl encodes Endo-=. At present, there are only limited data available on the effect of ccel mutations on genetic recombination. The analysis of ccel mutations could be complicated by the presence of multiple Holliday-junction-cleaving activities in s. cerevisiae if they have overlapping functions. trated, the remainder is identical in all junctions. The thick boxes indicate regions potentially capable of branch-migration, thin boxes show the nucleotide substitutions found in saf mutants. The arrows indicate the positions of the mapped cut sites and arrow length is proportional to the relative amount ofcleavage at an individual site compared to the other sites within an individual strand. The illustrated results show that small changes in sequence at the core can have significant effects on whether a substrate can be cleaved by endo X2 and on the direction of the resulting resolution event. The symmetric location of many of the cleavage sites relative to each other is indicative of resolution events that produce linear duplex cleavage products containing ligatahle nicks. Reproduced with permission from Evans and Kolodner (136).

254

W.-D. HEYEH

AND R. D. KOLODNEH

3. SINGLE-STRANDED DNA-BINDINGPROTEINS(SSBs) Bacteria, plasmids, and bacteriophages encode proteins, called SSBs, that bind tightly and cooperatively to single-stranded DNA and are required for DNA replication. Probably the best-characterized examples of such proteins are the T4 gene-32 protein and the E . coli SSB protein. These proteins bind to single-stranded DNA, removing secondary structure and facilitating the action of other proteins. There is direct genetic evidence that these proteins are required for recombination (as reviewed in 49), and the ability of SSBs to facilitate the action of RecA protein along with the ability of gene-32 protein to facilitate the action of the UvsX protein provide at least one possible role for these proteins in recombination (reviewed in 18, 51-54). Since the initial reports of SSB proteins, numerous investigators have attempted to identify eukaryotic equivalents by isolating and analyzing proteins that bind to single-stranded DNA-cellulose columns. This approach has been fraught with difficulties, as the ability to bind to single-stranded DNA is a necessary, but not sufficient, criterion for identifying such proteins (as reviewed in 146). In this section, we discuss the various S. cerevisiae proteins that have been suggested to be SSBs, and examine the evidence that any of these proteins are true SSBs and function in recombination. Two proteins, an M , 37,000 polypeptide termed protein C (147) and an M , 40,000 protein termed SSBl (148),were isolated as factors that stimulate DNA polymerases. A comprehensive survey for SSBs on the basis of nncleicacid-binding properties (149) identified, in addition to SSB1, two more SSBs, designated SSB2 ( M , 50,000) and SSB-m ( M , 20,000). All three proteins are immunologically unrelated (149). The gene for SSBl has been isolated, and disruption mutations in SSBl show no discernible phenotype (150).Subsequent studies suggested a role for SSBl in RNA metabolism, as the predicted amino-acid sequence showed homology with known RNAbinding proteins. In addition, the protein was located in the nucleolus by immunofluorescence, making a role in RNA metabolism seem likely (151).A role for SSBl in RNA metabolism had been suggested earlier, as SSBl was found to bind RNA and single-stranded DNA with essentially equal affinity (148, 149). The product of the CDC8 (cell - division cycle 8) gene from S. cerevisiue was suggested to be an SSB because it could bind to single-stranded DNA (152).However, CDC8 was shown subsequently to encode thymidylate kinase (153, 154). A number of putative SSBs have been described as factors that stimulate the S . cerevisiae STPa protein (67, 70, 89).As yet, no detailed information is available about the purification, nucleic-acid-binding properties, or relationship of these M , 14,000, 20,000, 26,000, 35,000, 40,000, 42,000, and 55,000

HOMOLOGOUS RECOMBINATION OF

S . cerevisiae

255

polypeptides to each other and to other SSBs. These proteins stimulate the activity of STPa by aggregating the substrate DNA in the same way that agents such as spermidine and histones stimulate STPa activity (89). This mode of stimulation is different from that seen for proteins like E . coli SSB and T4 gene-32 protein (18, 51-53) and suggests that these proteins might not represent true SSBs. A novel type of SSB was identified in higher eukaryotes as a factor essential for SV40 DNA replication in vitro (155-157) and that also appears to play a role in DNA repair in vitro (158). This activity, termed RP-A (replication protein A), RF-A (replication factor A), or HSSB (human sinde-stranded DNA-binding protein), was identified as a heterotrimer with subinits of M , 70,$00, 32,000, and 14,000. The multi-subunit complex showed single-stranded DNA-binding activity, which resides in the large, M , 70,000, subunit (159).RP-A/RF-A/HSSB participates in the earliest steps of DNA replication, the local unwinding of the SV40 origin, and probably participates in the chain elongation phase as well (157). Other SSBs, such as E . coli SSB, can substitute for RP-A/RF-A in partial reactions (local unwinding) but cannot support the entire DNA replication reaction in the absence of RP-A/RF-A/HSSB (157). Recently, a similar trimeric complex, called yRF-A, has been purified from S . cerevisiae using an unwinding assay (160). This activity shows the same subunit composition as the human counterpart, namely, M , 69,000, 36,000, and 13,000 subunits. Again, the large subunit contains the DNA binding site. We identified (161) an SSB on the basis of its ability to stimulate an SEP1-catalyzed strand exchange reaction (see Section 111, C). Purified as an M, 34,000 polypeptide, it had a high affinity for single-stranded DNA, and did not appear to bind to double-stranded DNA or RNA. Sequence analysis of the cloned gene, now called RPAl, revealed that it encodes a protein of M , 70,000 (162, 163). This is the same M , as found for the large DNA-binding subunit of human RP-A/RF-A/HSSB. Comparison with peptide sequences from the human protein revealed a striking degree of conservation. Subsequently, the human gene for the large subunit was cloned and a high degree of homology (31%)between the yeast and the human genes was found throughout the entire open reading frame (164). Sequence comparison has also demonstrated a similar degree of conservation between the S. cerevisiae and human middle-sized ( M , 34,000) subunits (163).Sequence comparisons between the S. cerevisim RPAl gene and a peptide sequence obtained from the large subunit of the yeast trimeric complex (160) revealed 100% identity, indicating that the M , 70,000 subunit of this protein is the product of the RPAl gene (as cited in 162). Further biochemical studies of the trimeric RP-A/RF-A complex of S. cerevisiae defined the binding-site size as 90- 100 nucleotides of single-

-

-

256

W.-D. HEYER AND R. D. KOLODNER

stranded DNA per trimeric unit. The protein.DNA complexes formed with high &nity ( K , = 1 x 109 M-1) and the binding exhibited strong cooperativity (o = 104-105) (165). Electron-microscope and nuclease protection analysis of the protein.DNA complexes revealed a nucleosome-like structure with 4-fold compaction of the single-stranded DNA (165). These DNAbinding properties are similar to those observed for E . coli SSB and T4 gene-32 protein. Biochemical analysis indicates that the intact trimeric S. cerevisiae yRF-A stimulates the strand-exchange protein SEPl in vitro, suggesting a possible role in recombination (see Section 111, C). Genetic analysis demonstrated that RPAl encodes an essential function for s. cerevisiae mitotic growth (162, 163). Cells lacking RPAl arrest as mononucleate multiplybudded structures consistent with a defect in chromosomal DNA synthesis (162). However, there is as yet no genetic evidence that RP-A/RF-A actually functions in genetic recombination. Another report describes the identification of a putative trimeric SSB, termed “DNA polymerase I1 stimulatory factor I” from S. cerevisiae (166). As discussed in this report, antisera raised against the yRF-A complex (160) did not cross-react with this stimulatory factor. Based on the sequence information discussed above, this factor is unlikely to represent the S. cerevisiae homolog of human RP-A/RF-A/HSSB. Clearly, a large number of S. cerevisiae proteins have been described as SSBs. At present there are insufficient data about the DNA-binding properties of these proteins to substantiate the view that they are true SSBs; in the case of one protein, SSB1, it seems likely that the protein has a role in RNA metabolism. Only in the case of yRP-A/RF-A do the DNA-binding properties and the effect of the protein on both strand exchange and DNA replication in vitro support the idea that this protein is the eukaryotic equivalent of the prokaryotic SSBs. While there are genetic data indicating that yRPAIRF-A acts in DNA replication, there is as yet no genetic evidence that it is required for recombination or repair. 4. DNA HELICASES

The unwinding of duplex DNA is important for DNA replication, repair, and recombination. A class of enzymes, called DNA helicases, have been identified that promote the unwinding of double-stranded DNA by disruption of hydrogen bonds to generate separated single strands in a reaction that requires ATP hydrolysis. There are at least two different roles that helicases are thought to play in recombination. These include the unwinding of DNA before synapsis in order to promote the initiation of homologous pairing, and as an accessory factor for a homologous pairing protein to facilitate the strand-exchange

HOMOLOGOUS RECOMBINATION OF

S. cerevisiue

257

process. In E . coli, there is extensive evidence that DNA helicases are involved in genetic recombination and mismatch repair (reviewed in 49; also see 35, 167). In S. cerevisiue, a number of DNA helicases have been identified using both biochemical and genetic methods. The properties of these proteins are discussed below. The RADS group of genes controls the excision repair pathway in S. cerevisiue; mutations in RADS are sensitive to killing by UV irradiation. In addition, some rud3 mutations (the rem alleles) are hyper-recombinogenic for spontaneous mitotic recombination (reviewed in 36). The DNA sequence of the RAD3 gene predicted an ATP-binding site and the purified protein is an ATPase (168). It was shown subsequently that the RADS protein has an ATP-dependent 5‘-to-3’ DNA helicase activity (169).The fact that deletions in the RAD3 gene are lethal whereas point mutants that lack helicase activity are viable and repair-deficient (170, 171) suggests the RAD3 protein has additional functions besides its helicase activity. P l F l is a nuclear gene mutation that results in defects in mitochondria1 recombination and repair but does not cause a defect in overall mitochondrial maintenance (172-1 74).The nucleotide sequence of the P l F l gene suggested that it encodes a DNA helicase because the translated P l F l sequence has substantial homology to the sequence of known DNA helicases (174). Recently, it has been demonstrated directly that the purified PZFl gene product is a DNA helicase, using as assay the displacement of an 41mer oligonucleotide from a single-stranded circular DNA. The helicase is dependent on Mg2+ and ATP as cofactors and has a 5’-to-3’ polarity. Further genetic, biochemical, and cytological studies have established the mitochondria] localization of the PIFl protein (1 75). Two laboratories have isolated suppressors of mutations in the RA D6group of genes, RAD18 and U D 6 . They each identified the same gene, called either RADH (176) or SRS2 (1 77). Interestingly, the same gene was identified in a search for mitotic hyper-recombination mutations in S. cerevisiae and named HPRS (178, 179). The DNA sequence of RADHISRSBI HPRS revealed significant homology to the known DNA helicases Rep and UvrD of E . coli (62). Helicase activity of the RADHISRSBIHPRS gene product has recently been directly demonstrated (cited in 180). Mutations in this gene cause an increased frequency of intragenic recombination but have no effect on intergenic recombination. The mutants do not have a mutator phenotype, but are sensitive to UV and y irradiation and have reduced spore viability after meiosis (176, 178). The DNA helicase ATPase-111was been identified biochemically as an M , 63,000 polypeptide that specifically stimulates DNA polymerase I activity in vitro (181). The direction of the unwinding reaction has not been determined. ATPase-I11 activity is absent from strains carrying the rud3 mutation.

258

W.-D. HEYER AND R. D. KOLODNER

Although RAD3 encodes a DNA helicase, it was argued, on the basis of immunological data, that the RAD3 gene does not encode ATPase I11 (181). The gene encoding this DNA helicase is not known, and a role in recombination remains speculative. There is evidence for at least four helicases in S. cereuisiae. Clearly, the Pifl helicase is important for recombination or repair of mitochondria1 DNA. Mutations in RAD3 and RADHISRSBIHPRS cause a repair defect and in some cases a hyper-rec effect. These two phenotypes could arise from these helicases playing a direct role in repair, leading to accumulation of spontaneous DNA damage that subsequently evokes an increase in the frequency of initiation of recombination. Alternatively, each of these proteins could act as anti-recombinases that regulate or prevent recombination (182). At present, it is unclear that RAD3 and RADHISRSBIHPRS encode helicases that directly facilitate recombination. An important question about these two gene products is whether they function in mismatch repair. The hyper-rec effect on intragenic recombination caused by mutations in these genes is consistent with the results obtained with other mutations causing a defect in mismatch repair (183);however, the lack of a mutator phenotype is inconsistent with this idea (62, 178). There are no genetic data suggesting a specific role for the ATPase 111 helicase in any aspect of DNA metabolism. Clearly, the existence of multiple helicases suggests genetic studies demonstrating a direct, positive role of any specific helicase in recombination, or in recombination-related repair events, could be complex.

5. DNA TOPOISOMERASES DNA topoisomerases are enzymes that can relieve the torsional stress in double-stranded DNA generated during unwinding of the duplex. Apart from the involvement of topoisomerase activities in illegitimate or nonhomologous recombination, a role in homologous genetic recombination has recently become evident (reviewed in 184). Mutations in genes encoding the known topoisomerases I or I1 in S . cerevisiae and in genes encoding suspected DNA topoisomerases increase the frequency of certain recombination events. Mitotic recombination in the rDNA cluster is suppressed in wild type cells by the combined action of topoisomerase I and I1 (185). In top1 null mutants (type-I topoisomerase) and at semi-permissive temperatures in a temperature-sensitive top2 mutant (type-I1 topoisomerase) mitotic recombination between rDNA repeats was 50-200 times higher than in the wild type. This effect was specific for the rDNA locus, as recombination among other tandemly repeated genes remained unaffected (185). Additionally, cells with a deletion of TOP1 and a temperature-sensitive mutation in TOP2 had over half of the rDNA genes excised and present as extrachromosomal elements at the permissive temperature (186).

HOMOLOGOUS RECOMBINATION OF

S. cerevisiae

259

A mutation that causes an increase in the frequency of recombination between repeated delta sequences from the transposition element Tyl has been isolated. This hyper-recombination mutation, initially termed edrl-1 (187) and then renamed top3-1, identified a gene with significant homology with the E . coli topA gene, therefore encoding a putative type-I topoisomerase activity (188). Topoisomerase activity has not yet been directly demonstrated for the TOP3 gene product. In addition, H P R l , a gene in which mutations result also in a hyperrecombination phenotype (I78), shares homology at the carboxyterminal end with topoisomerase I from S . cerevisiae (189). Mutations in H P R l affect intrachromosomal crossing over (1 78), while recombination in the rDNA repeat is unaffected (189). As pointed out (184), the HPRl protein shares homology with the gene product of the mammalian RAG1 gene, which controls the site-specific V(D)J recombination event in immunoglobulin gene rearrangement, although the significance of this is unknown at present (190). The observed suppression of recombination by topoisomerases could be explained by two complementary mechanisms (184). Firstly, topoisomerases relax supercoiled regions of DNA and thus suppress DNA supercoilingstimulated recombination. While this mechanism could explain many observations, it cannot account for the hyper-recombination phenotype of mutations in TOP3 or H P R l , because it was argued strongly that the TOP1 and TOP2 gene products are the only activities in S . cerevisiae that can relax supercoiled domains. The second mechanism proposed (184)envisions that a topoisomerase might act as an “anti-recombinator” after pairing of homologous or nearly homologous sequences has taken place. Thus, recombinants would only arise from structures that had escaped their dissolution, similar to other ideas (182) invoking DNA helicase action. A different role has been assigned to topoisomerase I1 in S . cerevisiae meiosis by analyzing cold-sensitive top2 mutant strains (191).At the nonpermissive temperature, mutants in TOP2 are defective in meiosis and are unable to complete meiosis I, after having apparently completed pre-meiotic DNA synthesis and normal induction of meiotic recombination. Introduction of a rad50 mutation, which abolishes meiotic recombination, suppresses this meiotic defect of top2 mutations. Therefore, it was suggested (191) that one role of topoisomerase 11 is to disentangle intermediates or products of meiotic recombination and allow successful completion of meiosis. 6. DNA POLYMERASES AND DNA LIGASE According to most recombination models, DNA-repair synthesis is required during the recombination process (12, 16). Possible roles for such repair synthesis include: driving strand-displacement reactions prior to synapsis; copy-choice DNA synthesis after strand invasion; repair of gaps during

260

W.-D. HEYER AND R. D. KOLODNER

the final steps in recombination to produce continuous DNA strands; and as part of mismatch repair (see Figs. 1 and 2). Among the three essential nuclear DNA polymerases in S. cerevisiae, DNA polymerase I or c1 (CDCl7), I1 or E, and I11 or 8 (CDC2) (for an overview, see 192), it is uncertain which one is responsible for repair synthesis. Furthermore, repair synthesis might be provided by a fourth, yet unidentified DNA polymerase, possibly the non-essential REV3 gene product that has been speculated, on the basis of DNA sequence comparisons, to encode a DNA polymerase (193).The REV3 gene is required for spontaneous and induced mutagenesis, and rev3 mutant strains exhibit a hyperrecombination phenotype (reviewed in 36). The mitochondria1 DNA replicase, DNA polymerase m or y, is encoded by the MIPl gene (194).As yet there is no evidence that the MZPl gene product is involved in DNA repair ( 195). Direct evidence for the involvement of DNA polymerases in genetic recombination comes from a genetic analysis ( 178). Both, hpr3 and hpr6 mutations cause a strong hyper recombination phenotype for intergenic recombination and show a more modest elevation in intragenic recombination (178). HPR3 is identical to CDC17, the gene encoding DNA polymerase I, and HPR6 is identical to CDC2, the gene encoding DNA polymerase 111 (178, 196). Similar results have been obtained in an analysis of cdc mutations (197). Recent evidence examining induced mitotic recombination suggests an involvement of DNA polymerase I11 in induced gene conversion (198). DNA ligases are the class of enzymes that seal single-strand breaks in duplex DNA and are thought to be important for replication, repair, and recombination. The DNA ligase encoded by the CDC9 gene of S. cerevisiae (199) is the only ligase activity known in this organism. Surprisingly, this ligase appears to have not yet been purified to homogeneity, even though its existence has been known for a number of years. Mutants defective in ligase exhibit a hyper-recombination phenotype for both gene conversion and crossing-over (ZOO,201). The hyper-recombination phenotypes of DNA-ligase and DNA-plymerase mutants has been interpreted as being the result of accumulation of recombinogenic structures in DNA, gaps in the case of DNA polymerases, and single-strand breaks in the case of DNA ligase (3).As such, the genetic studies showing that mutations in genes encoding DNA polymerases increase the frequency of some recombination events have provided evidence that DNA damage can initiate recombination, but have not yet identified the DNA polymerase(s) that catalyze(s) repair synthesis required for recombination. While a similar argument also holds for the genetic analysis of DNA ligase, the fact that there appears to be only one DNA ligase in S. cerevisiae suggests this enzyme probably functions directly in recombination.

HOMOLOGOUS RECOMBINATION OF

S. cerevisiue

261

C. Toward a Reconstitution of a Complete in Vitro System A long-term goal for a biochemist studying recombination is the reconstitution of a complete in vitro system from purified components. For several other complex cellular processes, most notably DNA replication (32, 33) or methyl-directed mismatch repair (39, complete reconstituted systems have been developed and proven to be invaluable for the understanding of these processes. This section summarizes approaches presently being used to identify additional components required to reconstitute homologous recombination in vitro. The development of an in vitro system from crude extracts, as shown in Fig. 3, was an important development as it provided both a basis for studying recombination and potentially provided assays for identifying required components (37).The initial studies involved measuring recombination between plasmid substrates carrying different mutations in the tetracycline-resistance gene. Recombinant formation was monitored genetically by transformation of the product DNA into E . coli followed by selection for tetracyclineresistant clones. This system has been useful in studying some aspects of the mechanism of recombination and has served as an initial basis for the purification of several proteins (65,161,202). However, the original assay proved to be too time consuming to be used for general fractionation studies. Recent studies with such in vitro systems have provided additional information about the mechanism of recombination reactions catalyzed in vitro, but have not generally improved the speed of these assays. PCR potentially provides a more applicable, rapid assay for use in fractionation studies (41, 203). The success of this general approach, which is not without its potential problems, is likely to be crucial to the ultimate reconstitution of recombination in vitro and so remains an important area for investigation. In the absence of a complete reaction that is practical to reconstitute, a different approach has been developed to identify proteins that might function in reactions catalyzed by the strand-exchange protein, SEP1. The formation of hybrid DNA is central to recombination; most likely, several proteins are involved in this process in vivo. SEPl can catalyze the formation of hybrid DNA without additional factors. In order to identify new proteins potentially involved in genetic recombination in S . cerevisiue, activities that stimulate the SEP1-catalyzed strand-exchange reaction under conditions of limiting amounts of S E P l have been purified. This approach is similar to that used to demonstrate a requirement for the E . coli Fis protein in the bacteriophage-lambda excision reaction (204). A requirement for Fis in addition to Xis, Int, and Ihf was detected only when limiting concentrations of Xis protein were present.

HOMOLOGOUS RECOMBINATION OF

S. cerevisiae

267

B. Connolly, C. I. White and J. E. Haber, MCBiol 8, 2342 (1988). T. L. Orr-Weaver and J. W. Szostak, PNAS 80, 4417 (1983). L. R. Bell and B. Byers, CSHSQB 47, 829 (1982). R. H. Borts, M. Lichten and J. E. Haber, Genetics 113, 551 (1986). C. I. White and J. E. Haber, EMBO J . 9, 663 (1990). L. Cao, E. Alani and N. Kleckner, Cell 61, 1089 (1990). D. K. Bishop and R. D. Kolodner, MCBiol6,MOl (1986). D. K. Bishop, J. Anderson and R. D. Kolodner, PNAS 86, 3713 (1989). 31. W. Kramer, B. Kramer, M. S. Williamson and S. Fogel, MCBiol9, 4432 (1989). 32. A. Kornberg, “DNA Replication.” Freeman, San Francisco, 1980. 33. A. Kornberg and T. Baker, “DNA Replication,” 2nd Ed. Freeman, San Francisco, 1991. 34. T Tsurimoto, T. Melendy and B. Stillman, Nature 346, 534 (1990). 35. R. S. Lahue, K. G. Au and P. Modrich, Science 245, 160 (1989). 36. B. A. Kunz and R. H. Haynes, ARGen 15, 57 (1981). 37. L. S. Symington, L. M. Fogarty and R. Kolodner, Cell 35, 805 (1983). 38. L. S. Symington, P. T. Morrison and R. Kolodner, CSHSQB 49, 805 (1984). 39. L. S. Symington, P. T. Morrison and R. Kolodner, MCBiol5, 2361 (1985). 40. Y. Hotta, S . Tabata, R. A. Bouchard, R. Pinon and H. Stern, Chromosoma 93, 140 (1985). 41. L. S. Syrnington, EMBOJ. 10, 987 (1991). 42. S. Kunes, D. Botstein and M. S. Fox, CSHSQB 49, 617 (1984). 43. R. L. White and M. S. Fox, PNAS 71, 1544 (1974). 44. C. Muster-Nassal and R. D. Kolodner, PNAS 83, 7618 (1986). 45. P. Thuriaux, M. Minet, P. Munz, A. Ahrnad, D. Zbaeren and U. Leupold, Curr. Genet. 1, 89 (1980). 46. J. H. White, K. Lusnak and S. Fogel, Nature 315, 350 (1985). 47. P. Detloff, J. Sieber and T. D. Petes, MCBiol 11, 737 (1991). 48. M. S. Williamson, J. C. Game and S. Fogel, Genetics 110, 609 (1985). 49. 6. R. Smith, Microbiol. Reu. 52, 1 (1988). 50. A. K. Eggleston and S. C. Kowalczykowski, Biochirnie 73, 163 (1991). 51. C. M. Radding, in “Genetic Recombination” (R. Kucherlapati and 6. R. Smith, eds.), p. 193. American Society for Microbiology, Washington, D.C., 1988. 52. C. M. Radding, JBC 266, 5355 (1991). 53. M. M. Cox and I. R. Lehman, ARB 56, 229 (1987). 54. S. C. Kowalczykowski, Annu. Reu. Biophys. Chem. 20, 539 (1991). 55. M. M. Cox and I. R. Lehrnan, PNAS 78, 3433 (1981). 56. J. G. McCarthy, M. Sander, K. Lowenhaupt and A. Rich, PNAS 85, 5854 (1988). 57. J. D. Criffith and L. D. Harris, CRC Crit. Rev. Biochem. 23 (Suppl. l), S43 (1988). 58. J. F. Angulo, J. Schwencki, P. L. Moreau, E. Moustacchi and R. Devoret, MGG 201, 20 (1985). 59. S. J. Elledge and R. W. Davis, MCBiol 7, 2783 (1987). 60. H. K. Hurd, C. W. Roberts and J. W. Roberts, MCBiol7, 3673 (1987). 61. P. Sung, L. Prakash and S. Prakash, Nature 355, 743 (1992). 62. A. Aboussekhra, R. Chanet, A. Adjiri and F. Fabre, MCBioZ 12, 3224 (1992). 63. D. K. Bishop, D. Park, L. Xu and N. Kleckner, Cell 69, 439 (1992). 64. A. Shinohara, H. Ogawa and T. Ogawa, Cell 69, 457 (1992). 65. R. Kolodner, D. H. Evans and P. T. Morrison, PNAS 84, 5660 (1987). 66. A. W. Johnson and R. D. Kolodner, JBC 266, 14046 (1991). 67. A. Sugino, J. Nitiss and M. A. Resnick, PNAS 85, 3683 (1988). 68. J. Halbrook and K. McEntee, JBC 264, 21403 (1989). 69. D. Tishkoff, A. W. Johnson and R. D. Kolodner, MCBiol 11, 2593 (1991). 23. 24. 25. 26. 27. 28. 29. 30.

HOMOLOGOUS RECOMBINATION OF

S. cerevisiae

263

the reaction were stimulated. As discussed in Section 111, B, 3, this protein proved to be a fragment of the DNA-binding subunit of yRF-A. The interaction between the authentic heterotrimeric yRP-A/yRF-A complex and both the M , 132,000 and 175,000forms of S E P l was similar to that observed with the M , 34,000 fragment with the exception that maximal stimulation was observed at a stoichiometry of 1trimeric yRP-A/yRF-A complex per 95 bases of single-stranded DNA (165). The trimeric yRP-A/yRF-A complex could also stimulate RecA protein (165). The human trimeric RP-A/RF-A complex stimulated E . coli recA to the same extent as E . coli SSB, but was inert in reactions with S . cereuisiae S E P l (as cited in 162). A second polypeptide has been identified and purified using this approach. This protein has been termed SF1 for stimulatory factor 1, as it greatly increased the formation of joint molecules in reactions with limiting concentrations of S E P l (202, 206). SF1 by itself is a weak single-stranded DNA-binding protein that contains several unique activities. Upon binding to DNA, it forms large protein.DNA aggregates, effectively precipitating the DNA at high protein concentrations. In addition, SF1 catalyzes the simplest pairing reaction, the renaturation of complementary single-stranded DNA, although it is completely inert in the strand-exchange reaction, as shown in Fig. 4. These characteristics are similar to DPA, the strand exchange protein that requires activated substrates for strand exchange (68).However, the two proteins must be different, since the gene encoding SF1 does not have the capacity to encode an M , 120,000 polypeptide (68; D. Norris and R. D. Kolodner, unpublished results). Saturating amounts of SF1 in SEP1-catalyzed strand-exchange reactions result in a large stimulation of activity. As illustrated in Fig. 6, in reactions catalyzed by SEPl alone, approximately 550 mol of SEPl are required per mole of substrate (7200-nucleotide M13mp19). In conjunction with the M , 34,000 fragment or intact heterotrimeric yRP-A/yRF-A complex, this amount is reduced to less than 200 mol. However, in the presence of SF1, essentially 1 mol of S E P l suffices for strand exchange. The reaction requirements for this reaction remain unchanged, and in particular there is no energy requirement. Extensive analysis indicates that stimulation of SEPl by SF1 probably occurs by a specific mechanism and is unlikely to represent nonspecific stimulation by simple precipitation of the substrate DNA, as has been observed with spermidine or basic proteins such as histones (70,206,207). One possible mechanism by which SF1 could stimulate SEPl is by promoting a distinct part of the strand-exchange reaction, such as branch migration, with S E P l functioning in the initial pairing phase of the reaction (206).Th'is is an attractive mechanism, because the ability of S F l to renature DNA suggests it might promote other types of pairing reactions using appropriate substrates.

264

W.-D. HEYER AND H. D. KOLODNER

The gene encoding SF1 has recently been cloned and a deletion mutant constructed. Other than the fact that the null mutant is viable, little is known about the properties conferred by sfz mutations. The SF1 protein has been overproduced using the cloned gene and purified to homogeneity in large amounts (D. Norris and R. D. Kolodner, unpublished results). Overproduced SF1 protein promotes the formation of joint DNA molecules between linear double-stranded DNA and homologous single-stranded circular molecules. The mechanism involves extensive strand displacement, provided that the linear double-stranded DNA contains a short single-stranded tail homologous to the single-stranded circular molecule. This is reminiscent of the strand-exchange protein DPA (68), which is clearly a different protein and is also similar to SEPl under conditions where the intrinsic exonuclease activity is inactive. These results suggest that SF1 could be a new S. cerevisiue strand-exchange protein and suggests that the original assay detecting SF1 was actually measuring the stimulation of SFl by the exonuclease activity of SEPl (D. Norris, A. Johnson, and R. Kolodner, unpublished results). As indicated above, conditions that concentrate the substrate DNA stimulate strand-exchange activities; this type of stimulation has been studied in detail (89).The ability to aggregate DNA clearly correlates with the extent of stimulation of STPa-catalyzed reactions. It has been suggested that the aggregation mimics the high DNA concentrations found in uiuo, thereby representing a specific characteristic of the in vitro reaction. A direct in vivo role of the formation of aggregated complexes has also been suggested (89).

D. Concluding Comments Many genes have been identified through mutations that affect genetic recombination (3, 36). For an increasing number of these genes, specific predictions about the biochemical function of the respective proteins can be made. The biochemical approach has led to the purification of several proteins involved in recombination in vitro; however, genetic characterization of the genes encoding these proteins is still very preliminary. We expect that the genetic and the biochemical approaches will identify the same genes, but this overlap has not yet been achieved. The full significance of the proteins identified biochemically can only be assessed fully on the basis of the genetic analysis of mutations in the respective genes. The genetic redundancy of certain functions is already indicated at the protein level and will require the analysis of strains containing mutations in combinations of isofunctional genes to analyze the contribution to genetic recombination. One present major limitation of the biochemical approach is the absence of an assay system for recombination in uitro that easily allows the fractionation and reconstitution of the reaction. The use of the PCR as in the system

HOMOLOGOUS RECOMBINATION OF

S. cerevisiae

265

described by Symington (41) and White and Haber (27) is a novel alternative to previous assays of transforming E . coli and could lead to a solution of this problem. The reconstitution of more complex in vitro recombination reactions starting from an individual activity has only been initiated. The identification and purification of stimulatory factors in the strand-exchange reaction has been the only approach used, but other avenues seem possible. As in other complex reactions, like DNA replication or transcription, multiple proteins will probably act together in catalyzing recombination and must coordinate by protein.protein interactions. Protein.protein interactions can be directly probed starting from the already identified activities, and might lead to the discovery of additional factors. In addition to further identifying the enzymatic complexity of recombination reactions, the substrates for in vitro recombination reactions should be more similar to the substrates likely to be found inside the cell. For example, current model reactions for strand exchange involve one single-stranded and one double-stranded substrate but do not generally test duplex.duplex interactions or interactions between covalenty-closed DNA molecules without free ends. In addition, currently used substrates are protein-free DNA molecules, but the presence of histones and other proteins associated with the DNA in viro constituting a eukaryotic chromosome must be appreciated. It has already been demonstrated that the presence of nucleosomes has profound effects on RecA catalyzed reactions (208).The involvement of certain factors in recombination might only be appreciated in vitro when more complex substrates are used. With the introduction of cytology in studying recombination in the yeast S . cerevisiae (see 209-211 for examples), another indicator for a possible function of proteins encoded by genes controlling recombination has been developed. The precise subcellular localization of proteins as well as the effect of mutations on chromosome structure may reveal a role in recombination. For example, mutations in the HOP1 gene of S. cerevisiae abolish recombination between homologs (interchromosomal recombination) but hop1 mutants are proficient in intrachromosomal recombination between duplicated sequences (212). This phenotype was interpreted as a pairing defect in meiosis. The precise localization of the HOPl protein has shown that it is a meiosis-specific component of S. cerevisiae chromosomes (210).As synaptonemal complexes (meiosis-specific chromatin containing paired chromosomes) were not found in the hop1 mutants during meiosis, it was suggested that the HOPl protein might be a constituent of this structure rather than actually being an enzyme that acts directly on DNA during recombination (210). We are only beginning to learn about the cellular structures associated with recombination in S. cerevisiae. The analysis of the structural compo-

266

W.-D. HEYER AND R. D. KOLODNER

nents and functional constituents of these structures will be an essential third route to unraveling the mechanism of homologous recombination in eukaryotes.

ACKNOWLEDGMENTS This work was supported by the grants GM29383 and HG00305 from the National Institutes of Health to R.D.K. W.-D.H. is supported by the Swiss National Science Foundation through the START program and grant 31.30202.90. We express our gratitude to R. Fishel, S. Kearsey, H . Klein, J. Kohli, R. Ljungdahl, K. McEntee, A. Sentenac, A. Stevens, L. Symington, and P. Thuriaux, and to members of our laboratories for discussions, comments on the manuscript, and sharing unpublished data.

REFERENCES 1. S. Fogel, R. K. Mortimer and K. Lusnak, in “The Molecular Biology of the Yeast Saccharomyces: Metabolism and Gene Expression” (J. N. Strathern, E. W. Jones and J. R. Broach, eds.), p. 289. CSHLab, Cold Spring Harbor, New York, 1981. 2. T. L. Orr-Weaver and J. W. Szostak, Microbiol. Reu. 49, 33 (1985). 3. T. D. Petes, R. E. Malone and L. S. Symington, in “The Molecular and Cellular Biology of the Yeast Saccharomyces: Genome Dynamics, Protein Synthesis and Energetics” (J. R. Broach, E. Jones and J. Pringle, eds.), Vol. I, p. 407. CSHLab, Cold Spring Harbor, New York, 1991. 4. R. S. Kucherlapati and P. D. Moore, in “Genetic Recombination” (R. Kucherlapati and 6. R. Smith, eds.), p. 575. American Society for Microbiology, Washington, D.C., 1988. 5. S. Fogel, R. Mortimer, K. Lusnak and F. Tavares, CSHSQB 43, 1325 (1979). 6. M. D. Mitchell, PNAS 41, 935 (1955). 7. M. E. Case and N . H. Giles, Genetics 49, 529 (1964). 8. V. Enea and N. D. Zinder, J M B 101, 25 (1976). 9. R. M. Benbow, A. J. Zuccarelli, G. C. DavisandR. L. Sinsheimer, J . Virol. 13,898(1976). 10. H. L. Klein, Nature 310, 748 (1984). 11. R. Holliday, Genet. Res. 5, 282 (1964). 12. M. S. Meselson and C . M. Radding, PNAS 72, 358 (1975). 13. N. Sigal and B. Alherts, J M B 71, 789 (1972). 14. H . Sohell, PNAS 72, 279 (1975). 15. R. D. Hotchkiss, Annu. Reu. Microbiol. 27, 445 (1974). 16. J. W. Szostak, T. L. Orr-Weaver, R. J. Rothstein and F. W. Stahl, Cell 33, 25 (1983). 17. J. Wilson, PNAS 75, 3641 (1979). 18. C. M. Radding, ARGen 16, 405 (1982). 19. D. K. Bishop, M. S. Williamson, S. Fogel and R. D. Kolodner, Nature 328, 362 (1987). 20. M. Lichten, C. Goyon, N. P. Schultes, D. Treco, J. W. Szostak, J. E. Haber and A. Nicolas, PNAS 87, 7653 (1990). 21. D. S. Thaler and F. W. Stahl, ARGen 22, 169 (1988). 22. H. Sun, D. Treco, N. P. Schultes and J. W. Szostak, Nature 338, 87 (1989).

HOMOLOGOUS RECOMBINATION OF

S. cerevisiae

267

B. Connolly, C. I. White and J. E. Haber, MCBiol 8, 2342 (1988). T. L. Orr-Weaver and J. W. Szostak, PNAS 80, 4417 (1983). L. R. Bell and B. Byers, CSHSQB 47, 829 (1982). R. H. Borts, M. Lichten and J. E. Haber, Genetics 113, 551 (1986). C. I. White and J. E. Haber, EMBO J . 9, 663 (1990). L. Cao, E. Alani and N. Kleckner, Cell 61, 1089 (1990). D. K. Bishop and R. D. Kolodner, MCBiol6,MOl (1986). D. K. Bishop, J. Anderson and R. D. Kolodner, PNAS 86, 3713 (1989). 31. W. Kramer, B. Kramer, M. S. Williamson and S. Fogel, MCBiol9, 4432 (1989). 32. A. Kornberg, “DNA Replication.” Freeman, San Francisco, 1980. 33. A. Kornberg and T. Baker, “DNA Replication,” 2nd Ed. Freeman, San Francisco, 1991. 34. T Tsurimoto, T. Melendy and B. Stillman, Nature 346, 534 (1990). 35. R. S. Lahue, K. G. Au and P. Modrich, Science 245, 160 (1989). 36. B. A. Kunz and R. H. Haynes, ARGen 15, 57 (1981). 37. L. S. Symington, L. M. Fogarty and R. Kolodner, Cell 35, 805 (1983). 38. L. S. Symington, P. T. Morrison and R. Kolodner, CSHSQB 49, 805 (1984). 39. L. S. Symington, P. T. Morrison and R. Kolodner, MCBiol5, 2361 (1985). 40. Y. Hotta, S . Tabata, R. A. Bouchard, R. Pinon and H. Stern, Chromosoma 93, 140 (1985). 41. L. S. Syrnington, EMBOJ. 10, 987 (1991). 42. S. Kunes, D. Botstein and M. S. Fox, CSHSQB 49, 617 (1984). 43. R. L. White and M. S. Fox, PNAS 71, 1544 (1974). 44. C. Muster-Nassal and R. D. Kolodner, PNAS 83, 7618 (1986). 45. P. Thuriaux, M. Minet, P. Munz, A. Ahrnad, D. Zbaeren and U. Leupold, Curr. Genet. 1, 89 (1980). 46. J. H. White, K. Lusnak and S. Fogel, Nature 315, 350 (1985). 47. P. Detloff, J. Sieber and T. D. Petes, MCBiol 11, 737 (1991). 48. M. S. Williamson, J. C. Game and S. Fogel, Genetics 110, 609 (1985). 49. 6. R. Smith, Microbiol. Reu. 52, 1 (1988). 50. A. K. Eggleston and S. C. Kowalczykowski, Biochirnie 73, 163 (1991). 51. C. M. Radding, in “Genetic Recombination” (R. Kucherlapati and 6. R. Smith, eds.), p. 193. American Society for Microbiology, Washington, D.C., 1988. 52. C. M. Radding, JBC 266, 5355 (1991). 53. M. M. Cox and I. R. Lehman, ARB 56, 229 (1987). 54. S. C. Kowalczykowski, Annu. Reu. Biophys. Chem. 20, 539 (1991). 55. M. M. Cox and I. R. Lehrnan, PNAS 78, 3433 (1981). 56. J. G. McCarthy, M. Sander, K. Lowenhaupt and A. Rich, PNAS 85, 5854 (1988). 57. J. D. Criffith and L. D. Harris, CRC Crit. Rev. Biochem. 23 (Suppl. l), S43 (1988). 58. J. F. Angulo, J. Schwencki, P. L. Moreau, E. Moustacchi and R. Devoret, MGG 201, 20 (1985). 59. S. J. Elledge and R. W. Davis, MCBiol 7, 2783 (1987). 60. H. K. Hurd, C. W. Roberts and J. W. Roberts, MCBiol7, 3673 (1987). 61. P. Sung, L. Prakash and S. Prakash, Nature 355, 743 (1992). 62. A. Aboussekhra, R. Chanet, A. Adjiri and F. Fabre, MCBioZ 12, 3224 (1992). 63. D. K. Bishop, D. Park, L. Xu and N. Kleckner, Cell 69, 439 (1992). 64. A. Shinohara, H. Ogawa and T. Ogawa, Cell 69, 457 (1992). 65. R. Kolodner, D. H. Evans and P. T. Morrison, PNAS 84, 5660 (1987). 66. A. W. Johnson and R. D. Kolodner, JBC 266, 14046 (1991). 67. A. Sugino, J. Nitiss and M. A. Resnick, PNAS 85, 3683 (1988). 68. J. Halbrook and K. McEntee, JBC 264, 21403 (1989). 69. D. Tishkoff, A. W. Johnson and R. D. Kolodner, MCBiol 11, 2593 (1991). 23. 24. 25. 26. 27. 28. 29. 30.

268

W.-D. HEYER AND R. D. KOLODNER

70. 71. 72. 73. 74. 7.5. 76. 77. 78.

C. C. Dykstra, R. K. Hamatake and A. Sugino, JBC 265, 10968 (1990). W.-D. Heyer, D. H. Evans and R. D. Kolodner, J B C 263, 15189 (1988). S. S. Tsang, S. A. Chow and C. M . Radding, Bchern 24, 3226 (1985). S . A. Chow and C. M. Radding, PNAS 82, 5646 (1985). A. Stevens, BBRC 81, 656 (1978). A. Stevens, JBC 255, 3080 (1980). A. Stevens and M. K. Maupin, A B B 252, 339 (1987). F. W. Larimer and A. Stevens, Gene 95, 85 (1990). C. C . Dykstra, K. Kitada, A. B. Clark, R. K. Hamatake and A. Sugino, MCBiol 11, 2583 (1991). A. Nicolas, D. Treco, N. P. Schultes and J. W. Szostak, Nature 338, 35 (1989). H. Sun, D. Treco and J. W. Szostak, Cell 64, 1155 (1991). J. Kim, P. 0. Ljungdahl and G. R. Fink, Genetics 126, 799 (1990). D. Kipling, C. Tamhini and S. E. Kearsey, NARes 19, 1385 (1991). T. Ogawa, G. 6. Pickett, T. Kogoma and A. Kornberg, PNAS 81, 1040 (1984). J. M. Kaguni and A. Kornherg, JBC 259, 8578 (1984). K. G. Lark and C. A. Lark, C S H S Q B 43, 537 (1978). E. M. Witkin and T. Kogoma, PNAS 81, 7539 (1984). Y.-M. Mao, Q. Shi, Q.-G. Li and Z.-J. Sheng, MGC 225, 234 (1991). T. Kogama, K. Skarstad, E. Boye, K. von Meyenherg and H . B. Steen, J. B a t . 163, 439 (1985). R. K. Hamatake, C. C. Dykstra and A. Sugino, JBC 264, 13336 (1989). A. B. Clark, C. C. Dykstra and A. Sugino, MCBiol 11, 2576 (1991). J.-C. Hubert, A. Guyonvarch, B. Kammerer, F. Exinger, P. Liljelund and F. Lacroute, E M B O J. 2, 2071 (1983). C. J. Davies, J. Trgovich and C. A. Hutchinson 111, Nature 345, 298 (1990). M. Sawadogo, A. Sentenac and P. Fromageot, JBC 255, 12 (1980). A. Eisen and R. D. Camerini-Ote,o, PNAS 85, 7481 (1988). P. Hsieh, M. S. Meyn and R. D. Camerini-Otero, Cell 44, 885 (1986). S. P. Moore and R. Fishel, J B C 265, 11108 (1990). E. C. Friedherg, “DNA Repair.” Freeman, New York, 1985. S. B. Zimmermann and C. K. Oshinsky, JBC 244, 4689 (1969). P. W. Riddles and I. R. Lehman, JBC 260, 170 (1985). J. P. Menetski, D. G. Bear and S. C. Kowalczykowski, PNAS 87, 21 (1990). W. Rosselli and A. Stdsiak, ] M B 216, 335 (1990). S. T. Lovett and R. D. Kolodner, PNAS 86, 2627 (1989). S . R. Linn, in “Nucleases” (S. M . Linn and R. J. Roberts, eds.), p. 59. CSHLab, Cold Spring Harbor, New York, 1985. P. D. Sadowski, in “Nucleases” (S. M . Linn and R. J. Roberts, eds.), p. 23. CSHL, CSH (1985). L. M. S. Chang, JBC 252, 1873 (1977). E. Wintersherger, EJB 84, 167 (1978). I. S. Villadsen, S. E. Bjdrn and A. Vrang, JBC 267, 8177 (1982). G. A. Bauer, H. M. Heller and P. M. J. Burgers, J B C 263, 917 (1988). P. M. J. Burgers, G. A. Bauer and L. Tam, JBC 263, 8099 (1988). M . Dolberg, C.-P. Baur and R. Knippers, EJB 198, 783 (1991). T. Y.-K. Chow and M . A. Resnick, JBC 262, 17659 (1987). E. Dake, T. J. Hofmann, S. McIntire, A. Hudson and H. P. Zassenhaus, JBC 263, 7691 (1988). T. Y.-K. Chow and B. A. Kunz, Curr. Genet. 20, 39 (1991).

79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. 97. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112.

113.

HOMOLOGOUS RECOMBINATION OF

S . cereoisiue

269

114. T. Y.-K. Chow and M. J. Fraser, JBC 258, 12010 (1983). 115. K. Adzuma, T. Ogawa and H. Ogawa, MCBiol 4, 2735 (1984). 116. T. Y.-K. Chow and M. A. Resnick, MGG 211, 41 (1988). 117. S. G . Rogers and B. Weiss, in “Methods in Enzymology” (L. Grossman and K. Moldave, eds.), Vol. 65, p. 201. Academic Press, New York, 1980. 118. H. P. Zassenhaus, T. J. Hofmann, R. Uthayashankar, R . D. Vincent and M. Zona, NARes 16, 3283 (1988). 119. R. D. Vincent, T.J. Hofmann and H. P. Zassenhaus, NARes 16, 3297 (1988). 120. A. L. Kolodkin, A. J. S. Klar and F. W. Stahl, Cell 46, 73 (1986). 121. A. J. S . Klar and L. M. Miglio, Cell 46, 725 (1986). 122. R. Kostriken, J. N. Strathern, A. J. S. Klar, J. B. Hicks and F. Heffron, Cell 35, 167 (1983). 123. L. Colleaux, L. dAuxiol, M. Betermier, G. Cottarel, A. Jacquier, F. Galibert and B. Dujon, Cell 44, 521 (1986). 124. A. Dellahodde, V. Goguel, A. M. Becam, F. Cresuot, J. Perea, J. Banroques and C. Jacq, Cell 56, 431 (1989). 125. J. M. Wenzlau, R. J. Saldanha, R. A. Butow and P. S. Perlman, Cell 56, 421 (1989). 126. H. Watabe, T. Shibata and T. Ando, J . Biochem. 90, 162.3 (1981). 127. H. Watabe, T. Iino, T. Kaneko, T. Shibata and T. Ando, JBC 258, 4663 (1983). 128. H . Watabe, T Shibata, T. Iino and T. Ando, /. Biochem. 95, 1677 (1984). 129. K.-I. Nakagawa, J.-I. Hashikawa, 0. Makino, T.Ando and T. Shibata, EJB 171, 23 (1988). 130. T. Shibata, H. Watabe, T. Kaneko, T. Iino and T. Ando, JBC 259, 10499 (1984). 131. N. Morishima, K.-I. Nakagawa, E. Yamamoto and T. Shibata, JBC 265, 15189 (1990). 132. K.-I. Nakagawa, N. Morishima and T Shibata, JBC 266, 1977 (1991). 133. K. Mizuuchi, B. Kemper, J. Hays and R. A. Weisberg, Cell 29,357 (1982). 134. F. Jensch, H. Kosak, N. C. Seeman and B. Kemper, EMBO J . 8, 4325 (1989). 135. L. S. Symington and R. Kolodner, PNAS 82, 7247 (1985). 136. D. H. Evans and R. Kolodner, J M B 201, 69 (1988). 137. D. H. Evans and R. Kolodner, JBC 262, 9160 (1987). 138. S. C. West and A. Korner, PNAS 82, 6445 (1985). 139. S. C. West, C. A. Parsons and S. M. Picksley, JSC 262, 12752 (1987). 140. C. A. Parsons and S. C. West, Cell 52, 621 (1988). 141. C. A. Parsons, A. I. H. Murchie, D. M. J. Lilley and S. C. West, EMBOJ. 8,239 (1989). 142. B. Kemper, M. Garabett and U. Courage, EJB 115, 133 (1981). 143. M. S. Center and C. C. Richardson, JBC 245, 6285 (1970). 144. M. S. Center and C. C. Richardson, ] B C 245, 6292 (1970). 145. S. Kleff, S. Kemper and R. Sternglanz, EMBO J . 11, 699 (1992). 146. K. R. Williams and J. W. Chase, in ‘The Biology of Non-specific DNA-Protein Interactions” (A. Revzin, ed.), p. 197. CRC Press, Boca Raton, Florida, 1990. 147. L. M. S. Chang, K. Lurie and P. Plevani, CSHSQB 43, 587 (1979). 148. S . G. LaBonne and L. B. Dumas, Bchem 22, 3214 (1983). 149. A. Y . 3 . Jong, R. Aebersold and J. L. Campbell, JBC 260, 16367 (1985). 150. A. Y.-S. Jong and J. L. Campbell, PNAS 83, 877 (1986). 151. A. Y.-S. Jong, M. W. Clark, M. Gilbert, A. Oehm and J. Campbell, MCBiol7,2947 (1987). 152. J. Arendes, K. C. Kim and A. Sugino, PNAS 80, 673 (1983). 153. R. A. Sclafani and W. L. Fangman, PNAS 81, 5821 (1984). 151. A. Y. S. Jong, C. Kuo and J. L. Campbell, JBC 259, 11052 (1984). 155. C. R. Wobbe, L. Weissbach, J. A. Borowiec, F. B. Dean, Y. Murakarni, P. Bullock and J. Hurwitz, PNAS 84, 18834 (1987). 156. M. P. Fairman and B. Stillman, EMBOJ. 7, 1211 (1988). 157. M. S. Wold and T. J. Kelly, PNAS 85, 2523 (1988).

270

W.-D. HEYER AND R. D. KOLODNER

158. D. Coverly, M. K. Kenny, M. Mumm, W. D. Rupp, D. P. Lane and R. D. Wood, Nature 159. 160. 161. 16.2. 163.

164. 165. 166. 167. 168. 169. 170. 171. 172. 173. 174. 175. 176. 177. 178. 179. 180. 181. 182. 183. 184. 185. 186. 187. 188. 189.

190. 191.

192. 193. 194. 195. 196. 197. 198. 199.

349, 538 (1991). M. S. Wold, D. H. Weinberg, D. M. Virshup, J. J. Li and T. 1. Kelly, JBC 264, 2801 (1989). S. J. Brill and B. Stillman, Nature 342, 92 (1989). W.-D. Heyer and R. D. Kolodner, Bchem 28, 2856 (1989). W.-D. Heyer, M. R. S. Rao, L. F. Erdile, T. J. Kelly and R. D. Kolodner, E M B O J . 9, 2321 (1990). S . J. Brill and B. Stillman, Genes Dev. 5, 1589 (1991). L. F. Erdile, W.-D. Heyer, R. Kolodner and T. J. Kelly, JBC 266, 12090 (1991). E. Alani, R. Thresher, J. D. Griffith and R. D. Kolodner, J M B 226, 54 (1992). W. C. Brown, J. K. Smiley and J. L. Campbell, PNAS 87, 677 (1990). K. Umezu, K. Nakayama and H. Nakayama, PNAS 87, 5363 (1990). P. Sung, L. Prakash, S. Weber and S. Prakash, PNAS 84, 6045 (1987). P. Sung, L. Prakash, S. W. Matson and S. Prakash, PNAS 84, 8951 (1988). L. Naumovski and E. C. Friedberg, PNAS 80, 4818 (1983). D. R. Higgins, S. Prakash, P. Reynolds, R. Polakowska, S. Weber and L. Prakash, PNAS 80, 5680 (1983). F. Foury and J. Kolodynski, PNAS 80, 5345 (1983). F. Foury and E. Van Dyck, E M B O J . 4, 3525 (1985). F. Foury and A. Lahaye, E M B O J . 6, 1441 (1987). A. Lahaye, H. Stahl, D. Thines-Sempoux and F. Foury, EMBO J, 10, 997 (1991). A. Aboussekhra, R. Chanet, Z. Zgaga, C. Cassier-Chaurat, M. Hende and F. Fabre, NARes 17, 7211 (1989). C. W. Lawrence and R. B. Christensen, J . Bact. 139, 866 (1989). A. Aguilera and H. L. Klein, Genetics 119, 779 (1988). L. Rong, F. Palladino, A. Aguilera and H. L. Klein, Genetics 127, 75 (1991). F. Palladino and H. L. Klein, Genetics 132, 23 (1992). A. Sugino, B. H. Ryn, T. Sugino, L. Naumovski and E. C. Friedberg, JBC 261, 11744 (1986). D. S . Thaler, E. Sampson, I. Siddiqi, S. M. Rosenberg, L. C. Thomason, F. W. Stahland M . M. Stahl, Genome 31, 53 (1989). S . I. Feinstein and K. B. Low, Genetics 113, 13 (1986). J. C. Wang, P. C. Caron and R. A. Kim, Cell 62, 403 (1990). M. F. Christman, F. S. Dietrich and G. R. Fink, Cell 55, 413 (1988). R. A. Kim and J. C. Wang, Cell 57, 975 (1989). R. Rothstein, C S H S Q B 49, 629 (1984). J. W. Wallis, G. Chrebet, 6. Brodsky, M. Rolfe and R. Rothstein, Cell 58, 409 (1989). A. Aguilera and H. L. Klein, MCBiol 10, 1439 (1990). D. 6 . Schatz, M. A. Oettinger and D. Baltimore, Cell 59, 1035 (1989). D. Rose, W. Thomas and C. Holm, Cell 60, 1009 (1990). P. M . J. Burgers, R. A. Bambara, J. L. Campbell, L. M. S. Chang, K. M. Downey, U. Hubscher, M. Y. W. T. Lee, S. M. Linn, A. G. So and S. Spadari, EJB 191, 617 (1990). A. Morrison, R. B. Christensen, J. Alley, A. K. Beck, E. G. Bernstine, L. F. Lemonttand C. W. Lawrence, J. B u t . 171, 5659 (1989). F. Foury, J B C 264, 20552 (1989). A. Genga, L. Bianchi and F. Foury, JBC 261, 9328 (1986). K. C. Sitney, M. E. Budd and J. L. Campbell, Cell 56, 599 (1989). L. H . Hartwell and D. Smith, Genetics 110, 381 (1985). F. Fabre, A. Boulet and 6. Faye, M G G 229, 353 (1991). L. H . Johnston and K. A. Nasmyth, Nature 274, 891 (1978).

HOMOLOGOUS RECOMBINATION OF

S. cerevisiae

271

200. F. Fabre and H. Roman, PNAS 76, 4586 (1979). 201. J. C. Game, L. H. Johnston and R. C. von Borstel, PNAS 76, 4589 (1979). 202. D. N. Norris and R. D. Kolodner, Bchern 29, 7903 (1990). 203. R. K. Saiki, D. H. Gelfand, S. Stoffel, S. J. Scharf, R. Higuchi, 6. T. Horn, K. B. Mullis and H. A. Ehrlich, Science 239, 487 (1988). 204. J. F. Thompson, L. Moitoso de Vargas, C. Koch, R. Kahmann and A. Landy, Cell 50,901 (1987). 205. C. Egner, E. Azhderian, S. S. Tsang, C. M. Raddingand J. W. Chase, J. B a t . 169, 3422 (1987). 206. D. N. Norris and R. D. Kolodner, Bchem 29, 7911 (1990). 207. M. A. Krasnow and N. R. Cozzarelli, JBC 257, 2687 (1982). 208. J. Ramdas, E. Mythili and K. Muniyappa, PNAS 88, 1344 (1991). 209. M. E. Dresser and C. N. Giroux, J. Cell Biol. 106, 567 (1988). 210. N. M. Hollingsworth, L. Goetsch and B. Byers, Cell 61, 73 (1990). 211. E. Alani, R. Padmore and N. Kleckner, Cell 61, 419 (1990). 212. N. M. Hollingsworth and B. Byers, Genetics 121, 445 (1989).

This Page Intentionally Left Blank

Index

A ABF1, mitochondrial biogenesis in Saccharomyces cerevisiae and, 76-80 AbrB protein, Bacillus subtilis gene expression and, 124-132 Adenoviral DNA integration, 1-5 gene expression, 30-32 insertional mutagenesis, 32-33 mechanism, 21-23 cell-free system, 11-18 recombination, 18-21 methylation, 23-26 host defense mechanism, 26-27 insertion in sequences, 30 origin, 28-29 viral replication cycles, 27-28 survey of findings, 5-8 uptake by mammalian cells, 8, 10I1 DNA-protarnine complexes, 8-10

B Bacillus subtilis transition-state gene expression, 121-125, 148-149 modulators, 138 ComA protein, 142-143 ComP protein, 142-143 Deg proteins, 138-141 SenS protein, 141-142 TenA protein, 142 Ten1 protein, 142 redundant control, 146-148 regulators, 124 AbrB protein, 124-132 Hpr protein, 132-134 Pai protein, 136-138 Sin protein, 134-136 sporulation, 143, 145-146 signal transduction, 143-145 s p 0 0 genes, 143-145

Bacteriophage A, see Lysogenic pathway in bacteriophage A Baculovirus DNA, integration, and methylation patterns, 18-21 Biogenesis, mitochondrial, in Saccharomyces cerevisiae, see Mitochondria1 biogenesis in Saccharomyces cerevisiae

C Carbon, mitochondrial biogenesis in Saccharomyces cerevisiae and, 64-70 cis-acting elements, mitochondrial biogenesis in Saccharomyces cerevisiae and, 54-55 Coat proteins, yeast double-stranded RNAs and, 177-178 ComA protein, Bacillus subtilis gene expression and, 142-143 ComP protein, Blrcillus subtilis gene expression and, 142-143 CPFI, mitochondrial biogenesis in Saccharomyces cerevisiae and, 77-78 cll translation, lysogenic pathway in bacteriophage A and, 45

D Deg proteins, Bacillus suhtilis gene expression and, 138-141 DNA baculoviral, see Baculovirus DNA homologous recombination in Saccharomyces cerevisiae, 232-244 integration, adenoviral, see Adenoviral DNA integration DNA helicases, homologous recombination in Saccharomyces cerevisiae and, 256258 DNA ligase, homologous recombination in Saccharomyces cerevisiae and, 259-260

273

274

INDEX

DNA polymerase 11, 93-94, 116-117 catalytic subunit C-terminal half, 105 exonuclease active site, 105-109 POL2 homologs, 109-110 polymerase domain, 104-105 categorization, 94-96 cell cycle regulation, 103-104 DNA repair, 112-114 DNA replication, 114-115 genetics of exonuclease epistatic relationships, 111- 112 spontaneous mutator phenotype, 110111 structure mammalian DNA polymerase e, 102103 purification, 96-99 stimulatory factors, 101-102 subunits, 97-101 DNA polymerases, homologous recombination in Saccharomyces cereuisiae and, 259-260 DNA-protamine complexes, adenoviral DNA integration and, 8-10 DNA topoisomerases, homologous recombination in Saccharomyces cerevisiae and, 258-259 Double-stranded RNAs, yeast, see Yeast double-stranded RNAs

E Enzymology, of homologous recombination, see Homologous recombination in Saccharomyces cereuisiae Eukaryotic initiation factor 4E, 183-184, 213-215 activities, 191 binding to caps, 191-193 binding to eIF-4A, 197 binding to 4 0 3 ribosome, 193-198, 200 alteration of intracellular levels, 200-201 overexpression, 201-207 underexpression, 207-213 mRNA binding to ribosomes, 186-187 phosphorylation, 197, 199, 201 regulation of initiation, 184-185 structure, 188-191

Evolution mitochondrial biogenesis in Saccharomyces cereuisiae and, 81-82 yeast double-stranded RNAs and, 179 Exonuclease, DNA polymerase I1 and catalytic subunit, 105-109 genetics, 110-112

G Gene expression adenoviral DNA integration and, 30-32 transition-state, see Bacillus sub& transition-state gene expression Glucose repression, mitochondria1 biogenesis in Saccharomyces cerevisiae and, 65-67

H HAP1, mitochondrial biogenesis and, 59-62 HAP2, mitochondrial biogenesis and, 62, 67-68 HAP3, mitochondrial biogenesis and, 62, 67-68 HAP4, mitochondrial biogenesis and, 62, 67-68 Heme, mitochondrial biogenesis and, 59-62 Homologous recombination in Saccharomyces cereuisiae, 221-222, 264266 enzymology, 228-232 DNA helicases, 256-258 DNA ligase, 259-260 DNA polymerases, 259-260 DNA topoisomerases, 258-259 nucleases, 244-253 proteins in hybrid DNA formation, 232-244 single-stranded DNA-binding proteins, 254-256 models, 222-227 physical analysis, 228 reconstitution of in uitro system, 261-264 Hpr protein, Bacillus subtilis gene expression and, 132-134 Hybrid DNA, homologous recombination in Saccharomyces cereuisiae and, 232-244

275

INDEX

I Initiation factor 4E, see Eukaryotic initiation factor 4E Insertional mutagenesis, adenoviral DNA integration and, 32-33 Integration host factor, lysogenic pathway in bacteriophage A and, 45

L Lysogenic pathway in bacteriophage A, posttranscriptional control, 37-38, 47 cll translation, 45 integration host factor, 45 A genes, 38-39 phage regulatory proteins, 46-47 RNase 11, 40-45

M Methylation, adenoviral DNA integration and, 23-26 adjacent cellular DNA sequences, 30 host defense mechanism, 26-27 origin, 28-29 viral replication cycles, 27-28 Mitochondria1 biogenesis in Saccharomyces cerevisiue, 51-54, 82-85 cell growth, 74-76 ABF1, 76-80 CPFL, 77-78 multifunctional regulators, 80-81 QCR8 gene, 77 respiratory-chain component, 77-78 evolution, 81-82 gene pairs, 63-64 path to nucleus, 71-73 stress regulation, 70-71 transcriptional regulation &-acting elements, 54-55 mechanistic models, 55-57 signal transduction, 57-58 tram-acting factors, 55 transcriptional regulation by carbon source, 64,68-70 glucose repression, 65-67 HAP2, 67-68

HAP3, 67-68 HAP4, 67-68 transcriptional regulation by oxygen HAPI, 59-62 HAP2, 62 HAP3, 62 HAP4, 62 heme, 59-62 ROXl, 62 yeast cell cycle, 73-74 Mutagenesis, insertional, adenoviral DNA integration and, 32-33 Mutation, DNA polymerase I1 and, 110-111 Mycoviruses, yeast double-stranded RNAs and, 176-179

N Nucleases, homologous recombination in Saccharomyces cerevisiae and, 244253 Nucleotide sequence, yeast double-stranded RNA. 157-158

0 Oxygen, mitochondrial biogenesis and, 5962

P Pai protein, Bacillus subtilis gene expression and, 136-138 Phage regulatory proteins, lysogenic pathway in bacteriophage h and, 46-47 Phenotype, DNA polymerase I1 and, 110111 Phosphorylation, eukaryotic initiation factor 4E and, 197, 199, 201 Posttranscriptional control, lysogenic pathway in bacteriophage A, see Lysogenic pathway in bacteriophage A Proteins, see also specijic proteins homologous recombination in Saccharomyces cerevisiue and, 232-244 Protein synthesis initiation factor 4E, see Eukaryotic initiation factor 4E

276

INDEX

Q QCR8 gene, mitochondrial biogenesis in Saccharomyces ceretiisiae and, 77

R Recom bination adenoviral DNA integration and, 1821 homologous, see Homologous recombination in Saccharomyces ceretiisiae Replication DNA polymerase I1 and, 114-115 yeast double-stranded RNAs, 173174 Replication cycles, adenoviral DNA integration and, 27-28 Respiratory-chain component, mitochondrial biogenesis in Saccharomyces cerevisiae and, 77-78 Ribosomes, eukaryotic initiation factor 4E and, 186-187, 193-198, 200 RNA messenger, eukaryotic initiation factor 4E and, 186-187 yeast double-stranded, see Yeast doublestranded RNAs RNA polynierases, yeast double-stranded RNAs and, 158-165, 175-176 ROX1, mitochondrial biogenesis in Saccharomyces ceretiisiae and, 62

S Saccharomyces ceretiisiae DNA polymerase I1 and, see DNA polymerase I1 double-stranded RNAs from, see Yeast double-stranded RNAs homologous recombination in, see Homologous recombination in Saccharomyces cerevisiae mitochondrial biogenesis in, see Mitochondria] biogenesis in Saccharomyces ceretiisiae SenS protein, Bacillus subtilis gene expression and, 141-142

Signal transduction Bacillus subtilis gene expression and, 143-145 mitochondrial biogenesis in Saecharomyces ceretiisiae and, 57-58 Single-stranded DNA-binding proteins, homologous recombination in Saccharomyces ceretiisiae and, 254-256 Sin protein, Bacillus subtilis gene expression and, 134-136 s p o 0 genes, Bacillus subtilis gene expression and, 143-145 Sporulation, Bacillus subtilis gene expression and, 143-146 Stress regulation, mitochondrial biogenesis in Saccharomyces ceretiisiae and, 7071

T TenA protein, Bacillus subtilis gene expression and, 142 Ten1 protein, Bacillus subtilis gene expression and, 142 trans-acting factors, mitochondrial biogenesis and, 55 Transcription, mitochondria1 biogenesis in Saccharomyces ceretiisiae and, 64-70

V Vesicles, yeast double-stranded RNAs and, 178-179 Viral replication cycles, adenoviral DNA integration and, 27-28

Y

Yeast DNA polymerase I1 and, see DNA polymerase I1 homologous recombination in, see Homologous recombination in Samharotnyces ceretiisiae mitochondrial biogenesis in, see Mitochondria] biogenesis in Saccharomyces ceretiisiae

INDEX

Yeast double-stranded RNAs, 155-156, 179180 configuration, 169-173 evolution, 179 mycoviruses, 176-177 coat proteins, 177-178 vesicles, 178-179 replication cycles, 173-174 RNA polyrnerases, 175-176

277 sedimentation, 174-175 single-stranded RNA counterparts, 165169 T, genomic organization nucleotide sequences, 157-158 RNA polyrnerases, 158-165 W, genomic organization nucleotide sequences, 157-158 RNA polymerases, 158-165

This Page Intentionally Left Blank