Progress in Nucleic Acid Research and Molecular Biology, Volume 51

PROGRESS IN Nucleic A c i d Research and M o l e c u l a r Biology Volume 57 This Page Intentionally Left Blank PR...

Author: Waldo E. Cohn | Kivie Moldave

7 downloads 696 Views 19MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

PROGRESS IN

Nucleic A c i d Research and M o l e c u l a r Biology Volume 57

This Page Intentionally Left Blank

PROGRESS IN

Nucleic Acid Research and Molecular Biology edited by

WALDO E. COHN

KlVlE MOLDAVE

Biology Division Oak Ridge National Laboratory Oak Ridge, Tennessee

Departmnt of Molecular Biology and Biochemistry University of California, Zroine lrvine, California

Volume

57

ACADEMIC PRESS Son Diego New York Boston London Sydney Tokyo Toronto

This book is printed on acid-free paper. Copyright 0 1995 by ACADEMIC PRESS, INC. All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.

Academic Press, Inc. A Division of Harcourt Brace & Company 525 B Street, Suite 1900, San Diego, California 92101-4495

United Kingdom Edition published by Academic Press Limited 24-28 Oval Road, London NW 1 7DX

International Standard Serial Number: 0079-6603 International Standard Book Number: 0- 12-540051-9 PRINTED IN THE UNITED STATES OF AMERICA 95 96 9 7 9 8 99 0 0 B B 9 8 7 6

5

4

3 2

1

Contents

ABBREVIATIONS AND SYMBOLS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SOME

ARTICLESPLANNED

FOR

FUTUREVOLUMES . . . . . . . . . . . . . . .

ix

xi

Molecular Regulation of Heme Biosynthesis in Higher Vertebrates ............................... Brian K. May, Satish C. Dogra, Tim J. Sadlon, C. Ramana Bhasker, Timothy C . Cox and Sylvia S. Bottomley I. Heme Synthesis in Higher Vertebrates and the Importance of 5-Aminolevulinate Synthase ...................... 11. Comparison of Hepatic and Erythroid 5-Aminolevulinate Synthase Isozymes and Their Genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Molecular Regulation of Housekeeping 5-Aminolevulinate Synthase IV. Drug Induction of Hepatic Cytochrome P450 and 5-Aminolevulinate Synthase Transcription . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Heme Degradation by Heme Oxygenase . . ............... VI. Erythropoiesis and Heme Synthesis . . . . . . . . . . . . . . . . VII. Molecular Biology of Human Hereditary Porphyrias . . . . . . . . . . . . . . . VIII. Molecular Biology ereditary Sideroblastic Anemia IX. Find Comments . ................... ............... References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...............

3 6 11

17 21 22 34 41 46 47

The Flp Recombinase of the 2-pm Plasmid of Saccharomyces cerevisiae ......................... Paul D. Sadowski I. Structure and Function of the 2-pm Plasmid . . . . . . . . . . . . . . . . . . . . . 11. Flp Is a Conservative Site-specific Recombinase . . . . . . . . . . . . . . . . . . 111. Flp-mediated Recombination: The in Vitro Reaction . . . . . . . . . . . . . . . IV. The Mechanism of Action of the Flp Protein . . . . . . . . . . . . . . . . . . . . . V. Flp as a Reagent for Chromosome Engineering . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

V

53 57 59 62 86 88

vi

CONTENTS

Reconstitution of Mammalian DNA Replication ..... Robert A . Bainbara and Lin Huang I . Initiation at Replication Origins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1. Priming Reactions That Initiate DNA Replication . . . . . . . . . . . . . . . . . I11. Mechanisms of Leading- and Lagging-strand DNA Synthesis . . . . . . . IV. Completion of Lagging-strand Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . V. Regulation of Replication Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93 93

96 98 104 113 116

Transcription of the Herpes Simplex Virus Genome during Productive and Latent Infection ............. 123 Edward K . Wagner. John F . Guzowski and Jasbir Singh I. I1. 111. IV. V. VI .

Herpes Simplex Virus Type 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transcriptional Switches during HSV Infection . . . . . . . . . . . . . . . . . . . Functional Analysis of Specific HSV Promoters in Vivo . . . . . . . . . . . . Other Factors in the Early/Late Switch in HSV mRNA Expression . . In Vitro Analysis of HSV Promoters ............................. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Structure. Function. and Inhibition of 06-Alkylguanine-DNA Alkyltransferase

125 133 140

153 154 155 158

............ 167

Anthony E . Pegg. M . Eileen Dolan and Robert C . Moschel I. I1. I11. IV. V. VI . VII .

Alkyltransferase Structure and Function . . . . . . . . . . . . . . . . . . . . . . . . . ................... Inhibition of Alkyltransferase Activity Substrate Specificity and Metabolism o ................... Regulation of Alkyltransferase Expression Function of Alkyltransferase . . . . . . . . . . . . . . . . . . . . . . . . . . . Inactivation of Alkyltransferase to Enhance Chemotherapy . . . . . . . . . Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

170 182 191 196 200 209 214 215

CONTENTS

Replicable RNA Vectors: Prospects for Cell-free Gene Amplification, Expression, and Cloning ..........................

vii

225

Alexander B. Chetverin and Alexander S. Spirin I. 11. 111. IV. V. VI.

Synthesis of RNA by Qp Replicase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RQ RNAs . . . . . . . . . . . . . . . . . . . . . RQ RNA Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cell-free Molecular Cloning . . . . . . Conclusion . ........... .............................. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . .............. References . . . . . . . . . . . . . . . .

Examination of Mitotic Recombination by Means of Hyper- recombination Mutants in Saccharomyces cerevisiae .......................................

227 252 265 265

271

Hannah L. Klein I. Review of Mitotic Recombination in Yeast . . . . . . . . . . . . . . . . . . . . . . . . 11. Isolation of Hypo-recombination and Hyper-recombination Mutants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Concluding Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Gene Structure at the Human UGT7 Locus Creates Diversity in lsozyme Structure, Substrate Specificity, and Regulation ...................................

271 274 299 301 301

305

Ida S. Owens and Joseph K. Ritter I. Function and Distribution of the UDP-Glucuronosyltransferase System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Chemistry and Biochemistry of Bilirubin and Phenolic Substrates ... 111. The UGTl Gene Complex Locus ............................... IV. Exons 1 Determine Structural Diversity of the Transferases . . . . . . . . V. Defects Define Microregions in Bilirubin Transferase with Important Clues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI. Comparisons of the UGTl Gene Complex to Rat Steroid Transferase Genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII. Substrate Specificity of UGT1-encoded Isoforms . . . . . . . . . . . . . . . . . .

306 307 314 319 324 326 327

viii

CONTENTS

VIII . Regulation of the UCTl Genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IX . Significance of the Arrangement of the UGTl Locus and Future Direction of Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Growth Control of Translation in Mammalian Cells

328 334 336

339

David R . Morris I . Cellular Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation of Translational Initiation Factors . . . . . . . . . . . . . . . . . . . . . Regulation by mRNA Binding Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation by Open Reading Frames within the 5' Leader . . . . . . . . .

I1. 111. IV. V. VI .

Translation and Oncogenic Transformation . . . . . . . . . . . . . . . . . . . . . . . Conclusions: Physiological Roles of Translational Control . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

340 343 348

350 355 357 359

365

Abbreviations and Symbols

All contributors to this Series are asked to use the terminology (abbreviations and symbols) recommended by the IUPAC-IUB Commission on Biochemical Nomenclature (CBN) and approved by IUPAC and IUB, and the Editors endeavor to assure conformity. These Recommendations have been published in many journals ( 1 , 2 )and compendia (3);they are therefore considered to be generally known. Those used in nucleic acid work, originally set out in section 5 of the first Recommendations (I)and subsequently revised and expanded (2, 3), are given in condensed form in the frontmatter of Volumes 9-33 of this series. A recent expansion of the oneletter system (5) follows.

SINGLE-LETTER CODERECOMMENDATIONS~(5) Symbol

Meaning

Origin of symbol Guanosine Adenosine (ribo)Thymidine (Uridine) Cytidine

R Y M K S Wb

G or A T(U) or C A or C G or T(U) G or C A or T(U)

puRine pyrimidine aMino Keto Strong interaction (3 H-bonds) Weak interaction (2 H-bonds)

H B V DC

A or C or T(U) G or T(U) or C C or C or A G or A or T(U)

not not not not

N

G or A or T(U) or C

aNy nucleoside (i.e., unspecified)

Q

Q

Queuosine (nucleoside of queuine)

6 ;H follows G in the alphabet A; B follows A T (not U); V follows U C; D follows C

.Modified from Proc. Natl. Acad. Sci. U.S.A. 83, 4 (1986). bW has been used for wyosine, the nucleoside of “base Y” (wye). CDhas been used for dihydrouridine (hU or H,Urd). Enzymes

In naming enzymes, the 1984 recommendations of the IUB Commission on Biochemical Nomenclature ( 4 ) are followed as far as possible. At first mention, each enzyme is described either by its systematic name or by the equation for the reaction catalyzed or by the recommended trivial name, followed by its EC number in parentheses. Thereafter, a trivial name may be used. Enzyme names are not to be abbreviated except when the substrate has an approved abbreviation (e.g., ATPase, but not LDH, i s acceptable).

ix

ABBREVIATIONS AND SYMBOLS

X

REFERENCES 1 . JBC 241,527 (1966), Rcheni 5, 1445 (1966);BJ 101, 1(1966);ABB 115, 1 (1966), 129, l(1969);

and elsewhere. General. 2. EJB 15, 203 (1970);JRC 245, 5171 (1970);JMB 55, 299 (1971):and elsewhere. 3. “Handbook of Biochemistry” (G. Fasman, ed.), 3rd ed. Chemical Rubber Co., Cleveland, Ohio, 1970, 1975, Nucleic Acids, Vols. I and 11, pp. 3-59. Nucleic acids. 4 . “Enzyme Nomenclature” [Recommendations (1984) of the Nomenclature Committee of the IUB]. Academic Press, New York, 1984. 5 . EJB 150, 1 (1985). Nucleic Acids (One-letter system). Abbreviations of Journal Titles

Journals

Annu. Rev. Biochem. Annu. Rev. Genet. Arch. Biochem. Biophys. Biochem. Biophys. Res. Commun. Biochemistry Biochem. J. Biochim. Biophys. Acta Cold Spring Harbor Cold Spring Harbor Lab Cold Spring Harbor Symp. Quant. Biol Eur. J. Biochem. Fed. Proc. Hoppe-Seyler’s Z. Physiol. Chein. J. Amer. Chem. Soc. J. Bacteriol. J. Biol. Chem. J. Chem. SOC. J. Mol. Biol. J. Nat. Cancer Inst. Mol. Cell. Biol. Mul. Cell. Biochem. Mol. Gen. Genet. Nature, New Biology Nucleic Acid Research Proc. Natl. Acad. Sci. U . S . A . Proc. Soc. Exp. Biol. Med. Progr. Nucl. Acid. Res. Mol. Biol.

Abbreviations used

ARB ARGen ABB BBRC Bchem

BJ BBA CSH CSHLab CSHSQB EJB

FP ZpChem JACS J. Bact. J BC JCS JMB JNCI MCBiol MCBchem MGG Nature NB NARes PNAS PSEBM This Series

Some Articles Planned for Future Volumes

Roles of Metal Ions and Modified Nucleosides in RNA Structure and Function

PAULF. AGRIS The Poly(ADP-ribosyl)ation System of Higher Eukaryotes

FELIXR. ALmAUs Replication of Autonomous Parvovirus: cis-acting Sequences and a Transcribing Factor

CAROLINE ASTELL Structure and Function of Retroviral RNA

BEN BERKHOUT Architectural Components That Facilitate Chromatin Structure MICHAELBUSTINAND RAYMOND REEVES The Internal Structure of the Ribosome

BARRYS. COOPERMAN Recent Advances in the Molecular Biology of Vitamin

D Action

HECTORDELUCAAND HJSHAMDARWISH Transcriptional Regulation of Growth Related Genes

THOMASF. DEUELAND ZHAO-YIWANG Poly(A) Tails, Structure, and Function

MARYEDMONDS Accessibility of Bacteriophage T4 nrd6 mRNA to 30s Ribosomes. Role in Initiation of Ribonucleotide Reductase Synthesis and T4 DNA Replication

G. ROBERTGREENBERCAND JOHNM. HILFINGER Mechanisms for the Selectivity of the Cell's Proteolytic Machinery ALFRED GOLDBERG,MICHAELSHERMAN AND OLIVERCoux Structure/Function Relationships of Phosphoribulokinase and Ribulosebisphosphate Carboxylase/Oxygenase

FREDC. HARTMAN AND HILLELK. BRANDES The Nature of DNA Replication Origins in Higher Eukaryotic Organisms JOEL A. HUBERMAN AND WILLIAMC. BURHANS Cloning and Characterization of elF-2 Kinase and of the Subunits of Mammalian Translation Initiation Factor elF-26 LEONARD S. JEFFERSON,K. M. FLOWERS,H. MELLORAND S. R. KIMBALL

xi

xii

SOME ARTICLES PLANNED FOR FUTURE VOLUMES

Uracil Metabolism: Uridylate Synthesis from Orotic Acid or Uridine and Conversion of Uracil to p-alanine MARYELLENJONES AND THOMAS w. TRAW Parallel-stranded DNA and RNA

THOMAS M.

JOVIN, K.

RIPPE,

V.

KURYAVYIAND A. GARCIA

Replication of Chromatin

ROLF KNIPPERS AND

CLAUDIA CRUSS

Amino Acid-dependent Gene Expression in Mammalian Cells RONEY 0. LAINE, R. HLTSON AND M. s. KILBERG A Newly Discovered Global Regulon in Escherichia coli Controlled by the Leucine-responsive Regulatory Protein

ROWENA MATHEWS, ROBERT M.

BLUMENTHAL AND

DEBORAH BORST

Structure and Function of Translational Elongation Factors PETER

B. MOORE AND

JOHN

CZWORKOWSKI

The Role of Ribosomal RNA in Translation JAMES OFENCAND Transcriptional Activation of Thymidine Kinase in the Cell Cycle ARTHUR B. PARDEEAND QINC-PING Dou Bacterial and Eukaryotic DNA Methyltransferases NORBERT 0. REmi DNA Repair AZIZ SANCAR Why 10% of the Human Genome Consists of a Single Family of Sequences CARLW. SCHMID Depletion of Nuclear Poly(ADP-ribose)Polymerase by Antisense RNA Expression: Influence on Genomic Stability, Chromatin Organization and DNA Rejoining, Replication and Repair MARK SMULSONAND CYNTHIA SIMBULAN Transcriptional Regulation of Small Nuclear RNA Genes WILLLAME. STUMPH Bacillus subtilis as I Know It NOBORU SUEOKA

RNA Structure: Prediction and Investigation with Chemical and Enzymatic Probes VALENTIN

v. VLASSOV. N .

KOCIiANOV AND

I.

VLASSOVA

Molecular Regulation of Heme Biosynthesis in Higher Vertebrates BRIAN K. MAY,*.' SATISH C. DOGRA,*TIMJ. SADLON," C. RAMANA BHASKER," TIMOTHY c. COX" AND SYLVIAS . BOTTOMLEY? *Department of Biochemistry University of Adelaide Adelaide SA 5005, Australia tThe Department of Medicine University of Oklahoma College of Medicine and Veterans Affairs Medical Center Oklahoma City, Oklahoma 73104 I. Heme Synthesis in Higher Vertebrates and the Importance of 5-Aminolevulinate Synthase . . . . . . . . . . . . . . . . . . . . . . . . 11. Comparison of Hepatic and Erythroid 5-Aminolevulinat Isozymes and Their Genes .... 111. Molecular Regulation of Housekeeping 5-Aminolevulinate Synthase IV. Drug Induction of Hepatic Cytochrome P450 and 5-Aminolevulinate Synthase Transcription . . . . . . . . . . . . . . . .................. V. Heme Degradation by Heme Oxygenase .................. VI. Erythropoiesis and Heme Synthesis . . . . .................. VII. Molecular Biology of Human Hereditary Porphyrias . . . . . . . . . . . . . . . VIII. Molecular Biology of Hereditary Sideroblastic Anemia . . IX. Final Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . .............

3

6 11 17 21 22 34 41 46 47

The regulation of heme biosynthesis in animals has been of interest for many years. All cell types need heme; some have a greater demand for heme, particularly in response to endogenous and exogenous factors, and this, coupled with the fact that heme can be toxic to cells (1, 2), means that heme synthesis must be finely tuned to the specific requirements of a cell. Early studies were directed toward measuring the intermediate products 1

To whom correspondence may be addressed

Progress in Nucleic Acid Research and Molecular Biology, Vol. 51

1

Copyright B 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.

2

BRIAN K. MAY ET AL.

and the catalytic activities of enzymes of the heme biosynthetic pathway in isolated animal tissues or in cultured cells. With the purification of the individual enzymes, protein and immunological studies became feasible. More recently, cDNA clones and the genes for most of the heme pathway enzymes have been isolated, and this has opened the way for a detailed understanding of their regulation at the molecular level, both in normal and disease states (3-8). Because heme contains iron, and because heme is required in large amounts for liver cytochrome P450 proteins and hemoglobin, researchers have extended their interests to include studies on the regulation of cytochrome P450 proteins, iron metabolism, and erythropoiesis (9-1 3). The heme biosynthetic pathway is present in all cell types except mature erythrocytes. The final step of heme biosynthesis occurs in mitochondria and the heme is then utilized for the formation of different hemoproteins located in mitochondria, microsomes, peroxisomes, the cytosol, and probably the nucleus. These hemoproteins include hemoglobin and myoglobin, involved in 0, transport and storage; respiratory cytochromes as components of the mitochondrial electron-transport chain; mitochondrial and microsomal P450s required for steroid hormone synthesis and the oxidative metabolism of lipophilic compounds; tryptophan pyrrolase, which degrades tryptophan; and peroxisomal catalase and peroxidase, which degrade hydrogen peroxide. The central iron atom of the prosthetic heme molecule is crucial for functional activity of the hemoproteins and exists in two oxidative states, the ferrous and ferric states. The ferrous state can be reversibly oxidized to the ferric state by the transfer of an electron, and the ferrous state has a high &nity for oxygen. These features of the prosthetic group enable it to function as a single-electron carrier, to serve as a catalyst for redox reactions involving oxygen, and to function as an oxygen carrier. A preliminary overview of the biological importance of heme now follows, and aspects of this will be considered in more detail later. Heme can control the biosynthesis of some proteins. In erythroid cells, heme controls the translation of proteins, notably a- and P-globin chains, by modulating the activity of a specific kinase. Heme, acting at transcriptional and posttranscriptional steps, can regulate its own level by modulating the production of the rate-limiting enzymes involved either in its synthesis or in its degradation. A labile heme protein (acting as an 0, sensor molecule) is implicated in the transcriptional regulation of the erythropoietin gene in the kidney in response to hypoxia (14). Recent studies indicate that heme also plays an important role in signal transduction processes elicited by nitric oxide and carbon monoxide (15). These molecules activate guanylyl cyclase (EC 4.6.1.2) by binding tightly to the heme moiety of this enzyme and thus modulate the level of the intracellular messenger molecule, cyclic GMP.

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

3

Although the action of nitric oxide is on various tissues, that of carbon monoxide may be restricted to the brain. All nucleated animal cells must synthesize heme for incorporation into respiratory cytochromes, but erythroid and liver cells have the highest rates of heme synthesis. Erythroid cells synthesize about 90%of the total heme in the body for assembly into hemoglobin. Most of the remaining body heme is made by the liver for various hemoproteins, particularly microsomal P450s, which catalyze the oxidation of hydrophobic compounds to forms that can be modified further and easily excreted. Although the bulk of heme in the liver is made in situ, the liver may also obtain some heme from serum haptoglobin-hemoglobin and heme-hemopexin complexes, following intravascular hemolysis (16).The regulation of the heme biosynthetic pathway in the liver and erythroid tissues is of particular interest and importance, because these tissues are not only the major sites of heme production but also the major sites where the build-up of the intermediates of heme synthesis lead to specific porphyria disorders. Thus, understanding the molecular basis of heme production in these tissues has both fundamental and clinical relevance. It is generally agreed that the first enzyme of the heme pathway, 5-aminolevulinate synthase (ALAS),2 determines the rate of heme biosynthesis, and the regulation of this enzyme is therefore a major focus of this essay.

1. Heme Synthesis in Higher Vertebrates and the Importance of 5-Aminolevulinate Synthase In eukaryotic cells, eight enzymes comprise the heme biosynthetic pathway, with the first and last three steps occurring in the mitochondria and the intervening steps in the cytosol (Fig. 1). These enzymes are nuclear encoded and are synthesized in the cytosol; four of the proteins are subsequently transported into the mitochondrion. Presumably the production of all of these proteins is coordinated in some way. It is not known why there is this compartmentalization of the enzymes between the mitochondria and cytosol. In the overall scheme, shown in Fig. 1, the simple precursors glycine and succinyl CoA, from the citric acid cycle and/or from methyl malonyl Abbreviations are as follows: ALA dehydratase, 5-aminolevulinate dehydratase; ALAS, 5-aminolevulinate synthase; ALAS-1, “housekeeping”ALAS (abbreviated in literature as ALASN, ALAS-h, or ALAS-H); ALAS-2, erythroid ALAS (abbreviated in literature as ALAS-e or ALAS-E); BFU, burst-forming unit; CFU, colony-forming unit; HSE, heat-shock element; HRI, heme-regulated inhibitor kinase; IRE, iron-responsive element; IRE-BP, iron-responsive element binding protein; MEL, murine erythroleukemic; Me2S0, dimethyl sulfoxide.

4

BRIAN K. MAY ET AL.

FIG. 1. Pathway of heme biosynthesis. A M , 5-aminolevulinate; ALAS, ALA synthase (EC 2.3.2.37); ALA dehydratase (EC 4.2.1.24); PBG, porphobilinogen; PBGD, porphobilinogen deaminase (EC 4.3.1.8); uro’gen, uroporphyrinogen; uro’gen-111 synthase (EC 4.2.1.75); uro’gen-111 decarboxylase (EC 4.1.1.37);copro’gen, coproporphyrinogen; copro’gen-111 oxidase (EC 1.3.3.3); proto’gen, protoporphyrinogen; proto’gen-I11 oxidase, (EC 1.3.3.4); ferrochelatase (EC 4.99.1.1);A, acetate; V, vinyl; M , methyl.

CoA, are converted in a few steps to the complex tetrapyrrole ring, which ultimately incorporates iron to form heme inside mitochondria. The first enzyme, ALAS, is located on the matrix side of the inner mitochondria1 membrane and catalyzes the formation of 5-aminolevulinate

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

5

(ALA) from glycine and succinyl CoA. The enzyme requires pyridoxal phosphate as a cofactor and, as with other pyridoxal phosphate-dependent enzymes, a lysine residue forms a Schiff base with the cofactor prior to reaction (1 7).After leaving the mitochondrion, two molecules of ALA are condensed by the cytosolic enzyme, ALA dehydratase, to give the pyrrole, porphobilinogen. Four molecules of the latter are then linked by cytosolic porphobilinogen deaminase to give an unstable intermediate, hydroxymethyl bilane. This linear tetrapyrrole is converted to uroporphyrinogen I11 by uroporphyrinogen-111 synthase. In the absence of this enzyme, the unstable intermediate cyclizes nonenzymatically to give the isomer uroporphyrinogen I, which has no known biological function. The final cytosolic enzyme, uroporphyrinogen-111 decarboxylase, removes carboxyl groups from the acetyl side-chain on the pyrrole rings to give coproporphyrinogen 111, which is transported to the mitochondrion for the remaining reactions. Coproporphyrinogen-I11 oxidase, in the intermembrane space of mitochondria, then converts coproporphyrinogen I11 to protoporphyrinogen 111. This compound is oxidized by protoporphyrinogen111 oxidase, located in the inner mitochondrial membrane, to form protoporphyrin IX. Finally, ferrochelatase, also located in the inner mitochondrial membrane, catalyzes the insertion of ferrous iron into protoporphyrin IX to give heme, which is then utilized for mitochondrial respiratory cytochromes and mitochondrial P450s, or is transported out of the mitochondrion to other cellular locations. The mechanisms by which heme and its precursors move across membranes to other intracellular locations are unknown (2, 16). Whether their movement requires a protein-dependent active transport process remains unclear, although a possible candidate mitochondrial membrane protein has recently been identified (18). Interestingly, ALAS is not present in plant cells, algae, and some bacteria, such as Escherichia coli. In these, ALA is synthesized from glutamate via the C-5 pathway (19).ALA synthesis is initiated with the activation of the five-carbon skeleton of glutamate by ligation to tRNAGIU.This glutamyl-tRNA is reduced to glutamate-l-semialdehydeand subsequently converted by transamination to its isomer, ALA, which is utilized for heme and chlorophyll synthesis. Compared with the other enzymes of the heme pathway, ALAS exhibits the lowest relative enzyme activity in both the liver and erythroid cells (4, and so represents a likely site of regulation. In support of this, hepatic ALAS is rapidly induced by phenobarbital and other drugs to supply the additional heme needed for the drug-induced synthesis of P450s. The levels of other enzymes of the heme pathway are not altered by these drugs and the enzymes are considered to be present in nonlimiting amounts. It is therefore accepted that in the liver, ALAS catalyzes the rate-limiting step of heme

6

BRIAN K. MAY ET AL.

biosynthesis and, in keeping with this proposal, levels of ALAS are controlled by a negative-feedback mechanism involving the end product (discussed Sections III,D-II1,F). The primary stimulus for the differentiation and proliferation of erythroid progenitor cells to form erythroblasts is the hormone, erythropoietin. In response to this hormone, there is a large increase in the production of heme and globin chains, with increased transcription of the genes encoding heme biosynthetic pathway enzymes and globin. Although levels of most heme pathway enzymes increase, here also ALAS appears to be the rate-limiting enzyme. Thus ALAS is assumed to be the key regulatory enzyme involved in heme production both in the liver and erythroid tissue and, most likely, in all other tissues. There are two closely related isozymes for ALAS that are encoded by different genes on separate chromosomes. These genes are expressed in a tissue-specific manner and are differentially regulated. A “housekeeping” ALAS gene is expressed ubiquitously at a low level, and it is this form that is induced by drugs, chiefly in the liver. A second ALAS gene, the erythroidspecific gene, is expressed only in erythroid tissue and is induced by erythropoietin during erythroid differentiation. We will first consider the features of the ALAS isozymes and their genes before dealing with aspects of molecular regulation.

II. Comparison of Hepatic and Erythroid 5-Aminolevulinate Synthase lsozymes and Their Genes

A. Housekeeping and Erythroid ALAS lsozymes The first studies on ALAS focused on the hepatic enzyme, because this enzyme is greatly elevated by administration of porphyrinogenic drugs, such as phenobarbital and 2-allyl-2-isopropylacetamide, to experimental animals, including 17- to 19-day chick embryos, adult rats, and guinea pigs (3,s).This drug-induction phenomenon is called “chemical porphyria” because of its biochemical similarity to porphyria diseases. ALAS was purified from the mitochondria of drug-induced livers (20) and localized by an immunocytochemical approach to the matrix side of the inner mitochondrial membrane (21). Electron microscopy and cross-linking studies showed that the purified liver mitochondrial enzyme is a homodimer of two identical subunits associated in opposite polarities (22); it remains to be determined whether each monomer has catalytic activity. The enzyme characterized initially from liver is now referred to as the ubiquitous or housekeeping

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

7

isozyme of ALAS, because it is probably expressed in all cells. The enzyme is designated in the literature as ALAS-N (N for nonspecific), ALAS-h or H (h or H for housekeeping), or ALAS-1; we use the term ALAS-1. ALAS-1 is synthesized in the cytosol as a precursor protein with an N-terminal signal sequence (or presequence) that targets the protein to the mitochondria and is removed on entry of the precursor into mitochondria to generate the mature protein. A cDNA clone for ALAS-1 precursor was first isolated in our laboratory from a cDNA library prepared with mRNA from drug-induced chick embryo livers; the identity of this clone was confirmed by comparison with the amino-acid sequence of purified protein from chick embryo liver mitochondria (23). Subsequently, cDNA clones were isolated for human (24) and rat (25) ALAS-1. The second isozyme of ALAS has been identified only in those tissues concerned with erythropoiesis. This isozyme is designated here as ALAS-2 (and as ALAS-e, or -E, for erythropoiesis, elsewhere). A cDNA clone for an ALAS-2 precursor protein was first isolated from tissue of the chicken (26); subsequently, clones were identified from tissues of mice (27), humans (28), and rats (29). ALAS-2 has been purified to homogeneity from rat reticulocytes (29). Overexpression in E . coli of cDNA clones encoding the mitochondrial forms of mouse (30) and human ALAS-2 (our laboratory, unpublished) and examination of the expressed protein have established that the catalytically active enzyme exists as a homodimer. The deduced amino-acid sequences €or the human ALAS-1 and ALAS-2 precursor proteins are aligned in Fig. 2. The proposed site for the proteolytic cleavage of the signal sequences for the two ALAS isozymes is indicated. The location of this site is deduced from that of chicken ALAS-1 precursor, where the site is known to be between two glutamine residues and results in a signal sequence of 56 amino acids (28). The corresponding site for the human ALAS-1 precursor is therefore predicted to be located between two glutamine residues, and, for human ALAS-2, between a serine and a glutamine residue. On this basis, the mature mitochondria1 forms of ALAS-1 and ALAS-2 have monomer masses of 65 and 60 kDa, respectively. The two ALAS isozymes show extensive amino-acid identity in the Cterminal region encompassing about 75% of the mature proteins (Fig. 2). The deduced amino-acid sequences of bacterial ALAS proteins align with this C-terminal region and, indeed, all ALAS proteins show substantial identity in this region (28), which, it is concluded, contains the catalytically active domain. The active-site lysine residue involved in binding the pyridoxal phosphate cofactor resides in this domain (17).A glycine-rich sequence present in all ALAS proteins (see Fig. 2) resembles the GXGXXG sequence at the active site of other pyridoxal-phosphate-dependent enzymes; this sequence may interact with the phosphate moiety of the cofactor (32).There is

8

BRIAN K. MAY ET AL.

FIG. 2. The deduced amino-acid sequences of human erythroid ALAS (ALAS-2) and “housekeeping” ALAS (ALAS-1) are aligned, and common amino acids are shaded. The highly conserved C-terminal domain is shown in bold. The single arrow indicates the proteolytic cleavage site. The active site lysine is indicated with an asterisk; the multiple arrows indicate the glycine-rich region. Sequences in large brackets represent the putative heme-responsive domains (59).

9

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

limited similarity in the remaining sequence of the ALAS-1 and ALAS-2 precursor proteins. The region located at the N-terminus of the mature protein is shorter in the erythroid protein compared with its housekeeping counterpart.

B. Structure of Housekeeping and Erythroid ALAS Genes The genomic exon-intron organization has been determined for the chicken (33)and rat (34)ALAS-1 genes and for the chicken (35),mouse (27), and human (36) ALAS-2 genes (Fig. 3). The exon-intron boundaries of all of the genes are strikingly similar and correlate with the putative functional domains of the ALAS protein, although the chicken ALAS-1 gene lacks an intron found in the 5'-untranslated region of the other genes. The first exon in the ALAS-2 genes is highly conserved and encodes an iron-responsive ATG

TGA

I Exons

chicken ALAS-1

I 2 345 67

1910

2kb

M

\

rat ALAS-I

chicken ALAS-2

6 7 8

3

mouse ALAS-2

I1

human

11

TGA

ATG

I

I 1

910

4 5

II 'I1

I

2

3

I

4

I 5

I I

I I

6

7

I 1 9

I

I 8

I

I 10 11

TGA

ATG

I

I

I

I

I I

2

3

4

5

IA

I

I I

I I

I

6

7

1 9

10

I

B

A l t e r d v e SpUdng

T G A

ALAS-2 I

I I

I

I I

I I 1 I

1

I

11

TGA I

II

I

I

I I

I

8 9

10

i

Catalytic Domain

Regulatory ?

FIG. 3. Structural organization of the genes for ALAS-1 and ALAS-2. Exons are numbered and ATG and TAA are the initiation and termination codons. The proposed functional roles of exons are indicated. An alternative splice pattern for human ALAS-2 is shown.

10

BRIAN K. MAY ET AL.

element (28, 35),as discussed in Section VI,B. Exon 2 corresponds to the mitochondrial signal sequence (exon 1 in the case of the chicken ALAS-1 gene) and exons 5-1 1 encompass the proposed catalytic site domain (exons 4-10 for the chicken ALAS-1 gene). The remaining two exons of the genes encode the region located at the N-terminus of the mature protein. This conserved arrangement of exons and introns indicates that the housekeeping and erythroid genes have evolved from a common ancestral gene, with perhaps loss of the intron in the 5'-untranslated region of the chicken ALAS-1 gene. The human housekeeping gene is located on chromosome 3 (37, 38) and the erythroid gene is on the X chromosome (38, 39), suggesting that gene duplication and divergence of an ancestral gene were followed by translocation of the genes to different chromosomes. It is interesting that although the mitochondrial signal sequences for housekeeping ALAS proteins are very similar, they have little identity with the erythroid signal sequences. This suggests that these signal sequences may participate in different, but as yet unknown, mechanisms that control uptake of the precursors of the housekeeping and erythroid proteins into mitochondria in the different cell types. A study using the polymerase chain reaction has revealed two mRNA transcripts for the human arythroid ALAS gene (36).These mRNA are present in about equal amounts in cells at all stages of erythroid development, including human fetal liver, pre-proerythroblasts, adult bone marrow, and peripheral blood reticulocytes. The two mRNAs are identical except for the absence of exon 4 in one of them, and are therefore generated by an alternative splicing mechanism (Fig. 3). Exons 3 and 4 encode the N-terminal region of the mature protein, and so these ALAS-2 isoforms will be identical except for structural heterogeneity in this region. The alternative splicing of exon 4 in the human erythroid gene ALAS mRNA is not phylogenetically conserved because it is not seen in mouse and dog erythroid cells (36).This contrasts with the alternative splicing in the erythroid protein 4.1 gene, wherein the event is conserved in humans, mice, and dogs, and occurs at different stages of erythroid development (40). The lack of phylogenetic conservation in the alternative splicing of exon 4 in the ALAS gene implies that the event in the evolution of the protein structure came after the divergence of man and mouse. Although alternative splicing of exon 4 does not occur in the mouse erythroid ALAS gene, different alternative splicing results in the formation of two mouse mRNAs that are characterized by the presence or absence of 45 nucleotides in exon 3 (27). This alternative splicing event is prevented in the human ALAS-2 gene by a single base change that alters the consensus sequence for splicing at the corresponding downstream site in the mouse. As with the removal of exon 4 in the human gene, this differential splicing

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

11

process in the mouse will result in mature protein isozymes heterogeneous in their N-terminal regions. The functional implication for the differential splicing of ALAS is not clear. The domain affected is probably not involved directly in catalysis because it is not present in bacterial ALAS proteins. In contrast to the ALAS-2 mRNA, there is no evidence for alternative splicing of the ALAS-1 mRNA in liver and other nonerythroid cells.

111. Molecular Regulation of Housekeeping 5-Aminolevulinate Synthase

A. Basal Expression in Different Tissues Total RNA has been isolated from various rat tissues and the levels of ALAS-1 mRNA have been quantitated using the rat liver ALAS cDNA as a specific probe (25, 34, 41). A mRNA of size 2.3 kb, corresponding to that of the liver ALAS mRNA, is detected in all tissues examined, including intestine, kidney, lung, heart, muscle, testis, and adrenal gland. These results demonstrate the housekeeping role of the ALAS protein in supplying heme for respiratory cytochromes and other heme proteins. Quantitation of the housekeeping mRNA shows that the adrenal gland contains the largest amount, with the small intestine, lung, heart, and testis containing 50%, and the liver and kidney about 25%, of that in the adrenal gland (41). These results differ somewhat from those reported earlier (25), wherein the level of this mRNA was highest in heart. Species, age, or diet variations may account for these differences. In any case, the basal levels of ALAS-1 mRNA, in the absence of external inducers, probably do not vary dramatically from one tissue to another. The expression of the ALAS-1 gene has been studied in fetal liver, the major hematopoietic organ (in the fetus), and also in the adult liver of the rat during development (34).The level of ALAS-1 mRNA was extremely low in fetal liver at 15 to 20 days after gestation but increased to a maximum just before birth, when it decreased somewhat, and then increased later in the adult liver. These results confirmed earlier data (25)on the developmental profile of ALAS-1 mRNA, but the basis for the mRNA fluctuations is not known. The time course of ALAS-1 mRNA paralleled that of P450-2B1 mRNA. On the other hand, ALAS-2 mRNA was present at a high level in the fetal liver at 14 days after gestation and continued to increase up to birth, when it decreased to undetectable levels at 180 days. The production of globin mRNA in the fetal liver followed a path similar to that of the ALAS-2 mRNA. These studies support the proposal that ALAS-1 supplies heme for P450s and other hemoproteins in the newborn and adult liver, whereas ALAS-2 is chiefly responsible for supplying heme for hemoglobin.

12

BRIAN K. MAY ET AL.

B. Identification of Regulatory Sequences for Basal Expression in the Housekeeping Gene Housekeeping genes, in general, can be divided very broadly into two classes (42). Genes having promoters with no TATA box but with several binding sites for the transcription factor Spl fall into the first class. Genes in the second class have promoters with a TATA box and contain an array of different regulatory elements and exhibit both basal and tissue-specific regulation. The promoters of the housekeeping genes, for example, uroporphyrinogen111decarboxylase (43)and porphobilinogen deaminase (44), are (G + C)-rich, lack a TATA box, contain multiple Spl binding sites, and are examples of the first class. By contrast, the ALAS-1 gene belongs to the second class of genes. The promoter contains a TATA box and no apparent Spl sites, and although the gene is expressed at a basal level in all tissues, it can be substantially induced by exogenous compounds in a tissue-specific fashion, as described in Section IV. We have initiated studies to identify the control elements in the ALAS-1 gene required for its basal expression in various cell types (45). Different lengths of the 5’-flanking region of the rat gene (either with or without the first intron located in the 5’-untranslated region) were fused to the bacterial chloramphenicol acetyltransferase reporter gene and transient expression analysis was performed in different cell types. Sequences allowing maximum expression are located in the promoter region to -479 bp and in the first intron. Subsequent gel shift and site-directed mutagenesis experiments established an important role for two control elements located immediately upstream from the TATA box. These control elements (at -59 to -48 bp and -77 to -88 bp) are homologous to the binding site for the transcription factor, nuclear respiratory factor 1 (NRF-1) (46). The purification and molecular cloning of NRF-1 has been accomplished (47). Recombinant NRF-1 can activate gene transcription from several NRF-1-responsive promoters, and NRF-1 most likely binds as a monomer, even though its site is palindromic. Our transient expression studies (45) demonstrate that both NRF-1 sites in the rat ALAS promoter are functional; mutagenesis of the NRF-1 site at -77 to -88 bp in the native promoter resulted in a 50% loss of expression, whereas mutagenesis of the NRF-1 at -59 to -48 bp gave an 80% loss. When both NRF-1 sites are altered, there is almost complete loss of expression. Moreover, gel shift analysis using crude nuclear extracts from monkey kidney COS-1 cells showed that each of the NRF-1 sequences in the ALAS promoter bound a protein complex of the same mobility as that of an authentic NRF-1 sequence (from cytochrome c), and this binding was prevented by the authentic NRF-1 sequences when tested in competition assays. Together, these results establish that the

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

13

NRF-1 motifs act in a cooperative fashion and are critical for the promoter activity of the rat ALAS gene. We have also identified NRF-1 binding sites in the promoters for both the human and chicken ALAS-1 genes, emphasizing the importance of these elements in basal expression. Functional NRF-1 binding sites have been identified in the promoters of several nuclear genes encoding mitochondrial proteins concerned with oxidative phosphorylation, including somatic cytochrome c, cytochrome-c oxidase subunit VIc, ubiquinone-binding protein, and ATP synthase y-subunit (47). These findings suggest that NRF-1 may coordinate the supply of mitochondrial heme with the synthesis of some respiratory chain subunits by regulating nuclear expression of the genes for ALAS and respiratory chain subunits. In keeping with this rate-controlling role of ALAS, no NRF-1 sites are present in the promoters of the housekeeping genes for the other heme pathway enzymes, such as ALA dehydratase (48), porphobilinogen deaminase (44), uroporphyrinogen-111 decarboxylase (43), or ferrochelatase (49). In erythroid cells, NRF-1 appears not to be important for controlling heme levels, because the gene promoter for ALAS-2 does not contain an NRF-1 binding site (28).

C. Drug Induction and Heme Repression of Hepatic ALAS In his classical studies, Granick (50) demonstrated that a wide range of foreign chemicals, including pharmaceutical drugs such as phenobarbital, can increase substantially the level of ALAS activity in chick embryo liver and cultured primary chick embryo hepatocytes. The mechanism of ALAS induction is of considerable interest because phenobarbital and other drugs that induce liver ALAS precipitate attacks of porphyria in genetically susceptible individuals. These compounds are referred to as porphyrinogenic drugs. Granick also first demonstrated that the drug-induced increase of ALAS in chick embryo hepatocytes can be prevented by heme, and that ALAS synthesis is therefore subject to a negative feedback control mechanism. Similar experiments have been performed in rats and chick embryos, wherein drug induction of hepatic ALAS activity is prevented by administration of heme (or the heme precursor 5-aminolevulinate). The induction of ALAS-1 mRNA by drugs such as phenobarbital or 2-allyl-2-isopropylatamide is tissue specific. In adult rats a substantial induction occurs in the liver and, to a lesser extent, in the kidney (25),whereas in adult hens, similar levels of induction are seen in the liver, kidney, and small intestine, but not in other tissues (51). Heme treatment of rats prevents drug induction of liver and kidney ALAS-1 mRNA and can lower basal levels of mRNA in various tissues (such as heart, testis, and brain) (25).

14

BRIAN K. MAY ET AL.

The drugs that induce ALAS activity also induce specific P450s, particularly in the liver. P450s play an important role in drug metabolism, and because they are heme-containing proteins, the induction of ALAS by these drugs would ensure an adequate supply of heme for the induced P450 apoproteins. It can be postulated that, in response to drugs, newly synthesised P450s utilize heme, so that preexisting inhibitory heme levels are lowered and ALAS repression is reduced. Alternatively, drugs may play a more direct role and induce ALAS independently of P450 induction. There are several relevant questions here. How does heme regulate ALAS-l? How do drugs induce ALAS-1 and P450s in the liver and, specifically, is the induction of P450s a prerequisite for increased ALAS formation? These questions are now considered.

D. Heme Regulates Hepatic ALAS at Multiple Sites Heme acts at multiple steps to regulate the level of ALAS in the liver. This is envisaged to be exerted through two heme “pools,” one in the cytosol and the other in the nucleus (Fig. 4). The physiological significance of this tightly controlled end-product regulation most likely relates to the fact that high levels of heme can be toxic. There is evidence that when heme interacts with hydrogen peroxide, which arises in cells during normal oxidative metabolism, reactive oxygen species that damage membrane lipids and proteins are generated (2, 2). In addition, accumulated intermediates of the heme pathway have adverse effects in porphyria diseases (as discussed in Section VII). Porphyrinogens, when oxidized to porphyrins, absorb light and generate dangerous free radicals that mediate cutaneous injury (52), and excessive 5-aminolevulinate and porphobilinogen production may contribute to the neurological disturbances seen in some porphyrias. By regulating ALAS levels through end-product repression, the amounts of cellular heme and heme pathway intermediates can be kept to acceptable levels.

E. Heme Lowers ALAS-1 mRNA in Liver When rats are injected with drugs, there is a substantial increase in the level of hepatic ALAS-1 mRNA. This increase can be repressed at the transcriptional level by injection of heme or the precursor 5-aminolevulinate (25). In addition, we have shown that the half-life of rat ALAS-1 mRNA in hepatoma cells is significantly affected by the addition of heme (unpublished). In these experiments, cells were treated with actinomycin D to inhibit transcription and the level of ALAS-1 mRNA was quantitated by Northern blot analysis. In the presence of heme, the half-life of the ALAS-1

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

15

FIG.4. Regulatory roles of heme in the liver. In the liver, heme is involved in the regulation of a variety of processes. It represses transcription of the ALAS-1 gene, prevents import of ALAS-I precursor into mitochondria, and reduces ALAS-I mRNA stability. Heme also increases transcription of the heme oxygenase gene.

mRNA was reduced from 130 to 42 minutes whereas that of another shortlived mRNA, c-myc, was not affected. The ALAS-1 mRNA does not contain (A + U)-rich sequences in its 3'-untranslated region, as found in some highly labile mRNAs (53).We are currently defining regions in the mRNA that are responsible for this heme-mediated instability. The fact that the ALAS-1 mRNA is relatively short-lived, even in the absence of additional heme is in keeping with the role of ALAS-1 as the rate-controlling enzyme of heme biosynthesis. When 17-day drug-treated chick embryos are administered heme, the level of hepatic ALAS-1 mRNA is lowered, but, in contrast to rats, this response is predominantly due to an effect on mRNA stability, with transcription of the ALAS-1 gene being essentially unaffected (our unpublished data). Similar studies have been performed in cultured chick embryo hepatocytes. I n these cells, heme decreases the half-life of ALAS-1 mRNA from 3.5 to 1.2 hours without affecting the gene transcription rate (54, 55). Furthermore, the heme-mediated mRNA instability was prevented by cycloheximide treatment, indicating that heme action may be mediated by a labile protein (54, 55).

16

BRIAN K. MAY ET AL.

F. Heme Inhibits ALAS Transport into Mitochondria Most mitochondrial proteins, including ALAS, are nuclear encoded and are synthesized as a cytoplasmic precursor prior to import into the mitochondria, where the precursor is proteolytically processed to give the mature form. The import of ALAS precursor into liver mitochondria is inhibited by heme; rats were injected with heme and the levels of ALAS were measured either immunologically or by enzyme activity (56).These experiments showed that in response to heme, ALAS protein in liver mitochondria rapidly disappears while that in the cytosol increases. Pulse-labeling studies in chick embryo livers (57)confirmed that injected heme prevents the transfer of cytosolic precursor ALAS into mitochondria, and because heme does not inhibit the import of another mitochondrial protein, prepyruvate carboxylase, it seems that the effect of heme is specific for ALAS. Because the half-life of mitochondrial ALAS protein is short, about 35 minutes (58),inhibition of ALAS transport into mitochondria by heme would lead to a rapid depletion of the enzyme in mitochondria and provide an effective control mechanism. An examination (8,59)of the amino-acid sequences of liver and erythroid ALAS precursor proteins reveals three Cys-Pro motifs that resemble the seven Cys-Pro motifs comprising the heme-binding domain of the yeast heme-activated protein (HAP1) (60). HAP1 controls the transcription of a variety of yeast nuclear genes encoding respiratory proteins, and coordinates their expression with aerobic growth conditions through its metabolic coeffector heme. In uitro studies (59)provide evidence that the Cys-Pro motifs in the mouse ALAS-2 precursor mediate the inhibition of precursor transport into mitochondria by heme. It seems highly probable that the similar motifs identified in the signal sequence of liver ALAS proteins are also responsible for heme responsiveness (see Fig. 2). The heme-responsive motif is not present in the yeast ALAS precursor, the import of which into mitochondria is not affected by heme (61). The question now arises as to how heme, through these motifs, is able to prevent import of ALAS into mitochondria. The mechanistic details for the import of ALAS precursor into mitochondria have not been studied, but presumably the process is similar to that for other mitochondrial matrix proteins (62). In this model, specific binding of the precursor to the outer membrane is mediated by a surface receptor protein complex that interacts with the precursor’s signal sequence. The signal sequence then leads the transfer of the protein in an extended conformation through a protein transport pore that spans both the outer and inner mitochondrial membranes. Heme could inhibit import of precursor ALAS by onq-of several mecha-

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

17

nisms. For example, heme bound to the precursor could interfere with the interaction of the precursor with a specific mitochondrial surface receptor. Alternatively, bound heme may prevent the transfer of the precursor across the mitochondrial membranes. Although it is assumed that heme binds to the precursor, there is no direct evidence for this, and the involvement of a heme-binding protein in the inhibition mechanism cannot be dismissed. To summarize, heme inhibits ALAS synthesis in chick embryo liver by affecting mRNA stability and preventing the import of ALAS into mitochondria, whereas in rat liver, heme additionally prevents ALAS formation by inhibiting transcription of the gene (Fig. 4). Although the exact mechanism of action of heme at each step is not understood, it seems likely that heme-responsive proteins are involved in some, or all, of these mechanisms.

IV. Drug Induction of Hepatic Cytochrome P450 and 5-Aminolevulinate Synthase Transcription When adult rats or chick embryos are treated with drugs such as phenobarbital or 2-allyl-2-isopropylacetamide, there is a large increase in the hepatic level of ALAS protein, as detected by enzyme activity or immunological measurements, and administered heme prevents this increase. Transcriptional nuclear “run-on” experiments have established that drugs increase the rate of transcription of the ALAS-1 gene in chick embryo liver (51), rat liver (29, and chick embryo hepatocytes (63, 64), but the mechanism of this induction remains elusive. Heme treatment of rats (25), but not of chick embryos (our unpublished data), prevents the drug-induced increase in ALAS-1 gene transcription. It can therefore be inferred that, for chick embryo liver, drugs do not induce ALAS gene transcription by lowering repressive heme levels as a result of increased hepatic P450 synthesis. (The fact that heme prevents an increase of hepatic ALAS activity in chick embryos reflects the action of heme in inhibiting mitochondrial import and decreasing the ALAS-1 mRNA stability.) Hence, in chick embryo liver, drugs must increase transcription of the ALAS-1 gene through a mechanism independent of heme and P450 induction. Experiments with the protein synthesis inhibitor cycloheximide agree with this conclusion (63, 64).Chick embryos were injected with cycloheximide 60 minutes prior to drug treatment and this reduced total liver protein synthesis by more than 90%. Nuclei from the liver were isolated after 3.5 hours of exposure to phenobarbital and nuclear transcription rates were quantitated. The results demonstrated that the drug-induced increase in ALAS-1 gene transcription is not inhibited by cycloheximide but, in fact, is

18

BRIAN K. MAY ET AL.

markedly stimulated. The data strongly suggest that induction of ALAS-1 gene transcription is not dependent on the concomitant synthesis of P450 apoprotein and, moreover, that a labile repressor protein may be involved. In contrast to chick embryos, injection of rats with heme can inhibit the drug-induced increase in transcription of the ALAS-1 gene. However, when rats are injected with succinylacetone (a specific inhibitor of heme biosynthesis) in the absence of drug, there is only a small increase in the transcription of the ALAS-1 gene (our unpublished data). This implies that lowered heme levels per se are not sufficient to induce ALAS-1 gene transcription fully, and that drugs must also play some other role, perhaps by activating a transcription factor. Therefore, the induction of ALAS-1 protein observed in rat liver after drug administration must reflect increased ALAS-1 gene transcription, together with, perhaps, enhanced mRNA stability and enhanced import of the ALAS-1 precursor into mitochondria, the latter two responses reflecting a decrease in heme levels through druginduced P450 apoprotein synthesis. There is no information on the identity of transcription factors or DNA control elements, which may play a role in the drug-induced transcription of the hepatic ALAS-1 gene. In our laboratory, transient expression studies in chick embryo hepatocytes have been carried out with constructs containing up to 1.4 kb of 5'-flanking region of the chicken ALAS-1 gene promoter fused to a reporter gene. Although the endogenous mRNA is induced, no response to drugs has been observed with the constructs, indicating that the postulated drug-responsive enhancer domain lies further upstream (or downstream) of the gene, or within the gene. Although progress on the mechanism underlying the drug induction of the ALAS-1 gene has been slow, some progress has been made in determining the mechanism by which phenobarbital enhances transcription of hepatic P450 genes. We are investigating the regulation of the chicken hepatic P450 genes. When chick embryos are injected with drugs, three P450 mRNAs of 3.5, 2.5, and 2.2 kb are induced in the liver (51). Basal levels of these mRNAs in liver are very low, but following drug treatment there is a large increase in the amounts of the 3.5- and 2.5-kb mRNAs, with a lesser induction of the 2.2-kb mRNA. These mRNAs are also specifically induced in the liver, kidney, and small intestine of the adult hen and are accompanied by an induction of the ALAS-1 mRNA (51).We have isolated cDNA clones for the 3.5- and 2.2-kb mRNAs and have demonstrated that they are encoded by separate genes, namely, P4502H1 and P4502H2, respectively (65). cDNA clones for the 2.5-kb mRNA are currently being analyzed. Nuclear run-on experiments demonstrate that drugs act in a transcriptional fashion to increase amounts of the 3.5- and 2.2-kb mRNAs in chick embryo liver. When

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

19

chick embryos are treated with cycloheximide prior to drug injection, the rate of transcription of the P4502H1 gene is not inhibited but is, in fact, somewhat elevated (63). There are similar findings with cycloheximide (64) using chick embryo hepatocyte cultures. The data indicate that the required transcription factor(s) is preformed and de novo synthesis of protein is not required for drug induction of P450. We have recently observed that phosphorylation may play an important role in drug induction of ALAS-1 and P4502H1. 2-Aminopurine (a known inhibitor of serine/threonine kinases) interferes with the phenobarbital induction of the ALAS-1 and P4502H1 genes in chick embryo hepatocytes. Perhaps phenobarbital activates a kinase leading to phosphorylation of a transcription factor that is then able to activate its target promoter. To investigate drug-responsive regions in the chicken P4502H1 gene, various lengths of 5’-flanking region of the P450 gene were fused to the chloramphenicol acetyltransferase gene and these chimeric constructs were expressed in chicken embryo hepatocytes by the transient transfection method (66).In this study, a drug-responsive region between -5.9 and -1.1 kb was shown to act as an enhancer and to confer substantial phenobarbital and 2-allyl-2-isopropylacetamideresponsiveness to the weak heterologous enhancerless SV40 promoter. However, when examined in the native P4502H1 promoter, the region between -5.9 and -1.1 kb was only weakly responsive to the drug. In this regard, it is noteworthy that constructs with regions between 230 bp and 8.9 kb of the P4502H1 5’-flanking sequence, when fused to the reporter gene, gave very high basal levels of expression in the absence of drug. By contrast, the endogenous level of P4502H1 mRNA in the absence of drug was very low. Together, these results suggest that the strong basal expression of the P4502H1 gene is normally repressed in the liver and that a “silencer” protein may be present that acts further upstream than 8.9 kb or elsewhere in the P4502H1 gene. We have recently identified several liver-specific elements in the P4502H1 promoter that are most likely responsible for the tissue-specific expression of this gene and the high basal expression seen in transient analysis. The putative silencer may prevent the action of these tissue-specific transcription factors, with the drug acting through a receptor protein, reversing the action of the silencer. The fact that phenobarbital can induce both A L A S - 1 and P4502H1 gene transcription implies that the promoters of these genes will contain a common, or similar, drug-responsive region. Additionally, because structurally unrelated drugs induce these genes (for example, 2-allyl-2-isopropylacetamide and phenobarbital), the drug receptor must be postulated to have a “sloppy” active site (13)that can accommodate a variety of different inducers.

20

BRIAN K. MAY ET AL.

The phenobarbital-induction mechanism of 2'450 genes in Bacillus megaterium and rat liver has been investigated (67). It was suggested that the barbiturate-mediated induction of P450B,-, and P45OB,_, genes in bacteria and of the P4502B genes in rat liver may be controlled by a similar transcriptional factor. By comparing the 5'-flanking regions of these genes, a 17-bp region of strong sequence identity was identified. In gel-retardation assays, the sequence strongly bound a single protein present in nuclear extracts of livers from phenobarbital-treated but not untreated rats. We have found a sequence in the P4502Hl promoter with some homology to that of the 17-bp sequence. In gel-retardation assays, this sequence bound a protein present in both chick and drug-treated embryo and rat liver nuclear extracts. However, the mobility of this protein complex was considerably smaller than that observed with the 17-bp sequence (67), demonstrating that it binds a different protein. The role of this protein in the P4502H1 gene transcription is now under investigation. The mechanism governing the phenobarbital induction of the rat P4502B2 gene has been investigated in transgenic mice (68). These studies showed that the P4502B2 transgene, with 800 bp of 5' flanking region, is not phenobarbital inducible, but gives high basal levels of expression in the liver and kidney. [This 5'-flanking region contains the 17-bp sequence reported (67) as important for drug induction.] However, a transgene with 19 kb of 5'-flanking region is induced by drug specifically in the liver. This construct gave a low basal level in liver and reflected basal expression of the endogenous gene. These results suggest that there is a region between -800 bp and -19 kb that represses basal expression and is phenobarbital inducible in a tissue-specific fashion. The proposal that this gene is normally repressed agrees with our findings on the expression of the chicken P4502H1 gene. Recently, two DNA sequences in the first 200 bp of the rat P4502B2 gene promoter were shown by gel-shift and DNase-I footprint analyses to bind more protein from liver nuclear extracts of phenobarbital-treated rats compared with extracts from untreated rats (68a). This region of the promoter does not include the reported 17-bp sequence (67). What role, if any, these proteins play in the drug response remains to be determined. Phenobarbital-induced expression of the P4502B2 gene in a rat hepatoma cell line is inhibited by RU486, the glumcorticoid-progesterone antagonist, and it was proposed that an endogenous steroid may be involved in the drug induction mechanism (69). Whether there is a drug-receptor or an endogenous steroid-receptor complex involved in the phenobarbital-induction mechanism awaits the identification of the drug-responsive DNA control sequences and the proteins that bind to them.

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

21

V. Heme Degradation by Heme Oxygenase Excess heme can be toxic to cells, resulting in lipid and protein damage

by oxidation (2), thus probably all cells are capable of degrading heme. Heme oxygenase (EC 1.14.99.3) is the key enzyme in heme catabolism, because it catalyzes the rate-limiting step. The enzyme is located in the endoplasmic reticulum and, together with NADPH cytochrome-P450 reductase, catalyzes the cleavage of heme, with elimination of a porphyrin methene bridge as carbon monoxide and the formation of the bile pigment, biliverdin. Released iron is reutilized and carbon monoxide and bile pigment are ultimately eliminated from the body. Two different isozymes, heme oxygenase-1 and heme oxygenase-2, have been identified (70, 71); they are the products of separate genes (72, 73). Heme oxygenase-1 is expressed ubiquitously and is present at high levels in the spleen and liver, where it is responsible for degradation of hemoglobin heme. This isozyme is induced by many stimuli and agents, including heme, heavy metals, hyperthermia, and oxidative stress. The induction of heme oxygenase-1 by heat stress may serve to destroy heme released from denatured heme proteins, thus averting any toxic effects of heme (74). Although the primary role of heme oxygenase-l is to destroy heme, the enzyme may have another important role in protecting the cell against oxidative stress through the generation of the antioxidant bilirubin (74a). Cellular glutathione levels decrease in response to heavy metals and many other stimuli that induce heme oxygenase-1 (74a). With depleted glutathione, hydrogen peroxide levels increase, which results in the formation of dangerous “active oxygen” species. The mechanism that results in reduced glutathione levels in the presence of heavy metals is not known, but the induction of heme oxygenase can be viewed as a protective measure in which bilirubin generated by the action of heme oxygenase functions to remove the active oxygen species. Heme oxygenase-2 is not responsive to agents that induce heme oxygenase-1 and is constitutively expressed in many tissues throughout the body, with high concentrations in the brain and testis (71, 75). There is currently great interest in the possibility that the role of heme oxygenase-2 in the brain is to supply carbon monoxide, which in turn acts as a neuronal messenger by binding to the heme moiety of guanylyl cyclase to produce cyclic GMP (15, 76). In situ hybridization of brain slices has revealed discrete neuronal colocalization of mRNAs for ALAS-1, heme oxygenase-2, and guanylyl cyclase (15). Inhibition of heme oxygenase activity prevents an increase in cyclic GMP in olfactory neurons (15)and the induction of longterm potentiation, a learning-linked process, in the brain (77).

22

BRIAN K. MAY ET AL.

The molecular basis for the induction of heme oxygenase-1 by heme, metals, and heat stress is currently being pursued (74, 78, 78a). Heme and metals dramatically increase transcription of the heme oxygenase-1 gene. Moreover, the increase mRNA level produced by heme is inhibited by cycloheximide, suggesting that the synthesis of a labile protein is required for induction (79).This labile protein may be a repressor of transcription or a heme-activated protein, and by analogy with the yeast heme-sensor transcription factor HAP1, may bind only to its target DNA element as a hemeprotein complex. Regulatory regions have been identified in the mouse (78) and human (78a) heme oxygenase-1 gene promoters. A region of 268 bp located about 4 kb upstream in the mouse promoter responds to both heavy metals and heme (78). This region was previously shown to contain AP-1 sites and to respond to phorbolester, but it is not clear whether these sites are involved in the induction mechanisms. A region of 500 bp has also been identified in about the same location in the human heme oxygenase-1 gene promoter and responds to cadmium, although its response to heme has not been reported (78a). The 3' end of this region shows substantial homology with the heme-responsive region previously identified in the mouse. Preliminary mutagenesis data identified the sequence TGCTAGATIT located in the 500-bp region as being responsible for cadmium induction. This sequence does not, however, respond to other inducers, notably heme, and the control elements that respond to these inducers must lie elsewhere. There is a similar sequence in the 268-bp region of the mouse promoter, but this sequence has not been examined for its ability to mediate cadmium or heme induction. It is also worth noting that, in the induction of heme oxygenase-1 by heavy metals, there is no evidence for the involvement of metal-responsive elements as found in the metallothionein gene promoter. The molecular mechanism by which thermal stress induces the rat heme oxygenase-1 gene has been investigated (74, 81). The promoter of the rat heme oxygenase-1 gene contains two potential heat-shock-like elements (HSE1 and HSE2). Transient transfection analysis demonstrated that H S E l is sufficient to confer heat-shock on the reporter gene whereas HSE2, by itself, could not, although the inclusion of HSE2 increased the magnitude of the response to heat shock by HSEl (81).The proteins that bind to these elements have yet to be characterized.

VI. Erythropoiesis and Heme Synthesis During erythropoiesis, committed erythroid progenitor cells undergo dserentiation and proliferation, ultimately to yield mature circulating

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

23

erythrocytes. During erythroid cell development, large amounts of heme and globin are produced for hemoglobin. The molecular events leading to the rapid rate of synthesis of these is now considered. Mammalian globins are encoded by the a- and p-globin gene families. The human a-globin cluster on chromosome 16 includes an embryonic gene and two adult a genes; the P-globin cluster on chromosome 11 comprises an embryonic gene, fetal genes, and an adult p gene. The genes are expressed in a developmental sequence. In mammals, erythropoiesis is initiated in the yolk sac, then shifts to the fetal liver and to bone marrow in the adult, and different types of globin chains are produced during ontogeny. The production of globin and heme is intimately associated with erythroid cell differentiation, best understood in the bone marrow. Stem cells are recruited into differentiation to form early progenitors of the various hematopoietic cell lines and this process occurs via the expression of cell surface receptors for various factors called hematopoietic growth factors (82). Erythropoietin is the hematopoietic growth factor that uniquely directs the differentiation and proliferation of committed erythroid progenitor cells (83). Erythropoietin is produced mainly in the kidney and binds specifically to the erythropoietin receptor, a transmembrane protein in immature erythroid cells. The earliest identifiable erythroid progenitor cells are the burstforming unit erythroid (BFU-E) cells and the colony-forming unit erythroid (CFU-E) cells, and both express erythropoietin receptors. In response to erythropoietin, BFU-E cells differentiate and proliferate into the more mature CFU-E cells, which produce morphologically recognizable proerythroblasts. Proerythroblasts remain erythropoietin responsive and give rise to erythroblasts, in which the rate of hemoglobin production is maximal. In mammals, the cell nucleus is ultimately extruded, yielding reticulocytes, which continue to synthesize hemoglobin. In a few days reticulocytes are released from the bone marrow, and within a day lose the remaining ribosomes and mitochondria to become circulating erythrocytes. Erythropoietin receptor number is maximal in CFU-E cells but then decreases to negligible amounts in reticulocytes (84).That is, once the program of terminal differentiation is set in motion by erythropoietin, erythropoietin-receptor dependency progressively decreases, but the mechanism involved here is not understood. During erythroid differentiation, the transcription of erythroid-specific genes is activated. The expression of globin genes depends on the binding of multiple transcription factors to each gene promoter and to more distal enhancer regions (85, 86). These factors increase the rate of transcription of the genes in a developmental and tissue-specific fashion. Several transcription factors that bind to control regions of globin and other erythroid-specific genes have been characterized. The transcription

24

BRIAN K. MAY ET AL.

factor GATA-1 has attracted considerable attention and binds to a sequence that contains a GATA motif (87, 88). GATA-1 is a member of a multigene family, and another member (GATA-2) is also present in erythroid cells. In addition to GATA-1 binding sites, another DNA sequence important for erythroid gene expression is the CACCC motif. Several proteins appear to be capable of binding to this motif, including the ubiquitous factor Spl and an erythroid-specific factor EKLF (89). In many, if not all, erythroid-specific genes there appears to be a cooperative interaction between CACCC motif proteins and a nearby GATA-1. An AP-1-like motif that binds the erythroidspecific nuclear factor, designated NF-E2 (go), has been identified in some erythroid genes. It is of interest that GATA-1 and NF-E2 are found not only in erythroid cells, but also in mast cells and megakaryocytes. Perhaps the combined action of GATA-1, NF-E2, and EKLF results in erythroid-specific gene expression. This action may be prevented in mast cells and megakaryocytes by the absence of EKLF or the presence of inhibitors that prevent the binding of GATA-1 and NF-E2. Located far upstream of each of the a-and P-globin gene clusters, there is an enhancer region, the locus control region, that binds both ubiquitous and erythroid-specific factors (85, 86, 91). The locus control regions are required for maximal expression of the globin genes and for their expression in a developmentally regulated fashion. The mechanism by which the locus control region regulates transcription of the different globin genes is not known with certainty, but it has been proposed that there is an interaction between the locus control region and the appropriate globin promoter, so that the promoter exists in a nucleosome-free state, thus permitting transcription (91~). The molecular mechanisms by which erythropoietin mediates erythroid cell development are not clear. Recent studies indicate that the initial signal pathway involves protein-phosphorylation events. When erythropoietin binds to the erythropoietin receptor, a JAK2 kinase associates with the cytoplasmic domain of the erythropoietin receptor and is activated (92),but the proteins, which may be subsequently phosphorylated and play a role in mitogenesis and gene transcription, have not been fully elucidated. One transcription factor that plays an important role in erythroid gene activation in response to erythropoietin is GATA-1. Binding sites for GATA-1 are located in the promoter of the erythropoietin receptor gene and in the promoter of the gene for GATA-1. It is assumed that GATA-1 is expressed before the erythropoietin receptor and that GATA-1 is then able to activate expression of the erythropoietin receptor gene at the BFU-E stage or earlier. Subsequent binding of erythropoietin to the receptor on the surface of the BFU-E results in an increase in the amount of GATA-1. This, in turn, activates the erythropoietin receptor and GATA-1 genes, thus leading to a

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

25

large increase in GATA-1, which can activate erythroid gene expression (93). In primary cultures of human erythroid progenitor cells, following erythropoietin addition, there is a sharp rise in mRNA for GATA-1 and GATA-1 protein and this immediately precedes a major rise in globin mRNA (94). These results agree with the concept that GATA-1 is a key transcriptional activator of erythroid cell differentiation and maturation. Murine erythroleukemic (MEL) cells are virus-transformed erythroid precursor cells blocked at a stage of development comparable to the colonyforming unit, and these cells have been useful for examining the terminal stages of erythroid cell differentiation. On exposure to a chemical inducer such as dimethyl sulfoxide (Me,SO), the cells follow a developmental program similar to that of normal erythroid cell differentiation. Transcription of the a-and p-globin genes is activated in the Me,SO-induced MEL cells and this is accompanied by an increase in the amount of mRNA for at least six of the heme-synthesizing enzymes (95-97). Although MEL cells express erythropoietin receptor, they do not respond to erythropoietin. An attractive model system for studying erythropoietin-induced biochemical events has been developed by Klinen et al. (98), who isolated a mouse erythroid cell line (J2E) that corresponds to the proerythroblast stage and responds to erythropoietin. In the presence of this hormone, the cells proliferate and differentiate to give mature erythroid cells. During this process, production of both heme and globin is increased concurrently, as are the activities of several enzymes of the heme biosynthetic pathway (99). The enzymes are 5-aminolevulinate synthase, porphobilinogen deaminase, uroporphyrinogen-111 synthase, coproporphyrinogen oxidase, and ferrochelatase; the activities of the three remaining enzymes do not change (99). Subsequent studies show that mRNA levels for the induced enzymes are elevated by erythropoietin whereas the mRNA levels for the uninduced enzymes remain unaltered (100).An important question relates to the mechanism by which the levels of heme pathway enzymes increase during erythropoiesis, and the mechanism by which this is coordinated to globin production.

A. Expression of Genes for Heme Biosynthetic Enzymes As described, there is both a housekeeping and an erythroid isozyme for ALAS, each encoded by a separate gene. When MEL cells are chemically induced with Me,SO, there is a dramatic increase in the level of ALAS-2 mRNA, with the level of ALAS-1 transcript rapidly decreasing to a very low level (101). Studies with J2E cells have established that the level of ALAS-2 mRNA is markedly enhanced by erythropoietin and that this reflects in-

26

BRIAN K. MAY ET AL.

creased gene transcription, whereas the level of ALAS-1 mRNA is below the level of detection (99). Hence, expression of the erythroid-specific gene for ALAS is activated during erythroid development. The immediate ALAS-2 promoter resembles that of the globin promoter and contains several putative control elements, including GATA-1, the CACCC box, and the NF-E2 binding sites (28).A possible GATA-1 element is located at the TATA box position at -27 and also at positions -100 and -125. Noncanonical TATA boxes that bind GATA-1 but not TFIID have been identified in other erythroid specific genes (102,103). In these studies, because mutation of the GATA-1 sequence to an authentic TATA box substantially affected expression, it raises the possibility that GATA-1 plays a role in the formation of a preinitiation transcriptional complex. A dependence for GATA-1 on the initiation of transcription would ensure that expression of the gene is confined to erythroid cells. Two distinct genes in the human genome encode ALAS isozymes, but all other enzymes of the heme pathway so far examined have a single structural gene. The gene for ALA dehydratase, the second enzyme of the heme pathway, has an interesting organization, with separate housekeeping and erythroid-specific promoters (48). Erythroid and housekeeping mRNA isoforms arise by use of the appropriate promoter and alternative splicing. The isozymes are identical; the alternative splicing occurs in the 5'-untranslated region and the resulting mRNAs differ only in this region, and encompass identical coding sequences. The erythroid promoter contains a putative GATA-1 site, and transcription from this promoter is induced during erythroid differentiation. There are also separate housekeeping and erythroid-specific promoters for the gene encoding porphobilinogen deaminase, the third enzyme of the pathway (44, 104). However, unlike ALA dehydratase, the two porphobilinogen deaminase isozymes differ in sequence at the N-terminus as a result of differential splicing. The erythroid porphobilinogen deaminase gene promoter, induced during erythropoiesis, contains functional binding sites for GATA-1, NF-E2, and CACCC box proteins. The housekeeping gene promoter is devoid of these elements, but contains binding sites for the ubiquitous transcription factor Spl, a feature characteristic of many housekeeping genes. Human cDNA clones have been isolated for the fourth enzyme, uroporphyrinogen-111 synthase, and the hepatic and erythroid forms are identical (105). The mRNA for this enzyme increases during erythropoiesis (99), but the isolation of the gene has not yet been reported and the underlying control mechanisms are unknown. The gene for the fifth enzyme, uroporphyrinogen-111 decarboxylase, is expressed in all cell types, but expression is substantially increased during

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

27

erythroid differentiation in MEL cells, although this is not observed in the J2E mouse erythroid cell line (99). There is no evidence for different isozymes. Although the promoter contains an S p l binding site, control elements responding during erythropoiesis have not been identified (43). Recently, cDNA clones were isolated for human mitochondria1 coproporphyrinogen oxidase, the sixth enzyme of the heme biosynthetic pathway (97, 106).Identical transcripts are expressed in erythroid and nonerythroid cells, and a single gene has recently been characterized (106~). Unlike the ALA dehydratase and porphobilinogen deaminase genes, which have separate promoters, the gene for coproporphyrinogen oxidase has a composite promoter with multiple copies of Spl and GATA-1 binding sites. This promoter presumably responds in both nonerythroid and erythroid cell types. The seventh enzyme, protoporphyrinogen oxidase, has not as yet been cloned; there is thus no information on whether transcription of this gene is increased during erythropoiesis, although there is no increase in enzyme activity in erythropoietin-treated J2E cells (99). The final enzyme of the heme pathway, ferrochelatase, is encoded by a single gene and, as for coproporphyrinogen oxidase, there is a composite promoter (49). Several potential Spl binding sites interspersed by potential binding sites for NF-E2 and GATA-1 have been identified in this promoter. Thus, from work with MEL (95-97) and J2E cells (99) it is evident that the synthesis of most, if not all, enzymes of the heme pathway increases in response to differentiation stimuli, and expression of their genes is directed by promoters that bind erythroid-specific transcription factors such as GATA-1. It is not known whether there is a coordinated increase in the levels of induced mRNAs; the precise timing of their appearance has not been established during erythropoiesis. The requisite increased activity of the heme pathway during erythropoiesis results in a large production of protoporphyrin, but the subsequent formation of heme is critically dependent on the availability of iron. We now deal with the control of cellular iron levels and thus discuss further the role of ALAS-2 in determining the overall rate of protoporphyrin synthesis in response to iron availability.

B. Regulation of Cellular Iron Levels Iron is essential to all living species. In mammalian cells, it is required for important iron-containing proteins, in which it may be a component of heme or an iron-sulfur cluster. In addition, there are enzymes, such as ribonucleotide reductase, that contain iron not present in heme or in an iron-sulfur cluster. Iron has a low solubility, which is overcome by ironbinding proteins. In animals, the major iron-binding protein in the serum is transferrin, and iron is transported and delivered to cells as iron transferrin.

28

BRIAN K. MAY ET AL.

Inside cells, iron is complexed with protoporphyrin as heme and stored within ferritin shells. Because iron in conjunction with oxygen is a generator of harmful oxygen radicals that damage cells, the association of iron with protoporphyrin and the specific iron-binding proteins serves to maintain iron in a nontoxic form. Environmental iron, in the ferrous state, is absorbed by the intestinal mucosal cells and is subsequently oxidized to ferric iron, which binds to serum transferrin and is carried to cells. The mechanism of iron delivery to cells via transferrin has been well defined (107). The iron-transferrin complex binds specifically to cell-surface transferrin receptors. The transferrinreceptor complex is internalized in coated vesicles by receptor-mediated endocytosis, and the iron, released within endosomes, is reduced to Fe2+ and transported into the cytosol. The iron in the cytosol may either be utilized (for example, enter the mitochondria to form heme), or be stored (as cytosolic ferritin). Some cytosolic iron may also constitute a regulatory “free” iron pool, as described further below. The mechanism of intracellular iron transport is unknown. A large amount of iron is required by erythroid cells in the bone marrow to form hemoglobin. Indeed, erythroid cells have the greatest requirement for iron, hemoglobin iron accounting for about 70% of the total iron in the body. Hemoglobin does not turn over in erythroid cells and serum; transferrin-bound iron therefore is the major physiological source of iron for this protein. This may not be so in the liver, where heme proteins such as P45Os do turn over significantly, and a main source of iron is probably intracellular heme iron, liberated by the action of heme oxygenase. In nonerythroid cells iron homeostasis is maintained by a coordinated regulation of transfenin receptor and ferritin expression (107, 108). When iron is abundant, transferrin receptor levels decrease, resulting in less cellular uptake of iron, whereas ferritin levels are increased, promoting the sequestration of excess intracellular iron. When iron is scarce, the opposite is observed, with transferrin receptor amounts increasing and those of ferritin decreasing. These mechanisms are predominantly post-transcriptional and permit a rapid response to changes in cellular iron status. Regulation of these proteins in nonerythroid cells is modulated by a cytosolic protein of about 100 kDa, referred to as the iron-responsive-element binding protein (IREBP) (107). This protein binds with high affinity to a specific RNA stem-loop structure, the iron-responsive element (IRE) (Fig. 5). The IRE consists of a stable stem interrupted by an unpaired C, five nucleotides 5’ of the loop. The loop of six nucleotides has the conserved sequence CAGUGN. An IRE is located in the 5’-untranslated region of ferritin mRNA and five IRES are present in the 3’-untranslated region of the transferrin receptor mRNA (107). When IRE-BP binds to the single IRE in the femtin mRNA, the

29

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

Ferriiin

ALAS2

I

chicken

mouse

human

U-A U-A GC

U-A UA

riiiiil

1

UA

G-c %A U-A

G-c

G

0

C-G

G-c

U

&A

G-c

A

A

00 00

I

A

M U-A G-c

C

U-A

U-A UA

w

U

CG U-A

G-c G4

U

G4 U

A-U

I

t1

G 0

GO

UA

M M 5’C

G-c

oc

U U U-

GG GG A ..

t44

UG GO

G-c

oc

u U u

U-A U 5’A

tl

U-A UG

C

mG-c

C

UCM

UG GU A4

4.52

C

G-c

A

.._...... C

5’ GIIa

I

41

llnt G-c

1Mn

.......

CCACCC

t m

FIG. 5. Comparison of the proposed iron-responsive element structures in the 5’-untranslated region of chicken, mouse, and human ALAS-2 mRNAs and human ferritin mRNA. The initiation AUG codon is shaded, as are conserved bases. The first nucleotide (cap site) of the mRNAs is at 1. The loop sequence of the chicken ALAS-2 iron-responsive element differs by one nucleotide from the consensus.

+

initiation of translation is inhibited and synthesis of ferritin is prevented. In the transferrin receptor mRNA, binding of the IRE-BP to the 3’-untranslated region increases the stability of the mRNA, perhaps by preventing the action of a ribonuclease (1O9),and the synthesis of transferrin receptor is increased. The IRE-BP is a metal sensor; its RNA binding activity can be modulated by iron in the absence of any alteration in its rate of synthesis or its degradation (107). When the iron concentration is sufficiently low, the IRE-BP is modified so that it can bind strongly to an IRE. The nature of this modification became clear when cloning of the IRE-BP revealed sequence identity to the mitochondrial aconitase, which converts citrate to isocitrate (107). It is known that mitochondrial aconitase contains an iron-sulfur cluster that plays an essential role in regulating the enzyme’s activity. Indeed, in the presence of high iron levels, the IRE-BP gains aconitase activity and corresponds to the previously identified cytosolic aconitase. Therefore, it is assumed that

30

BRIAN K. MAY ET AL.

the RNA binding and enzyme activity of the IRE-BP depend on a labile iron-sulfur cluster (110, 111).When the IRE-BP assembles an iron-sulfur cluster in the presence of high cellular iron levels, an accompanying conformational change prevents RNA binding and the protein gains aconitase activity (111). That is, the IRE-BP utilizes a labile iron-sulfur cluster as an environmentally sensitive switch. A possible role for this aconitase activity in iron metabolism has been suggested (12).The iron constituting a cytoplasmic pool may be complexed with citrate so that iron-citrate and iron-isocitrate complexes are in equilibrium, with iron-isocitrate being more labile (12). Therefore, when IRE-BP gains aconitase activity, the isocitrate concentration increases at the expense of the cytosolic citrate pool, and the loading of ferritin with iron would be facilitated by use of the iron-isocitrate complex. Although the regulation of transferrin receptor in nonerythroid cells is well documented, the question arises as to whether the receptor is similarly controlled in erythroid cells. Transferrin receptors are detected at the BFU-E stage, and their expression increases as the cells mature through to the CFU-E and erythroblast stages. Therefore, the expression of transferrin receptors in reticulocytes is reduced and is not detectable in erythrocytes. Recent studies demonstrate that the control of transferrin receptor in erythroid cells is diflerent from that in nonerythroid cells. When MEL cells are induced to daerentiate with Me,SO, transferrin receptor numbers increase but, in contrast to nonerythroid cells, there is a substantial increase in the rate of transcription of the transferrin receptor gene (112). In addition, the stability of the transferrin receptor mRNA is increased during MEL cell differentiation, but by a mechanism that does not apparently involve iron and the IRE-BP (112). Studies on J2E cells similarly show that the level of transferrin receptor increases in response to erythropoietin and that this represents an activation of gene transcription accompanied by stabilization of the mRNA for the receptor (100). In differentiating chick erythroid cells, the expression of the transferrin receptor gene is regulated predominantly at the transcriptional level (113).It is now important to determine at the molecular level how the gene for transferrin receptor is controlled transcriptionally during differentiation, and whether erythroid-specific factors play a role, and, in addition, to determine the mechanism for increased mRNA stability. As described, ferritin synthesis in nonerythroid cells is predominantly regulated by iron at a post-transcriptional level. There is evidence that such a mechanism also operates in Me,SO-treated MEL cells, wherein translation of ferritin can be modulated in response to iron (114). Translational control of ferritin synthesis therefore seems likely during erythropoiesis, but there is also an increase in transcription of this gene ( 1 1 4 ~ )The . promoter for the H subunit of ferritin contains Spl and CAAT-

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

31

box protein-binding sites, typical of a housekeeping promoter, and this gene is expressed ubiquitously. The increased expression of this gene during erythropoiesis is due to an enhancer region identified about 4.5 kb upstream from the transcription start site, but the nuclear factors that bind to this region remain unidentified.

C. Heme Synthesis Is Coupled to Iron Availability The erythroid ALAS plays a key regulatory role in providing a link between the availability of cytosolic iron and subsequent mitochondria1 heme formation in erythroid cells. The 52-nucleotide 5’-untranslated region of the human ALAS-2 mRNA is composed almost entirely of an IRE structure (28) that is very similar to the highly conserved IRE found in ferritin and transferrin receptor mRNAs. An almost identical IRE is present in the 5’-untranslated region of the mouse (8,115)and chicken (35)erythroid mRNAs (Fig. 5). No such IRE sequence is present in the mRNA for ALAS-1 or in the mRNAs for other heme pathway enzymes. When the IRE sequence from human ALAS-2 is fused to the heterologous human growth hormone cDNA sequence and transcripts generated and translated in vitro, the addition of purified IRE-BP strongly inhibits translation of the mRNA in wheat-germ extracts (116). This inhibition is not observed if the IRE in the mRNA is mutated in the conserved loop. These results, together with RNA gel mobility-shift experiments (28) suggest that IRE-BP binds to the IRE to prevent initiation of mRNA translation. There is iron-dependent translation of ALAS-2 mRNA in MEL cells and the IRE, fused to a reporter mRNA, confers translational control when transfected into MEL cells (114). The IRE-BP bound to the IRE prevents association of the small ribosomal subunit with the mRNA (116~).However, it is not clear whether the IRE-BP directly inhibits binding of the small ribosomal subunit, or whether it inhibits binding of cap-site initiation factors required for attachment of the small ribosomal subunit. It is noteworthy that the location of the IRE is important for translational control. If the ALAS-2 IRE is relocated further from the cap site (about 60 nucleotides), in vitro translation is not inhibited by IRE-BP, indicating that the ribosome can now bind to the 5’ cap site of the mRNA, dislodge the IRE-BP, and traverse the stem-loop structure of the IRE (116). An intriguing model can be proposed for the regulation of ALAS-2 in developing erythroid cells (see Fig. 6). In response to erythropoietin, transcription of the ALAS-2 gene is activated (together with genes for other heme pathway enzymes and globin). Iron transported into erythroid cells by the transferrin receptor forms an iron pool, some of which is free or bound to ferritin. Translation of ALAS-2 mRNA takes place if the level of free intracellular iron is sufficient to modify the IRE-BP, which no longer binds RNA

32

BRJAN K. MAY ET AL.

FIG.6. Proposed model for the translational regulation of ALAS-2 and globin mRNAs in erythroid cells. Erythropoietin (Epo) activates transcription of ALAS-2 and globin genes; Tf, transferrin; TfR,transferriri receptor; IRE-BP, iron-responsive-element binding protein. The iron pool regulates ALAS-2 mRNA translation and is incorporated into protoporphyrin to give heme. The cytosolic heme pool may act in a negative fashion to regulate its own synthesis or to regulate globin mRNA translation.

but assumes aconitase activity. The level of mitochondrial ALAS-2 therefore increases, and because this enzyme is rate-limiting, the amount of protoporphyrin also increases. Ferrochelatase then catalyzes the insertion of available iron into the newly synthesized protoporphyrin, to give heme. Hence, through the regulation of ALAS-2 translation, erythroid heme production is coupled to iron availability. This mechanism would ensure

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

33

that, with iron deficiency, there is not a large accumulation of protoporphyrin. The observation that ALAS activity is low in erythroid cells of patients with iron deficiency is consistent with the proposal (117). The role of ferritin in erythroid cells is not clear; it may serve a detoxification purpose. Most studies show that erythroid cells are unable to mobilize intracellular ferritin iron for heme synthesis (118).

D. Does Heme Regulate Erythroid ALAS Synthesis? In liver, heme regulates its own level by controlling the expression of ALAS through a negative feedback mechanism (Fig. 4). Can heme in erythroid cells regulate the production of ALAS and hence the rate of heme formation? As already described, heme inhibits the import of erythroid ALAS into mitochondria. There is also evidence (119) that, in rabbit reticuiocytes, heme exerts a negative feedback control by inhibiting the release of iron from the internalized iron-transferrin receptor complex. This would be expected to reduce the iron pool and translation of the erythroid ALAS-2 inRNA (see Fig. 6). Whether this mechanism operates in differentiating erythroid cells (as depicted in Fig. 6) is not known. There is no evidence that the production of ALAS-2 mRNA is negatively regulated by heme during erythropoiesis. Indeed, there is a small increase of ALAS-2 mRNA in MEL cells treated with heme (101), but whether heme acts directly on the gene or indirectly through its ability to stimulate MEL cell differentiation (120)is unknown. When Me2SO-differentiating MEL cells are treated with succinylacetone, a specific inhibitor of protoporphyrin and thus of heme synthesis, the level of ALAS-2 mRNA remains essentially unaltered, in contrast to the effect of Me,SO alone (8, 101). Interestingly, total ALAS enzyme activity can be increased at least 10-fold by the addition of succinylacetone to Me,SO-treated MEL cells compared with Me,SOalone, and heme prevents this (121, 122). It is possible that more cytosolic iron becomes available during succinylacetone treatment, thus allowing increased ALAS-2 translation. However, it is unclear how heme would prevent this. Alternatively, it can be proposed that there is a greater import of ALAS-2 into mitochondria in the presence of lowered heme levels and this results in increased levels of enzyme activity.

E. Globin Formation Is Coupled to Heme Production Although protoporphyrin and heme formation are coupled to iron availability in erythroid cells, the translation of globin chains is, in turn, coordi-

34

BRIAN K. MAY ET AL.

nated to the supply of heme (Fig. 6). The mechanism of translational control by heme is well-documented (123, 124). In eukaryote protein synthesis, initiation factor eIF-2 is required for binding the initiating Met-tRNA species to the small ribosome subunit. In heme deficiency, an erythroid-specific eIF-2a protein kinase, referred to as heme-regulated inhibitor kinase (HRI), present in erythroid cells, is activated and catalyzes the phosphorylation of the (Y subunit of the initiation factor eIF-2. This leads to inhibition of translation of globin mRNA and other erythroid mRNAs. In the presence of sufficient heme, the HRI dimer binds heme and is inactivated. Two possible heme-responsive motifs in the HRI resembling those in the ALAS-2 precursor and the transcription factor HAP-1 have been identified (124). Heme may bind to the HRI through these motifs and promote dimer inactivation by catalyzing intersubunit disulfide bond formation. Hence, if heme levels are sufficient, the kinase action is prevented and initiation of translation can take place. It is to be noted that even though translational control mechanisms governing the production of ALAS-2 and globin are assumed to occur in nucleated differentiating erythroid cells, such mechanisms would be of particular importance in mammalian reticulocytes, which lack a nucleus, but continue to synthesize heme and globin from existing mRNAs.

VII. Molecular Biology of Human Hereditary Porphyrias The porphyrias (see Table I) are an interesting and diverse group of inherited disorders of heme biosynthesis, and in each there is a decreased activity of one of the seven enzymes of the heme pathway after ALAS (4,5,125).As a result, in each porphyria there is a specific pattern of accumulation and excretion of porphyrins or their precursors, porphobilinogen and ALA. Clinically, porphyrias are characterized by either neurologic abnormalities or cutaneous photosensitivity, or both. Porphyrias are also classified as hepatic or erythropoietic, depending on the primary organ in which the accumulation of porphyrins or precursors is recognized, but this is not always clear-cut. There are no reports of a disorder associated with a defect in ALAS-1. Such a mutation may be lethal even in the heterozygous state, or the defect may be undetected if induced expression of the normal gene allele can compensate for the mutant allele. We will focus on the current status of the molecular biology of the porphyrias. For a more complete picture on the medical aspects, the reader is referred to recent reviews (5, 125, 126).

35

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

TABLE I GENEDEFECTS I N PORPHYRIA DISEASES Defective enzyme (% remaining of normal enzyme)

Major tissue site

Porphyria

Inheritance

Aminolevulinate dehydratase Acute intermittent Congenital erythropoietic Familial cutanea tarda Hereditary coproporphyria Variegate Erythropoietic protoporphyria

Recessive

Aminolevulinate dehydratase (1%)

Liver

Dominant

Porphobilinogen deaminase (50%)

Liver

Recessive

Erythropoietic

Dominant

Uroporphyrinogen-111 cosynthase (
Dominant Dominant

Protoporphyrinogen oxidase (50%) Ferrochelatase (20%)

Liver Erythropoietic

Dominant

Liver/ (erythropoietic) Liver

A. Hepatic Porphyrias 1. ACUTE HEPATICPORPHYRIAS Four porphyrias constitute the acute hepatic porphyrias-acute intermittent porphyria, hereditary coproporphyria, variegate porphyria, and the very rare porphyria due to ALA dehydratase deficiency. Clinical manifestations are often in the form of acute attacks of neurologic dysfunction and may be life threatening. These disorders are usually latent and most patient do not show symptoms. Attacks can be precipitated by a variety of exogenous and endogenous factors; these include pharmaceutical drugs (such as sulfonamides, barbiturates, phenytoin), alcohol, fasting, and steroid hormones, so that these diseases are also called inducible porphyrias. There is indirect evidence that George III, “the mad king,” suffered from acute hepatic porphyria (variegate porphyria (1277, and, more recently, it has been implied that Vincent van Gogh was afflicted with acute intermittent porphyria (128). Of the heme pathway enzymes, apart from ALAS, porphobilinogen deaminase, coproporphyrinogen oxidase, and protoporphyrinogen oxidase show the lowest relative activities in liver compared with the other enzymes (4), and reduced levels of these three enzymes to 50% of normal are responsible for the three autosomal dominant types of the acute hepatic porphyrias (Table I). In the autosomal recessive porphyria due to ALA dehydratase deficiency, enzyme activity in tissues is markedly reduced, to about 1%of

36

BRIAN K. M A Y ET AL.

normal. These genetic disorders have also provided important models for the study of the regulation of heme biosynthesis in the liver.

a. Acute Intermittent Porphyria. The dominant trait of acute intermittent porphyria, the commonest of the acute hepatic porphyrias, results from a decrease in the activity of porphobilinogen deaminase in all tissues to about 50%of normal. In rare double heterozygous (homozygous) cases, the activity is reduced to about 15%.In the latent state of the porphyria it is assumed that sufficient heme is synthesized by the impaired pathway in liver and other tissues to supply cellular needs and to control ALAS negatively. Compounds that precipitate a clinical attack in patients (for example, the barbiturate phenobarbital) increase the production of drug-metabolizing P450s, particularly in the liver, the major site of P450 production, and thereby increase the demand for heme. Because of the partial block at the porphobilinogen deaminase step of the heme pathway, the increased need for heme is not met and lowered heme levels result in a large induction of ALAS. (In addition, drugs may also induce ALAS directly as described earlier.) Increased levels of hepatic ALAS have been demonstrated in patients with acute intermittent porphyria; in one fatal case, the level of ALAS activity in the liver was elevated 40-fold by phenobarbital (5).The increased ALAS activity greatly accentuates the accumulation of porphobilinogen (and to a lesser extent, 5-aminolevulinate) caused by the deaminase deficiency, and these precursors are released into the plasma and excreted in urine. The relationship between the genetic defect and the clinical manifestations, exclusively neurologic in this porphyria, remains unclear. Aminolevulinate is structurally similar to the inhibitory neurotransmitter y-aminobutyric acid (GABA) and shows GABA agonist properties in experimental animals; however, elevated levels of it have not been demonstrated in the nervous system. Alternatively, a deficiency of heme in neural tissues may compromise enzymatic oxidations and energy-producing reactions involving hemoproteins. It is not known whether P450s are induced in neural tissues during a porphyric attack and thus whether heme levels are indeed lowered. Third, experimental work in animals indicates that depletion of heme impairs the catabolism of L-tryptophan by the heme-dependent hepatic tryptophan pyrrolase and results in accumulation of tryptophan and serotonin to neurotoxic levels (129). More recently it has also been suggested that systems involving the hemoproteins nitric oxide synthase and guanylyl cyclase may be involved. The beneficial effect of heme administration in reversing the clinical manifestations of acute porphyric attacks may be the result of lowered hepatic ALAS levels and reduced 5-aminolevulinate and porphobilinogen levels, or an increased supply of heme to neural tissues. Star-

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

37

vation can cause prophyric attacks, and carbohydrate administration is beneficial, possibly by blocking induction of hepatic ALAS by an unknown “glucose effect” (5). Acute intermittent porphyria is now understood at the molecular level. A defective gene for porphobilinogen deaminase is responsible, and there is considerable heterogeneity, with 52 distinct mutations so far discovered (130).The mutations either prevent expression of the gene or give rise to a protein that is catalytically inactive or rapidly degraded, resulting in a 50% decrease of enzyme activity in the heterozygous state. The gene for porphobilinogen deaminase gives rise to a ubiquitous and an erythroid mRNA, each having a different first exon and 13 common exons. In patients with the most prevalent type, mutations occur in the 13 common exons and result in a deficiency of the deaminase in all tissues. The largest number of mutations are in exons 10 and 12, often changing amino acids in the active site of the enzyme. The three-dimensional structure of porphobilinogen deaminase from E . coli has been defined by X-ray analysis (131).It indicates strong conservation of structurally and functionally important amino acids in the human enzyme, and thus has allowed analysis of the reported mutations for likely effects on the structure and catalytic properties of the human enzyme and correlation with manifestations of the disease. In an uncommon subtype of acute intermittent porphyria, the erythrocyte enzyme is normal, with only the ubiquitous isozyme being defective. In this type, two mutations that sect the splicing of exon 1 during pre-mRNA processing and thus impair the formation of mRNA only for the ubiquitous isozyme have been found in 7 of 10 unrelated patients (132).

b. Hereditary Coproporphyria. Heterozygotes with coproporphyria have half the normal activity of coproporphyrinogen oxidase in all tissues and rare homozygotes have 2% of normal enzyme activity. During clinical attacks, following P450 induction and depression of hepatic ALAS levels, coproporphyrin-the substrate of the defective enzyme-accumulates, and can give rise to skin photosensitivity. Substantial amounts of Fi-aminolevulinate and porphobilinogen also accumulate, because porphobilinogen deaminase becomes rate limiting and imparts a second bottleneck to the pathway. Two mutations in coproporphyrinogen oxidase that result in hereditary coproporphyria have been identified (106a, 133). In a patient with a homozygous form, a point mutation resulting in an arginine-to-tryptophan substitution produces a protein with reduced catalytic activity (133). A different mutation, which results in exon skipping, was found in a patient with the more common heterozygous form (106a).

38

BRIAN K. MAY ET AL.

c. Vuriegute Porphyria. Heterozygotes with this porphyria have half the normal activity of photoporphyrinogen oxidase in all tissues; in homozygotes enzyme activity is reduced to 10%. Protoporphyrin accumulates and ALAS is also derepressed, resulting in a biochemical and clinical phenotype similar to that of hereditary coproporphyria. However, photosensitivity is seen in up to 80% of patients. A cDNA clone for this enzyme has not yet been isolated to permit identification of molecular defects.

d . ALA Dehydrutuse Deficiency Porphyria. This porphyria is the last porphyria to be discovered and so far only four cases have been reported. Due to the severe reduction of enzyme activity seen only in homozygotes, the enzyme becomes rate limiting in the production of heme, and its substrate, 5-aminolevulinate accumulates. Clinical manifestations are limited to the nervous system. Distinct maternal and paternal point mutations have been identified in the coding region of the dehydratase gene in two patients, resulting in the production of a protein that has markedly diminished activity or is very unstable. 2. PORPHYRIA CUTANEATARDA Two variants of this disease, familial and acquired, are recognized. We focus on the familial form (also called type II), which accounts for about 20% of cases. Most affected individuals have half the normal activity of uroporphyrinogen decarboxylase in all tissues, causing accumulation of uroporphyrin and heptacarboxylic porphyrin in the liver. These porphyrins are released into the circulation and give rise to skin fragility and blister formation because they promote oxygen radical generation and complement activation on photoexcitation. Like the acute hepatic porphyrias, the disease is usually latent; additional factors such as liver disease, alcoholism, estrogen ingestion, or sufficient iron overload are necessary for enough uroporphyrin to accumulate to produce symptoms. It is proposed that iron somehow lowers the decarboxylase activity in the liver, perhaps through direct oxidant effects on the enzyme or by promoting oxidation of uroporphyrinogen to inhibitory levels of uroporphyrin. Venesections abolish both the symptoms and the uroporphyrin accumulation. Neurologic symptoms do not occur in this porphyria, and 5-aminolevulinate and porphobilinogen do not accumulate. Presumably there is sufficient residual decarboxylase to cope with the demand for heme without induction of ALAS even when the patient is exposed to P.iS0-inducing drugs. Studies show that a defective gene for uroporphyrinogen decarboxylase underlies this disease. A point mutation in the coding region of the gene that causes instability of the mutant protein has been identified (136).In addition,

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

39

a splice-site mutation producing deletion of exon 6 in the mRNA and resulting in an inactive, unstable protein has been found in several kindred (137). A presumed homozygous form of familial porphyria cutanea tarda (also called hepatoerythropoietic porphyria) is much rarer. It is characterized by severe enzyme deficiency (activity less than 25% of normal) and symptoms occur spontaneously early in life. The disease is biochemically heterogeneous; to date, four different point mutations and/or a deletion in the uroporphyrinogen decarboxylase gene have been described in four separate families (138). Interestingly, uroporphyrinogen decarboxylase gene mutations that cause porphyria cutanea tarda in the heterozygote state have not been found in patients with hepatoerythropoietic porphyria, possibly because these mutations would be lethal in the homozygous state (139).

B.

Erythropoietic Porphyrias

The erythropoietic porphyrias comprise congenital erythropoietic porphyria and erythropoietic protoporphyria. These porphyrias have the common feature of cutaneous photosensitivity and patients do not suffer neurologic attacks.

1. CONGENITAL ERYTHROPOIETIC PORPHYRIA This is a very rare disease and is manifested only in homozygotes. The residual uroporphyrinogen-111 synthase activity is usually below 10% and results in marked accumulation of uroporphyrin I, which is formed nonenzymatically from the linear tetrapyrrole hydroxymethylbilane, the substrate of the enzyme. In most cases, very large amounts of the uroporphyrin are produced and cause pronounced cutaneous phototoxic injury of the skin from infancy, resulting in grotesque scarring of light-exposed areas. There is also variably severe anemia from a defective erythropoiesis and/or increased destruction of erythrocytes. A defective uroporphyrinogen-111 synthase gene underlies the disorder. A large heterogeneity of mutations has been found in 28 unrelated patients; these include point mutations in the coding region and insertions and deletions of sequences in the mRNA (140, 141). Most patients are compound heterozygotes. One point mutation causing a cysteine-to-arginine substitution at position 73 interestingly accounts for about 50% of the mutant alleles; most homozygotes for this mutation are severely affected, whereas a threonine-to-methionine substitution at position 228 in one allele seems to be associated with a less severe form of the disease (140). Overall, no clear relationship between genotype and clinical phenotype has yet been established.

40

BRIAN K. MAY ET AL.

2. ERYTHROPOIETIC PROTOPORPHYRIA This porphyria follows autosomal dominant inheritance and is clinically characterized by an acute photosensitivity on sun exposure, in contrast to the other porphyrias with cutaneous manifestations. Some individuals display a mild hypochromic, microcytic anemia. A subset of patients develop fatal liver failure by mid-life or before, and usually have more severe biochemical abnormalities. In this disorder, ferrochelatase activity has less than half the normal activity expected for an autosomal dominant defect. The functional unit of the enzyme appears to be a dimer (142),and an interaction between normal and mutant subunits that would decrease the activity by more than 50% has been postulated. Alternatively, there may be additional genetic defects, because carriers in the same family usually do not have symptoms. The deficiency results in accumulation of protoporphyrin mainly in erythroid cells at the reticulocyte stage of development. The free erythrocyte protoporphyrin promptly diffuses out of the young red cells on their release from the bone marrow into the circulation. Photoexcitation of the free protoporphyrin in skin capillaries generates reactive oxygen species that damage cell membranes. In addition, protoporphyrin activates the complement cascade, which elicits an inflammatory response that further damages skin. To date, 12 molecular defects in the ferrochelatase gene have been found. Most of these cause aberrant splicing, with loss of exon 2, 3, 7, or 10, resulting in defective enzyme proteins. In two cases point mutations in one or both alleles produced an enzyme that is unstable or lacks activity (143, 144). The mutations have not been correlated with severity of clinical phenotype, but one report suggests that certain mutations occurring in both alleles may place patients at special risk of developing liver failure (145).

C. Perspectives on Porphyrias There have been rapid advances in the identification of specific gene defects associated with all of the porphyrias except for variegate porphyria for which the gene corresponding to the deficient enzyme has not been investigated. In each of the other porphyrias there is a marked heterogeneity in the mutations identified, and this reflects the clinical heterogeneity observed. The molecular basis for the neurologic symptoms of the acute porphyrias remains unclear. Whether excessive 5-aminolevulinate (and porphobilinogen) or a deficiency of heme in neural tissues or some other mechanism is responsible remains to be established. Transgenic animal experiments are underway and should help solve this enigma. It is noteworthy that neurologic defects are not associated with all the porphyrias, even when the level of activity of the heme pathway enzyme is

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

41

very low. Presumably there is sufficient enzyme activity remaining to cope with the demand for heme so that ALAS is not induced. In addition, heme synthesis in erythroid cells is only minimally affected in the various porphyrias and severe anemia is rare. Tissue-specific features and levels of the deficient enzymes can be postulated to account for this. The observed relationship between porphyria cutanea tarda and iron stores is intriguing, but a molecular basis for this has not been established. It needs to be determined whether iron reduces expression for the wild-type allele in the liver or inhibits its enzyme activity. Similarly, it is not clear why patients with erythropoietic protoporphyria have levels of ferrochelatase that are substantially less than 50%. Also, it is not clear why “carriers” of the same defect in families have no symptoms; once again, there is the possibility that expression of the wild-type allele for ferrochelatase is somehow inhibited, or additional genetic defects remain undetected. Finally, development of methods for accurate prenatal diagnosis as well as gene therapy will be most important for the severe and debilitating forms of porphyria, such as congenital erythropoietic porphyria, homozygous familial porphyria cutanea tarda, and erythropoietic protoporphyria.

D. Lead Poisoning Lead intoxication produces neurological symptoms that resemble those of acute hepatic porphyria. Several of the enzymes of the heme pathway are inhibited by lead to varying degrees, and this is reflected in uiuo by the accumulation and excretion of substrate intermediates in the pathway (10). ALA dehydratase is most sensitive to inhibition by lead. The other enzymes that can be inhibited by lead levels achieved in uiuo are coproporphyrinogen oxidase and, in severe cases, porphobilinogen deaminase. Protoporphyrin accumulates as zinc protoporphyrin, indicating that ferrochelatase activity per se is not impaired but rather that iron is not available to the enzyme. Overall, these effects of lead usually do not greatly compromise heme production, at least in erythroid cells, so as to produce anemia, and although frequently stated in the past, a ring sideroblast defect is not seen in lead poisoning.

VIII. Molecular Biology of Hereditary Sideroblastic Anemia The sideroblastic anemias are a group of inherited or acquired disorders and are characterized by anemia of variable severity, hypochromic/microcytic erythrocytes in the blood, and ring sideroblasts in the bone marrow (10). Ring sideroblasts are erythroblasts in which the mitochondria are loaded

42

BRIAN K . M A Y ET AL.

with iron, and often assume a perinuclear distribution. Thus, in contrast to iron-deficiency anemia, there is an abundance of iron in erythroid cells of patients with sideroblastic anemia. We focus here on hereditary sideroblastic anemia, which, in most cases, follows an X-linked pattern of inheritance and thus is manifested almost exclusively in males. It has been considered for many years that X-linked sideroblastic anemia represents a lowered amount of one of the enzymes of the heme pathway with reduced production of protoporphyrin. Although globin synthesis also appears impaired in erythroid cells, this is thought to be secondary to reduced heme formation, because in uitro studies have shown that reticulocytes from the patients retain the capacity to synthesize globin provided that exogenous heme is added (10).The delivery of iron to erythroid cells via endocytosis of the iron-transferrin receptor complex is also apparently normal (10).Therefore, it can be proposed that the reduced amount of hemoglobin reflects impaired production of protoporphyrin so that when iron enters mitochondria, it is not incorporated into heme and accumulates. Approximately one-third of patients with hereditary sideroblastic anemia respond to pyridoxine administration (10).This finding implicates ALAS in the disease because the enzyme requires pyridoxal phosphate as a cofactor. In addition, the level of ALAS activity measured in bone marrow lysates of patients is often reduced and may be restored by addition of pyridoxal phosphate and/or after pyridoxine administration. Because hereditary sideroblastic anemia follows an X-linked pattern of inheritance, the localization of the ALAS-2 gene to the X chromosome (38, 39) makes it highly probable that defects in this gene underlie the disease. We have investigated several families with X-linked sideroblastic anemia. Initial studies were made in two probands; one patient (XSA-7 in Fig. 7) is partially responsive to pyridoxine and the other (XSA-12) is unresponsive. We observed that the amounts of mRNA for ALAS-2 and globin were low in bone marrow cells in both cases (146),but the reason for this is not clear. A female patient (XSA-6) with moderate anemia that is partly responsive to pyridoxine administration was examined several months after pyridoxine was discontinued. Although the amounts of ALAS-2 and globin mRNA were normal, the activity of ALAS in bone marrow cells was very low and was significantly enhanced by pyridoxal phosphate in uitro (146).A fourth patient (XSA-4) with severe anemia completely responded to pyridoxine and was studied before and after treatment with this vitamin. Before treatment the level of ALAS activity in bone marrow lysates was about 50% of the normal level; after treatment with pyridoxine and remission of the anemia the enzyme activity was restored to above normal levels (147). To investigate whether a defect in the gene for ALAS-2 is responsible for the disease, the mRNA for ALAS-2 was isolated from the bone marrow of the

43

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

I

1

I II

Exons

2

- XSA#

3 4

II 5

6

I II I 7

8 9

10

Position

Nucleotide Change

Amino Acid Change

Exon 5 Exon 5 Exon 7 Exon 8 Exon 9 Exon 9 Exon 9 Exon 9 Exon 9 Exon 9 Exon 9

T545C G566A A947C C1215G C1283T G1299A G 1395A C1406A C1406T G1407A T1479A

F165L A172T K299Q T388S R411C G4 16D R448Q R452S R452C R452H I476N

I 11

FIG. 7. Location of point mutations identified in the ALAS-2 gene of patients with X-linked sideroblastic anemia (XSA) and the resulting amino-acid substitutions. The pyridoxalphosphate binding site is located in exon 9.

fourth patient (XSA-4) and was sequenced (147). A single base change (C to G) was identified in the mRNA and subsequently in the genomic DNA (Fig. 7). The same mutation was found in the genomic DNA of an affected maternal nephew and a heterozygous daughter of the patient, but not in other unaffected male members of the family. A genetic linkage analysis was also carried out using a highly polymorphic dinucleotide repeat sequence identified in intron 7 of the ALAS-2 gene (148).This analysis showed that the same ALAS-2 allele was present in all affected male family members as well as in obIigate female carriers, but not in unaffected male relatives (147). The mutation results in the replacement of a threonine by a serine residue at position 388 in exon 8. This threonine is invariant (among all ALAS proteins) and is located only three residues from the lysine that binds pyridoxal phosphate. Expression studies in E . coli showed that the mutant ALAS-2 enzyme has a lower activity compared with the wild type (147). We have subsequently identified (unpublished; 149) four different single base changes in exon 9 of the ALAS-2 gene in seven other kindred with

44

BRIAN K. MAY ET AL.

X-linked sideroblastic anemia (XSA-6-XSA-12 in Fig. 7). Of interest is the female proband (XSA-6) who is heterozygous at the mutation site in the genomic DNA, but only the mutated gene is expressed in the mRNA, presumably due to skewed inactivation of the X chromosomes. Her mother and daughter are heterozygous at the genomic DNA site, but, in contrast, express both the mutant and wild-type ALAS-2 mRNAs and are not clinically affected. A point mutation in exon 9 of the ALAS-2 gene in a male patient with pyridoxine-responsive X-linked sideroblastic anemia (XSA-13 in Fig. 7) has been found (150). Other mutations have been reported in exons 9, 7 , and 5 (151, 152) and correspond to patients XSA-1, -3, and -5 in Fig. 7. The substitution of an isoleucine for a glutamine at position 476 results in an enzyme with reduced affinity for pyridoxal phosphate (150). In our studies of the patient XSA-4, the mutation at position 388 apparently does not reduce the affinity of the mutant enzyme for the cofactor. Rather, a computer analysis suggests that the mutation introduces a conformational change in the vicinity of the lysine residue, and this may alter substrate binding (147). How the administration of pyridoxine in this patient raises bone marrow levels of the mutant enzyme is not clear, but perhaps the resulting increase in cellular pyridoxal phosphate leads to stabilization of the protein. All of the reported amino-acid alterations lie in regions that are highly conserved among the eukaryote ALAS proteins. Whether these mutations lead to a reduction in the catalytic activity of ALAS-2 or to reduced protein stability is being investigated, as is the role of pyridoxal phosphate to determine why some patients respond to pyridoxine and others do not. The heterogeneity of mutations observed so far in the ALAS-:! gene is expected, because patients with the disease have variable degrees of anemia and show variable responses to pyridoxine treatment. A model for the role of a defective ALAS-2 protein in hereditary sideroblastic anemia is shown in Fig. 8. As a result of a mutant ALAS-2 with lowered catalytic activity, the production of protoporphyrin and heme in the erythroid cells is reduced, and this, in turn, results in diminished production of globin chains. Iron continues to enter the mitochondria in the face of reduced protoporphyrin formation, and iron levels increase in the mitochondria. There are interesting aspects relating to the control of ALAS-2 in these erythroid cells. The mechanism by which iron enters mitochondria is not known, but in this disease state iron continues to enter in the face of its reduced utilization. A similar accumulation of iron in mitochondria is seen in rabbit reticulocytes treated with succinylacetone, the heme synthesis inhibitor (118).With reduced heme levels in these cells it may also be predicted that there will be an increase in the import of the mutant ALAS-2 protein

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

45

FIG. 8. Model for reduced hemoglobin production in X-linked sideroblastic anemia. Tf, Transferrin; TfR,transferrin receptor. A mutant ALAS-2 with reduced catalytic activity results in lower protoporphyrin (PP) and hence heme in the erythroid cell. Lowered heme leads to impaired globin mRNA translation.

precursor from its site of synthesis in the cytosol into the mitochondria. However, high levels of iron in the mitochondria may inhibit ALAS-2 activity in situ or the import of ALAS-2 protein, so that the activity of mitochondrial ALAS-2 remains inadequate. As discussed earlier, the formation of ALAS-1 in normal erythroid cells may be repressed by heme during erythropoiesis (101); it is thus possible that in cells from a sideroblastic anemia patient, synthesis of this isozyme is derepressed in the low-heme environment. This was suggested by our findings of increased levels of mRNA for ALAS-1 in two cases (146,148).In any case, induction of this isozyme is apparently unable to compensate fully for the defective ALAS-2. If the iron accumulation seen in hereditary sideroblastic anemia is a result of a defective synthesis of protoporphyrin, it could be predicted that a protoporphyria patient with a deficiency of ferrochelatase would have ring sideroblasts in the bone marrow. This has not been fully excluded. However,

46

BRIAN K. MAY ET AL.

in most protoporphyria cases there is probably no opportunity for iron to accumulate in mitochondria, because ferrochelatase is not rate limiting until very late in the maturation of the erythroid cell, and protoporphyrin accumulation is not observed until the reticulocyte stage in these patients. It should be mentioned that mutations at another site on the X chromosome may cause a form of sideroblastic anemia. A kindred has been reported with X-linked sideroblastic anemia and ataxia that shows linkage to phosphoglycerate kinase at Xq13 rather than to ALAS-2 at X p l l (153).

IX. Final Comments All enzymes of the heme biosynthetic pathway, except for protoporphyrinogen oxidase, have been cloned from higher vertebrates. The genes encoding these enzymes are located on different chromosomes. Only the first enzyme, ALAS, is encoded by separate erythroid and housekeeping genes, but the reason, if any, for this is unknown. Future studies will shed light on how the genes for all these enzymes are regulated, in particular in erythroid and hepatic tissues. The basis for their coordinated expression during erythropoiesis will be of interest. Hepatic ALAS is induced by drugs in the liver, but the mechanism of this induction and that of the accompanying P450s are not clear. Levels of ALAS-1 are tightly regulated by heme. The underlying mechanism by which heme prevents import of ALAS into mitochondria and destabilizes ALAS-1 mRNA will continue to attract attention. At the transcriptional level, heme can repress ALAS-1 gene transcription or induce heme oxygenase-1 transcription; whether regulatory heme proteins are involved in these processes remains to be determined. Other important posttranscriptional mechanisms have been identified: translation of ALAS-2 is dependent on iron whereas globin mRNA translation requires adequate heme. There has been a rapid understanding in the identification of the gene defects responsible for porphyria diseases; a marked heterogeneity in the mutations is one feature. No doubt, clones for the remaining enzyme protoporphyrinogen oxidase will soon be isolated and the genetic defects responsible for variegate porphyria elucidated. Many different point mutations responsible for X-linked sideroblastic anemia in the ALAS-2 gene have been identified. Heme pathway enzymes, in addition to porphobilinogen deaminase, will be crystallized and their three-dimensional structures determined. This will allow an analysis of the effect of observed mutations, in porphyria and sideroblastic anemia, on the structure and catalytic properties of the mutant enzymes.

HEME BIOSYNTHESIS I N HIGHER VERTEBRATES

47

ACKNOWLEDGMENTS We thank Bill Elliott, Peter Klinken, Ric Rossi, and Dave Elder for their comments on the manuscript, Chris Matthews and other members of the BKM group for their interest and helpful discussions, and Ros Murrell for preparation of the manuscript. We wanted to raise pertinent issues in this review rather than comprehensively cover all aspects. We apologize to those workers who have not been adequately acknowledged. We would also like to thank Peter Klinken for sharing unpublished data.

REFERENCES 1. G. Balla, G. M. Vercellotti, U. Miiller-Eberhard, J. Eaton and H. S. Jacob, Lab. Inoest. 64,648 (1991). 2. U. Miiller-Eberhard and M. Fraig, Am. J. Hemutol. 42, 59 (1993). 3. B. K. May, I. A. Borthwick, G. Srivastava, B. A. Pirola and W. H. Elliott, Curr. Top. Cell. Regul. 28, 233 (1986). 4. S. S. Bottomley and U. Miiller-Eberhard, Semin. Hemutol. 25, 282 (1988). 5. A. Kappas, S. Sassa, R. A. Galbraith and Y. Nordmann, in “The Metabolic Basis of Inherited Disease” (C. R. Scriver, A. L. Beaudet, W. S. Sly and D. Valle, eds.), 6th ed., p. 1305. McGraw-Hill, New York, 1989. 6. B. K. May and M. J. Bawden, Semin. Hematol. 26, 150 (1989). 7. B. K. May, C. R. Bhasker, M. J. Bawden and T. C. Cox, Mol. B i d . Med. 7, 405 (1990). 8. P. Dierks, In “Biosynthesis of Heme and Chlorophylls” (H. A. Dailey, ed.), p. 201. MeGraw-Hill, New York, 1990. 9. P. Ponka and H. M. Schulman, Stem Cells ll(Supp1. l), 24 (1993). 10. S. S. Bottomley, in “Wintrobe’s Clinical Hematology” (G. R. Lee, T. C. Bithell, J. Foerster, J. W. Athens and J. N. Lukens, eds.), 9th ed., p. 852. Lea & Febiger, Philadelphia, 1993. 11. J. W. Harris, N. Engl. J. Med. 330, 709 (1994). 12. T. V. O’Halloran, Science 261, 715 (1993). 13. D. J. Waxman and L. Azaroff, BJ 281, 577 (1992). 14. T. Helfman and V. Falanga, Am. J . Med. Sci. 306, 37 (1993). 15. A. Verma, D. J, Hirsch, C. E. Glatt, G . V. Ronnett and S. H. Snyder, Science 259, 381 (1993). 16. A. Smith, in “Biosynthesis of Heme and Chlorophylls” (H. A. Dailey, ed.), p. 435. McGraw-Hill, New York, 1990. 17. G. C. Ferreira, P. J. Neame and H. A. Dailey, Protein Sci. 2, 1959 (1993). 18. S. Taketani, H. Kohno, M. Okuda, T. Furukawa and R. Tokunaga, JBC 269, 7527 (1994). 19. S. 1. Beale and J. D. Weinstein, in “Biosynthesis of Heme and Chlorophylls” (H. A. Dailey, ed.), p. 287. McGraw-Hill, New York, 1990. 20. I. A. Borthwick, G. Srivastava, B. A. Pirola, B. K. May and W. H. Elliott, Methods Enzymol. 123, 395 (1986). 21. M. Rohde, G. Srivastava, D. 8. Rylatt, P. Bundesen, J. Zamattia, D. I. Crane and 8. K. May, ABB 280, 331 (1990). 22. B. A. Pirola, F. Mayer, I. A. Borthwick, G. Srivastava, B. K. May and W. H. Elliott, EJB 144, 577 (1984). 23. I. A. Borthwick, G. Srivastava, A. R. Day, B. A. Pirola, M. A. Snoswell, B. K. May and W. H. Elliott, EJB 150, 481 (1985).

48

BRIAN K. MAY ET AL.

24. M. J. Bawden, I. A. Borthwick, H. M . Healy, C. P. Morns, B. K. May and W. H. Elliott, NARes 15, 8563 (1987). 25. 6 . Srivastava, I. A. Borthwick, D. J. Maguire, C. J. Elferink, M. J. Bawden, J. F. B. Mercer and B. K. May, JBC 263, 5202 (1988). 26. R. D. Riddle, M. Yamamoto and J. D. Engel, PNAS 86, 792 (1989). 27. D. S . Schoenhaut and P. J. Curtis, NARes 17, 7013 (1989). 28. T C. Cox, M . J. Bawden, A. Martin and B. K. May, E M B O ] . 10, 1891 (1991). 29. H. Munakata, T. Yamagami, T. Nagai, M. Yarnamoto and N. Hayashi, ]. Biochem. (Tokyo) 114, 103 (1993). 30. G. C. Ferreira and H. A. Dailey, ] B C 268, 584 (1993). 32. M. Marceau, S . D. Lewis, C. L. Kojiro, K. Mountjoy and J. A. Shafer, ]BC 265, 20421 (1990). 33. D. J. Maguire, A. R. Day, 1. A. Borthwick, G. Srivastava, P. L. Wigley, B. K. May and W. H. Elliott. NARes 14, 1379 (1986). 34. K. Yomogida, M. Yamamoto, T. Yamagami, H. Fujita and N. Hayashi, ]. Biochem. (Tokyo) 113, 364 (1993). 35. K. C. Lim, H. Ishihara, R. D. Riddle, 2. Yang, N. Andrews, M. Yamamoto and J. D. Engel, NARes 22, 1226 (1994). 36. J. G . Conboy, T. C. Cox, S. S. Bottomley, M. J. Bawden and B. K. May, ]BC 267, 18753 (1992). 37. G. R. Sutherland, E. Baker, D. F. Callen, V. J. Hyland, B. K. May, M. J. Bawden, H. M. Healy and I. A. Borthwick, Am. ]. Hum. Genet. 43,331 (1988). 38. D. F. Bishop, A. S. Henderson and K. H. Astrin, Genomics 7 , 207 (1990). 39. T. C. Cox, M. J. Bawden, N . G. Abraham, S. S. Bottomley, B. K. May, E. Baker, I. Z. Chen and 6. R. Sutherland, Am. ]. Hum. Genet. 46, 107 (1990). 40. J. C . Conboy, R. Shitamoto, M. Parra, R. Winardi, A. Kabra, J. Smith and N. Mohandas, Blood 78, 2438 (1991). 41. G. Srivastava, S. K . Kwong, K. S. Lam and B. K. May, E]B 203, 59 (1992). 42. B. Lewin, Cell 61, 1161 (1990). 43. M. Romana, A. Duhart, D. Beaupain, C. Chabret, M. Goossens and P. H. Romeo, NARes 15, 7343 (1987). 44. V. Mignotte, L. Wall, E. de Boer, F. Grosveld and P. H. Romeo, NARes 17, 37 (1989). 45. G . Braidotti, I. A. Borthwick and B. K. May, JBC 268, 1109 (1993). 46. M. J. Evans and R. C. Scarpulla, ] B C 264, 14361 (1989). 47. C. A. Virbaqius, J. V. Virbasius and R. C. Scarpulla, Genes Deu. 7, 2431 (1993). 48. A. H. Kaya, M. Plewinska, D. M. Wong, R. J. Desnick and J. G. Wetmur, Genomics 19, 242 (1994). 49. S. Taketani, J. Inazawa. Y. Nakahashi, T. Abe and R. Tokunaga, EJB 205, 217 (1992). 50. S. Granick, ]BC 241, 1359 (1966). 51. A. J. Hansen, L. A. Elferink and B. K. May, DNA 8, 179 (1989). 52. D. Darr and I. Fridovich, Biol. Incest. D e m t o l . 102, 671 (1994). 53. A. B. Sachs, Celt 74, 413 (1993). 54. J. W. Hamilton, W. J. Bement, P. R. Sinclair. J. F. Sinclair, J. A. Alcedo and K. E. Wetterhahn, ABB 289, 387 (1991). 55. P. D. Drew and I. Z. Ades, BBRC 162, 102 (1989). 56. N. Hayashi, Y. Kurashima and C. Kikuchi, ABB 148, 10 (1972). 57. G . Srivastava, I. A. Borthwick, J. D. Brooker, J. C. Wallace, B. K. May and W. H. Elliott, BBRC 117, 344 (1983). 58. N. Hayashi, M . Terasawa and G. Kikuchi, /. Biochem. (Tokyo) 88, 921 (1980). 59. J. T. Lathrop and M . P. Tirnko, Science 259, 522 (1993).

HEME BIOSYNTHESIS IN HIGHEH VERTEBRATES

49

60. K. Pfeifer, K. S. Kim, S. Kogan and L. Guarente, Cell 56, 291 (1989). 61. D. Urban-Grirnal, C. Volland, T. Garnier, P. Dehoux and R. Labbe-Bois, EJB 156, 511 (1986). 62. M. Kiebler, P. Keil, H. Schneider, I. J. van der Klei, N. Pfanner and W. Neupert, Cell74, 483 (1993). 63. S. C. Dogra, C. N. Hahn and B. K. May, ABB 300, 531 (1993). 64. J. W. Hamilton, W. J. Bement, P. R. Sinclair, J. F. Sinclair, J. A. Alcedo and K. E. Wetterhahn, ABB 298, 96 (1992). 65. L. A. Mattschoss, A. A. Hobbs, A. W. Stegles, B. K. May and W. H. Elliott, JBC 261, 9438 (1986). 66. C. N. Hahn, A. J. Hansen and B. K. May, JBC 266, 17031 (1991). 67. J. S. He and A. J. Fulco, JBC 266, 7864 (1991). 68. R. Rarnsden, K. M. Sornmer and C. J. Orniecinski, JBC 268, 21722 (1993). 68a. E. A. Shephard, L. A. Forrest, A. Shervington, L. M. Fernandez, G. Ciaramella and I. R. Phillips, DNA Cell Biol. 13, 793 (1994). 69. P. M. Shaw, M. Adesnik, M. C. Weiss and L. Corcos, Mol. Pharmacol. 44, 775 (1994). 70. M. D. Maines, G. M. Trakshel and R. K. Kutty, JBC 261, 411 (1986). 71. G. M. Trakshel, R. K. Kutty and M. D. Maines, JBC 261, 11131 (1986). 72. I. Cruse and M. D. Maines, JBC 263, 3348 (1988). 73. W. K. McCoubrey and M. D. Maines, Gene 139, 155 (1994). 74. V. S. Raju and M. D. Maines, BBA 1217, 273 (1993). 74a. D. Lautier, P. Luscher and R. M. Tyrrell, Carcinogenesis 13, 227 (1992). 75. Y. Sun, M. 0. Rotenberg and M. D. Maines, JBC 265, 8212 (1990). 76. M. D. Maines, Mol. Cell. Neurosci. 4, 389 (1993). 77. C. F. Stevens and Y. Wang, Nature 364, 147 (1993). 78. J. Alarn, J. Cai and A. Smith, JBC 269, 1001 (1994). 78a. K. Takeda, S. Ishizawa, M. Sato, T. Yoshida and S. Shibahara, JBC 269, 22858 (1994). 79. J. Alam and A. Smith, JBC 267, 16379 (1992). 81. S. Okinaga and S. Shibahara, EJB 212, 167 (1993). 82. D. Metcalf, Blood 82, 3515 (1993). 83. S. B. Krantz, Blood 77, 419 (1991). 84. V. C. Broudy, N. Lin, M. Brice, B. Nakarnoto and T. Papayannopoulou, Blood 77, 2583 (1991). 85. S. H. Orkin, Cell 63, 665 (1990). 86. K. L. Blanchard, J. Fandrey, M. A. Goldberg and H. F. Bunn, Stern Cells ll(Supp1. l),1 (1993). 87. L. J. KO and J. D. Engel, MCBiol 13, 4011 (1993). 88. M. Merika and S. H. Orkin, MCBiol 13, 3999 (1993). 89. I. J. Miller and J. J. Bieker, MCBiol 13, 2776 (1993). 90. N. C. Andrews, H. Erdjument-Bromage, M. B. Davidson, P. Ternpst and S. H. Orkin, Nature 362, 722 (1993). 91. G. Starnatoyannopoulos and A. W. Nienhuis, in “The Molecular Basis of Blood Diseases” (G. Stamatoyannopoulos, A. W. Nienhuis, P. W. Majerus and H. Varrnus, eds.), p. 107. Saunders, Philadelphia, 1994. 91a. J. J. Caterina, D. J. Ciavatta, D. Donze, R. R. Behringer and T. M. Tomes, NARes 22, 1006 (1994). 92. B. A. Witthuhn, F. W. Quelle, 0. Silvennoinen, T. Yi, B. Tang, 0. Miuraand J. N. Ihle, Cell 74, 227 (1993). 93. T. Chiba, Y. Ikawa and K. Todokoro, NARes 19, 3843 (1991).

50

BRIAN K. MAY ET AL.

94. N . Dalyot, E. Fibach, A. Ronchi. E. A. Rachmilewitz, S. Ottolenghi and A. Oppenheim, NARes 21, 4031 (1993). 95. H. Fujita, M. Yamamoto, T. Yamagami, N. Hayashi, T. R. Bishop, H. De-Verneuil, T. Yoshinaga, S. Shibahara, R. Morimoto and S. Sassa, BBA 1090, 311 (1991). 96. B. Grandchamp, C. Beaumont. H. De-Verneuil and Y. Nordmann, JBC 260, 9630 (1985). 97. H. Kohno, T. Furukawa. T. Yoshinaga, R. Tokunaga and S. Taketani. JBC 268, 21359 (1993). 98. S . P. Klinken, N. A. Nicola and G . R. Johnson, PNAS 85, 8506 (1988). 99. S. J. Busfield, K. J. Riches, A. J. Sainsbury, E. Rossi, P. Garcia-Webb and S. P. Klinken, Growth Factors 9, 87 (1993). 100. P. Klinken, S. Busfield, U. Keil, T. Farr, S. Colley, B. Callus, D. Chappel and J. Papadimitriou, J . Comput.-Assist. Microsc. 5, 81 (1993). 101. H. Fujita, M. Yamamoto, T. Yamagami, N. Hayashi and S. Sassa, JBC 266, 17494 (1991). 102. 1. Max-Audit. J. F. Eleouet and P. H. Romeo, JBC 268, 5431 (1993). 103. M . C . Barton, N. Madani and B. M. Emerson, Genes Deu. 7, 1796 (1993). 104. S. Chrbtien, A. Dubart, D, Beaupain, N. Raich, B. Grandchamp, J. Rosa, M. Goossens and P. H . Romeo, PNAS 85, 6 (1988). 105. S. F. Tsai, D. F. Bishop and R. J. Desnick, PNAS 85, 7049 (1988). 106. P. Martasek, J. M. Camadro, M.-H. Delfau-Larue, J. B. Dumas, J. J. Montagne, H. DeVerneuil. P. Labhe and B. Grandchamp, PNAS 91, 3024 (1994). 1060. M.-H. Delfau-Larue, P. Martasek and B. Grandchamp, Hum. Mol. Genet. 3, 1325 (1994). 107. R. D. Klausner, T. A. Rouault and J. B. Harford, Cell 72, 19 (1993). 108. E. C. Theil, JBC 265, 4771 (1990). 109. R. Binder, J. A. Hortnvitz, J. P. Basilion, D. M. Koeller, R. D. Klausner and J. B. Harford, EMBO J. 13, 1969 (1994). 110. H. Hiding, B. R. Henderson and L. C. Kuhn, EMBO J. 13, 453 (1994). 111. C. C. Philpott, I>. Haile, T. A. Rouault and R. D. Klausner, JBC 268, 17655 (1993). 112. R. Y. Chan, C. Seiser, H. M. Schulman, L. C. Kuhn and P. Ponka, EJB 220, 683 (1994). 113. L. N . Chan and E. M . Gerhardt, JBC 267, 8254 (1992). 114. 0. Melefnrs, B. Goossen, H. E. Johansson, R. Stripecke, N. K. Gray and M .W. Hentze, J B C 268, 5974 (1993). 1140. C. Beaumont, A . Sevhan, A.-K. Yachou, B. Grandchamp and R. Jones, JBC 269,20281 (1994). 115. T. Dandekar, R. Stripecke. N. K. Gray, B. Goossen, A. Constable, H . E. Johansson and M. W. Hentze. EMBOJ. 10, 1903 (1991). 116. C. R. Bhasker, G . Burgiel, B. Neupert, A. Emery-Goodman, L. C. Kuhn and B. K. May, JBC 268, 12699 (1993). 1160. N. K. Gray and M W. Hentze, E M B O J . 13, 3882 (1994). 117. T. Houston, M. R. Moore, K. E. McColl and E. Fitzsimons, Br. 1. Huemotol. 78, 561 (1991). 118. M. L. Adams, I. Ostapiuk and J. A. Grasso, BBA 1012, 243 (1989). 119. P. Ponka. H. M. Schulman and J. Martinez-Medellin, BJ 251, 105 (1988). 120. J. L. Granick and S. Sassa, JBC 253, 5402 (1978). 121. C. Beaumont, J.-C. Deybach, B. Grandchamp, V. DaSilva, H. De-Verneuil and Y. Nordmann, Exp. Cell Res. 154, 474 (1984). 122. C. J. Elferink, S. Sassa and B. K. May, JBC 263, 13012 (1988). 123. C. E. Samuel, JBC 268, 7603 (1993). 124. J. S . Crosby. K. Lee, I M . London and J. J. Chen, MCBiol 14, 3906 (1994).

HEME BIOSYNTHESIS IN HIGHER VERTEBRATES

51

125. S. S. Bottomley, in “The Fundamentals of Clinical Hematology” (J. L. Spivak and E. R. Eichner eds.), 3rd ed., p. 101. Johns Hopkins University Press, Baltimore, 1993. 126. G. R. Lee, in “Porphyria in Wintrobe’s Clinical Hematology” (G. R. Lee, T. C. Bithell, J. Foerster, J. W. Athens and J. N. Lukens, eds.), 9th ed., p. 1272. Lea & Febiger, Philadelphia, 1993. 127. I. Macalpine and R. Hunter, Br. Med. J . 8, 5479 (1966). 128. L. S. Loftus and W. N. Arnold, Br. Med. J. 303, 1589 (1991). 129. D. Litman and M. A. Correia, J . P h a m o l . Erp. Ther. 232, 237 (1985). 130. P. D. Brownlie, R. Lambert, G . V. Louie, P. M. Jordan, T. L. Blundell, M. J. Warren, J. B. Cooper and S. P. Wood, Protein Sci. 3, 1644 (1994). 131. G. V. Louie, P. D. Brownlie, R. Lambert, J. B. Cooper, T. L. Blundell, S. P. Wood, M. J. Warren, S. C. Woodcock and P. M. Jordan, Nature 359, 33 (1992). 132. F. Bourgeois, X. F. Gu, J. C. Deybach, M. P. Te Velder, F. De Rooij, Y. Noordmann and B. Grandchamp, Clin. Chem. 38, 93 (1992). 133. P. Martasek, Y. Nordmann and B. Grandchamp, Hum. Mol. Genet. 3, 477 (1994). 134. M. Plewinska, S. Thunell, L. Holmberg, J. G. Wetmur and R. J. Desnick, Am. J. Hum. Genet. 49, 167 (1991). 135. N. Ishida, H. Fujita, Y. Fukuda, T. Noguchi, M. Doss, A. Kappas and S. Sassa, J. Clin. Znuest. 89, 1431 (1992). 136. J. R. Garey, J. L. Hansen, L. M. Harrison, J. B. Kennedyand J. P. Kushner, Blood 73,892 (1989). 137. J. R. Garey, L. M. Harrison, K. F. Franklin, K. M. Metcalf, E. S. Radisky and J. P. Kushner, J. Clin. Invest. 86, 1416 (1990). 138. K. Meguro, H. Fujita, N. Ishida, R. Akagi, T. Kurihara, R. A. Galbraith, A. Kappas, J. B. Zabriskie, A. C. Tobach, L. C. Harber and S. Sassa, J. Inuest. Dermatol. 102, 681 (1994). 139. H. De-Verneuil, J. Hansen, C. Picat, B. Grandchamp, J. Kushner, A. Roberts, G. Elder and Y. Nordman, Hum. Genet. 78, 101 (1988). 140. C. A. Warner, H. W. Yoo, A. G. Roberts and R. J. Desnick, J . Clin. Invest. 89,693 (1992). 141. S. Boulechfar, V. Da Silva, J.-C. Deybach, Y. Nordmann, B. Grandchamp and H. DeVerneuil, Hum. Genet. 88, 320 (1992). 142. J. G. Straka, J. R. Bloomer and E. S. Kempner, JBC 266, 24637 (1991). 143. D. A. Brenner, J. M. Didier, F. Frasier, S. R. Christensen, G. A. Evans and H. A. Dailey, Am. J. Hum. Genet. 50, 1203 (1992). 144. H. A. Dailey, V. M. Sellers and T. A. Dailey, JBC 269, 390 (1994). 145. R. P. E. Sarkany, G. J. M. A. Alexander and T. M. Cox, Lancet 343, 1394 (1994). 146. S. S. Bottomley, H. M. Healy, M. A. Brandenburg and B. K. May, Am. J. Hematol. 41,76 (1992). 147. T. C. Cox, S. S. Bottomley, J. S. Wiley, M. J. Bawden, C. S. Matthews and B. K. May, N . Engl. J. Med. 330, 675 (1994). 148. T. C. Cox, H. M. Kozman, W. H. Raskmd, B. K. May and J. C. Mulley, Hum. Mol. Genet. 1, 639 (1992). 149. S. S. Bottomley, P. D. Wise, L. H. Whetsell and F. V. Schaefer, Blood 82(Suppl.) 433a (1993). 150. P. D. Cotter, M. Baumann and D. F. Bishop, PNAS 89, 4028 (1992). 151. P. D. Cotter, M. Baumann, D. L. Rucknagel, E. J. Fitzsimons, A. May and D. F. Bishop, Am. J. Hum. Genet. 51, A45 (1992). 152. A. May, A. Al-Sabah, E. J. Fitzsimons, T.Houston, B. Woodcock, P. D. Cotter, L. Wong and D. F. Bishop, Br. J . Haemutol. 81(Suppl. I), 148 (1994). 153. W. H. Raskind, E. Wijsman, R. A. Pagon, T. C. Cox, M. J. Bawden, B. K. May and T. D. Bird, Am. J. Hum. Genet. 48, 335 (1991).

This Page Intentionally Left Blank

The Flp Recombinase of the 2 - k m Plasmid of Saccharomyces cerevisiae PAULD. SADOWSKI Department of Medical Genetics University of Toronto Toronto, Canada M5S 1A8

I. Structure and Function of the 2-pm Plasmid ...................... 11. Flp Is a Conservative Site-specific Recombinase . . . . . . . . . . . . . . . . . . . 111. Flp-mediated Recombination: The in Vitro Reaction . . . . . . . . . . . . . . . . IV. The Mechanism of Acti A . DNA Binding.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. DNA Bending . . . . C. Synapsis .................................................. D. Strand Cleavage . E. Strand Exchange ................... F. Strand Ligation G . Resolution . . . . . . . . . . . ................... H. The Structure and V. Flp as a Reagent for Chromosome Engineering References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53 57 59 64 68

75

I

81

88

Most strains of the yeast, Saccharomyces cerevisiae, harbor about 100 copies of an autonomously replicating plasmid, the 2-pm plasmid ( I , 2). Although the plasmid confers no advantage to its host, and in fact confers a slight disadvantage, it is very stable ( 3 , 4 ) .Its rate of loss is -7.6 x 10-5 per cell per generation. Because of its high copy-number and stability, the 2-pm plasmid has been a useful model for studying D N A replication, recombination, regulation of gene expression, and plasmid segregation in eukaryotic organisms. The plasmid has also served as the basis of many important autonomously replicating high-copy expression vectors in yeast (5). One of the plasmids gene products, the Flp protein, is the subject of this review. Flp is the first eukaryotic, conservative site-specific recombinase to be subjected to detailed study.

1. Structure and Function of the 2-pm Plasmid

A. Structure The 2-pm plasmid is a duplex circular DNA molecule of 6318 base pairs (bp) (6). It contains two inverted repeats of 599 bp that divide the molecule Progress in Nucleic Acid Research

and Molecular Biology, Vol. 51

53

Copyright 0 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.

54

PAUL D. SADOWSKI

FRTsm

. c . \

0

A

B

Srcl

FIG. 1. The Z-bm plasmid of Sacchoromyces cereoisiae. The plasmid consists of large (L) and small (S) unique regions (solid lines) separated by two 599-bp inverted repeats (striped lines). These contain the two FRT sites. The approximate positions of the four functional open readings frames are shown (FLP, REPI, REP2, RAF) along with two other cis-acting sequences, the origin of replication (ORI)and the STB locus. An Flp-mediated recombination event (cross) converts the A isoform (top) to the B form (bottom).

into two unique regions, large (L) and small (S) (see Fig. 1). Yeast cells contain approximately equal amounts of two isomeric forms (A and B) that arise from recombination across the 599-bp repeats (7). The entire 6318-bp sequence of the plasmid revealed four major open reading frames (ORFs) as well as several repeated sequences. The function of a fifth open reading frame (ORF E) is not known. The largest open reading frame (A) is required to promote formation of the two isomeric forms, and was nicknamed FLP’ (8). Two other open reading frames (B and C) were called R E P l and REP2 and are needed along with a cis-acting sequence (Originally called R E P S , now STB) for maintenance of plasmid copy-number ( 9 , l O ) . Although R E P l , REP2, and R E P 3 (STB) were originally thought to be involved in D N A replication, it was subsequently shown that they are actu1 The names Flp and FLP arose from the fact that the gene product (Flp) causes an inversion (flip-flop)of a segment of the 2 - p n plasmid.

THE

Flp RECOMBINASE

55

ally required for faithful partition of the plasmid from mother to daughter cells. The actual origin of replication was located at the junction of one inverted repeat and the large unique region (8). It is cell-cycle regulated in that it is activated only once during the cell cycle (11).This sequence behaves as an autonomously replicating sequence (ARS) and contains a consensus ARS (12). Two-dimensional electrophoresis confirms that this sequence behaves as a true origin of replication (13,14). Because plasmid-encoded functions do not seem to be required for the function of the 2-pm origin (15), this sequence serves as a useful adjunct to the study of host DNA replication.

B. Function of Flp 1. Flp ACTSIN

A

SITE-SPECIFICMANNER

Disruption of the FLP open reading frame abolishes the isomerization of the 2-pm plasmid (8).It was not certain whether the recombinational inversion event was due to the occurrence of general meiotic or mitotic recombination anywhere across the 599-bp inverted repeats, or whether recombination was restricted to a specific region therein (8).It was proved (16)that Flp is a site-specific recombinase by subjecting the plasmid to transposon and deletional mutagenesis. By analyzing the recombination of 2-pm plasmids bearing transposons at various positions throughout the inverted repeats, it was possible to localize the event to a restricted (65-bp) region within the entire 599-bp repeats. These conclusions were confirmed by deletion analysis. This region was later called the FRT site (Flp Recognition Target (17)).The establishment of an in vitro assay for the Flp recombinase confirmed these in vivo studies (18) and has led to a detailed examination of the mechanism of action of the protein by three groups. These studies form the bulk of this essay. AND REGULATION OF FLP 2. FUNCTION

a. The Futcher Model. The 2-pm plasmid attains a high copy-number even though its origin fires only once during the cell cycle (11).This paradox was addressed by the model of Futcher (19), who postulated that Flpmediated recombination plays an important role in plasmid amplification (Fig. 2). Divergent bidirectional replication begins at the origin, and replication of the nearby FRT site gives a theta (Cairns) replication intermediate that contains three FRT sites (Fig. 2A and 2B). Flp-mediated recombination inverts one replication fork with respect to the other with the result that the formerly divergent replication forks now are oriented in the same direction. As the forks “chase” one another around the circle, multimers of the 2-pm plasmid are generated. The FLP recombinase can then catalyze recombina-

56

PAUL D. SADOWSKI

A

m 2 - p

circle

replication

B

narltimeriration

FIG. 2. The Futcher rnodel (19) for amplification of the 2-pm plasmid. DNA replication begins at ori and bidirectional replication proceeds past an adjacent FRT site (A, B). Flp mediates recombination between the unreplicated and one newly replicated FRT site (C) with the result that the previously divergent replication forks are now oriented in the same direction (D). The forks “chase” one another around the circles, generating multimers of the plasmid by a rolling-circle mechanism (E). These multimers can then be resolved to monomers via Flpmediated recombination across two directly oriented FRT sites. (Reproduced from Julie Dixon’s 1994 Ph. D. thesis by permission.)

THE

Flp

57

RECOMBINASE

tion across two directly oriented FRT sites to generate monomeric plasmid molecules. This model received experimental support from other studies (20, 21). A computer simulation of the possible replication intermediates generated by Flp revealed a much more complicated array of structures than portrayed in Fig. 2 and the original Futcher model (22). These studies also suggest that up to six copies of the plasmid can arise during each cell generation.

b. Regulation Expression of the FLP Gene. Excessive amplification of the 2-pm plasmid is toxic to yeast cells (21). Because overproduction of the Flp protein leads to uncontrolled amplification of the 2-pm plasmid (23),it is not surprising that expression of the FLP gene is subject to tight control. Transcription of FLP is negatively regulated by the products of R E P l and REP2 and positively regulated by the ORF-D ( R A F ) product (21,24,25).The REP1 and REP2 gene products also negatively regulate expression of their own genes and the ORF-D ( R A F ) gene. Because the REPl and REP2 gene products are presumed to interact with the STB site, two sequence motifs (A and B) in the FLP-REP2 intergenic regions, in the REPl gene, and in the STB locus are possible DNA binding sites for the REPl and REP2 gene products (26). However, direct demonstration of a DNA-binding activity of either product has not been reported. The Repl protein appears to be located in the karyoskeleton subfraction of the yeast nucleus, where it may function in partition of the 2-pm plasmid (27). Thus the 2-pm plasmid possesses a rather sophisticated mechanism for control of its copy number. When the copy number drops due to inefficient partition during cell division, the levels of the Repl and Rep2 proteins drop. This stimulates transcription of the ORF-D gene and of the FLP gene. The Orf-D protein further stimulates expression of the FLP gene. The Flp protein then converts the theta DNA intermediate to a rolling circle with an attendant rise in the copy-number. Increased levels of the Repl and Rep2 proteins facilitate partition of the plasmids from mother to daughter cells while repressing the expression of the FLP and R A F genes. Possible variation of expression from the 2-pm plasmid during the cell cycle has not been examined, although passage through stationary phase seems to inhibit spontaneous loss of the plasmid (4).

II. Flp Is a Conservative Site-specific Recombinase There are two categories of site-specific recombination: conservative sitespecific recombination and transpositional recombination. Conservative site-

58

PAUL D. SADOWSKI

specific recombination occurs at precise sequences in DNA without any gain or loss of nucleotides. This is in contrast to transpositional recombination, in which small or large sequences of the target or of the transposable element, respectively, are duplicated. [Several reviews of the site-specific recombination field have appeared recently (28-36).]With the availability of an in uitro assay and a source of pure Flp recombinase (18, 37-39), it soon became apparent that Flp is a conservative site-specific recombinase belonging to the integrase family.

A. The lntegrase Family of Site-specific Recombinases This family is named after the integrase protein of bacteriophage h (h-Int); this was the first conservative site-specific recombinase to be subjected to detailed biochemical analysis (40).There are some thirty Int family members that are grouped together because they share certain mechanistic and structural features (41, 42). Although there is only limited sequence homology, all family members have four absolutely conserved amino acids: arginine 191, histidine 305, arginine 308, and tyrosine 343 (the numbers refer to the amino-acid positions in the 2-pm Flp protein). It is assumed that these conserved residues underlie a common mechanism of catalysis of the breakage and reunion of the DNA target. All members of the Int family create a 5’-OH group at the site of the break in DNA and covalently attach to the adjacent 3’-phosphoryl group (28, 33, 35, 43). The nucleophile for the breakage reaction is the conserved tyrosine of the recombinase. The other three conserved residues (the RHR triad) are thought to be involved in the catalysis of cleavage and strand reunion. The energy of the phosphodiester bond is presumed to be conserved in the phosphotyrosine linkage; this accounts for the fact that sitespecific recombinases do not require an external energy source. Members of the integrase family promote inversion of the DNA between two inversely oriented target sequences (e.g., 2-pm Flp), excision of the DNA between two directly oriented target sites (e.g., phage P1-Cre, phage X-Int, Flp), and intermolecular recombination between two molecules, each of which harbors a target site (e.g., Cre, Flp, Int) (8, 18, 44-48). The two target sequences are usually brought together in a synaptic complex by a process of “random collision” (49-51), and the substrates may be supercoiled circular molecules or relaxed circular or linear molecules. Thus the topological and structural requirements of the substrates are far less stringent than those of the members of the resolvase/invertase family (see Section 11, B). The members of the integrase family promote such diverse biological functions as integration and excision of phage genomes into and from the host chromosome (h-Int) (52),resolution of dimeric chromosomes of a low-

THE

Flp

59

RECOMBINASE

copy-number plasmid to aid plasmid segregation (Cre protein of prophage P1) (53, S),resolution of replicating Col-El plasmids (Xer-C and -D proteins) (55),and amplification of yeast plasmids (2-pm Flp and Flp-like proteins of other 2-pm-like plasmids of yeasts) (19).

B. The Resolvase/lnvertase Family of Site-specific Recombinases As the name implies, members of this family are responsible for resolution of cointegrate intermediates formed during transposition of transposons (e.g., Tn3, yS (56, 57) or the inversion of segments of DNA (e.g., the Gin invertase that changes the host range of phage Mu) (58).There are at least 44 family members (M. Boocock and S . Rowland, personal communication) as defined by sequence homology. These recombinases are classified as conservative by virtue of the fact that they rearrange DNA without gain or loss of nucleotides, and conserve the energy of the broken phosphodiester bond in a covalent recombinase-DNA intermediate. However, the nucleophile is a serine that attaches covalently to the 5’-phosphoryl end of the broken DNA (cf. integrases whose nucleophilic tyrosine attaches to the 3’-phosphoryl group). Furthermore, the structural and topological requirements of the resolvases/invertases are much more stringent than those of the integrase members. The resolvase/invertase members assemble an ordered synaptic complex in a precise way so as to achieve the correct functional result, viz., excision (resolution) for the resolvases, inversion for invertases. The substrate must be supercoiled and the target sites must be in the proper orientation with respect to one another, i. e., in direct orientation for resolvases and inverted orientation for the invertases. The proper assembly of the synaptic complex is aided by the energy of the supercoiled substrate, and the products have a very precise topology (e.g., the products of resolution by Tn3 resolvase are always singly linked catenanes). This topology reflects the precise manner in which the recombinase/substrate complexes are assembled and contrasts with the “random collision” mechanism used by the integrase family. [There are several recent reviews of site-specific recombination giving further details of the differences between the two major families of conservative site-specific recombinases (e.g., 28, 31, 33-36).]

111. Flp-mediated Recombination: The in Viiro Reaction A detailed knowledge of the mechanism of action of FLP has come from in uitro studies using well-characterized DNA substrates and purified protein.

60

PAUL D. SADOWSKI

A. The FRT Site The 2-pm plasmid contains two FRT sites, each of which is embedded in one of the two 5 9 9 - b ~inverted repeats. The FRT site (Fig. 3) consists of three 13-bp repeats (symmetry elements a, b, and c) (59, So). Symmetry element c is in direct orientation with respect to element b and has the identical sequence. Elements a and b are in inverted orientation surrounding the 8-bp (A.T)-rich core region; elements a and b contain one nonidentical base pair (positions +6 and -6, Fig. 3). The sites of cleavage by Flp are at the junction of the core and symmetry elements a and b (Fig. 3, vertical arrows). The core region is not symmetrical; this asymmetry dictates the directionality of the reaction, i.e., recombination between inverted FRT sites causes inversion of the DNA between them (as in the 2-pm plasmid, in uiuo), whereas recombination between directly oriented sites leads to excision of the DNA between them (61). The length of the core region is also important; increasing its size by as little as 2 bp inactivates the FRT site. The sequence of the bases of the core regions of the FRT site is not important for recombination, provided the cares of the two participating sites are identical (61, 62). Another asymmetry in the FRT site is the presence of pyrimidinerich tracts that radiate out from the core region on opposite strands (Fig. 3, stippled bars). The integrity of these tracts is important for recombination, although their precise function is unknown (63). Although the FRT site comprises 48 bp (59), the entire sequence is not required for recombination in uiuo (64) or in uitro (65, 66). Symmetry element c is dispensable and has no obvious function. Up to 7 bp of the peripheral part of symmetry elements a or b can be deleted with little effect on the reaction in uitro. Senecoff et al. (67')changed systematically every position of

THE

Flp

RECOMBINASE

61

the symmetry elements and found that certain changes have profound effects on the affinity of the Flp for the target (see also 68). Especially important were the G residue at position -5, the A residues at positions -6 and -7, and the G residue at position -11 (bottom strand of element b as shown in Fig. 3). Generally, the effects of the mutation were more marked when both symmetry elements of an FRT site bore the mutation than when a single mutation was present. Likewise, the effect was more marked when recombination between two mutated FRT sites was measured than when recombination took place between one wild-type site and a mutated site. Where examined, the changes led to reduced binding by Flp. The importance of various nucleotide positions in stages of the reaction subsequent to DNA binding is not known.

B. The Flp Protein The FLP gene encodes a 423-aminoacid protein, Flp. Its in vitro activity was first detected in extracts from yeast cells bearing a plasmid containing the FLP gene expressed from the ADHl promoter (18). However, much higher yields were subsequently obtained from Escherichia coli cells expressing the FLP gene from strong, inducible promoters (37-39, 68). The yields of protein from the T7 gene-10 promoter are about 1 mg of purified protein from 1liter of E . coli cells, and homogeneous preparations of Flp are obtainable with a relatively simple procedure (39). The N-terminal methionine is removed in E . coli (37). The protein can be cleaved specifically by proteolysis into an aminoterminal domain of 13 kDa (P13) and a carboxy-terminal domain of 32 kDa (P32). Further digestion of P32 yields an -21-kDa domain (P21). Certain functions have been assigned to each of these domains (see 39, 69, 70) (see Section IV, H).

C. Assays of Flp Recombinase Activity The purification of Flp depends on the availability of convenient assays for its activity. The first assays measured the overall recombination activity on DNA substrates containing FRT sites. Initially, the inversion of a DNA segment in a circular plasmid containing two inverted FRT sites was detected as a change in position of restriction enzyme sites (18). Subsequent assays measured excision of DNA from a linear molecule or intermolecular recombination (Fig. 4) (37,38,47). The variety of substrates used reflects the absence of strict topological requirements (supercoiling is not needed) and the fact that the enzyme performs inversion or excision equally efficiently (51). There are no specific factors required other than buffer and a monovalent or divalent cation; specifically, no accessory protein factors are required (18).The reaction proceeds well at several temperatures and is ex-

62

PAUL D. SADOWSKI

A INVERSION

B EXCISION mbcdm

-

1

+

c INTERMOLECULAR

-

-

L

abed.

mbcdm w

-

L

=-+

Fw r

4

FIG. 4. Assays of the Flp recombinase. The substrates are represented by horizontal lines and the orientation of the FKT sites by horizontal arrows. The DNA sequence between the FRT sites is shown by the letters a, b, c, d, and e. (A) Flp inverts the DNA between two inverted FKT sites and (B) excises a circular molecule from between two directly oriented FRT sites. Each product contains one FRT site. (C) Flp also mediates recombination between two molecules that contain an FKT site. All substrates are shown here as linear molecules for the sake of simplicity but Flp acts equally well on supercoiled or relaxed circular molecules. The products of the Flp reaction can be detected readily by agarose or polyacrylamide gel electrophoresis.

tremely efficient; yields of recombinant products can approach 100%. Although Flp, like other site-specific recombinases, is usually used in stoichiometric excess over the substrate, there are conditions under which Flp behaves as an enzyme (turnover rate -0.12 min-1 per Flp monomer) (71). The simplicity and efficiency of the reaction have made the FLP system attractive for detailed mechanistic studies and as a reagent for genetic engineering in uiuo in several organisms.

IV. The Mechanism of Action of the Flp Protein The wealth of mechanistic detail uncovered for site-specific recombinases has depended on the use of in vitro assays that detect partial activities of the recombinase. Several of these assays have been applied to the Flp system and some novel ones have been developed using the Flp system. The general steps in the Flp recombination reaction are shown in Fig. 5. The details of

THE

Flp RECOMBINASE I

A

BINDING

W

c

a

b

c

111

+s +-

+-0

63

a

b

00 0 c

b

a

a

€3

C

a

C

D

CLEAVAGE, STRAND EXCHANGE, AND u ( u f K I N

E c

b

a

c

b

a

FIG.5 . Steps of the Flp recombination reaction. (A) DNA binding. A DNA molecule (two horizontal lines containing an FRT site, horizontal arrows labeled a, b, c, and open box) is shown: one, two, or three molecules of Flp (ovals) bind to the symmetry elements to form complexes I, 11, and 111 that are detectable by gel mobility-shift electrophoresis. (B) DNA bending. Concurrently with DNA binding, Flp induces bends in the FRT site. The type-I bend occurs when Flp binds to a single symmetry element (left). The type-I1 bend (middle) results from Flp molecules bound across the core (middle) and is much more acute than shown here. A type-I11 bend (not shown here) is thought to result from protein-protein interactions between Flp molecules bound to elements b and c. (C) Synapsis. Two FRT sites loaded with Flp molecules (left) are brought together by protein-protein interactions (right) in a synaptic complex. For simplicity, the bends have been omitted in this and subsequent steps. (D) Cleavage, first strand exchange, and ligation. A pair of nicks on homologous strands (here, the bottom strands, vertical arrow) are introduced by Flp (left); the strands are swapped and ligated to the homologous strand of the partner duplex to form a Holliday intermediate (right). (E) Resolution. A second round of nicking, strand exchange, and ligation resolves the Holliday intermediate into two recombinant molecules (right). Note that the arms of the two molecules have been exchanged and that the core regions of the two recombinants are heteroduplex.

64

PAUL D. SADOWSKI

each step of the reaction, including assays, are dealt with individually in this section.

A. DNA Binding Site-specific binding of Flp to the FRT site was detected initially by DNase and chemical footprinting (59, 72). Flp protects the entire FRT site (-50 bp) from DNase digestion (59, 60). Subsequently, extensive use has been made of gel mobility-shlft assays (Fig. 6) combined with various chemical and enzymatic footprinting techniques (73, 74). Gel mobility-shift assays reveal that the symmetry elements of the FRT site are the DNA binding targets of Flp (73) (Fig. 6). Flp forms three distinct protein-DNA complexes that represent binding of monomeric Flp to one, two, or three of the symmetry elements. The initial binding to a single symmetry element is of a “loose” nature in that the DNA of the first complex is not protected against DNase. However, exonuclease footprinting showed that Flp first localizes to symmetry element b (74). A second molecule of Flp then binds to symmetry element a to form complex II. The symmetry elements a and b as well as the core in complex I1 are protected against DNase digestion as well as chemical modification. The entire FRT site is protected against chemical modification or DNase when complex I11 is analyzed. A detailed footprinting analysis revealed that Flp makes extensive contacts with bases in both major and minor grooves throughout the entire symmetry elements (75). Furthermore, phosphate contacts that are important for recombination are present opposite the sites of cleavage (72, 74).No contacts important for DNA binding are present in the core region; moreover, two guanine residues of the core sequence are hypermethylated by

0 0 f

-

c

-

CII

S

FIG.6. Schematic representation of a gel mobility-shift assay with Flp. The substrate (S) contains an FHT site with three 13-bp symmetry elements (horizontal arrows). Binding of Flp to the symmetry elements generates complexes I, 11, and 111 (CI, CII, and CIII, respectively), which can be separated by electrophoresis on a nondenaturing polyacrylamide gel shown schematically at right; direction of electrophoresis is top to bottom.

THE

Flp

65

RECOMBINASE

dimethyl sulfate on DNA binding by Flp. These hypermethylations are associated with Flp-induced bending of the FRT site (76). In fact, it is likely that the extensive contacts of Flp with the FRT site are involved in DNA bending (see below). Active Flp protein can be synthesized in uitro. This technique has been used to study the effect of mutations on the DNA binding activity of Flp (77). The amino-acid sequence of the protein reveals no obvious DNA-binding motif (e.g., helix-turn-helix, helix-loop-helix, zinc finger, etc. j and insertional mutagenesis shows that many regions throughout the protein are required for DNA binding to the FRT site. Partial proteolysis identified a 13kDa amino-terminal fragment (P13)and a 21-kDa internal peptide (P21); the latter bound in a site-specific manner to the symmetry elements of the FRT site (39). Somewhat larger peptides were isolated after V8 protease digestion (69). Although the affinity of P21 for the FW site was approximately the same as intact Flp, DNA bending was impaired. Less extensive digestion of Flp permitted the isolation of P32, a 32-kDa carboxy-terminal domain that also retains the site-specific DNA binding property of Flp (70). P13 stimulates the binding of P32 to the symmetry element. Subsequent footprinting and chemical cross-linking studies (78)show that the amino-terminal 13-kDa peptide of Flp (P13)binds to the core-proximal 4 bp of a symmetry element, whereas the carboxy-terminal32-kDa peptide (P32) binds to the core-distal 9 bp of a symmetry element (Fig. 7). The precise residues that contact the bases of the symmetry elements are

CORE-DISTAL AREA CORE-PROXIMAL AREA

0Pi3

FIG. 7. Schematic representation of the binding of P13 and P32 to a symmetry element. P13 and P32 bind to the core-proximal or core-distal regions (top and middle). Protein-protein interactions trigger a conformational change in P32 (Reproduced from 78, with permission of the JouniaI of Biological Chemistry.)

66

PAUL D. SADOWSKI

not known, although several mutations that adversely affect DNA binding have been isolated. A more accurate picture awaits the solution of the physical structure of Flp bound to its target DNA or the isolation of mutations of Flp that alter DNA binding specificity.

B. DNA Bending A genetic screen for mutations that cause a recombination-negative phenotype led to the isolation of two mutations in amino acid position 328 of Flp (76). Both mutant proteins exhibited a defect in DNA bending when studied by permutation analysis combined with gel mobility-shift assays (Fig. 8). Detailed analysis (79) revealed that Flp induces three types of bend in the FRT site: (1) a type-I bend of -60" occurs when a single Flp molecule binds to one symmetry element; (2) a type-I1 bend of >144" occurs when two Flp molecules bind to symmetry elements a and b (the magnitude of the type-I1 bend is larger than simply the sum of two type-I bends and is thought to be due to protein-protein interactions across the core); (3) a type-111 bend occurs when Flp binds to symmetry elements b and c and is thought to be due to protein-protein interactions between Flp molecules bound to symmetry elements b and c. We have recently conducted a detailed study of the bend centers using circular permutation analysis and the direction of the Flp-induced bends,

FIG.8. Schematic representation of a circular permutation assay to detect D N A bending. Two D N A molecules of identical base composition (right) have the FFiT site in the middle (top) or at the end (bottom). Binding of Flp (ovals) induces a bend of the same magnitude but the shortened end-to-end distance of the top molecule produces a molecule that migrates more slowly with respect to the bottom molecule. As shown schematically on the left, the molecule with the bend in the middle (M) migrates more slowly than the molecule bearing the bend at its end (E). Both substrates (S)migrate at the same rate. Such assays can be used to measure the position of the bend (center) and the magnitude of the bend.

THE

Flp RECOMBINASE

67

using phasing analysis (80-82; K. Luetke, unpublished). The type-I bend center, when assayed on a substrate containing a sole symmetry element, resides at nucleotide -17, well toward the core-distal end of the symmetry element. Surprisingly, when the bend center of a type-I bend is determined using a full FRT site, the bend center is now at position -6, a shift of 11 bp. The type-I1 bend induced by the wild-type Flp is around the middle of the core, between the two pyrimidine tracts (position -1). The type-I1 bend produced when one of the two Flp molecules of the complex is covalently attached to the FRT site actually has a different bend center depending on which strand of the FRT site is cleaved (83; K. Luetke, unpublished). When Flp is attached to the bottom strand (i.e., next to symmetry element a), the type I1 bend is in the middle of the core (position +2), whereas when the top strand is cleaved, the bend center shifts to position -9. It is possible that the apparent shift in the center of the type-I1 bend influences the direction of the recombination reaction (see Section IV,E, 1). Experiments using phased substrates (81)measure the direction of the type-I1 bend (i.e., whether toward the major or minor groove). These experiments show that the direction of the type-I1 bend is toward the major groove (K. Luetke, unpublished). Flp mutations that cause defects in the type-I1 bend have been localized to many regions of the protein. An insertion of four amino acids at position 115 (ins 115) causes a defect in the type-If bend, as do point mutations at positions 60, 309, 328, 329, 336, 339, 343, and 345 (77, 84, 85). Although mutations that affect DNA bending inhibit recombination in v i w and in uitro, it has been difficult to discern a common defect in the recombination pathway that would link the defect in bending to the recombination-defective phenotype. For example, two mutations at position 328 abolish strand cleavage whereas Flp ins 115 is cleavage competent; all these proteins have a marked defect in the type-I1 bend. The mutations at 309-345 are all associated with defects in strand cleavage and dimerization of half-sites, but some show normal resolution of Holliday intermediates, indicating that the catalytic activity for cleavage and ligation is intact. It is probable that DNA bending is needed for proper assembly of the Flp synaptic complex that precedes strand exchange. It is also probable that the planar tyrosine rings at positions 60 and 343 play a structural role in bending, because replacement of these residues with phenylanine does not produce as severe a bending defect as does substitution with serine. There are no mutations known that affect the type-I bend, although the P21 proteolytic fragment is severely deficient in inducing this bend. Apparently, residues 128-145 and/or the extreme carboxy terminus of Flp are needed for the type-I bend. Little is known about the type-I11 bend.

68

PAUL D. SADOWSKI

C. Synapsis For recombination to take place, the two FRT sites must be brought together by the process called synapsis. Synapsis is presumed to be mediated by protein-protein interactions between Flp molecules bound to the FRT sites.

1. TOPOLOGICAL ANALYSIS By analyzing the topology of the products of a recombination reaction, it is possible to make certain deductions about how the recombining sites are brought together in synapsis. Flp, like other members of the integrase family, carries out synapsis by a “random collision” mechanism (51, 86). When circular supercoiled substrates are recombined, most of the supercoils between the target sequences (interdomainal supercoils) are trapped in the products (knots or catenanes) and the complexity of the products depends on the superhelix density. The random collision mode contrasts with the much more ordered synapsis shown by the resolvase-invertase group, whose synapsis excludes interdomainal supercoils from the synaptosome. The synapsis of these recombinases assures that only properly oriented sites will recombine (e.g., the res resolvase targets must be in direct orientation for resolvase to act; invertase targets must be in inverted orientation).

2. ASSAYOF

SYNAPSIS

It would be ideal ifa synaptic complex that was a true intermediate in the reaction could be isolated. Although attempts to isolate such intermediates have not succeeded, it is possible to trap a structure that has the properties of a Flp-mediated synaptosome using the protein cross-linking agent, glutaraldehyde (87, 88) (Fig. 9). These structures show that synapsis can take place in either the parallel or antiparallel configuration (parallel means that the FRT sites are aligned in the same direction). Thus it seems that homology of the cores of the FRT sites is not required for synapsis. Furthermore, DNase footprinting experiments and deletion analyses show that symmetry elements a and b of both FRT sites must be present and occupied by FLP. The nature of the protein-protein interactions that mediate synapsis by Flp is not known, because there are no known mutations that specifically affect synapsis. A multimeric intermediate of the Int reaction that is mediated at high levels of Int protein may be a model for synapsis in that system (89). It is uncertain whether synapsis precedes cleavage, or vice versa. It is possible that assembly of the synaptosome precedes trans cleavage (see Section IV, D,3) and indeed that the trans-cleavage mechanism would assure that a properly configured synaptic complex had been assembled before productive cleavage could occur.

THE

Flp

RECOMBINASE

~--

69

..... ... . ...... . .. .

rynapris

FIG. 9. Assay of synapsis by Flp. A linear DNA molecule bearing two directly oriented FRT sites is bound by Flp to form a synaptosome. The structure is stabilized by cross-linking with glutaraldehyde and can be detected by its slow mobility on a nondenaturing polyacrylamide gel. (Reproduced from 87, with permission of the Journal of Molecular Biology.)

D. Strand Cleavage 1. SITESOF CLEAVAGE: BRMINI DNA cleavage was first detected using denaturing polyacrylamide gels of end-labeled, FRT-site-containing substrates (59, 60, 72). Flp protein was found to cleave the FRT site at the margins of the 8-bp core (see vertical arrows in Fig. 1).Flp covalently attaches to the 3’-phosphoryl group at the site of the break and leaves a free 5’-OH end (59, 60). Covalent Flp-DNA complexes can be detected using SDS-polyacrylamide gel electrophoresis (Fig. 10) (70, 88). Flp attaches to the 3’-phosphate group via a tyrosine residue (90). Other integrase proteins (e.g., A-Int, P1-Cre) also generate 5’-OH and 3’-P04-Tyr termini. An amino-acid sequence comparison of several integrase family members (41) revealed three absolutely conserved residues, one of which is a tyrosine (position 343 in Flp). Change of this residue to a phenylanine abolishes recombination and strand cleavage (91) and direct amino-acid sequencing shows indeed that Flp attaches to the DNA via the tyrosine hydroxyl of amino-acid 343 (92). Similar studies have been done for the A-Int protein (93). The Int family members thus use a mechanism for strand cleavage that is also used by topoisomerases (94). These enzymes break and join DNA and use an active-site tyrosine to attach to the DNA. As with conservative sitespecific recombinases, it is assumed that the energy of the tyrosinephosphate bond is used to rejoin the phosphodiester backbone.

2. DIMERIZATION OF HALF-FRTSITES Partial FRT sites consisting of a single symmetry element and part of the core engage in a reaction called “half-site dimerization” (95, 96) (Fig. 11). Multimers that consist of two or more half-sites are held together by

70

PAUL D. SADOWSKI

5,

b,

,a

3‘

++ FLP

FLP FIG. 10. Cleavage of an FRT site and covalent attachment by Flp. The FRT site (endlabeled; asterisk) is cleaved by Flp to yield a 5’-OH end. Flp (filled oval) is covalently attached to the 3’-phosphorvl terminus via tyrosine 343. This slowly migrating complex can be detected by SDS-polyacrylamide gel electrophoresis.

protein-protein interactions between Flp molecules bound to the half-sites; these protein-protein interactions are called cross-core interactions. The multimers may contain two, three or four such half-sites. The formation of these multimers requires cleavage of the haif-site; mutant Flp proteins that are unable to cleave the DNA cannot form half-site multimers (95). In a cross-core dimer, one-half of the half-FRT sites are cleaved; in the trimer about two-thirds of the half-sites are cleaved, and in the tetramer all of the half-sites are cleaved. It has been proposed that covalent attachment of Flp to the DNA triggers a conformational change in the Flp that stabilizes the half-site dimer by strong protein-protein interactions. Consistent with this is the finding that a cleavage-competent mutant Flp protein (R191K2), which is unable to form half-site dimers, can be complemented by Flp Y343F for formation of dimers (83). It appears that Flp R191K, which can readily cleave a full FRT site, is unable to cleave a half-site

Flp R1SlK indicates an Flp protein in which the arginine at position 191 has been changed to lysine. This notation is used for other mutant proteins in this essay.

THE

71

Flp RECOMBINASE

FIG.11. Half-site dimerization reaction. The half-FRT sites are shown as horizontal lines and arrows. Flp molecules (ovals) bind to the symmetry elements; one Flp molecule (stippled) cleaves and covalently attaches to the half-site (filled circle). A conformational change (here shown by the tilting of the oval) allows strong protein-protein interactions (striped lines) to stabilize the dimer. Although not shown here, the cross-core dimer contains a type-I1 bend that is similar in magnitude to the type-I1 bend of an intact FRT site (K. Luetke, unpublished).

due to defective protein-protein interactions needed to bring the two halfsites together. This defect can be corrected if one of the two half-sites contains bound Flp Y343F. The half-site can participate in Flp-induced strand-exchange reactions with a full FRT site (88,96,97). More importantly, the assay has been used to establish complementation between certain mutant Flp proteins (83, 98) (Fig. 12). These assays have been useful in dissecting the mechanism of

FLP-Y343F (lig t cle - ) (1)

5‘r

FLP (lig

+

&\CT

(2)

FLP-Y 343 F

, D

m,

R191K 3’ (3)

TCT-•

OH

dimerizatlon and trans- c la a va ge

-

-, cle+)

I-

(4)

TCT

FLP

- R191K

-*

+ *Aq &TCT

cis-ligation

FIG. 12. Use ofhalf-site dimers to demonstrate cleavage in trans. Two mutant Flp proteins (Flp Y343F and Flp R191K) are bound separately to the half-sites (top). Flp Y343F has a ligation+, cleavage- phenotype whereas Flp R191K is ligation-, cleavage+. Neither protein by itself can dimerize half-sites. After the two half-sites are mixed (middle), dimerization and trans cleavage occurs. Flp R191K donates its nucleophilic tyrosine to the left-hand FRT site to catalyze cleavage (“in trans”)with the release of the TCT trinucleotide. Ligation of the half-site by Flp Y343F (“in cis , see below) leads to a hairpin molecule (bottom), which is detected by denaturing polyacrylamide gel electrophoresis. The size of each oligonucleotide (1, 2, 3, and 4) permits distinction of a hairpin arising from the left-hand half-site from that arising from the right-hand site.

72

PAUL D. SADOWSKI

strand cleavage by Flp, where they have revealed a mechanism of strand cleavage by Flp (“cleavage in trans”) that appears to be unique to the Flptype recombinases (98).

3. Trans CLEAVAGE The half-site complementation assay has been used to demonstrate a remarkable feature of the Flp reaction, namely, that cleavage of the FRT site takes place “in trans” (98, 99). This means the Flp active site is actually composed of two molecules of Flp (35).One molecule binds to the symmetry element adjacent to the cleavage site and activates the scissile phosphodiester bond for cleavage (Figs. 12 and 13).A second molecule of Flp, bound to a symmetry element in trans to the cleavage site, donates the nucleophilic tyrosine (position 343) that actually executes the nucleophilic attack. These experiments were done by binding two mutant Flp proteins separately to two half-sites, then mixing the occupied half-sites together. The results convincingly showed that the half-site that underwent cleavage (and subsequent strand transfer) was the one to which the cleavage-defective mutant protein (Flp Y343F) had been bound. In other words, a mutant Flp protein that contained an intact tyrosine 343 residue provided this residue in trans to bring about cleavage. Further studies support the bimolecular nature of the Flp active site (99). Flp molecules bearing more than one mutation in the RHR triad can nevertheless supply the nucleophilic tyrosine in trans. This surprising result has been extended to include at least one other Flp-like recombinase, the ARg recombinase of Zygosuccharmyces rouxii (100).Moreover, an Flp protein bearing alterations in one residue of the RHR triad and tyrosine 343 cannot be complemented by wild-type Flp protein. Although it has not been possible to establish unequivocally whethe r the tyrosine donor is located in a trans-horizontal, trans-vertical, or transdiagonal position with respect to the cleavage site (98) (Fig. 13B), the transhorizontal position is favored (98a). Attempts to extend the trans-cleavage paradigm to other integrase family members have yielded mixed results. Experiments using the A-Int protein with suicide att L suicide substrates to trap cleavage intermediates seemed to support the trans-cleavage mechanism for the top strand only (101).However, use of Holliday intermediates containing Int binding sites of differing specificity clearly support the cis-cleavage model (102).The mechanism of cleavage by the Cre recombinase has not been examined. Beyond the integrase family, the y8 resolvase (of the resolvase-invertase family) uses a cis-cleavage mechanism [i.e., the resolvase molecule bound adjacent to the cleavage site contributes the nucleophile (a serine) that executes the strand cleavage (36, 103, I @ ) ] .

THE

73

Flp RECOMBINASE

i 1 ) TRANS

HORIZONTAL

FIG. 13. Trans cleavage by Flp on a full-FRT site. (A) Two FRT sites are aligned in parallel with Flp molecules (1-4) bound to the a and b symmetry elements. Molecules 1 and 4 have activated the cleavage sites adjacent to symmetry elements b (open circles) and molecules 2 and 3 are donating their position-343 tyrosines in trans to the cleavage pockets of 1 and 4. As explained in B, molecules 1 and 3, 2 and 4, are situated trans-diagonal with respect to one another. (B) Definitions of trans cleavage. Where the FRT sites are in parallel alignment, as in A, molecules 1 and 3 are situated trans-diagonal (i). However, trans-vertical (ii) or transhorizontal (iii) cleavage could also occur.

Although the trans-cleavage mode may be peculiar to Flp-like recombinases, it is worth considering whether the mechanism can be extended to other DNA-cleaving enzymes, such as topoisomerases (94), +X-174 gene-A protein (105), and restriction enzymes (106).

74

PAUL D. SADOWSKI

4. THE CHEMISTRY OF STRAND CLEAVAGE As explained in Section IV,D,l, tyrosine 343 is the key nucleophile that brings about cleavage by Flp. Half-site complementation assays have been used to define residues of the Flp protein needed for cleavage (107). Mutations that affect amino acids 339-345 cause the altered Flp proteins to be cleavage defective in this assay. This may mean that the amino acids surrounding tyrosine 343 are responsible for presenting the tyrosine to the scissile phosphodiester bond that lies within the “cleavage pocket” of the Flp molecule that is bound in cis to the cleavage site. However, a direct role for these residues in activating the tyrosine hydroxyl is not excluded. Mutant Flp proteins bearing alterations in and around the other three conserved residues (arginine 191, histidine 305, arginine 308, the RHR triad) can donate the nucleophilic tyrosine properly in these half-site complementation assays, and most behave as if they are defective in the strandjoining step of the reaction. Nevertheless, two of these three conserved residues have also been implicated in the chemistry of strand cleavage (R191 and R308) (108,109).Although certain mutations in these two residues allow the Flp proteins to retain this cleavage activity (e.g., R191K, R308K), changes to other amino acids (e.g., R191S, R191E, R308G, R308Q, R308P) abolish the cleavage activity of the resulting Flp proteins. These “steparrest” mutants are adduced as evidence for a direct role of these conserved residues in strand cleavage. Further evidence comes from studies that used the exogenous nucleophile H,O, to stimulate strand cleavage (110).When an FRT site is incubated with the cleavage-defective Flp Y343F protein in the presence of 1 M H,O,, cleavage of the phosphodiester bonds normally cleaved by Flp is observed. Presumably activation of the scissile phosphodiester bonds still occurs. The fact that such cleavages do not occur in the presence of Flp proteins bearing mutations in both of the arginines of the RHR triad is taken as evidence that these residues are involved in the catalysis of strand cleavage, possibly by the activation of the phosphate. The histidine residue at position 309 is also highly conserved among the integrase family members and is a possible candidate for activation of the hydroxyl of tyrosine 343. Because trans cleavage by Flp depends on the nucleophilic tyrosine, it was of interest to ask whether tyrosine or tyrosine mimetics could act as nucleophiles when provided from the solution (Fig. 14). Surprisingly, tyrosine and tyramine could complement the cleavage-defective phenotype of Flp Y343F in a cleavage-ligation assay (110).Further studies of the range of chemicals that can substitute for tyrosine have identified additional aromatic compounds that stimulate cleavage in this assay (111).These include p-aminophenol, p-cresol, 3,4-dimethylphenol, tyramine, p-nitrophenol, phenol,

THE

Flp

-

75

RECOMBINASE

b

5'*

pTCT

-. . 5'

FLP Y343F

0 5'*

FLP Y 3 4 3 F o

P t Y r (7)

+TCT

5'

-+ ' 5

P

\

+

t

y

r

+

n

FLP Y343F

FIG. 14. Trans complementation of the cleavage defect of Flp Y343F by tyrosine from solution. Flp Y343F binds to the half-site activating the scissile phosphodiester bond (top). Tyrosine from solution acts as a nucleophile and attaches covalently to the 3'-phosphate (hypothetical), releasing the TCT trinucleoside. Ligation by Flp Y343F promotes nucleophilic attack by the 5'-hydroxyl group, with resealing of the phosphodiester bond to form a hairpin structure.

resorcinol, and tyrosine (in decreasing order of activity). All of the active compounds have a benzene ring and a hydroxyl. Substituents on the 3, 4, or 5 positions of the benzene ring stimulate the activity. Several other aromatic compounds, nucleophiles, and amino acids were inactive. It is not certain whether these chemicals promote cleavage by the same mechanism as tyrosine 343, i.e., whether they attach covalently to the 3'-phosphoryl group.

E. Strand Exchange 1. THE HOLLIDAYINTERMEDIATE After assembly of the synaptic complex and cleavage of the two substrates, the cleaved molecules must be apposed to reciprocal partners in such a way as to facilitate recombination, that is, the joining of one part of a DNA molecule to another part of a different molecule, and vice versa. If the cleaved molecules simply rejoined without strand exchange, no recombi-

76

PAUL D. SADOWSKI

nants would occur. Reciprocal recombination of two duplex molecules requires four strand-breakage events and strand exchange, followed by strand joining (ligation). Strand exchange may, in principle, occur by one of two mechanisms.

1. In the one-step mechanism in which the two DNA molecules are cleaved by a double-strand break, the ends are then rotated 180”with respect to one another and rejoined to form recombinants. 2. Alternatively, the reaction may take place in two stages. First, a pair of nicks is introduced into identical positions on the two recombining molecules and two single strands are exchanged and rejoined. This forms a X-structure or Holliday intermediate (112) (Fig. 5). In the second step the X-structure is resolved by a second pair of nicks on the opposite strands and a second pair of single-strand exchanges. The resolvase-invertase group uses the one-step mechanism, whereas

the integrase family uses the two-stage mode. The existence of a Holliday intermediate was initially suggested by in uiuo studies of A-Int promoted recombination (113, 114). Then it was shown (115) that artificial Holhday-like molecules containing A att (attachment) sites could be resolved by Int in a site-specific manner. The existence of a Holliday intermediate in the in oitro recombination reaction was demonstrated for A-Int (116, 117)and Cre (118) and then for the Flp recombinase (119,120) (Fig. 5D and E). The A and Cre systems have the unique property of initiating strand exchange with a unique strand (“top” strand), whereas the Flp recombinase seems not to have such an asymmetric strand preference (120). Interestingly, the use of the “top” strands by the A integrase is conditioned by the arm-type sequences of the target att site (121). The position of the type-I1 bends may influence strand exchange. If cleavage of the bottom strands occurs, the type-I1 bend in the center of the core might drive the first strand exchange whereas cleavage of the top strands might be reversed, because the bend center is to the left of the cleavage site (K. Luetke, unpublished).

2. TOPOLOGICAL STUDIES As described above, topological studies show that synapsis by Flp occurs by a random collision mechanism typical of the integrase family (51, 86). However, the direction of strand exchange is not random (Fig. 15). Flpmediated excisive recombination of a relaxed circle bearing two directly oriented FRT sites yields exclusively unlinked circles. A random direction of strand exchange would have been expected to give a mixture of unlinked and linked circles (catenanes). These results are supported by studies with re-

THE

Flp RECOMBINASE

77

FIG. 15. Direction of strand exchange during site-specific recombination. Two duplexes in a synaptosome are represented by two-ladder-like structures (solid and striped lines). The view looks at a cleaved duplex end-on. If the cleavage is on the C and C' strands, the C' strand must rotate through 90°, whereas the C strand must rotate through 270",here both in the counterclockwise direction. Topological studies of conservative site-specific remmbinases show that the direction of strand rotation is always the same.

laxed, knotted substrates. Likewise, studies with supercoiled substrates show that, with multiple rounds of recombination, topological products of increasing complexity accumulate (knots or catenanes with increased numbers of nodes). Topological studies have also been useful in defining the sense of the exchange of strands. It is possible to visualize the crossings of duplex DNA (nodes) over one another in the products of a recombination reaction and thereby to assign an arbitrary sign to the node ("+" or "-"). These studies have invariably shown that the "sense" of the strand exchange (i.e., the direction of rotation of the recombining strands) is not random but takes place in the same direction for each recombination event (122, 123). Although such studies have not been done with Flp, the topological studies and the precedent from other systems make it likely that the direction of strand exchange during Flp recombination also has a unique sense.

F. Strand Ligation Strand ligation is formally a reversal of the strand-cleavage reaction. Cleavage is brought about by a nucleophilic attack of tyrosine 343 on the scissile phosphodiester bond with formation of a 3'-phosphotyrosine bond (35). In ligation, the nucleophile is a 5'-OH group of the DNA strand adjacent to the nick. This hydroxyl attacks the phosphotyrosine bond to reestablish a phosphodiester bond (Fig. 16). If the attacking 5'-OH comes from the same DNA molecule that bears the Flp-phosphotyrosine linkage, then the reaction simply reverses the strand breakage by Flp. Recombination will occur only if the nucleophilic 5'-OH group comes from a partner FRT site that has also been suitably nicked by Flp.

78

PAUL D. SADOWSKI

A

CLEAVAGE

5

-

P-

B

9

5’

3 -

FLP -OH,. -OH

-P 5

’

1

FLP-O

LIGATION

-

7

-OH

F L IP J /

OH

OH

/

p-

I

OFLP -OH

FIG. 16. Flp-mediated cleavage and ligation proceed through two trans-esterifications. (A) Cleavage. Flp cleaves the phosphodiester bond and covalently attaches via a tyrosine hydroxyl to the 3’-phosphoryl group, leaving a free 5’-hydroxyl. (B) Ligation. A free 5‘-hydroxyl group from another FET acts as the nueleophile that attacks the phosphotyrosine linkage and rejoins the phosphodiester bond with the release of Flp.

1. ASSAY OF LIGATION Conservative site-specific recombinases employ a mechanism of cleavage that is similar to the topoisomerases (94) in that all enzymes use an aminoacid nucleophile to attach covalently to the DNA. It has been thought that the energy of the phosphoamino-acid bond conserved the energy of the phosphodiester bond and allowed the enzyme to reseal the DNA without an external energy source. A direct test of this idea was possible with the development of assays that measure ligation by Flp. Substrates were constructed that bore a Flp binding site, a 5’-OH and a 3’-phosphoryl group that had an attached tyrosine. These substrates were made initially by digestion with Flp (Flp R191K, which has a hypercleavage phenotype) followed by protease digestion (Fig. 17A) (124).Then suitable synthetic oligonucleotides with 3’-phosphotyrosine groups were made (125). Both such substrates underwent efficient ligation when incubated with Flp (Fig. 17B). Most significantly, this ligation activity was equally efficient when mutant Flp proteins that lacked the nucleophilic tyrosine were used (e.g., Flp Y343F). Because these proteins are incapable of strand cleavage, this result clearly shows that the ligation reaction is separable from the cleavage reaction. This reaction also bespeaks of a certain amount of flexibility to the active site for ligation in that the “ligation pocket” can apparently accommodate both the 3’-phosphotyrosine of the substrate and the amino acid at position 343 (e.g., tyrosine, phenylalanine, serine).

THE

Flp

79

RECOMBINASE

I+ R191K-FLP 3'

Hb

1

+ PRONASE + ELECTROPHORESIS 3'

O ,H

Intramolecular ligation

B

bbW I

>-bs

a

0

5'.

Intermolecular ligation

-\

M

" L

1

"1

b

a

L

C

b

- ,5

5’

* 5

Tp&

3’

7 HoGp

5;

-lob

a

3

*

FIG. 17. Ligation assay of Flp recomhinase. (A) Generation of a substrate for Flp-mediated ligation. An FRT site (top) is treated with Flp R191K protein, which cleaves and covalently attaches to the FRT site (middle). After Pronase treatment, the two half-sites can be separated by electrophoresis. The open circles represent the tyrosine residues that remain attached to the 3'-phosphate termini. (B) Flp-promoted ligation reactions. Each half-site is incubated separately with Flp (left and middle). The S'-hydroxyl group attacks the phosphotyrosine linkage (arrows), the tyrosine acts as a leaving group, and the phosphodiester bond is formed with the generation of a hairpin structure. When both half-sites are incubated together with Flp (right), the original duplex full-FRT site is reconstituted. The products are detected as radioactive bands on a denaturing polyacrylamide gel. Asterisks indicate 32P-labeled 5' termini. (Reproduced from 124, with permission of the Journal of Biological Chemistry.)

80

PAUL D. SADOWSKI

These studies also showed that the R191K Flp protein is defective in ligation activity. This reaction was generalized by the finding that two other members of the integrase family (A-Int and P1-Cre proteins) as well as mammalian topoisomerase I can carry out a similar reaction (125).

2. LIGATIONOCCUI~S IN Cis Initial ligation assays that used a half-FRT-site (with a single symmetry element) suggested that a single bound Flp molecule is sufficient for ligation. With the discovery of cleavage in trans, it was of interest to learn whether ligation by Flp takes place in cis or truns, i.e., whether ligation is performed by an Flp molecule bound in cis or in trans to the ligation site. A series of complementation tests was done using two half-FRT sites, one of which contained a strand that bore a 3'-phosphotyrosine (107). Each of the half-sites was loaded with a different Flp protein: one that was ligation competent but cleavage defective (e.g., Y343F) and another that was cleavage competent but ligation defective (e.g., R191K) (Fig. 18). The results

FLP R191K (llg

FLP-Y 3 43 F

;clr +)

(lig

(1) 5 S t r A C T (2) OH

+

HO

1

R19 1K t

$p

c~r-

-' P-tyr

3' (3) (4)

dimeritation no cleavage Y343F

7 cis-ligation

J b .

p n -

t

-(40 nt)

a +

& c* (48 nt)

FIG. 18. Ligation occurs in cis. A half-site complementation assay is done between two half-sites; the left-hand site contains Flp R191K whereas the right-hand one bears Flp Y343F. Because cleavage must take place in trans and because Y343F is cleavage defective, the leftband half-site is not cleaved (middle). Y343F is ligation competent and promotes & ligation (bottom), yielding a 40-nt intermolecular product (left) and a 48-nt intramolecular hairpin ligation product (right). Reversal of the proteins would lead to ligation products from the lefthand half-site.

THE

81

Flp RECOMBINASE

clearly showed that the Flp molecule that executes ligation is bound in cis to the ligation site. These studies also showed a spatial segregation of the functions of cleavage and ligation within the Flp protein. Flp proteins bearing mutations in amino acids 191-329 showed a defect in ligation in these tests, but were cleavage competent. Mutations that changed amino acids 336-345 caused a cleavage defect, but did not affect ligation.

3. CHEMISTRY OF LIGATION Although some of residues 191-329 are important for the ligation reaction, a key question that remains concerns which residues are actually important for the chemistry of the reaction. A possible mechanism for catalysis of cleavage and rejoining by FLP that involves acid-base catalysis has been proposed (98). The conserved residues constituting the RHR triad could be involved in proton abstraction from the nucleophilic hydroxyl, or hydrogen bonding to and activation of the phosphate toward nucleophilic attack. Although the two arginine residues of the catalytic triad have been implicated directly in the cleavage reaction, it is possible that some or all of the three could play an analogous role in the ligation step. Mutations that affected the catalytic triad (R191K, H305L, and R308K) abolished intramolecular ligation activity on a half-site that contained a 3‘-phosphotyrosine terminus (85). Nevertheless it is becoming clear that certain mutant proteins that score as negative in this assay can under some conditions, perform ligation. For example, FLP N329D could not perform intramolecular ligation but could resolve Holliday junctions to covalently closed products. FLP R308K is also unable to carry out intramolecular ligation but can promote ligation if it can cleave and covalently attach to the FRT site (X. Zhu, unpublished). The tyrosine residue acts as the leaving group for the nucleophilic attack by the 5’-OH group. The finding that several aromatic phenolic compounds can cleave the FRT site (in the presence of FLP Y343F) suggests that the requirements for the leaving group are flexible (110, 111).It is not certain whether the common features of the active compounds (aromatic hydroxyl with opposed side chains) that are needed for cleavage reflect chemical or steric requirements. We are approaching this question by attempting to synthesize oligonucleotides with various 3’-phosphoryl substituents.

G. Resolution 1. Flp RESOLVES SYNTHETICX-STRUCTURES Like other integrases, FLP uses a two-stage mode of strand exchange with the generation of a Holliday-like (or intermediate (119, 120). Because the x intermediate in the in uitro reaction is short-lived, and therefore

x)

a2

PAUL D. SADOWSKI

difficult to isolate, synthetic oligonucleotides were used to construct x-like intermediates consisting of two FRT sites linked by a pair of exchanged strands (127). Flp effectively resolved these structures in an unbiased fashion; that is, resolution was equally efficient in the “forward” direction (to products) or the “backward” direction (to substrates; Fig. 19). Resolution occurring adjacent to a given symmetry element requires binding of Flp to that symmetry element. Resolution of the X-structure requires the binding of at least two Flp molecules to two symmetry elements. [Resolution of an analogous X-structure by the A-Int protein requires binding of at least three bound Int

-

r 1 s 4

! Y ; Fb

*s

3

b

8

*-

3.6

5, II * -

8

I,

b

-=As “1

b

8

FIG. 19. Resolution of a synthetic X-structure by Flp. The synthetic X-structure is assembled from four oligonucleotides (shown by different patterns and numbers 1-4). The X-structure contains four arms, each of which contains either of the a or b symmetry elements separated by the 8-bp core. The junction is freely mobile throughout the FRT sites but each arm bears a different sequence to prevent spontaneous resolution of the junction by branch migration. Resolution by Flp at symmetry element a (left) or b (right) generates two linear products of different lengths. Because only oligonucleotide 3 is 5‘-”P-labeled (asterisk), only two of the four resdution products arr visualized by autoradiography.

THE

Flp

RECOMBINASE

83

molecules (128).Protein-protein interactions can substitute for binding of an Flp molecule to a symmetry element and still allow some resolution. Studies of resolution of synthetic X-structures are compatible with a trans mode of cleavage during resolution. For example, Flp Y343F protein can be complemented by Flp G328E for resolution of a X-structure, whereas neither protein has any activity by itself (A. Shaikh, unpublished). Studies with X-structures that have a nick in the core region of one strand are also compatible with a trans-cleavage mechanism for resolution (J. Dixon and P. Sadowski, unpublished). However, a definitive test of the trans-cleavage model has not been carried out with Flp. Such an experiment seems to indicate that A-Int protein cleaves X-structures in cis (102). CORE HOMOLOGY RESOLUTION REACTION

2. THE ROLE OF IN THE

Homology between the cores of two recombining FRT sites is essential for recombination (62),a finding consistent for all conservative site-specific recombinases. Furthermore, heterologies in the core region of recombining X att sites cause accumulation of a Holliday intermediate (flq, and more Holliday intermediates accumulate when the heterology is further away from the site of initial strand exchange. A reasonable explanation of these findings is that the heterology blocked the branch migration needed to deliver the branch point to the site of the second strand exchange (resolution point). A direct test of the need for homology of the cores in the resolution of a X-structure by FLP was carried out by placing changed base pairs at various positions in the core of a X-like substrate (129). These changes were shown to immobilize the X-junction by use of the X-specific endonuclease, T4 endonuclease VII. Nevertheless, the X-structure containing the immobilized junction was efficiently resolved by Flp irrespective of the position of the junction. Thus it seems that homology of the core regions of two FRT sites is required at a stage prior to the first strand exchange. It seems reasonable that homology is required to stabilize the first strand exchange. A heterology in the core region might destabilize the strand-exchange intermediate and favor reversal of the reaction. Support for this idea comes from studies of the strand exchange that occurs in synaptic intermediates or between a full FRT site and half-site (87, 88). Studies of synapsis showed that core homology is not required for synapsis but at least 2 bp of homology is necessary to stabilize the strand-exchange event either between two full FRT sites or between one FRT site and a halfFRT site. It has also observed (G. Pan, unpublished) that a minimum of 4 bp of homology is required for ligation of a single-stranded oligonucleotide to a nicked duplex FRT site.

84

PAUL D. SADOWSKI

H. The Structure and Function of Flp Flp exists as a -46-kDa monomer in solution. Although mutational analysis failed to identify a discrete DNA-binding domain (77), partial proteolysis yielded an internal 21-kDa fragment (P21, amino acids 148 to -346) that retained site-specific DNA-binding activity to the symmetry elements (39). Although the binding d n i t y was approximately the same as with intact Flp, a 13-kDa amino-terminal domain (P13, amino acids 2-122) stimulated binding of P21. Milder proteolysis yielded, in addition to P13, a 32-kDa carboxyterminal fragment (P32), which also bound in a site-specific manner to the symmetry elements of the FRT site (Fig. 20) (70). Although P21 induces a type-I bend of only 24", P32 induces a type-I bend of near-normal magnitude, i.e., 55" vs. 63" for intact FLP (see Table I). Neither P21 nor P32 carries out the type-I1 bend. P13 stimulates the binding of both P21 and P32 to the FFiT site. P13 binds to the core-proximal4 bp of a symmetry element whereas P32 protects the core-distal 9 bp of a symmetry element (78). P21 and P32 in combination with P13 can each promote the ligation reaction, although P32 is more effective than P21. P32 can also covalently attach to an FRT site when incubated in the presence of P13. P21 cannot covalently attach to the FRT site, however, possibly because it lacks the nucleophilic tyrosine (D. Kuntz, unpublished). Sequence comparisons among the members of the integrase family identified the four absolutely conserved residues (RHRY) that are thought to be

n

FIG. 20. Structure and function of Flp. P13 is composed ofthe N-terminal122 amino acids: P32 begins at amino acid 124 and is thought to include amino acids 124 to 423. P21 starts at position 148 and may include up to amino acid -350. The first and second conserved regions (as defined in reference 130) are shown (CRI and CFUI) along with the four conserved residues of the catalytic tetrad.

THE

85

Flp RECOMBINASE TABLE 1 ACTIVITIESOF Flp PEPTIDES Peptide P13 P21 P32 P13 P13

+ P21

+ P32

Function Binds to core-proximal 4 bp of symmetry element; crosscore protein-protein interactions Specific DNA binding; 24" type-I bend; no type-I1 bend Specific DNA binding; binds to core-distal 9 bp of symmetry elements; 55" type-I bend; no type-I1 bend DNA ligation DNA ligation; DNA cleavage

involved in the catalysis of strand breakage and reunion (41, 42). However, other extensive regions of conservation are not seen. A more informative picture was obtained by sequence comparisons among six Flp-like proteins from various yeast species (130)(Fig. 20). These studies showed that the proteins contain two regions of extensive homology (or identity). The first conserved region (Box I) stretches from amino acids 185203 (Flp numbering) and contains the first absolutely conserved arginine (R191). The second conserved region (Box 11) includes amino acids 294-313; it contains two other residues of the catalytic triad (H305 and R308). Changes of residues in the first conserved region (83, 131) may affect DNA binding, strand cleavage, half-site dimerization, and strand ligation. Most of the mutant proteins can complement the Y343F protein in the trunscleavage assay, i.e., they are able to provide the nucleophilic tyrosine in trans. A study of residues of the second conserved region has also revealed a wide spectrum of defects, i.e., in DNA binding, DNA bending, strand cleavage, and half-site dimerization (85). Most of the Flp proteins with changes in this region are also competent to donate the nucleophilic tyrosine 343 to a Y343F protein in a half-site complementation assay (107).They show a ligation-defective phenotype in these complementation tests. The fact that the six sequenced FLP genes from various yeast species have diverged so widely while preserving the catalytic machinery to recombine DNA has led to speculation that the evolution of the FLP genes is driven by rapid changes in the target DNA sequences (132).As the FRT sites change, the DNA-binding specificity of the Flp proteins changes so as to preserve the ability to bind to the targets. This idea of molecular coevolution has also been applied to the evolution of the REP1 genes to enable the Rep1 protein to bind to the cis-acting STB sequences and stabilize the plasmid (133). Although mechanistic and genetic studies have greatly enhanced our

86

PAUL D. SADOWSKI

knowledge about FLP recombination, our understanding would be greatly aided by a three-dimensional structure of the protein. Such information is emerging for the yti resolvase and is an exceedingly valuable adjunct in interpreting mechanistic studies and introducing informative mutations (36). Adequate amounts of Flp can be produced in E . coli, but the intact protein has been refractory to attempts at crystallization and is too large for heteronuclear NMR. However, the functional domains of Flp (P13, P21, P32) may offer an approach to the solution of the three-dimensional structure of Flp.

V. Flp as a Reagent for Chromosome Engineering An important offshoot of our understanding of the mechanism of sitespecific recombination has been the use of site-specific recombinases to engineer chromosomal rearrangements (reviewed in 134). The Flp protein and the PI-Cre protein are the agents of choice because of their simple target sequences (FRT site and lox site, respectively), the simple topological requirements (no supercoils needed, excisions or inversions are carried out), and the lack of the need for accessory protein factors (e.g., IHF for A-Int, FIS for Hin and Gin proteins). Both proteins work very efficiently in vitro for removing or inverting DNA between the two target sites or for integrating a plasmid DNA molecule into another large molecule. But their most extensive use is to be found in uivo for the engineering of chromosomes of eukaryotic organisms. The following brief discussion cites instances wherein either the Flp protein or the Cre protein is used in viva.

A. Targeting There is a wide variability in the expression of eukaryotic genes that depends on the position in the chromosome where the gene is located. Ifone wishes to study the expression of a modified gene in a chromosome, it would be desirable to place those modified copies of the gene at the same position and in the same orientation in the chromosome. Site-specific recombination can be used for this purpose to generate isogenic cell lines. Such experiments have been successful with both Flp and Cre in animal cells (135-137), and in yeast with Cre (138).The efficiency with these recombinases is little better than homologous recombination, which relies on the cell’s general recombinational machinery. This may be because the sitespecific integration reaction is reversible. The resulting product contains the inserted DNA bounded by two directly oriented recombinase target sequences, an arrangement that is a substrate for the excision of the DNA. This problem is solved by providing the reconibinase for only a short time, by

THE

Flp

RECOMBINASE

87

inducing its expression briefly, or by directly introducing the recombinase protein or RNA into the cells. The Flp protein in general has been less widely used than the Cre protein for animal cells because of the apparent instability of its mRNA. This problem was addressed by modifying the sequence of the FLP gene to remove potential cryptic poly(A) addition sites or splice sites (135).

B. Excisive Recombination Flp and Cre also catalyze the removal of DNA that is between two directly oriented target sites. This strategy is used for removing unwanted DNA from a transgene, for example to remove a selectable marker that might otherwise interfere with expression of the transgene. This strategy can be used in concert with homologous recombination and has been successful in yeast cells (139), transgenic plants (140), and animal cells (141, 142). A similar protocol has been used to activate a transgene in a mouse (143). One parent bears the gene whose expression is inhibited by sequences that are flanked by recombinase target sites. The excision is activated by mating to another mouse that expresses the recombinase gene. The progeny now carry an activated transgene and the effects of this activated gene on the mouse can be studied.

C. Mosaic Analysis The Flp recombinase under control of a heat-shock promoter is active in Drosophilu (144). Mosaic animals are detected after mitotic recombination between homologous chromosomes that are heterozygous for centromeredistal markers. Such mosaics are extremely useful for tracking cell fates during development. Normally mosaics are induced by X-rays, but sitespecific recombination by Flp provides a much more controlled method (145). By creating strains with FRT sites in centromere-proximal positions to cell-autonomous marker genes, it has been possible to extend, in principle, mosaic analysis to almost all of the Drosophila genome (146).

D. Miscellaneous Uses The Flp recombinase has been used in conjunction with mobilization of P-element transposons to generate mosaics, chromosome deficiencies and duplications, and dicentric and acentric chromosomes (147). The FRT sites and Flp-mediated recombination have also been used to develop a general method for mapping chromosomal markers in yeast (148,149). Flp is also being used in attempts to generate transgenic mosquitoes to facilitate genetic analysis (150).

88

PAUL D. SADOWSKI

ACKNOWLEDGMENTS Work in the author’s laboratory is supported by the Medical Research Council of Canada and the National Sciences and Engineering Research Council of Canada. I thank Linda Houston and Tracy Pegg for secretarial a5sistance and Karen Luetke for critical comments on the manuscript.

REFERENCES 1 . A. B. Futcher, Yeast 4, 27 (1988). 2. F. C. Volkert, D. W. Wilson and J. R. Broach, Microbiol. Rev. 53, 299 (1989). 3 . A. B. Futcher and B. S. Cox. J. Bact. 154, 612 (1983). 4. D. J. Mead, D. C. J. Gardner and S. G. Oliver, MGG 205, 417 (1986). 5. M. D. Rose and J. R. Broach, Methods Entymol. 194, 195 (1991). 6 . J. L. Hartley and J. E. Donelson, Nature 286, 860 (1980). 7. J. D. Beggs, Nature 275, 104 (1978). 8 . J. R. Broach and J. 8. Hicks, Cell 21, 501 (1980). 9 . M. Jayaram, Y.-Y.Li and J. R. Broach, Cell 34, 95 (1983). 10. Y. Kikuchi, Cetl35, 487 (1983). 1 1 . V. A. Zakian, 8 . J. Brewer and W. L. Fangman, Cell 17, 923 (1979). 12. J. R. Broach, V. R. Guarascio, M. H. Misiewicz, J. Abraham, K. A. Nasmyth and J. B. Hicks, CSHSQB 47, 1165 (1983). 13. B. J. Brewer and W. L. Fangman, Cell 51, 463 (1987). 14. J. A. Huberman, L. D. Spotila, K. A. Nawotka, S. M. El Assouli and L. R. Davis, Cell 51, 473 (1987). 15. D. M. Livingston and D. M. Kupfer. JMB 116, 259 (1977). 16. J. R. Broach, V. R. Guarascio and M. Jayaram, Cell 29, 227 (1982). 17. M. McLeod, S. Craft and J. R. Broach, MCBiol 6, 3357 (1986). 18. D. Vetter, B. J. Andrews, L. Roberts-Beatty and P. D. Sadowski, PNAS 80, 7284 (1983). 19. A. B. Futcher, J. Theor. Biol. 119, 197 (1986). 20. F. C. Volkert and J. R. Broach, Cell 46, 541 (1986). 21. A. E. Reynolds, A. W. Murray and J. W. Szostak, MCBiol 7, 3566 (1987). 22. F. D. Russo, I. Scherson and J. R. Broach, J. Theor. Biol. 155, 369 (1992). 23. J. A. H. Murray, Mol. Microbiol. 1, 1 (1987). 24. J. A. H. Murray, M. Scarpa, N. Rossi and G. Cesareni, EMBOJ. 6, 4205 (1987). 25. T. Som, K. A. Armstrong, F. C. Volkert and J. R. Broach, CeU 52, 27 (1988). 26. B. E. Veit and W. L. Fangman, MCBiol 8, 4949 (1988). 27 L.-C. C. Wu. P. A. Fisher and J. R. Broach, JBC 262, 883 (1987). 28. N. L. Craig, ARGen 20, 385 (1988). 29. M. M. Cox, in “Mobile DNA” (D. E. Berg and M. M. Howe, eds.), p. 661. American Society for Microbiology, Washington, DC, 1989. 30. K. Mizuuchi, AAB 61, 1011 (1992). 31. W. M. Stark. M. R. Boocock and D. J. Shenatt, Trends Genet. 8, 432 (1992). 32. D. B. Haniford and G. Chaconas, Curr. Opin. Genet. Dec. 2, 698 (1992). 33. P. D. Sadowski, FASEB J. 7, 760 (1993). 34. A. Landy, Curr. Opin. Genet. Dev. 3, 699 (1993). 35. M. Jayaram. TIBS 19, 78 (1994). 36. N . D. F. Grindlev. Nucleic Acids Mol. B i d . 8, 236 (1994).

THE

Flp

RECOMBINASE

89

37. D. Babineau, D. Vetter, B. J. Andrews, R. M. Gronostajski, G. A. Proteau, L. G. Beatty and P. D. Sadowski, JBC 260, 12313 (1985). 38. L. Meyer-Leon, C. A. Gates, J. M. Attwood, E. A. Wood and M. M. Cox, NARes 15,6469 (1987). 39. H. Pan, D. Clary and P. D. Sadowski, JBC 266, 11347 (1991). 40. H. A. Nash, PNAS 72, 1072 (1975). 41. P. Argos, A. Landy, K. Abremski, J. 8. Egan, E. H. Ljungquist, R. H. Hoess, M . L. Kahn, B. Kalionis, S. V. L. Marayana, L. S. Pierson 111, N. Sternberg and J. M. Leong, EMBO J. 5, 433 (1986). 42. K. E . Abremski and R. H. Hoess, Protein Eng. 5, 87 (1992). 43. P. Sadowski, f. B a t . 165, 341 (1986). 44. K. Mizuuchi and M. Mizuuchi, CSHSQB 43, 1111 (1979). 45. K. Mizuuchi, M. Gellert, R. A. Weisberg and H. A. Nash, J M B 141, 485 (1980). 46. K. Abremski, R. Hoess and N. Sternberg, Cell 32, 1301 (1983). 47. R. M. Gronostajski and P. D. Sadowski, JBC 260, 12328 (1985). 48. R. A. Weisberg and A. Landy, in “Lambda 11” (R. W. Hendrix, J. W. Roberts, F. W. Stahl and R. A. Weisberg, eds.), p. 211. CHS Press, Cold Spring Harbor, NY, 1983. 49. T. J. Pollock and H. A. Nash, JMB 170, l(1983). 50. H. A. Nash and T. J. Pollock, f M B 170, 19 (1983). 51. L. G. Beatty, D. Babineau-Clary, C. Hogrefe and P. D. Sadowski, JMB 188,529 (1986). 52. A. M. Campbell, Ado. Genet. 11, 101 (1962). 53. N. Sternberg and D. Hamilton, J M B 150, 467 (1981). 54. D. E. Adams, J. B. Bliska and N. R. Cozzarelli, JMB 226, 661 (1992). 55. G. Blakely, G. May, R. McCulloch, L. K. Archiszewska, M. Burke, S. T. Lovett and D. J. Sherratt, Cell 75, 351 (1993). 56. J. A. Shapiro, PNAS 76, 1933 (1979). 57. A. Arthur and D. Sherratt, MGC 175, 267 (1979). 58. P. van de Putte, S. Cramer and M. Giphart-Gassler. Nature 286, 218 (1980). 59. B. J. Andrews, G. A. Proteau, L. G. Beatty and P. D. Sadowski, Cell 40, 795 (1985). 60. J. F. Senecoff, R. C. Bruckner and M. M. Cox, PNAS 82, 7270 (1985). 61. J. F. Senecoff and M. M. Cox, JBC 261, 7380 (1986). 62. B. J. Andrews, M. McLeod, J. Broach and P. D. Sadowski, MCBioZ 6, 2482 (1986). 63. S. W. Umlauf and M. M. Cox, E M B O ] . 7, 1845 (1988). 64. M. Jayaram, PNAS 82, 5875 (1985). 65. R. M. Gronostajski and P. D. Sadowski, JBC 260, 12320 (1985). 66. G. A. Proteau, D. Sidenberg and P. Sadowski, NARes 14, 4787 (1986). 67. J. F. Senecoff, P. J. Rossmeissl and M. M . Cox, JMB 201, 405 (1988). 68. P. V. Prasad, D. Horensky, L.-J. Young and M. Jayaram, MCBiol6, 4329 (1986). 69. J.-W. Chen, B. R. Evans, S.-H. Yang, D. B. Teplow and M. Jayaram, PNAS 88, 5944 (1991). 70. G. Pan and P. D. Sadowski, JBC 268, 22546 (1993). 71. C. A. Gates and M. M. Cox, PNAS 85, 4628 (1988). 72. R. C. Bruckner and M. M. Cox, JBC 261, 11798 (1986). 73. B. J. Andrews, L. G. Beatty and P. D. Sadowski, JMB 193, 345 (1987). 74. L. G. Beatty and P. D. Sadowski, J M B 204, 283 (1988). 75. G. €3. Panigrahi, L. G. Beatty and P. D. Sadowski, NARes 20,5927 (1992). 76. C. J. E. Schwartz and P. D. Sadowski, J M B 205, 647 (1989). 77. A. A. Amin and P. D. Sadowski, MCBiol9, 1987 (1989). 78. G. B. Panigrahi and P. D. Sadowski, JBC 269, 10940 (1994). 79. C. J. E. Schwartz and P. D. Sadowski, J M B 216, 289 (1990).

90

PAUL D. SADOWSKI

80. H. H. Wu and D. M. Crothers, Nature 308, 509 (1984). 81. S. S. Zinkel and D. M. Crothers, Nature 328, 178 (1987). 82. J . J Salvo and N. D. F. Crindley, NARes 15, 9771 (1987). 83. H. Friesen and P. D. Sadowski, J M B 225, 313 (1992). 84. J.-W.Chen, B. R. Evans, L. Zheng and M. Jayaram, J M B 218, 107 (1991). 85. J. Kulpa, J. E. Dixon, 6. Pan and P. D. Sadowski, JBC 268, 1101 (1993). 86. P. D. Sadowski, L. 6. Beatty, D. Clary and S. Ollerhead, in “DNA Replication and Recombination” (R. McMacken and T. J. Kelly, eds.), p. 691. Alan R. Liss, New York,

1987. 87. A. A. Amin, L. G. Beatty and P. D. Sadowski, J M B 214, 55 (1990). 88. A. Amin, H. Roca, K. Luetke and P. D. Sadowski, MCBiof 11, 4497 (1991). 89. A. M. Segdl and H. A. Nash, EMBOJ. 12, 4567 (1993). 90. R. M . Gronostajski and P. D. Sadowski, MCBiol5, 3274 (1985). 91. P. V. Prasad, L.-J. Young and M. Jayaram, PNAS 84, 2189 (1987). 92. B. R . Evans, J.-W. Chen, R. L. Parsons, T. K. Bauer, D. B. Teplow and M. Jayaram,JBC 265, 18504 (1990). 93. C . A. PargeBis, S. E. Nunes-Diiby, L. M. de Vdrgas and A. Landy, JBC 263, 7678 (1988). 94. J. J. Champoux, i n “DNA Topology and its Biological Effects” (N. R. Cozzarelli and J. J. Wang, eds.), p. 217. CSH Lab Press, Cold Spring Harbor, NY, 1990. 95. X.-H. Qian, R. B. Inman and M. M. Cox, JBC 265, 21779 (1990). 96. X.-H. Qian, R. B. Inman and M. M. Cox, JBC 267, 7794 (1992). 97. M. C. Serre and M. Jaydram. J M B 225, 643 (1992). 98. J.-W. Chen. J. Lee and M . Jdyaram, Cell 69, 647 (1992). 980. J. Lee, 1. Whang. J. Lee and M. Jayaram, EMBOJ. 13, 5346 (1994). 99. J.-W. Chen, S . H. Yang and M. Jayaram, JBC 268, 14417 (1993). 100. S. H. Yang and M . laydram, JBC 269, 12789 (1994). 101. Y. W. Han. R . I. Gumport and J. F. Gardner, EMBO J . 12, 4577 (1993). 102. S. E. Nunes-Duby, R. S. Tirumalai, L. Dorgai, E. Yagil, R. A. Weisberg and A. Landy, EMBOJ. 13, 4421 (1994). 103. P. Droge, G . Hatfull, N. D. Grindley and N . R. Cozzarelli, PNAS 87, 5336 (1990). 104. N . D. F. Grindley, Science 262, 738 (1993). 105. R. Hanai and J. C. Wang, JBC 268, 23830 (1993). 106. M. D. Topd and M. Conrad, NARes 21, 2599 (1993). 107. G. Pan, K. Luetke and P. D. Sadowski, MCBiof 13, 3167 (1993). 108. R. L. Parsons, P. V. Prasad, R. M . Harshey and M. Jayaram, MCBiol8, 3303 (1988). 109. R. L. Parsons, B. R. Evans, L. Zheng and M. Jayaram. JBC 265, 4527 (1990). 110. J. Lee and M . Jayaram, J S C 268, 17564 (1993). 111 . 0. Uziel, M .Sc. thesis. University of Toronto, 1994. 112. R. Holliday, Genet. Res. 5, 282 (1964). 113. H Echols and L. Green, Genetics 93, 297 (1979). 114. L. W. Enquist, H. Nash and R. A. Weisherg, PNAS 76, 1363 (1979). 115. P. L. Hsu and A. Landy, Nature 311, 721 (1984). 116. S. E. Nunes-Duhy, L. Matsumoto and A. Landy, Cell 50, 779 (1987). 117. P. A. Kitts and H. A . Nash, Nature 329, 346 (1987). 118. R . Hoess, A. Wierzbicki and K. Abremski, PNAS 84, 6840 (1987). 119. L. Meyer-Leon, L.-C. Huang, S. W. Umlauf, M. M. Cox and R. B. Inman, MCBiol 8, 3784 (1988). 120. M . Jayaram, K . L. Crain, R. L. Parsons and R. M . Harshey, PNAS 85, 7902 (1988). 121. P. A. Kitts and H. A Nash, J M B 204, 95 (1988). 122. J. D. Grifith and H . A. Nash, PNAS 82, 3124 (1985). 123. S . J. Spengler, A. Stasiak and N. R. CozzarelIi, Cell 42, 325 (1985).

THE

Flp RECOMBINASE

91

G. Pan and P. D. Sadowski, JBC 267, 12397 (1992). G. Pan, K. Luetke, C. D. Juby, R. Brousseau and P. D. Sadowski, J B C 268, 3683 (1993). J. E. Dixon and P. D. Sadowski, J M B 234, 522 (1993). B. Franz and A. Landy, J M B 215, 523 (1990). J. E. Dixon and P. D. Sadowski, J M B 243, 199 (1994). J. Utatsu, S. Sakamoto, T. Imura and A. Toh-E., J. Bact. 169, 5537 (1987). J.-W. Chen, B. R. Evans, S.-H. Yang, H. Araki, Y. Oshima and M. Jayaram, MCBiol 12, 3757 (1992). 132. J. A. H. Murray, G. Cesareni and P. Argos, J M B 200, 601 (1988). 133. W. Xiao, L. E. Pelcher and G. H. Rank, J. Mol. E d . 32, 145 (1991). 134. B. Sauer, Curr. Opin. Biotechnol. 5, 521 (1994). 135. S. O’Gorman, D. T Fox and G. M. Wahl, Science 251, 1351 (1991). 136. B. Sauer, W. Baubonis, S. Fukushige and L. Santomenna, Methods: Cornpaflionto Methods E n z y m l . 4, 143 (1992). 137. W. Baubonis and B. Sauer, NARes 21, 2025 (1993). 138. B. Sauer, M C B i o l 7, 2087 (1987). 139. H. Matsuzaki, R. Nakajima, J. Nishiyama, H. Araki, and Y. Oshima, J. Bact. 172, 610 (1990). 140. E. C. Dale and D. W. Ow, PNAS 88, 10558 (1991). 141. S. Feiring, C . G. Kim, E. M. Epner and M. Groudine, PNAS 90,8569 (1993). 142. H. Gu, Y. R. Zou and K. Rajewsky, Cell 73, 1155 (1993). 143. M. Lakso, B. Sauer, J. B. Mosinger, E. J. Lee, R. W. Manning, S.-H. Yu, K. L. Mulder and H. Westphal, PNAS 89, 6232 (1992). 144. K. G. Golic and S. Lindquist, Cell 59, 499 (1989). 145. T.-B. Chou and N. Perrimon, Genetics 131, 643 (1992). 146. T. Xu and G. M. Rubin, Deoelopment 117, 1223 (1993). 147. K. 6. Golic, Genetics 137, 551 (1994). 148. S. C. Falco and D. Botstein, Genetics 105, 857 (1983). 149. L. P. Wakem and F. Sherman, Genetics 125, 333 (1990). 150. A. C. Morris, T. L. Schaub and A. A. James, NARes 19, 5895 (1991).

124. 125. 127. 128. 129. 130. 131.

This Page Intentionally Left Blank

Reconstitution of Mammalian DNA RepIicat ion ROBERT A. BAMBARA LIN HUANG

AND

Department of Biochemistry and Cancer Center Uniuersity of Rochester School of Medicine and Dentistry Rochester, New York 14642

I. Initiation at Replication Origins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Priming Reactions That Initiate DNA Replication . . . . . . . . . . . . . . . . . . 111. Mechanisms of Leading- and Lagging-strand DNA Synthesis . . . . . . . . IV. Completion of Lagging-strand Synthesis .......................... V. Regulation of Replication Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93 96 98 104 113 116

This review is focused on efforts to perform the series of reactions necessary to carry out mammalian chromosomal DNA replication using purified enzymes in uitro. This work fits into a larger framework of genetic and cell biological experiments, using prokaryotes, viruses, and eukaryotes, that must be discussed in parallel with the work emphasized here. What follows is also meant to complement a number of excellent reviews on eukaryotic DNA replication (1-12). The particular features of DNA replication that we will emphasize include initiation at replication origins, and components and propagation of the replication fork.

1. Initiation at Replication Origins

A. The SV40 System Current approaches to the reconstitution of the mammalian DNA replication fork derive from the work of Li and Kelly (13).They were able to carry out replication of SV401 origin-containing plasmids in monkey-cell extracts, Abbreviations: ARS, autonomously replicating sequence; ABF, ARS binding factor; ORC, origin recognition complex; OBF, origin binding factor; CBF, core sequence binding factor; SSB, single-stranded DNA binding protein; RP-A, replication protein A; RF-C, replication factor C; PCNA, proliferating cell nuclear antigen; SV40, simian virus 40; HIV-RT, human immunodeficiency virus-1 reverse transcriptase. Progress in Nucleic Acid Research and Molecular Biology, Vol. 51

93

Copyright 0 1995 by Academic Press, Inc. All nghts of reproduction in any form reserved.

94

HOBERT A. BAMBARA AND LIN HUANG

which led to closed circular form I products. This required use of SV40infected cell extracts or supplementation with purified T antigen. The accomplishment of complete synthesis suggested that all functions necessary for the replication process not directed by T antigen are carried out by cellular proteins. Furthermore, the extract system provided a means to purify these cellular proteins by complementation. Shortly thereafter, three groups began the identification of these replication proteins in human cells

(13-1 6). DNA replication begins when T antigen recognizes and binds SV40's origin of replication. T antigen binds at positions designated site I and site 11 (17) (Fig. 1). Although T antigen function is unique to SV40, the details of its action are important because it probably has common features with proteins that initiate DNA replication at all eukaryotic replication origins. Significant structural features of the SV40 origin have been determined by genetic analysis and site-directed mutagenesis (18).The minimal or core origin contains a 64-bp viral genome sequence (Fig. 1).This core origin suffices to support the initiation of viral DNA replication both in uiuo and in uitro. The 64-bp sequence has three domains. The center domain is the SV40 T-antigen recognition and binding site II, which contains four inverted repeats of the sequence GAGGC in 27 bp (17, 19-25). On the side of T-antigen binding domain site 11, in the direction of early transcription, is a 17-bp imperfect palindrome sequence (Fig. I), which is conserved in other papovaviruses (18). On the side of site I1 in the direction of lute transcription is a 20-bp (A.T)-rich domain that serves as a DNA bending center (26).The T-antigen binding-site I is not included in the minimal origin, but rather just adjacent to the palindrome domain on its early side. Nevertheless, binding of T antigen to site I results in a major increase in the efficiency of initiation of replication both in uiuo and in uitro (27-31). From electron microscopy, ATP not only stabilizes binding of T antigen to the core origin, but also allows it to form a unique structure called the double hexamer (25, 32-34). After the double hexamer assembles, the T antigen carries out an unwinding process (35-37). T antigen initiates unwinding by melting an 8-bp region in the palindrome domain of the SV40 core origin, and concurrently induces a conformational change in the (A.T)rich domain (38, 39). The discovery that T antigen has an intrinsic 3'-to-5' helicase activity (40-42) added to our understanding of the role of T antigen in the initiation of SV40 DNA replication. T antigen is capable of continually unwinding DNA in the presence of RP-A, which stabilizes the melted single-strand region, and topoisomerase I, which relieves positive supercoils in the circular molecule (26,36,37,43).The unwound (U) protein-DNA complex allows initiation of synthesis by DNA polymerase dprimase (44).

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

95

A

FIG. 1. Initiation at the SV40 replication origin.

B. Initiation in Yeast and Mammals In Saccharomyces cerevisiae, replication initiates at cis-acting elements, called autonomously replicating sequences (ARS) (45-48). Their position and activity have been assessed by two-dimensional gel electrophoretic mapping techniques (49-50). The ARS contains an 11-bp (A.T)-rich sequence located within the A element. On its 3’ side is an element of more diverse sequence about 80 nucleotides long and designated the B element. The B element has been called the DNA unwinding element because it readily melts (48,51).It

96

ROBERT A. BAMBARA AND LIN HUANG

has recently been subdivided into three distinct elements, B1, B2, and B3 (52, 53). A six-subunit protein named the origin recognition complex (ORC) binds the A and B1 elements (54).Interaction of the ORC with ARS DNA causes a periodicity of DNase I hypersensitive sites in the A and B elements both in oitro (54)and in permeabilized cells (55). The B3 element (52, 53) need not be located 3’ to the A element to be functional. In ARS121, it is at the 5’ end of the A element, where it can function as a DNA replication enhancer in an orientation- and distanceindependent manner (56).The B3 element binds the OBFlABFl protein (53, 56), which has been purified independently in several laboratories as an ARS binding protein (57-63). A specific and stable multiprotein complex has been assembled stepwise in oitro at the ARS121 origin by Eisenberg and colleagues (63),and analyzed by gel-shift assay. The first step involves ATP-independent binding of ABF-1 and a factor designated OBF-2. The next step is the ATP-dependent binding of the protein designated CBF to the essential core sequence. This suggests that CBF is a component of the ORC. Yeast origin structure, hypotheses concerning origin function, and the ORC have been recently reviewed (7, 64, 65). Animal-cell replication origins are distributed throughout regions called “initiation zones” of several hlobasepairs (kbp) or larger (66). Little is known about the essential features of animal cell origins because of the complexity of direct genetic manipulations in their large chromosomes and lack of suitable assays for origin position and activity (67-69). Schizosucchuromyces pombe also displays replication at initiation zones (70), making it an interesting model for mammalian initiation. Recent results with s. pombe suggest that the 4-kbp zone in the uru4 origin region consists of multiple mutually interfering origins (71). This is consistent with the observation of clusters of origin sequence elements in initiation zones of chromosomes from many eukaryotes (72).

II. Priming Reactions That Initiate DNA Replication In the middle 1970s, several groups reported that the nascent DNA chains synthesized from normal or virus-infected cells contain transient, covalently linked, 5’-terminal oligoribonucleotides about 10 residues long, and of varying sequences (73-76). They were designated initiator RNAs (73). Synthesis of R N A that can be used to initiate DNA synthesis requires the action of a primase. Lehman and colleagues first described the co-

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

97

purification of primase activity with the major species of polymerase a from Drosophila melanogaster embryos (77-79). The immunological evidence for a tight association of primase activity with a partially purified polymerase a from murine Ehrlich ascites cells was provided by Yagura et al. (80, 81). Although the primases obtained from various eukaryotic cells have different structures, their hnction is well conserved (reviewed in 1).The initiator RNA synthesized is in the range of 6 to 14 nucleotides long if there is concomitant DNA synthesis, but could be 24 to 36 nucleotides long if DNA synthesis is not allowed (82, 83). Initiation is usually with rA, but the RNADNA junction sequence is not specific. The primase activity of human polymerase a has been characterized in the presence of specific and complete monoclonal antibody inhibition of the DNA polymerase active site. In the presence of physiological concentrations of NTPs, but dNTPs between 0.08 and 0.8 pM, the primase made a product consisting of strictly alternating tracts of RNA and DNA about 12 nucleotides long (83). These conditions reveal a unique alternating synthesis mechanism of the primase. At higher concentrations of dNTPs, the first segment of RNA was made, followed by a variable-length segment of DNA. The authors postulated that at physiological dNTP concentration, the enzyme becomes stabilized in the deoxy-polymerizing mode &er synthesis of'the first tract of RNA. This w a s presumed to be the product transferred to the polymerase active site for further elongation. A unique feature of primase is its low fidelity. Calf thymus primase will misincorporate NTPs at a frequency as high as 1/200 on homopolymeric templates (84). She& and Kuchta showed that primase readily misincorporates nucleotides in the central and 3' regions of the RNA primer, but accurately incorporates the first two nucleotides (85). After synthesis of a primer on single-stranded DNA, the newly generated primer-template is transferred intramolecularly to the active site for DNA synthesis on the large subunit of polymerase a,for subsequent deoxynucleotide addition (86, 87). This coordination of RNA and then DNA synthesis occurs each time a primer is initiated. The signal that governs the switch from RNA to DNA synthesis is intrinsic in the primase mechanism, and is generated by ambient dNTPs (83). Interestingly, polymerase a will efficiently elongate primase-generated primers that contain many misincorporated nucleotides (85). The priming and synthesis reactions for SV40 DNA replication in vitro begin after a time delay of 8 to 10 minutes (35, 37, 38, 88-90). This period has been designated the presynthesis stage. This delay can be avoided by a prior incubation with T antigen, the single-stranded binding protein RP-A, and topoisomerase I or I1 (25,33,34).This period involves the steps of origin recognition and unwinding, followed by polymerase a binding and primase action, as shown in Fig. 1.

98

ROBERT A. BAMBARA AND LIN HUANG

111. Mechanisms of Leading- and Lagging-strand DNA Synthesis

A. The Monopolymerase System The minimum complement of cellular proteins that carry out extensive bidirectional replication of SV40 include T antigen, DNA polymerase alprimase, topoisomerase I, and human SSB (RP-A) (91). As discussed below, hybridization analysis showed that larger products resulted from leading-strand synthesis, whereas shorter products were derived from lagging-strand synthesis (92). This is reminiscent of the situation in prokaryotes, in which the same DNA polymerase carries out leading- and lagging-strand DNA replication, with the necessary asymmetry of function provided by auxiliary subunits. However, the nucleus of eukaryotic cells contains three DNA polymerases, ci (93, 94), 6 (95),and E (96), each of which is required for viability in yeast. Their presence suggests that there is a more complex replication mechanism in eukaryotes. Alternatives include a role for all three polymerases in the process of chromosomal replication, or participation of one or more of the polymerases in a repair process that is critical for viability.

B. Polymerase 6 and PCNA First indications that chromosomal DNA replication might require more than one DNA polymerase came from the study of DNA polymerase 6 (97, 98). This nuclear enzyme, after purification from calf thymus, was distinguished from other polymerases by its intrinsic 3'-to-5' exonuclease activity and subunits of 125 and 48 kDa (98, 99). The larger catalytic subunit (100102) and smaller subunit (Antero So, personal communication) have now been cloned from human and calf. In early stages of purification, polymerase 6 utilized the substrate oligo(dT).poly(dA)(97, 98). Further purification reduced this activity, because of the removal of an important auxiliary factor. This auxiliary factor was later purified based on the ability to stimulate polymerase 6 synthesis on the homopolymer substrate (I03). Proliferating cell nuclear antigen (PCNA) had been under investigation for some years prior to the discovery of the 6 auxiliary factor, as a protein that might be involved in the control of cell growth. It appears at higher levels in tumor cells compared to normal cells (104-109). Immunostaining analysis showed that its level rises with cell growth (110-113) and DNA synthesis (114-119). The polypeptide is about 36 kDa and has been cloned from rat and human (120, 121) cells. Two important logical connections followed: the realization that PCNA is the auxiliary subunit of DNA polymerase 6 (103,122,123),and that PCNA is

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

99

required for replication of SV40 in vitro (124, 125). These connections led to the proposal that DNA polymerase 6 and PCNA elongate the leading strand in SV40, whereas DNA polymerase a participates in lagging-strand synthesis (124, 126-128).

C. The Single-stranded DNA Binding Protein RP-A (RP-A, SSB) Single-stranded DNA binding proteins assist in the strand separation reactions necessary for DNA metabolism in prokaryotes and eukaryotes (10). The mammalian single-stranded binding protein RP-A was identified as an essential protein for SV40 DNA replication in vitro (43, 91, 129). It is a heterotrimer with subunits of 70, 32, and 14 kDa (43,129,130).The cDNAs for the subunits have been isolated (131-133) and coexpressed in E . coli to form an active complex (134).The genes for all three RP-A polypeptides are essential in yeast (135, 136). The DNA binding ability resides in the 70-kDa subunit (124,130,132).The 32-kDa subunit becomes phosphorylated at the beginning of S phase (137) suggesting cell-cycle-dependent regulation of function. RP-A has high &nity for single-stranded DNA, but low &nity for double-stranded DNA and RNA (138).Human RP-A forms two distinct complexes with single-stranded DNA (139).One complex cooperatively binds 810 nucleotides and the other binds 30 nucleotides with a preference for pyrimidine-rich sequences; it displays low binding cooperativity (139, 140). The binding-site size of the single-stranded DNA binding protein isolated from Drosophila was about 22 nucleotides (141). RP-A increases the fidelity of DNA synthesis by polymerase a from 2- to 8-fold in vitro (142). RP-A also stimulates the activity of calf thymus DNA helicases A, B, C , D (143),and E (144), and HeLa cell helicase E (145)and cx (146). RP-A stimulates pol-dprimase in a species-specific manner whereas its stimulation effect on polymerase 6 can be substituted by other singlestrand DNA binding proteins (147). Polymerase E activity from both yeast and human cells is significantly increased in the presence of RP-A (148). Aside from its role in DNA replication, RP-A is also thought to act in DNA repair (149, 150)and recombination (135, 151).

D. RF-C, RP-A, and the Assembly of the Highly Processive Leading-stra nd Complex A key to the understanding of the leading-strand replication mechanism was the discovery of RF-C. RF-C was identified as a necessary factor for complete SV40 DNA replication (152). Replication reactions performed in the absence of RF-C or PCNA resulted in accumulation of short products. Hybridization analysis showed that these were primarily lagging-strand products. RF-C is a multisubunit factor with five components of 140-145,

100

ROBERT A. BAMBARA AND LIN HUANG

40, 38, 37, and 36 kDa (153,154)in human cells. The structural and functional properties of RF-C are highly conserved from yeast to humans (155157). The genes for all the known subunits of RF-C have been cloned and sequenced from both human (153, 158-160) and yeast cells (161,162, and Bruce Stillman, personal communication). RF-C is also a DNA-dependent ATPase (163),with greater affinity for primer-template structures than for single- or double-stranded DNA. The ATPase activity is stimulated by PCNA (163).Binding analysis showed that the presence of ATP increases the specific interaction of RF-C with primertemplate junctions (164).PCNA interacts with the RF-C-primer-template complex. Specificity of primer-template junction recognition by RF-C is heightened by the presence of RP-A (164). Although PCNA stimulates highly processive DNA synthesis on oligo(dT).poly(dA) (103),Tsurimoto and Stillman (127) found that the action of polymerase 6, PCNA, and RF-C is necessary to allow highly processive synthesis on primed circular M13 DNA, making products thousands of nucleotides in length. RP-A, RF-C, and PCNA cooperatively stimulate the activity of DNA polymerase 6 (147,127).[RF-C has also been designated A1 by Hurwitz and colleagues (165)l. Other work (166)suggests that assembly of the leading-strand replication complex is the result of an interaction between RF-C, RP-A, PCNA, and DNA polymerase 6. An important role of RP-A is to control the use of primed single-stranded DNA by DNA polymerases. As the concentration of RP-A was increased (166),it first stimulated and then inhibited D N A polymerase a.Over the same concentration range, it thoroughly inhibited DNA polymerase 6, in the presence or absence of PCNA, RF-C, or both. However, when ATP was present with all four proteins, active polymerization occurred. When RF-C, RP-A, and PCNA were present, DNA polymerase a was still active for synthesis. However, addition of ATP stopped the reaction by polymerase a.The authors (166)concluded that RF-C and PCNA can form a primer recognition complex in the presence of ATP that allows primer binding and extension by polymerase 6 but not a. Polymerase 6 appeared to carry along the RF-C during synthesis, because addition of ATPyS immediately inhibited primer elongation.

E. Reconstitution of SV40 DNA Replication with Two DNA Polymerases The specificity for primer use conferred by RP-A and the characterization of the primer recognition complex prompted Tsurimoto and Stillman to propose the currently accepted mechanism by which polymerases a and 6 participate in SV40 replication (166).They examined replication of an SV40

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

101

origin-containing plasmid in the presence of T antigen, topoisomerases I and 11, and D N A polymerase a. DNA synthesis required RP-A, and produced long leading-strand products and short lagging-strand products, as expected for the monopolymerase system. However, when the RP-A concentration was raised from the stimulation peak at 12.5 pg/ml to 50 pg/ml, the long products disappeared. Evidently high RP-A concentration blocked rebinding of polymerase a to the nascent leading strands. Addition of PCNA and RF-C eliminated long products, even at low RP-A levels. Most importantly, the addition of polymerase 6, in the presence of PCNA and RF-C, restored the synthesis of long products, even in the presence of high RP-A concentrations. DNA polymerase a was still necessary to synthesize significant amounts of short lagging-strand products. These observations led Tsurimoto and Stillman to propose the polymerase switching model (166) in which the leading strand is initiated by DNA polymerase a on an RP-A-coated template. After completion of the first Okazaki fragment, RF-C, driven by ATP, forms a complex with the primer terminus and then recruits PCNA to make the primer recognition complex. Polymerase 6 then replaces polymerase a,and begins highly processive synthesis in a complex with PCNA and RF-C. This model is depicted in Fig. 2. At the time of this work, polymerase a was proposed to carry out both priming and elongation of the lagging-strand segments. RF-C was thought to function as a stimulator of polymerase a (164, 166). More recent evidence based on investigation of completion of lagging-strand synthesis, discussed below, suggests that polymerase switching also occurs on the lagging strand.

F. Details of the Leading-strand Replication Process Tsurimoto and Stillman (163) pointed out the remarkable functional resemblance between the interaction of DNA polymerase 6, PCNA, and RFC, and the mechanisms of bacteriophage T4 and E. coli DNA replication. The T4 gene 44/62 protein complex has ATPase and primer binding activities similar to those of RF-C (167-170). The T4 gene-45 protein is a functional and partial sequence homolog of PCNA that stimulates the ATPase of the 44/62 protein just as PCNA stimulates the ATPase of RF-C (167). Both proteins are needed to assemble a processive replication complex with the gene-43 DNA polymerase (167). In E. coli, the bulk of chromosomal DNA replication is carried out by DNA polymerase 111 (171).Components of this enzyme include an interactive group of the protein subunits y, 6, a', x and called the y complex (172).The y complex carries out an ATP-dependent assembly of the p subunit of DNA polymerase I11 onto primed DNA (173).

+,

102

ROBERT A. BAMBARA AND LIN HUANC

FIG.2. The polymerase switching mechanism

The p subunit is assembled into a toroidal hoinodimer (174) that can slide freely on DNA but cannot readily dissociate. It then interacts with the polymerizing subunit to make it highly processive for DNA synthesis. It has been proposed that PCNA is a toroidal homotrimer (174, 175) that can slide onto the ends of linear DNA, but must be assembled around closed circular DNA by the ATP-driven action of RF-C (176). Recent crystallization of yeast PCNA (1 77) verified the toroidal structure. Binding of polymerase 6 and possibly other replication proteins to PCNA locks them onto the template strand for highly processive DNA synthesis. Burgers and Yoder (176) postulated that PCNA could assemble into a highly processive complex with polymerase 6 and a linear homopolymer substrate because the PCNA could

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

103

slide over the end of the linear template. Because primed M13 is circular,

the RF-C-mediated, ATP-dependent assembly of the PCNA toroid around the template is required.

G. DNA Polymerase E DNA polymerase E was first isolated from rabbit bone marrow as a polymerase containing an intrinsic 3’-to-5’ exonuclease (178). It has also been isolated from human placenta (179, 180), cultured cells (181, 182), calf (99, 183-189, and yeast cells (reviewed in 186). The major active-site subunits range from 140 kDa, for the major form in calf, to 258 kDa in humans. However, the calf has a minor form at 220 kDa, suggesting that the smaller form is a product of proteolysis (187). The yeast enzyme also appears as two forms, a tetramer with subunits of 200, 80, 30, and 29 kDa and a smaller form of 145 kDa (186).The 258-kDa catalytic subunit of the human enzyme has been cloned and sequenced (188). Unlike polymerase 6, the capacity of polymerase E to synthesize on oligo(dT).poly(dA) was retained throughout purification (189, 190). This suggests that polymerase E is active in the absence of PCNA. Indeed, it is also processive for 500-1500 nucleotides on primed single-stranded DNA (182, 191, 192), with little effect of added PCNA (185). Furthermore, addition of RF-C, ATP, and PCNA in various combinations to the yeast polymerase E did not significantly alter processivity, but did reduce salt sensitivity of the activity (148). In general, analyses of both human and yeast polymerase E (148) showed that approximately 200 mM salt inhibits synthesis on primed poly(dA) or M13 DNA. However, addition of SSB from one of several organisms (human, yeast, E . coli, or T4), with HeLa PCNA, A1 protein (RF-C), and ATP, allowed synthesis to occur. These components also stabilized binding of either polymerase 6 or E to the primer-template, although binding was still rather weak for polymerase E. These results suggest that DNA polymerase E and polymerase 6 may participate in similar reactions such as polymerase switching. Yeast polymerase E is encoded by a gene distinct from that of other DNA polymerases, and its disruption is lethal (96). Morrison et al. (96)also pointed out that defects in polymerase a,6 , or E all result in the same terminal cell morphology. This was a dumbbell-shaped cell with the nucleus at the isthmus. It is indicative of a defect in DNA replication (193). There is also evidence of involvement of polymerase E in repair (181). However, participating in repair does not detract from the likelihood of an involvement in replication, because polymerase OL and 6 are also implicated in repair (194196). Interestingly, PCNA is also required for DNA excision repair (196). This further suggests that either or both DNA polymerases 6 and E have dual roles in replication and excision repair. Mutations in many known DNA

104

ROBERT A. BAMBARA AND LIN HUANC

repair genes in yeast almost always result in viable cells (197), suggesting that cells can survive disruption of a polymerase solely involved in DNA repair. Of course, because repair processes are redundant, if polymerase E were required in all repair pathways, its absence could be lethal. DNA polymerase E was proposed to serve as the mammalian leadingstrand polymerase based on its requirement for viability and capacity for highly processive DNA synthesis (96). Meanwhile, Lee et al. (148) substituted DNA polymerase E for DNA polymerase 6 in the dipolymerase SV40 DNA replication system in oitro. Long products characteristically made by polymerase 6 disappeared. Overall, products were only about one or two times longer than those made by pol a alone, suggesting that polymerase E is not designed to carry out leading-strand synthesis. The somewhat longer products made in the presence of polymerase E were also dependent on the presence of polymerase alprimase. This result suggests that polymerase E had extended Okazaki fragments made by polymerase a. The likelihood that polymerases 6 or E complete Okazaki fragment synthesis is consistent with other observations (198)that synthesis of the longer lagging-strand fragments of SV40 replication in oitro is less sensitive to inhibition by butylphenyl-dGTP than that of the shorter fragments. Because this inhibitor affects polymerase a at considerably lower concentration than polymerase 6 or E, the result suggests that in their system polymerase 6 or E was adding to primers initiated by polymerase a. Furthermore, Bullock et al. (44) showed that antibodies to PCNA shorten lagging-strand SV40 segments made in oitro. This suggests that a PCNA-dependent enzyme was completing synthesis of these fragments. However, that polymerase also could be DNA polymerase 6. This was again pointed out by Waga et al. (199, 200), who failed to identify polymerase E as a protein that detectably improves the efficiency of production of form I closed-circular SV40 DNA in oitro. If polymerases 6 or E can be used interchangeably for completion of lagging-strand synthesis of SV40 DNA in oitro, the need for inclusion of polymerase 6 will make it difficult to verify the possible roles for polymerase &.

IV. Completion of Lagging-strand Synthesis A. The Prokaryotic Mechanism In E. coli, the lagging strands are elongated by DNA polymerase 111 up to the initiator RNA on the next nascent DNA strand. The initiator RNA is thought to be removed by the action of the 5’-to-3’ exonuclease of DNA polymerase I(10). The E. coli RNase H may participate in the process (201), but genetic experiments show that it is not essential (202). The 5’-to-3’

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

105

exonucleases of both DNA polymerase I and the Taq polymerase are greatly stimulated by synthesis from the upstream primer to generate a nick (203). The coordinated action of synthesis and nuclease functions, termed nick translation (204), is thought to be the means of RNA removal. It also generates a nicked structure between the resulting two nascent DNA segments. DNA ligase then joins the DNAs. Polymerase I and the Taq polymerase also have an endonuclease function (203). A very specific substrate, consisting of two primers annealed to a template, is required. The downstream primer must have an unannealed 5’ end region. The upstream primer must be annealed with its 3’ end directly adjacent to the first annealed nucleotide of the downstream primer. The endonuclease then removes the unannealed region as an intact segment. The role, if any, of the endonuclease function in lagging-strand DNA synthesis is not known.

B. Reconstitution of the Joining of Mammalian Lagging-strand Products Requires a 5’-to-3‘ Exonuclease There is no homolog of DNA polymerase I in mammalian cells. However, a thorough characterization of a mammalian 5’-to-3’ exonuclease suggests that it performs the same role as the 5’-to-3’ exonuclease of DNA polymerase I. A key observation was that a 5’-to-3’ exonuclease is required for generation of closed circular products during SV40 DNA replication in uitro (92). The authors began with the monopolymerase SV40 DNA replication system described by Wobbe et al. (91)utilizing SV40 origin-containing DNA, HeLa DNA polymerase dprimase, topoisomerase I, and the HeLa SSB (RP-A). This group of components produced form-I1 nicked double-stranded DNA products. Supplementation with HeLa DNA ligase, RNase H1, and topoisomerase I1 still did not lead to closure. However, HeLa extracts contained an activity that would allow generation of form-I DNA. Using closure as an assay a 5’-to-3’ exonuclease was purified (92) that w a s a monomeric protein of 44 kDa. It was a 5‘-to-3’ exonuclease that could degrade oligo(dA) or oligo(rA) annealed to poly(dT). Mononucleotide products were released from the 5’, but not the 3’ ends of the substrate. Primers not annealed to templates resisted cleavage. The preparation contained no phosphatase activity. RNase H1, isolated by a variation of the method of DeFrancesco and Lehman (205), stimulated priming and the subsequent DNA synthesis reaction severalfold. In the absence of the 5’-to-3’ exonuclease, products were made that spanned about half of the length of the SV40-origin-containing plasmid; a second population was about 200 nucleotides long (92).Hybridization experi-

106

ROBERT A. BAMBAFU AND LIN HUANG

ments with leading- and lagging-strand probes verified that longer products were leading-strand intermediates, whereas the shorter ones were laggingstrand Okazaki fragments. Overall, these results implicated the 5’-to-3‘ exonuclease in the removal of initiator RNA. The 5’-to-3‘ exonuclease identified by complementation of SV40 DNA replication was identical to a cellular enzyme designated the “pL protein” (206).This protein aided the initiation of adenovirus DNA replication when the viral DNA lacked terminal protein. The nuclease acted by removing nucleotides from the 5‘ end of the double-stranded viral genome, providing a single-strand origin region. Working in parallel on DNA enzymes isolated from mouse cells, Goulian and Heard (207) came to virtually the same conclusions. They created a substrate to investigate the reactions of lagging-strand synthesis. DNA polymerase a/primase was used to randomly prime fd DNA and to continue synthesis until the DNA extended from each primer encountered the next adjacent downstream RNA primer. This substrate, when exposed to purified mouse DNA ligase I, RNase H1, and an extract fraction from mouse cells, was converted into closed circular DNA (207). This allowed an assay from which the essential factor in the extract could be purified. The factor was a 49-kDa 5’-to-3‘ exonuclease with virtually identical specificity for primed homopolymeric DNA as the HeLa exonuclease. They named it the cca (circle-closing activity) nuclease. A trade of the enzymes between the two laboratories showed that each could function in the other system. Experiments were then conducted to determine the respective roles of the 5’-to-3’ exonuclease and the RNase H1 in the process of RNA primer removal (208). In terms of nucleotide release, the exonuclease alone degraded about half of the RNA. The RNase H1 alone degraded 80 to 90% of the RNA. Together, these enzymes effectively removed all of the RNA. It was concluded that the 5’-to-3‘ exonuclease is relatively inert on the intact initiator RNA, with the observed activity likely caused by contaminating RNase H. They postulated that the RNase H removed most of the primer by endonuclease action. The remaining one or two nucleotides would then be susceptible to the 5’-to-3’ exonuclease. It is not clear why the original ribonucleotides in the initiator RNA resisted cleavage, whereas those remaining after RNase H action were susceptible. Harrington and Lieber (209) purified a mouse 5’-to-3’ exonuclease, designated FEN-1 (flap endonuclease), that appears to be the same as the cca nuclease. It did not degrade RNA oligomers annealed to DNA, which suggests that RNA length or sequence, or overall substrate structure, determines whether RNA is susceptible to cleavage. The genes for the mouse and human FEN-1 nucleases have recently been cloned, sequenced, and expressed (210).The murine FEN-1 gene is highly

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

107

homologous with the S. cerevisiae genes Y K w l O and RAD2. RAD2 is essential for yeast excision repair (211) and was previously shown to be a nuclease (212). The expressed YKL510 peptide and a truncated form of RAD2 have structure-specific endonuclease activity (210). These results support a dual role for the 5'-to-3' exonuclease in D N A replication and excision repair.

C. Unique Substrate Specificity of the 5'-to-3' Exonuclease An important feature of D N A polymerase I is its ability to carry out nick translation, the simultaneous synthesis from an upstream primer and degradation of a downstream primer. This results in movement of a nick in doublestranded D N A in the 5'-to-3' direction. This reaction is thought to be part of the process of initiator RNA removal and replacement of damaged DNA (10). The nick-translation phenomenon was investigated using the calf 5'-to-3' exonuclease, a homolog of the nucleases from HeLa and mouse cells. This 5'-to-3' exonuclease copurified through a number of chromatography steps with calf D N A polymerase E (213).A fraction containing the polymerase and the nuclease degraded a synthetic DNA or RNA oligonucleotide annealed to M13 DNA, but its activity depended on DNA synthesis from an upstream primer. Purification revealed the nuclease to be a monomeric protein that degraded a downstream deoxyoligomer irrespective of whether the DNA polymerase used to extend the upstream nucleotide was calf polymerase OL, 6 or E, E . coli polymerase I Klenow fragment, or T7 polymerase. If only one, two or three of the four deoxynucleoside triphosphates were used in the reaction, limiting the furthest position of extension of the upstream primer, the degradation of the downstream primer was likewise limited. These results suggested that the 5'-to-3' exonuclease was designed to work with a D N A polymerase for simultaneous synthesis and degradation. It was not clear from these results, however, what was responsible for the activation of the nuclease. One likely possibility is that the presence of a polymerase abutting the downstream primer is significant. Another is that the upstream primer had to be extended directly up to the downstream primer. We later found that the nuclease alone acts efficiently in a two-primer system if only a nick separates the primers (214). Under these circumstances the first nucleotide is rapidly removed from the downstream primer. However, there is essentially no further cleavage. Similarly, when a substrate with a one nucleotide gap between the primers, or one with no upstream primer at all, was tested, the activity of the nuclease was negligible. Parallel work (209) reached the same conclusions concerning the exonuclease from the mouse.

108

ROBERT A. BAMBARA AND LIN HUANG

These results explain convincingly why polymerization from an upstream primer stimulates exonuclease activity on a downstream primer. Polymerization continuously generates the nicked structure necessary for exonuclease action. It is also clear why either calf or prokaryotic DNA polymerases stimulated the calf exonuclease. As long as the polymerase could generate a nick, the nuclease would be stimulated. These observations show that the nuclease can act with any polymerase to carry out nick translation. The actual process of nick-translation was demonstrated using a substrate consisting of a synthetic template with a foldback upstream primer (215). A downstream primer separated by a four-nucleotide gap was then annealed. Either of DNA polymerases a,6, or E could extend the upstream primer to generate a nick that could be sealed by calf DNA ligase I. Podust and Hiibscher (216) obtained similar results, but had more difficulty obtaining ligation in reactions containing polymerase a, possibly because the polymerase is inefficient at complete gap filling (217). In fact, they point out that secondary structure in their template may have exacerbated this problem. These results show that the 5‘-to-3’ exonuclease does not work uniquely with only one DNA polymerase to carry out nick translation. Therefore, specificity of the nuclease cannot be used to define the DNA polymerase that participates in the joining of Okazaki fragments. Synthesis and ligation also occur in the presence of the calf 5’-to-3’ exonuclease (215).In this case, it appears that nucleotides are removed from the downstream primer prior to ligation. This means that synthesis from the upstream primer replaces the removed nucleotides, continually generating a nick that can be sealed. Because the nuclease works best on a nicked structure, genuine nick translation must occur, moving the nick into the downstream primer, followed by ligation.

D. Lagging-strand Completion in the Twopolymerase Reconstitution of DNA Replication in SV40 The most complete reconstitution of mammalian DNA replication to date was recently carried out in the SV40 system (199, 200). As with earlier reconstitution reactions, the components necessary for initiation at the origin and production of leading- and lagging-strand products were present. These included T antigen and PCNA, DNA polymerases a and 6, RF-C, PCNA, RP-A, and topoisomerases I and 11. However, the goal w a s to identlfy the additional components necessary to produce closed-circular-form-I DNA. Two fractions from human cells were identified that allowed generation of the form-I product. From one was purified a component named maturation factor I. This protein was a 44-kDa 5’-to-3’ exonuclease specific for double-

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

109

stranded DNA. It was clearly identified by the authors as the same 5'-to-3' exonuclease described above. The other fraction contained DNA ligase I. Purified calf DNA ligase I was an effective substitute for this latter factor. Another important observation w a s that DNA ligase I rather than I11 was uniquely required for Okazaki fragment maturation during SV40 replication in uitro. Mammalian cells contain at least three DNA ligases (218, 219). Evidence suggests that DNA ligase I functions in DNA replication (220-223) and that DNA ligase I1 participates in repair (224). The role of ligase I11 is not known. The system of Waga et al. (199, 200) differs from that of Ishimi et al. (92) in that DNA polymerase 6, PCNA, and RF-C are present. Topoisomerase I included in the two-polymerase system allowed formation of the form-I product, whereas it inhibited its formation in the monopolymerase system. The latter system also needed a relatively high concentration of polymerase a and a low concentration of RP-A for polymerase a to carry out leadingstrand DNA synthesis. It was suggested (199) that the presence of DNA polymerase 6, PCNA, RF-C, and high levels of RP-A alter the reactions associated with initiator RNA removal and joining of Okazaki fragments. This seems particularly likely in the case of RP-A, which can bind directly adjacent to the initiator RNA. It was also pointed out that the preparation of DNA ligase I used contained RNase H1, and that the RNase H1 could have participated in the removal of initiator RNA (199, 200).

E. Polymerase Switching on the Lagging Strand Is Required for Ligation Waga and Stillman (200) provided additional compelling evidence that a polymerase switching process is an essential feature of lagging strand synthesis. They used a synthetic Okazaki fragment model substrate consisting of a 445-nucleotide DNA template with a 30-nucleotide primer annealed to its 3' end. A 15-nucleotide RNA was annealed with its 3' end 227 nucleotides from the 5' end of the template, and then fully extended with DNA. Addition of RP-A, RF-C, PCNA, DNA polymerase 6, maturation factor 1, and a mixture of DNA ligase I and RNase H1 resulted in elongation of the upstream primer, RNA removal, and ligation. The same reaction could also be completed efficiently if DNA polymerase a was used in place of RP-A, RF-C, PCNA, and polymerase 6. However, addition of RP-A to the latter reaction partially inhibited product formation, and addition of PCNA and RF-C completely prevented generation of the ligated product. Therefore, the components of the reaction that load DNA polymerase 6 onto the primer also suppress the action of polymerase a.

110

ROBERT A. BAMBARA AND LIN HUANG

F. Specificities of RNase H1 and the 5’-to-3’ Exonuclease Suggest Their Functions in Okazaki Fragment Processing We used a synthetic Okazaki fragment substrate to model the initiator

RNA removal and ligation reactions (225). It consisted of a 13-nucleotide RNA transcript annealed to a longer DNA template, then extended by polymerization for 60 nucleotides. A DNA primer was annealed upstream on the same template, separated by 29 nucleotides from the 5’ end of the initiator RNA of the downstream primer. The complete reaction required removal of the initiator RNA, extension of the upstream primer, and ligation. Use of this substrate showed that some of our preparations of calf5’-to-3’ exonuclease contained a contaminating RNase H. Preparations with both activities removed the initiator RNA, whereas those with the exonuclease alone did not. Further purification resolved the two enzymes and allowed identification of the second activity as RNase H1 (226). In a time- and concentration-dependent manner, purified calf thymus RNase H1 degraded the RNA, generating a distinct degradation product. This product consists of the DNA portion of the primer with a single remaining ribonucleotide at the 5’ end. The rest of the RNA primer was released as an intact oligonucleotide, sustaining no further cleavage. The double-strandspecific 5’-to-3‘ exonuclease, added as a purified enzyme, removed the remaining monoribonucleotide. We used DNA polymerase E to generate the nick, which was then sealed by DNA ligase I. The exonuclease could potentially be stimulated by polymerization from the upstream primer, or by the cleaved RNA segment still annealed after the action of the RNase H. The polymerization steps could also be carried out by polymerases (Y or 6 (215). The unique specificities of the two nucleases for primers with initiator RNA strongly suggest that they perform the same reactions in uiuo. This series of reactions is depicted in Fig. 3. Eukaryotes contain two classes of RNase H, designated types 1 and 2 (227).RNase H 1 activity correlates with DNA synthesis, suggesting a role in DNA replication (227). DeFrancesco and Lehman (205) showed that RNase H from D.melanogaster not only removes initiator primers from Okazaki fragments in uitro, but also stimulates DNA synthesis by purified DNA polymerase dprimase. We have also examined mammalian RNase H1 cleavage products from three Okazaki fragment model substrates with different structures (228). Results show that the initiator RNA was removed in each case by a single cut made between the last two ribonucleotides upstream of the RNA-DNA junction (Fig. 3). The initiator RNA was released intact. Specific cleavage

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

111

FIG. 3. Completion of lagging-strand DNA synthesis.

occurred even though the RNA segments ranged from 13 to 31 nucleotides in length. Furthermore, the position of cleavage was not influenced by the nucleotide sequence in the junction region. Cleavage specificity was lost if the RNA was not extended with DNA, or if there was a nick at the RNADNA junction. Specific cleavage was observed in the presence of Mg2+, or both Mgz+ and Mn2+. Cleavage with only Mn2+ was more random. Recent analysis of the solution structure of an Okazaki fragment model substrate indicates that the RNA-DNA junction has unique groove and bending features thpt could contribute to the specificity of RNase H cleavage

112

ROBERT A. BAMBARA AND LIN HUANG

(229). Comparison with E. coli RNase H or the human immunodeficiency virus reverse transcriptase RNase H showed that neither enzyme cut with the same specificity as the mammalian RNase H1 (228). HIV-RT made a preferred cut a fixed distance from the 5’ end of the RNA, whereas cleavages by E. coli RNase H appeared to occur at random positions. Overall, these results suggest that calf RNase H1 is designed to recognize the distinct structure of Okazaki fragments.

G. Endonuclease Function of the 5’-to-3‘ Exonuclease Another distinctive feature of the mouse 5‘-to-3’ exonuclease reported

by Goulian et al. (208) was its action on poly(dA-dT), a perfectly alternating double-stranded polymer. Surprisingly, the products were mostly the dimers d(T-A) or d(A-T), plus some alternating oligomers. This suggested that the enzyme has endonucleolytic activity. We later demonstrated a DNA endonuclease function in the calf 5‘-t0-3’ exonuclease that requires a substrate with a very specific structure (214). Cleavage requires a primer annealed to a template such that there is an unannealed region at its 5’ end. Furthermore, there must be a second primer annealed such that its 3‘ nucleotide is directly upstream of the first annealed nucleotide of the downstream primer. An endonucleolytic cleavage can then occur either just before or just after the first annealed nucleotide of the downstream primer. Endonuclease action removes unannealed segments of 2 to at least 12 nucleotides in length. It is likely that the unique structure of the alternating poly(dA-dT) allowed foldbacks that could transiently create the needed upstream primer and unannealed tail for the observed endonuclease action (208). Parallel work by Harrington and Lieber (209) showed that FEN-1 is also an endonuclease. It efficiently removes the unannealed segment of two primer substrates as described above, and does not cleave 3’ unannealed regions, Holliday junctions, or RNA 5’ unannealed regions (209). These observations suggest that, regardless of source, the mammalian 5‘-to-3’ exonuclease has endonuclease function. These specificities of nuclease action are not only unique compared to other mammalian nucleases, but are the same as the specificities of the 5’-to-3’ exonucleases of E. coli DNA polymerase I and Taq polymerase (203). This means that it is most likely that the mammalian 5’-to-3’ exonuclease is the functional homolog of the nuclease in the bacterial polymerase I. Because more than one of the mammalian DNA polymerases may perform nick translation, it is probable that the 5’-to-3’ exonuclease can serve all of them as a partner in that role. Rather than being physically attached to a single polymerase, the nuclease has evolved a specificity that requires coordination

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

113

with any DNA polymerase in order to carry out cleavage associated with nick translation.

H. DNA Polymerase p Nuclei of eukaryotic cells also contain DNA polymerase p. It is smaller than the other nuclear polymerases, lacks associated nuclease activity, and does not undergo major changes in activity with cell growth (230). Rat and human forms have been cloned and expressed in bacteria (231), and the three-dimensional structure of the rat enzyme has been determined (232). Synthesis on single-stranded substrates and long gaps is distributive (231, 233-240). Synthesis on short gaps is most efficient and goes to completion (241).Interestingly, filling of short gaps of up to six nucleotides is processive, if the gap has a 5' phosphate (242). Photochemical cross-linking analysis indicates that on long gaps, the polymerase binds the 5' side of the gap. When the gap is shorter than six nucleotides, the 3' terminus can contact the polymerase, resulting in processive synthesis to fill the gap. These results suggest a role for DNA polymerase p in DNA repair rather than in replication. They are consistent with the observation that mutations in the polymerase p gene in yeast are not lethal (243). Also, recent experiments in which the level of polymerase p is down-regulated in cultured cells by antisense techniques show that the cells become sensitized to a wide variety of DNA-damaging reagents (Samuel Wilson, personal communication). Polymerase p might also participate in some aspect of gap filling associated with DNA replication, but in an optional pathway.

V. Regulation of Replication Reactions A. Specificity of Interaction of DNA Replication Components Analysis of the interaction specificity of a protein with other replication proteins can be used to verdy its unique participation at a particular step in DNA replication, compared to proteins with the same catalytic activity. One example is the interaction of initiation proteins with DNA polymerase a. Initiation at the SV40 replication origin requires a DNA polymerase OL from a cell type that is permissive for SV40 replication (244,245).This is not true of the bovine papilloma system, in which the E l protein, which performs the same functions as the SV40 T antigen, can initiate replication using polymerase OL from a nonpermissive cell type (246). Similarly the RP-A horn S. cerevisiae cannot substitute for the human RP-A in SV40 replication (247). Although the yeast RP-A supports T-antigen-directed origin unwinding, it

114

ROBERT A . BAMBAKA AND LIN HUANG

does not allow normal T-antigen stimulation of priming by polymerase a. Additionally, T4 DNA replication proteins can substitute for DNA polymerase 6 and its auxiliary factors for leading-strand synthesis in SV40 DNA replication, but these proteins cannot support generation of form-I SV40, suggesting that they cannot interact with the components that remove initiator RNA and allow ligation (200).Also, DNA ligase I tlersus I11 was needed for the latter reaction. Evidently, depending on the system, there is a specificity of interaction that may be necessary, beyond the need for a particular catalytic action. Such observations prompted Waga and Stillman (200)to propose that the components of DNA replication are part of a single complex. This is similar to the proposed structure of prokaryotic replication forks ( I 73, 247-249). The model first proposed by Sinha et al. (247)for T4 DNA replication argues that the lagging strand is threaded through the complex in a loop and periodically released. A similar structure has been proposed for the bacterial and now the mammalian replication fork.

B. Regulation Mechanisms Only now are direct connections emerging between the growth regulation systems in eukaryotic cells and the protein components of the DNA replication machinery. Examples include phosphorylation of SV40 T antigen (as reviewed in 1I), phosphorylation of DNA polymerase a (250) and RP-A (138, and direct inhibition of PCNA by p21 (251). Phosphorylation plays an important role in the regulation of T antigen transcription and replication activities. Biochemical and mutational studies indicate that the direct effect of phosphorylation ofT antigen is on its binding ability to site I1 in the replication core origin (252).In contrast, the helicase activity and single-strand DNA binding activity of the T antigen were not affected (253-256). Phosphorylation of threonine-124 by the cell-cycleregulated cdc2/cyclin kinase was found to activate T antigen to initiate SV40 DNA replication (257). However, T-antigen kinase (casein kinase I) inhibits SV40 replication by phosphorylation of intact T antigen on serines-120 and - 123 (258).The cellular protein phosphotase designated RP-C, the catalytic subunit of protein phosphatase 2A, stimulates the initiation activity of T antigen (259, 260) and also can reverse the inhibitory effects produced by casein kinase I (261). DNA polymerase a is a phosphoprotein (250).Phosphorylation correlates with maximum synthetic activity (262,263).The level of the phosphorylated protein increases as cells move from quiescence to proliferation (264). The 180-kDa subunit is phosphorylated throughout the cell cycle, but is hyperphosphorylated in G,/M phase, whereas the 70-kDa subunit is phosphory-

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

115

lated only in G,/M phase (250).The results suggest that the p34cAZ kinase is responsible for cell-cycle-related phosphorylation of DNA polymerase a (250). RP-A is also phosphorylated in a cell-cycle phase-specific manner (137), the 34-kDa subunit much more in S and G, phases than in G, phase (137). Both the 34- and 70-kDa subunits can be phosphorylated in vitro by the cdc2-cyclin B kinase complex (265, 266). G, extracts from HeLa cells, incubated in advance with human cyclin A, hyperphosphorylate the 34-kDa subunit (267). Phosphorylation was carried out by the cdk-cyclin A complex and DNA-dependent p350 protein kinase (DNA-PK). The tumor suppressor p53 protein regulates expression of the cyclindependent protein-kinase regular p21, a protein that inhibits SV40 DNA replication in uitro (251).Trimeric PCNA forms a one-to-one complex with p21, preventing stimulation of processive DNA synthesis by DNA polymerase 6 (251).This suggests a means by which p53 can regulate entry into S phase and damage control. Genetic analyses of the requirements for initiation of DNA replication offer the promise of elucidating the control of DNA replication. The power of this approach is exemplified in the analysis of origin-associated proteins. The genes ORC2 and ORC6 that encode ORC proteins are necessary for cellcycle progression and to bind the ARS A element in viuo (268-271). The minichromosome maintenance genes MCMI, MCM2, and MCM5 in yeast (272, 273) are essential for growth, and encode structurally related proteins necessary to maintain ARS-specific plasmids. The proteins localize to the nucleus between M phase and the beginning of S phase, binding tightly to DNA. They disappear from the nucleus at the onset of DNA replication, suggesting that they control the timing of replication initiation. These are characteristics of the “licensing factor” proposed to allow one round of chromosomal replication per cell cycle (274). Elucidation of the biochemical steps in the control of initiation of DNA replication will be the next major advance in this field. It offers the promise of more sophisticated efforts to prevent and treat diseases resulting from breakdown of growth regulation.

ACKNOWLEDGMENTS We thank John Turchi and Lynn Rust for critical reading of the manuscript, and Richard Murante for expert computer graphic illustrations. We are grateful to Bruce Stillman and Mark Kenny for reviewing our presentation. Also, our thanks to Shlomo Eisenberg for providing an excellent summary of the yeast origin literature. Support for this work was provided by National Institutes of Health Grant GM 24441.

116

ROBERT A. BAMBARA AND LIN HUANG

REFERENCES 1 . L. S. Kaguni and I. R. Lehman, BBA 950, 87 (1988). 2 . P. M. J. Burgers, This Series 37, 235 (1989). 3. M. D. Chalberg and T. J. Kelly, ARB 58, 671 (1989). 4. B. Stillman, Annu. Reo. Cell Bid. 5, 197 (1989). 5. J. Hurwitz, F. B. Dean, A. D. K w n g and S.-H. Lee, JBC 265, 18043 (1990). 6. R. A. Bambara and C. B. Jessee, BBA 1088, 11 (1991). 7. J. L. Campbell and C. S. Newlon, in ‘The Molecular and Cellular Biology of the Yeast Saccharomyces.” CSH Lab Press, Cold Springs Harbor, NY, 1991. 8. T.S.-F. Wang, ARB 60,513 (1991). 9. K. C. Sitney, M . E. Budd and J. L. Campbell, Ann. N.Y. Acad. Sci. 665, 52 (1992). 10. A. Kornberg and T. A. Baker, “DNA Replication.” Freeman, New York, 1992. 11. T. Melendy and B. Stillman, Nucleic Acids Mol. B i d . 6, 129 (1992). 12. A. G. So and K. M. Downey, Crit. Reo. Biochem. Mol. Biol. 27, 129 (1992). 13. J. J. Li and T. J. Kelly, PNAS 81, 6973 (1984). 14. J. J. Li and T. J. Kelly, MCBiol 5 , 1238 (1985). 15. B. W. Stillman and Y. Cluzman. MCBiol5, 2051 (1985). 16. C. R. Wobbe, F. Dean, L. Weissbach and J. Hurwitz, PNAS 82, 5710 (1985). 17. R. Tjian, Cell 13, 165 (1978). 18. S . Deb, A. L. DeLucia, C. Baur. A. Koff and P. Tegtmeyer, MCBiol6, 1663 (1986). 19. D. DiMaio and D. Nathans, J M B 140, 129 (1980). 20. A. L. DeLucia, B. A. Lewton, R. Tjian and P. Tegtmeyer, J. Virol. 46, 143 (1983). 21. P. Tegtmeyer, B. A. Lewton, A. L. DeLucia, V. G. Wilson and K. Ryder, J. Virol. 46, 151 (1983). 22. D. G. Tenen, T. S. Taylor, L. L. Haines, M. K. Bradley, R. G . Martin and D. M. Livingston, JMBiol 168, 791 (1983). 23. K. A. Jones, R. M. Myers and R. Tjian, EMBO J. 3, 3247 (1984). 24. P. Gottlieb, M. S. Nasoff, E. F. Fisher, A. M. Walsh and M. H. Caruthers, NARes 3,6621

(1985). 25. S. P. Deb and P. Tegtmeyer, J. Virol. 61, 3649 (1987). 26. S. Deb, A. L. DeLucia, K. A. Koff, S. Tsui and P. Tegtmeyer, MCBiol 6, 4578 (1986). 27. R. M. Myers and R. Tijian, PNAS 77, 6491 (1980). 28. K. A. Jones and R. Tjian, CeU 36, 155 (1984). 29. B. Stillman, R. D. Gerard, R. A. GuggenheimerandY. Gluzman, EMBOJ. 4,2933(1985). 30. J. J. Li, K. W. C. Peden, R. A. F. Dixon and T. Kelly, MCBiol. 6, 1117 (1986). 31. A. L. DeLucia, S. Deb, K. Partin and P. Tegtmeyer, /. Virol. 57, 138 1986). 32. J. A. Borowiec and J. Hurwitz, PNAS 85, 64 (1988). 33. F. B. Dean, M. Dodson, H. Echols and J. Hurwitz, PNAS 84, 8981 (1987). 34. I. A. Mastrangelo, P. V. C. Hugh, J. S. Wall, M. Dodson, F. B. Dean and J. Hurwitz, Nature 338, 658 (1989). 35. F. B. Dean, P. Bullock, Y. Murakami, C. R. Wobbe, L. Weissbachand J. Hurwitz, PNAS 84, 16 (1987). 36. M. Dodson, F. B. Dean, P. Bullock, H. Echols and J. Hurwitz, Science 238, 964 (1987). 37. M. S. Wold, J. J. Li and T. J. Kelly, €”AS 84, 3643 (1987). 38. J. A. Borowiec and J. Hurwitz, EMBO J. 7, 3149 (1988). 39. R. Parsons, M. E. Anderson and P. Tegtmeyer, J. Virol. 64, 509 (1990). 40. H. Stahl, P. Droge and R. Knippers, E M B O J . 5, 1939 (1986). 41. G . S. Coetz, F. 8 . Dean, J. Hurwitz and S. W. Matson, JBC 263, 383 (1988). 42. M. Wiekawski, M. W. Schwarz and H. Stahl, JBC 263, 436 (1988).

RECONSTITUTION O F MAMMALIAN DNA REPLICATION

117

M. S. Wold and T. Kelly, PNAS 85, 2523 (1988). P. A. Bullock, Y. S. Seo and J. Hunvitz, MCBiol 11, 2350 (1991). D. T. Stinchcomb, K. Struhl and R. W. Davies, Nature 282, 39 (1979). D. H. Rivier and J. Rine, Science 256, 659 (1992). A. M . Deshpande and C. S. Newlon, MCBiol 12, 4305 (1992). R.-Y. Huang and D. Kowalski, EMBO J. 12, 4521 (1993). B. J. Brewer and W. L. Fangman, Cell 51, 463 (1987). J. A. Huberman, L. D. Spotila, K. A. Nawotka, S. M. El-Assouli and L. R. Davis, Cell 51, 473 (1987). 51. D. A. Natale, R. M. Umekand D. Kowalski, NARes21, 555(1993). 52. S. E. Celniker, K. Sweder, F. Srienc, J. E. Baily and J. L. Campbell, MCBiol4, 2455 (1984). 53. Y. Marahrens and B. Stillman, Science 255, 817 (1992). 54. S. P. Bell and B. Stillman, Nature 357, 128 1992) 55. J. F. X. Difffey and J. H . Cocker, Nature 357, 169 (1992). 56. S. S. Walker, S. C. Francesconi and S. Eisenberg, PNAS 87, 4665 (1990). 57. S. Eisenberg, D. Civalier and B. K. Tye, PNAS 85, 743 (1988). 58. J. F. X. DifHey and B. Stillman, PNAS 85, 2120 (1988). 59. K. S. Sweder, P. R. Rhode and J. L. Campbell, JBC 263, 17270 (1988). 60. S. C. Francesconi and S. Eisenberg, MCBiol9, 2906 (1989). 61. H. Halfter, U. Muller, E.-L. Winnacker and D. Galwitz, EMBOJ. 8, 3029 (1989). 62. R. A. Buchman and R. D. Kornberg, MCBiol 10, 887 (1990). 63. H. G. Estes, B. S. Robinson and S. Eisenberg, PNAS 89, 11156 (1992). 64. C. S. Newlon, Science 262, 1830 (1993). 65. B. Stillman, JBC 269, 7047 (1994). 66. W. C. Burhans and J. A. Huberman, Science 263, 639 (1994). 67. T. L. Orr-Weaver, BioEssays 13, 97 (1991). 68. S. Handeli, A. Klar, M. Meuth and H. Cedar, Cell 57, 909 (1989). 69. D. Kitsberg, S. Selig, I. Keshet and H. Cedar, Nature 366, 588 (1993). 70. J. Zhu, C. Brun, H. Kurooka, M. Yanagida and J. A. Huberman, Chromosom 102, S7 (1992). 71. D. D. Dubey, J. Zhu, D. L. Carlson, K. Sharmaand J. A. Huberman, EMBO]. 13,3638 (1994). 72. D. L. Dobbs, W.-L. Shaiu and R. M. Benbow, NARes 22, 2479 (1994). 73. P. Reichard, R. Eliasson and G. Soderman, PNAS 71, 4901 (1974). 74. V. Pigiet, R. Eliasson and P. Reichard, J M B 84, 197 (1974). 75. M. A. Waqar and J. A. Huberman, Cell 6, 551 (1975). 76. B. Y. Tseng and M. Goulian, Cell 12, 483 (1977). 77. R. C. Conaway and I. R. Lehman, PNAS 79, 2523 (1982). 78. R. C. Conaway and I. R. Lehman, PNAS 79, 4585 (1982). 79. L. S. Kaguni, J.-M. Rossignol, R. C. Conaway and I. R. Lehman, PNAS 80,2221 (1983). 80. T. Yagura, S. Tanaka, T. Kozu, T. Seno and D. Korn, JBC 258, 6698 (1983). 81. T. Yagura, T. Kozu and T. Seno, ]BC 257, 11121 (1982). 82. T.S.-F. Wang, S.-Z Hu and D. Korn, JBC 259, 1854 (1984). 83. S.-Z. Hu, T. S.-F. Wang and D. Korn, JBC 259, 2602 (1984). 84. S.-S. Zhang and F. Grosse, JMB 216, 475 (1990). 85. R. J. S h e d a n d R. D. Kuchta, JBC 269, 19225 (1994). 86. R. J. S h e d a n d R. D. Kuchta, Bchem 32, 3027 (1993). 87. R. J. S h e d , R. D. Kuchta and D. Ilsley, Bchem 33, 2247 (1993). 88. J. A. Borowiec, F. B. Dean, P. A. Bullock and J. Hunvitz, Cell 60, 181 (1990).

43. 44. 45. 46. 47. 48. 49. 50.

118

ROBERT A. BAMBARA AND LIN HUANG

89, T. Tsurimoto, M. P. Fairman and B. Stillman, MCBiol9, 3839 (1989). 90. J. hi. Roberts, PNAS 86, 3939 (1989). 91. C. R. Wobhe, L. Weissbach, J. A. Borowiec, F. B. Dean, Y. Murakami, P. Bullock, and J. Hurwitz, PNAS M,1834 (1987). 92. Y. Ishimi, A. Claude, P. Bullock and J. Hurwitz, JBC 263, 19723 (1988). 93. L. M. Johnson, M. Snyder, L. M. Cbang, R. W. Davis and J. L. Campbell, Cell 43, 369 (1985). 94. K. C. Sitney, M. E. Budd and J. L. Campbell, Cell 56, 599 (1989). 95. A. Boulet, M. Simon, G. Faye, G . A. Bauer and P. M. Burgers, E M B O J . 8, 1849 (1989). 96. A. Morrison, H. Arai, A. B. Clark, R. K. Hamatake and A. Sugino, Cell 62, 1143 (1990). 97. M. Y. W. T. Lee, C.-K. Tan, K. M. Downey and A. G . So, This Series 26, 83 (1981). 98. M. Y. W. T. Lee. C.-K. Tan, K. M. Downey and A. G. So, Bchem 23, 19C6 (1984). 99. M. Goulian, S . M. Hermann, J. W.Sackett and S. L. Grirnm, JBC 265, 16402 (1990). 100. C.-L. Yang, L. -S. Chang, P. Zhang, H. Hao, L. Zhu, N. L. Toomey and M. Y. W. T. Lee, NARes 20, 735 (1992). 101. D. W. Chung, J. Zhang, C.-K. Tan, E. W. Davie, A. G. Soand K. M. Downey, PNAS 88, 11197 (1991). 102. J. Zhang, D. W. Chung, C. -K. Tan, K. M. Downey, E. W. Davie and A.-G. So, Bchem 30, 11742 (1991). 103. C.-K. Tan, C. Castillo, A. G . So and K. M. Downey, JBC 261, 12310 (1986). 104. R. Bravo, S . J. Fey and J. E. Celis, Carcinogenesis 2, 769 (1981). 105. J, E. Celis, P. Madsen, S. Nielsen and H. H. Rasmussen, Anticoncer Res. 7, 605 (1987). 106. R. Bravo, S. J. Fey, 1. Bellatin, P. M. Larsen and J. E. Celis, “Embryonic Development. Part A. Genetic Aspects.” pp. 235-248. Alan R. Liss, New York, 1982. 107. J. E. Celis and R. Bravo. FEBS Lett. 165, 21 (1984). 108. J. E. Celis, R. Bravo, P. M. Larsen and S. J. Fey, k u k . Res. 8, 143 (1984). 109. J. E. Celis, S. J. Fey, P. M. Larsen and A. Celis, PNAS 81, 3128 (1984). 110. K. Miyachi. M. Fritzler and E. M. Tan, J. Immunol. 121, 2228 (1978). 1 1 1 . Y. Takasaki, J.-S. Deng and E. M. Tan.1. Exp. Med. 154, 1899 (1981). 112. P.-K. Chan, R. Frakes, E. M. Tan, M. G . Brattain, K. Smetana and H. Busch, Cancer Res. 43, 3770 (1983). 113. J. E. Celis and A. Celis, PNAS 82, 3262 (1985). 114. R. Bravo, FEBS Lett. 169, 185 (1984). 115. P. Madsen and J. E. Celis, FEBS Lett. 193, 5 (1985). 116. J. E. Celis and P. Madsen, FEBS Lett. 209, 227 (1986). 117. R. Bravo and H. MacDonald-Bravo, EMBO]. 4, 655 (1985). 118. R. Bravo, Exp. Cell Res. 163, 287 (1986). 119. L. Toschi and R. Bravo, J. Cell Bwl. 107, 1623 (1988). 120. K. Matsumoto, T. Moriuchi, T. Koji and P. K. Nakane, EMBO]. 6, 637 (1987). 121. J. M. Almendral, D. Huebsch, P. A. Blundell, H. MacDonald-Bravoand R. Bravo, PNAS 84, 1517 (1987). 122. G . Prelich, C.-K. Tan. M. Kostura, M. B. Mathews, A. G. So, K. M. Downey and B. Stillman, Nature 326, 517 (1987). 123. R. Bravo, R. Frank, P. A . Blundell and H. MacDonald-Bravo, Nature 26, 515 (1987). 124. M. S. Wold, D. H. Weinberg, D. M. Virshup, J. J. Li and T. J. Kelly,]BC 264,2801 (1989). 125. G . Prelich, M. Kostura. D. R. Marshak, M. B. Mathews and B. Stillman, Nature326,471 (1987). 126. D. H. Weinberg and T. J. Kelly. PNAS 86, 9742 (1989). 127. T. Tsurimoto and B. Stillman, EMBO J. 8, 3883 (1989). 128. T. Tsurimoto, T. Melendy and B. Stillman, Nature 346, 534 (1990).

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

119

M. P. Fairman and B. Stillman, EMBOJ. 7, 1211 (1988). M. K. Kenny, U. Schlegel, H. Furneaux and J. HurwitzJBC 265, 7963 (1990). L. F. Erdile, M. S. Wold and T. J. Kelly, JBC 265, 317 (1990). L. F. Erdile, W.-D. Heyer, R. Kolodner and T. J. Kelly, JBC 266, 12090 (1991). C. B. Umbricht, L. F. Erdile, E. W. Jabs and T. J. Kelly, JBC 268, 6131 (1993). L. A. Henricksen, C. B. Umbricht and M. S. Wold, JBC 269, 11121 (1994). W.-D. Heyer, M. R. S. Rao, L. F. Erdile, T. J. Kelly and R. D. Kolodner, EMBOJ. 9, 2321 (1990). 136. S. J. Brill and B. Stillman, Genes Dev. 5, 1589 (1991). 137. S.-U. Din, S. J. Brill, M. P. Fairman and B. Stillman, Genes Den 4, 968 (1990). 138. C. Kim, R. 0. Snyder and M. S. Wold, MCBioZ 12, 3050 (1992). 139. L. J. Blackwell and J. A. Borowiec, MCBioZ 14, 3993 (1994). 140. C. Kim, B. F. Paulus and M. S. Wold Bchen 33, 14197 (1994). 141. P. G. Mitsis, S . C. Kowalczykowski and I. R. Lehman, Bchern 32, 5257 (1993). 142. M. P. Carty, A. S. Levine and K. Dixon, Mutat. Res. 274, 29 (1992). 143. P. Thommes, E . Ferrari, R. Jessberger and U. Hubscher, J B C 267, 6063 (1992). 144. J. J. Turchi, R. S. Murante and R. A. Bambara, NARes 20, 6075 (1992). 145. Y . 4 . Seo, S.-H. Lee and J. Hurwitz, JBC 266, 13161 (1991). 146. Y . 3 . Seo and J. Hunvitz, JBC 268, 10282, (1993). 147. M. K. Kenny, S.-H. Lee and J. Hunvitz, PNAS 86, 9757 (1989). 148. S.-H. Lee, Z.-Q. Pan, A. D. Kwong, P. M. J. Burgers and J. Hurwitz, JBC 266, 22707 (1991). 149. D. Coverley, M. K. Kenny, M . Munn, W. D. Rupp, D. P. Lane and R. D. Wood, Nature 349, 538 (1991). 150. D. Coverley, M. K. Kenny, D. P. Lane and R. D. Wood, NARes 20, 3873 (1992). 151. S. P. Moore, L. Erdile, T. Kelly and R. Fishel, PNAS 88, 9067 (1991). 152. T. Tsnrimoto and B. Stillman, MCBiol 9, 609 (1989). 153. F. Bunz, R. Kobayashi and B. Stillman, PNAS 90, 11014 (1993). 154. Z.-Q. Pan, M. Chen and J. Hurwitz, PNAS 90, 6 (1993). 155. B. L. Yoder and P. M. J. Burgers, J B C 266, 22689 (1991). 156. P. M. J. Burgers, JBC 266, 22698 (1991). 157. K. Fien and B. Stillman, MCBiol 12, 155 (1992). 158. M. O’Donnell, R. Onrust, F. B. Dean, M. Chen and J. Hurwitz, NARes 21, l(1993). 159. M. Chen, Z.-Q. Pan and J. Hurwitz, PNAS 89, 5211 (1992). 160. M . Chen, 2.-Q. Pan and J. Hurwitz, PNAS 89, 2516 (1992). 161. X. Li apd P. M. Burgers, PNAS 91, 868 (1994). 162. X. Li and P. M. J, Burgers, JBC 269, 21880 (1994). 163. T. Tsurimoto and B. Stillman, PNAS 87, 1023 (1990). 164. T. Tsurimoto and B. Stillman, JBC 266, 1950 (1991). 165. S.-H. Lee, A. D. Kwong, Z.-Q. Pan and J. Hurwitz, JBC 266, 594 (1991). 166. T. Tsurimoto and B. Stillman, JBC 266, 1961 (1991). 167. T.-A. Cha and B. M. Alberts, in “Eukaryotic DNA Replication: Cancer Cells 6” (T. Kelly and B. Stillman, eds.), pp. 1-10, CSH Lab Press, Plainview, NY, 1988. 168. M. M. Munn and B. M. Alberts, JBC 266, 20034 (1991). 169. M. M. Munn and B. M. Alberts, JBC 266, 20024 (1991). 170. T. L. Capson, S. J. Benkovic and N. G. Nossal, Cell 65, 249 (1991). 171. C . S. McHenry, ARB 57, 519 (1988). 172. K. J. Marians, ARB 61, 673 (1992). 173. M. O’Donnell and P. S. Studwell, JBC 265, 1179 (1990). 174. X. P. Kong, R. Onrust, M. O’Donnell and J. Kuriyan, Cell 69, 425 (1992).

129. 130. 131. 132. 133. 134. 135.

120

ROBERT A. BAMBARA AND LLN HUANG

G. A. Bauer and P. M. J. Burgers, PNAS 85, 7506 (1988). P. M. J. Burgers and B. L. Yoder, JBC 268, 19923 (1993). T. S. R. Krishna, X.-P. Kong, S. Gary, P. M. Burgers and J. Kuriyan, Cell79, 1233 (1994). J. J. Byrnes, K. M. Downey, V. L. Black and A. C . So, Bchem 15, 2817 (1976). M. Y. W. T. Lee, Y. Jiang, S.-J. Zhang and N. L. Toomey, JBC 266, 2423 (1991). M. Y. W. T. Lee and N. L. Toomey, Bchem 26, 1076 (1987). C. Nishida, P. Reinhard and S. Linn, JBC 263, 501 (1988). J. Syvaoja and S. Linn. JBC 264, 2489 (1989). J. J. Crute, A. F. Wahl and R . A. Bambara, Bchem 25, 26 (1986). T. W. Myers and R. A. Bambara, UCLA Symp. Mot. Cell. B i d . , New Ser. 234, 165 (1990). F. Focher, S. Spadari, B. Ginelli, M.Hottiger, M. Gassman and U. Hubscher, NARes 16, 6279 (1988). 186. R. K. Hamatake, H. Hasegawa, A. B. Clark, K. Babenek, T. A. Kunkel and A. Sugino, JBC 265, 4072 (1990). 187. G. Siegal, J. J. Turchi, C. B. Jessee, L. M. Mallaber, R. A. BambaraandT. W. Myers,]BC 267, 3991 (1992). 188. T. Kesti, H. Franti and J. E. Syvaoja,JBC 268, 10238 (1993). 189. J. J. Byrnes and V. L. Black, Bchem 17, 4226 (1978). 190. J. J. Byrnes, MCBioZ 62, 13 (1984). 191. R. D. Sabatino, T. W. Myers, R. A. Bambara, 0, Kwon-Shin, R. L. Marraccino and P. H. Frickey, Bchem 27, 2998 (1988). 192. R. A. Bambara. T. W. Myers and R. D. Sabatino, in “The Eukaryotic Nucleus: Molecular Biochemistry and Macromolecular Assemblies” (P. Strauss and S. Wilson, eds.), pp. 6994. Telford Press, Caldwell, NJ, 1990. 193. J. R. Pringle and L. H. Hartwell, in ‘The Molecular Biology of the Yeast Saccharomyces: Life Cycle and Inheritance” (J. N. Strathern, E. W. Jones and J. R. Broach, eds.), pp. 97142. CSH Press, Cold Spring Harbor, NY, 1981. 194. J. A. DiGiuseppe and S. L. Dresler, Bchem 28, 9515 (1989). 195. X.-R. Zeng, Y. Jiang, S.-J. Zhang, H. Hao and M. Y. W. T. Lee, JBC 269, 13748 (1994). 196. K. K. Shivji, M. K. Kenny and R. D. Wood, Cell 69, 367 (1992). 197. E. C. Friedberg, Microbial. Reu. 52, 70 (1988). 198. T. Nethanel and G. Kaufmann, J. Virol. 64, 5912 (1990). 199. S. Waga, S. Bauer and B. Stillman, JBC 269, 10923 (1994). 200. S. Waga and B. Stillman. Nature 369, 207 (1994). 201. B. E. Funnell, T. A. Baker and A. Kornberg, JBC 261, 5616 (1986). 202. T. Ogawa and T. Okazaki, MGG 193, 231 (1984). 203. V. Lyamichev, M. A. D. Brow and J. E. Dahlberg, Science 260, 778 (1993). 204. R. B. Kelly, M. R. Atkinson, J. A. Huberman and A. Kornberg, Nature 224, 495 (1969). 205. R. A. DeFrancesct, and I. R. Lehman, JBC 260, 14764 (1985). 206. M. K. Kenny, L. A. Balogh and J. Hunvitz, JBC 263, 9801 (1988). 207. M. Goulian and C. J. Heard, JBC 265, 13231 (1990). 208. M. Goulian, S. H. Richards, C. J. Heard and B. M. J. Bigsby, JBC 265, 18461 (1990). 209. J. J. Harrington and M . R. Lieber, EMBOJ. 13, 1235 (1994). 210. J. J. Harrington and M. R. Lieber, Genes Deu. 8, 1344 (1994). 211. S . Prakash, P. Sung and L. Prakash, ARGen 27, 33 (1993). 212. Y. Habraken, P. Sung, L. Prakash and S. Prakash, Nature 366, 365 (1993). 213. G. Siegal, J. J. Turchi, T. W. Myers and R. A. Bambara, PNAS 89, 9377 (1992). 214. R. S. Murante, L. Huang, J. J. Turchi and R. A. Bambara, JBC 269, 1191 (1994). 215. J. J. Turchi and R. A. Bambara, JBC 268, 15136 (1993). 216. V. N. Podust and U. Hiihscher, NARes 21, 841 (1993).

175. 176. 177. 178. 179. 180. 181. 182. 183. 184. 185.

RECONSTITUTION OF MAMMALIAN DNA REPLICATION

217. 218. 219. 220. 221. 222. 223. 224. 225. 226. 227. 228. 229. 230.

231. 232. 233. 234. 235. 236. 237.

238. 239. 240. 241.

242. 243. 244. 245. 246. 247. 248. 249. 250. 251. 252. 253. 254. 255.

121

S. K. Davey and E. A. Faust, JBC 265, 4098 (1990). A. E. Tomkinson, E. Roberts, G. Daly, N. F. Totty and T. Lindahl, JBC 266,21728 (1991). T. Lindahl and D. E. Barnes, ARB 61, 251 (1992). L. M. Henderson, C. F. Arlett, S. A. Harcourt, A. R. Lehmann and B. C. Broughton, PNAS 82, 2044 (1985). A. R. Lehmann, A. E. Willis, B. C. Broughton, M. R. James, H. Steingrimsdottir, S. A. Harcourt, C. F. Arlett and T. Lindahl, Cancer Res. 48, 6343 (1988). U. Lonn, S. Lonn, U. Nylen and G. Winblad, Carcinogenesis 10, 981 (1989). D. E. Barnes, A. E. Tomkinson, A. R. Lehmann, D. B. Webster and T. Lindah., Cell 69, 495 (1992). D. Creissen and S. Shall, Nature 296, 271 (1982). J. J. Turchi, L. Huang, R. S. Murante, Y. Kim and R. A. Bambara, PNAS 91,9803 (1994). P. S. Eder and J. A. Walder, JBC 266, 6472 (1991). W. Biisen, J. H. Peters and P. Hausen, EJB 74, 203 (1977). L. Huang, Y. Kim, J. J. Turchi and R. A. Bambara, JBC 269, 25922 (1994). M. Salazar, 0. Y. Fedoroff, L. Zhu and B. R. Reid, J M B 241, 440 (1994). S. H. Wilson, in “The Eukaryotic Nucleus: Molecular Biochemistry and Macromolecular Assemblies’’ (P. R. Strauss and S. H. Wilson, eds.), Vol. 1, p. 199. Telford Press, Caldwell, NJ, 1990. J. Abbotts, D. N. SenGupta, B. Zmudzka, S. G. Widen, V. Notario and S. H. Wilson, Bchem 27, 901 (1988). M. R. Sawaya, H. Pelletier, A. Kumar, S. H. Wilson and J. Kraut, Science 264, 1930 (1994). L. M. S. Chang, JBC 248, 6983 (1973). L. M. S. Chang, JMB 93, 219 (1975). R. A. Bambara, D. Uyemura and T. Choi, JBC 253, 413 (1978). A Matsukage, M. Nishizawa and T. Takahashi, J. Biochem. (Tokyo) 85, 1551 (1979). A Matsukage, M. Yamaguchi, K. Tanabe, Y. N. Taguchi, M. Nishizawa and T. Takahashi, in “New Approaches to Eukaryotic DNA Replication” (A. M. de Recondo, ed.), p. 81. Plenum, New York, 1983. S. D. Detera, S. P. Becerra, J. A. Swack and S. H. Wilson, fBC 256, 6933 (1981). M. Fry, in “Enzymes of Nucleic Acid Synthesis and Modification”(S. T. Jacob, ed.), p. 39. CRC Press, Boca Raton, FL, 1983. M. Fry and L. Loeb, “Animal Cell DNA Polymerases.” CRC Press, Boca Raton, FL, 1986. T.S.-F. Wang and D. Korn, Bchem 21, 1597 (1982). R. K. Singhal and S. H. Wilson, JBC 268, 15906 (1993). M. E. Budd and J. L. Campbell, in “Methods in Enzymology,” Vol. 262, in press, 1995. Y. Murakami, T. Eki, M. Yamada, C. Prives and J. Hunvitz, PNAS 83, 6347 (1986). F. Muller, Y.-S. Seo and J. Humitz, JBC 269, 17086 (1994). T. Melendy and B. Stillman, JBC 268, 3389 (1993). N. K. Sinha, C. F. Morris and B. M. Alberts, JBC 255, 4290 (1980). C. S. McHenry and K. 0. Johanson, in “Proteins Involved in DNA Replication” (U. Hubscher and S. Spadari, eds.), p. 315. Plenum, New York, 1984. H. Maki, S. Maki and A. Kornberg, JBC 263, 6570 (1988). H.-P. Nasheuer, A. Moore, A. F. Wahl and T.S.-F. Wang, fBC 266, 7893 (1991). S. Waga, G. J. Hannon, D. Beach and B. Stillman, Nature 369, 574 (1994). J. Schneider and E . Fanning, J. Virol. 62, 1598 (1988). D. T. Simmons, W. Chou and K. Rodgers, J. Virol. 60, 888 (1986). F. A. Grasser, K. Mann and G. Walter, J. Virol. 61, 3373 (1987). I. J. Mohr, B. Stillman and Y. Gluzman, EMBOJ. 6, 153 (1987).

122

ROBERT A. BAMBARA AND LIN HUANG

256. K. Klausing, K-H. Scheidtmann, E. A. Baumann and R. Knippers, J. Virol. 62, 1258 (1988).

257. D. McVey, L. Brizuela, I. Mohr, D. R. Marshak, Y. Gluzman and D. Beach, Nature 341, 503 (1989).

258. A. Cegielska, I. Moarefi, E. Fanning and D. M. Virshup, 1. Virol. 68, 269 (1994). 259. D. M. Virshup and T.J. Kelly, PNAS 86, 3584 (1989). 260. D. M . Virshup, M. G. Kauffman and T. J. Kelly, E M B O ] . 8, 3891 (1989). 261. A. Cegielska and D. M. Virshup, MCBiol 13, 1202 (1993). 262. S. W. Krauss, D. Mochly-Rosen, D. E. Koshland and S. Linn, JBC 262, 3432 (1987). 263. R. W. Donaldson and E. W. Gerner, PNAS 84, 759 (1987). 264. J. Cripps-Wolfman, E. C. Henshaw and R. A. Bambara, JBC 264, 19478 (1989). 265. A. Dutta and B. Stillman, EMBOJ. 6, 2189 (1992). 266. Z.-Q. Pan and J. Hurwitz, IBC 268, 20433 (1993). 267. Z.-Q. Pan, A. A. Amin, E. Gibbs, H. Niu and J. Hunvitz, PNAS 91, 8343 (1994). 268. S. P. Bell, R. Kobayashi and B. Stillman, Science 262, 1844 (1993). 269. M . Foss, F. J. McNally, P. Laurenson and J. Rine, Science 262, 1838 (1993). 270. J. J. Li and I. Herskowitz, Science 262, 1870 (1993). 272. 6. Micklem, A. Rowley, J. Hanvood, K. Nasmyth and J. F. X. Diffley, Nature 366, 87 (1993).

272. H. Yan, A. M. Merchant and B. K. Tye, Genes Den 7, 2149 (1993). 273. Y. Chen, K. M .Hennessy, D. Botstein and B. K. Tye, PNAS 89, 10459 (1992). 274. J. J. Blow and R . A. Iaskey, Nature 332, 548 (1988).

Transcription of the Herpes Simplex Virus Genome during Productive and Latent Infection EDWARD K. WAGNER, F. GUZOWSKI

JOHN

AND JASBIR SINGH

Department of Molecular Biology and Biochemistry and Program in Animal Virology University of California, Irvine Irvine, California 92717

I. Herpes Simplex Virus Type 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Physical Description of HSV-1 . . . . . . . . . . . . . . . . . . B. Productive Infection . . . . .... ............. C. Latent Infection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Transcriptional Switches during HSV Infection A. Transcriptional Regulation during Product General Considerations ........................ B. Experimental Analysis of HSV Cis-acti Vivo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Functional Analysis of Specific HSV Promoters in Vioo . . . . . . . . . . . . . A. The LAT Promoter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Early HSV Promoters .... ............. C. Late HSV Promoters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Other Factors in the Early/Late Switch in HSV mRNA Expression V. I n Vitro Analysis of HSV Promoters . . . . . . . . . . . ............. VI. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References ......

125 125 129 132 133 134 137 140 141 144 147 153 154 155 158

Herpesviruses are nuclear-replicating, icosahedral, enveloped DNA viruses that infect members of all groups of vertebrates. To date, over 80 distinct types have been described, of which 7 infect humans (1). Herpesviruses are generally grouped into three divisions:

1. The a-herpesviruses are neurotropic, tend to have a broad cell specificity, and many grow to high titers with a rapid productive cycle in cell 1

To whom correspondence may be addressed.

Progress in Nucleic Acid Research

and Molecular Biology, Vol. 51

123

Copyright 8 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.

124

EDWARD K. WAGNER ET AL.

cultures. These include herpes simplex virus types 1 and 2 (HSV-1 and -2) and varicella zoster virus (VZV). Several animal herpesviruses, including pseudorabies virus (PRV; Herpesvirus suis), equine herpesvirus type 1 (EHV-l), and bovine herpesvirus (BHV) have interesting parallels to the human a-herpesviruses. 2. P-Herpesviruses include human cytomegalovirus (HCMV) and human herpesvirus type 6 (HHV-6); members of this group have restricted host ranges and a lengthy replication cycle with low yields of infectious virus. Details of their latent state of infection also differ from the other groups of herpesvirus (2). 3. y-Herpesviruses are best represented by Epstein-Barr virus (EBV); the newly characterized human herpesvirus type 7 (HHV-7) also appears to be a member of this group (3). These viruses are lymphotrophic, have a highly restricted range of cells in which they can replicate, and have a highly regulated latent state of infection (4). Unlike adenoviruses, which share a general genomic architecture, the genomic structure of the various herpesviruses presents a bewildering array of individual variations on a general theme. Still, within these variations, gene order within large blocks of the genome is generally maintained, and varying degrees of genetic homology are clearly evident. Herpesvirus genomes can, perhaps, be best described as complex, containing significant regions of inverted repeat sequences, and displaying wide variations both in total genomic size (100-240 kbp) and base composition (45 to 70% G + C). Productive replication of all herpesviruses studied to date involves a regulated cascade of viral gene products in which control of viral mRNA abundance plays a central role. Although there are great diflerences in the details depending on the specific herpesvirus in question, the process can be readily generalized as follows: During infection, a small group of viral regulatory proteins is first expressed. Following the synthesis of these ci (immediate-early) proteins, there is expression of a number of viral enzymes and proteins involved in mediating various aspects of vegetative viral genome replication. Finally, concomitant with viral DNA replication, genes encoding viral structural and assembly proteins are expressed at high levels. This early/late cascade in which viral genomic replication serves as a “waters h e d event is, of course, characteristic of the replication of most groups of DNA viruses. The plasticity in structure and organization of herpesvirus genomes is indicative of a basic feature of productive viral replication-herpesvirus genomes are promoter-rich, and generally the expression of a given protein is mediated by a specific promoter mapping at that gene. Thus, extensive transcription units processed into a variety of mRNAs are the exception

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

125

rather than the rule. Therefore, no strict constraint on precise genomic order of genes or genomic organization is required, and only the genomic content and co-regulation of essential functions are conserved. Another general feature of herpesvirus infection is the intimate and lasting association between herpesviruses and their hosts-the process of latency (reviewed in 5, 6). Virus persistence restricted to specific tissues of the mammalian host is a common, if not universal, phenomenon of the natural course of infection of nuclear-replicating DNA viruses. Despite this universality, latent infections in which there is maintenance of a genetically intact viral genome as an episome, and a major or complete restriction of the expression of productive cycle genes, appear to be confined to the herpesvirus group. However, latent infection with most herpesviruses does result in the transcription of a restricted portion of the viral genome. Depending on the herpesvirus in question, latent-phase transcription encodes genes or genetic elements mediating maintenance of the viral genome and/or the re-establishment of productive-phase gene expression during the process of reactivation.

1. Herpes Simplex Virus Type 1 Herpes simplex virus type 1 (HSV-l), the subject of this and many other reviews (5, 7-14), is the prototype and best studied representative of the a-herpesvirus group. It is neurotropic and establishes latent infections in sensory neurons; further, it is characterized by an extremely rapid productive replication cycle, and can replicate in a large group of animals, tissues, and cultured cells. Virions consist of a dense core containing the viral genome, a 120-nm (diameter) icosahedral capsid, a protein-rich tegument (matrix) surrounding the capsid, and an envelope containing at least 10 different viral glycoproteins. Purified virions contain over 33 distinct proteins, and mature capsids are composed of 7 different proteins. The tegument contains two important host-modlfying proteins: the a-trans-inducing factor, a-TIF (also known as VP16, VMW65,UL48, or virion stimulatory protein, VSP), and UL41 (virion host shut-off protein, vhs). The functions of these two proteins are discussed briefly in Section I,B,l.

A. Physical Description of HSV-1 1. THE VIRAL GENOME The complete sequence of the 17syn+ strain of HSV-1 has been known for some time (15, 16); comparison with other strains indicates that this sequence is a valid prototype for HSV-1 in general. The genome, schematically shown in Fig. 1, is a 152-kbp double-stranded linear-DNA molecule,

126

EDWARD K. WAGNER ET AL.

FIG. 1. The genetic/transcription map ofthe HSV-1 (strain 17syn+).The map is shown as a modified circle to indicate the fact that the linear genome circularizes on infection. Open reading frames (ORFs) predicted from sequence data are shown as the outside solid ring, and transcripts expressing all or portions of them are indicated with the arrowheads showing the transcription termination/polyadenylationsites. Known kinetic classes of transcripts (a, immediate-early; p, early. Py, leaky late; y, strict late) are indicated. The latency-associated transcripts (LAT) in the R,- regions are the only transcripts expressed during latent infection. Also shown are the locations of the origins of replication and the “a” sequences at the genome ends, which are involved in encapsidation. In addition, and where known, the functions of individual genes are shown. (An earlier version of this map appears in reference 10.)

consisting of two unique segments (U, = 108 kbp; Us = 13 kbp), each flanked by a pair of inverted repeats. The UL is surrounded by the 8.8-kbp long repeat (RL), and the Us is bounded by the 6.6-kbp short repeat. Due to

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

127

the genomic organization, genes contained entirely within the repeats are present in two copies. Three origins of DNA replication have been identified within the HSV-1 genome; two are positioned within the short repeats (oris) and one within the unique long region (oriL). DNA packaging signals are located in highly repetitive sequence elements at the ends of the genome.

2. VIRAL TRANSCRIPTS More than 90% of the viral genome contains open translational (reading) frames (ORFs) encoding proteins initiated by ATG and terminated by TAA, TAG, or TGA.2 Nucleotide sequence analyses show at least two extensive unique ORFs (ICP34.5 and a0 = ICPO) in the long repeats, 56 in the U,, one in the short repeat (a4 = ICP4), and 12 in the Us. Transcription of the HSV-1 genome is highly regulated, and is sequentially ordered. During productive infection four kinetic classes of transcripts are expressed by the virus: immediate-early (a), early (p), leaky late (By), and strict late (y). Early and late transcripts are interspersed throughout the unique portions of the genome (UL and Us). In contrast, the immediate-early transcripts are clustered within or bordering the repeat regions (RL and Rs). During latent infection, only the latency-associated transcription (LAT) unit mapping within the R L is expressed. Unlike productive-cycle transcripts, the 8.5-kb polyadenylated LAT unit does not encode an extensive ORF, and its function during latent infection probably does not involve the action of a protein encoded by it (8). A standard approach toward characterization of HSV transcripts has been developed over several decades (12, 17-27). Essentially, transcripts are first located by Northern-blot analysis using defined DNA fragments as probes and mRNA isoIated under various conditions of infection. Northern-blot band signal intensity provides a rough measure of viral mRNA abundance; this has been shown to correlate with cDNA abundances where studied (27). There is at least a 10-fold variation in the levels of abundant and rare transcripts of a common kinetic class expressed during infection; differential 2 The genetic complexity of herpes simplex virus has led to a general consensus for the nomenclature of HSV proteins based on the sequence of the genome. Proteins are numbered by the order of the ORFs occurring in the U, and Us regions of the genome; i.e., the ORF encoding the major capsid protein (VP5) is the nineteenth extensive one numbering from the left of the U, and is, thus, ORF U, 19. Other semisystematic nomenclatures for HSV proteins, based on observed migration rates in "standard" denaturing acrylamide gels, are also in general use. One still in use identifies proteins as virion protein (VP) or infected cell protein (ICP) along with a number corresponding to location on a gel-the lower the number the slower the migration rate (corresponding to a greater MJ. This trivial nomenclature has been combined with a notation employing the time during the replication cycle the protein is expressed, especially in the case of the a or immediate-early proteins, which are encoded by mRNA abundantly transcribed in the absence of de nmo protein synthesis in the productively infected cell. An example is the relatively large immediate-early transcriptional activator, a4. Precise locations of HSV proteins are indicated in Fig. 1.

128

EDWARD K. WAGNER ET AL.

mRNA abundance generally reflects rates of mRNA synthesis and accumulation as determined by pulse-labeling of infected cells in uitro (28-33). The precise locations of 5 and 3’ ends of particular transcripts are determined using S, nuclease or RNase protection analysis, although other methods such as cDNA mapping and primer-extension analysis have also been utilized. Generally, transcripts are expressed as unique, unspliced mRNA molecules, bounded on the 5’ end by recognizable promoter elements and terminated by canonical cleavage polyadenylation signals (AATAAA). The average size of HSV mRNAs is between 1500 and 2,000 bases, although transcripts as long 9OOO bases (U,36) are expressed. Generally, there is a 150- to 250-base leader between the transcript cap site and the translation initiator, and a short trailer (10 to 25 bases) between translation terminator and polyadenylation signal; polyadenylation typically occurs 10 to 20 bases 3’ of this signal. In uitro translation of isolated viral mRNA species generates products consistent with the sizes of the ORFs encoded (23, 25, 26). Although there is a good first-order correlation between the location of viral transcripts and ORFs predicted by sequence analysis (16,34),the actual number of independent transcripts expressed and the total number of proteins encoded by HSV-1 is significantly greater than the 70 ORFs indicated. There are many factors that make impossible more than a rough correlation between ORFs, viral proteins, and individual transcripts. These complications can interfere with simple genetic manipulation of specific genes; failure to consider them can obviate direct interpretation of genetic analyses. Nested, partially overlapping transcripts, each encoding a unique ORF but utilizing a common polyadenylation site, are common (cf. transcripts encoding U,24, 25, and 26; those encoding U,39 and 40; those encoding U,5, 6, and 7, etc.). There are occasions wherein the same ORF is utilized to express partially overlapping proteins through expression of independently controlled, partially overlapping transcripts. Notable examples include the expression of overlapping transcripts containing the ORF-encoding alkaline exonuclease (UL12), the capsid assembly proteins (UL26), and transcripts encoding portions of UL8 and U,9 (21,35,36).In at least one case (UL3),the transcript identified does not have the capacity to encode the whole ORF, and it is unclear how the complete ORF conserved between HSV-1 and -2 is expressed (27). Post-translational modification of viral proteins is also common and further increases the complexity of viral products. For example, virus-encoded protease is involved in the maturation of the U,26 capsid protein (35, 37-40). Finally, although the occurrence of transcript splicing is rare during productive infection, a few important transcripts do contain exons: aO,0122, a47, the latency-associated transcripts (LATs), and UL15. The UL15-encoded protein is ubiquitous among herpesviruses, appears to be distantly related to genomic packaging proteins encoded by large DNA

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

129

bacteriophages, and may fulfill the same function in HSV (41-43). Other spliced transcripts occur as low-abundance mRNAs partially overlapping unspliced transcripts encoding major ORFs; such spliced transcripts could result in the expression of antigenically related but distinct proteins. Two examples are minor spliced variants of the transcript encoding gC (UL44) (44), and a recently described variant of the major unspliced mRNA encoding DNA polymerase (UL30) (45).

B. Productive Infection 1. VIRAL GENOMEENTRYAND a (IMMEDIATE-EARLY) GENE EXPRESSION Productive infection of a cell by HSV involves a number of stages representing different levels of viral gene expression and interaction of viral gene products with host machinery. Virus entry requires sequential interaction between specific viral membrane glycoproteins and cellular receptors, notably heparan sulfate proteoglycans (reviewed in 46). Following virus entry, the nucleocapsid is transported to the nuclear pores, presumably mediated by the cellular cytoskeleton (9), where viral DNA is released into the nucleus. The virion-associated a-TIF protein can be inferred to accompany the viral genome into the nucleus, in that it functions in enhancing a viral transcription. Release of the viral genome into the nucleus appears to require a viral function, because a temperature-sensitive mutant that releases DNA only at the permissive temperature has been characterized (47). HSV-1 brings another important regulatory protein into the cytoplasm. The virion host shut-off protein (vhs, UL41) causes the disaggregation of polyribosomes and increased degradation of both cellular and viral RNAs (48, 49). This function is believed to be beneficial to the virus replication strategy for two reasons. First, it quickly decreases cellular RNA pools so that actively transcribed viral RNAs are preferentially expressed. Second, by decreasing the half-lives of viral messages, it allows for efficient transition from one phase of viral gene expression to the next. Five HSV genes [a4 (ICP4),a0 (ICPO), a27 (ICP27/UL54),a22 (ICP22/Us1), and a 4 7 (ICP47/Us12)] are expressed as a transcripts and function in the earliest stages of the lytic infection cycle. Immediate-early (a)expression is mediated by the action of a-TIF through its interaction with cellular transcription factors at specific enhancer elements. In the absence of virusencoded protein synthesis, only a transcripts are expressed. Because promoters controlling expression of all kinetic classes of HSV transcripts have features of cellular promoters and can be expressed by unmodified cellular transcription systems, the restriction of viral transcription in the absence of virus-induced protein synthesis is, in itself, sufficient to infer that the nature

130

EDWARD K. WAGNER ET AL.

of the viral genome as a transcription template has a critical role in subsequent viral gene expression. In addition to these well-characterized transcripts, at least two other promoters appear to be active under conditions of infection wherein all de nouo protein synthesis is inhibited. These are the ribonucleotide-reductase large-subunit promoter (UL39),and a partially characterized promoter in the long repeat. The functional relevance of these promoters in productive viral infection and pathogenesis is unknown. Proteins encoded by the 014, (YO, and a27 transcripts have clear roles in regulation of viral gene expression at the level of transcription, or at least, mRNA expression (10,18).They functionally interact to form nuclear complexes with viral genomes (50-52); the role of these interactions on global only two (a4 and a27) have extensive areas of sequence similarity among a large number of a-herpesviruses. Recently, it has been shown that a cellular protein (p15) shares features and activities with the a4 protein (52a).Despite this, only amino-acid sequences in a27 appear to be conserved extensively among the more distantly related p- and y-herpesviruses (53). Much less is known about the two other a proteins, a22 and a47. Both are dispensable for virus replication in many types of cultured cells, but a22 is required for HSV replication in certain types of cultured cells and may have a role in maintaining the virus’s ability to replicate in a broad range of cells in the host (54,55). The a47 protein has a role in modulating host response to infection by specifically interfering with the presentation of viral antigens on the surface of infected cells (56).

2. EARLYGENE EXPRESSION Activation of the host-cell transcriptional machinery by the action of the a-gene products, results in the expression of the early or p genes. Seven of these are necessary and sufficient for viral replication under all conditions: DNA polymerase (UL30), DNA binding proteins (UL42 and UL29), OR1 binding protein (UL9),and the helicase/primase complex (UL5, 8, and 52) (57-59). When sufficient levels of these proteins have accumulated within the infected cell, viral DNA replication ensues by a rolling-circle mechanism (9). Other early proteins, including thymidine kinase (U,23) and ribonucleotide reductase (UL39 and 40), are involved in increasing the deoxyribonucleotide pools of the infected cells, and still others, including uracil DNAglycosylase (UL2)and alkaline exonuclease (UL12),presumably function as repair enzymes for the newly synthesized viral genomes. These accessory proteins are “nonessential” for virus replication in that cellular products can substitute for their function in one or another cell type or on replication of previously quiescent cells; however, disruption of such genes often have

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

1.31

profound effects on viral pathogenesis and/or ability to replicate in specific cells (60-66). Thus, any deficiencies in these genes greatly impairs virus replication in the natural host.

3. LATE GENE EXPRESSION The vegetative replication of viral DNA represents a critical and central event in the viral replication cycle. High rates of DNA replication irreversibly commit a cell to producing virus, which eventually results in cell destruction. DNA replication also has a significant influence on viral gene expression. Early expression is significantly reduced or shut off following the start of DNA replication, whereas late genes begin to be expressed at high levels. These late genes can be divided into two subclasses: leaky-late (Py) and strict-late (y). Leaky-late (py) genes are expressed at low levels prior to DNA replication, but reach maximum expression after viral DNA replication has been initiated. In contrast, late (y) genes are difficult to detect at all until the onset of viral DNA replication. Immunofluorescence studies show that DNA replication occurs at discrete sites, or “replication compartments” (67, 68). Prior to DNA replication, the a4 protein and the p single-stranded DNA binding protein ICP8 (UL29) are distributed diffusely throughout the nucleus; concomitant with viral DNA replication, the distribution of these proteins changes to a punctate pattern. In the case of 014,this change involves interaction with a0 and a27 (50-52). These observations have two important implications. First, the virus can cause a restructuring of the nucleus and in effect create its own organelle-like structures. Second, transcription of early and late genes is likely to occur in two distinct environments. The transcriptional environment within replication compartments may lead to preferential transcription of late genes. More than 30 HSV-1 gene products are structural components of the virion, and all are expressed with late kinetics; however, it is not clear whether the distinction between Py and y proteins is of a functional significance or is merely a convenient experimental discrimination. For example, although expression of the major capsid protein, VP5, with Py kinetics, and the penton protein, UL38, with y kinetics, might be important in achieving maximum virus yields, there is no experimental evidence to suggest that this is indeed the case. Viral capsids assemble in the nucleus in the presence of the y capsid assembly protein, UL26 (69, 70); concatamers of genomes generated by the rolling-circle mechanism are cleaved into monomer lengths and packaged into the preformed capsids. Capsids bud through the nuclear membrane, which contains the viral glycoproteins. During this process, the capsids are surrounded by tegument proteins, including a-TIF and vhs, which may

132

EDWARD I<. WAGNER ET AL.

functionally interact to aid in envelopment (71). Enveloped capsids are transported via the Golgi apparatus, and are eventually released from the cell (63, 72, 73). The entire replication cycle is surprisingly rapid, with mature virions being formed in as little as 8 hours in some cell culture systems.

C. Latent Infection 1. ESTABLISHMENT AND MAINTENANCE Latent infection by HSV in sensory neurons involves three distinct stages: establishment, maintenance, and reactivation (6, 18, 74-76). First, the virus enters neurons via infection at sensory endings and travel by retrograde axonal transport to the nucleus (77, 78). Restriction of viral gene expression occurs when productive-cycle genes are transcriptionally quiescent, and only a single transcription unit is expressed. It is likely that differences in the transcriptional environment within individual neurons play a critical role in the attenuation of productive infection, although other factors are also important. For example, the immune competency of the host has a major role in limiting viral productive infection in peripheral neurons, as well as spread to the CNS (79, 80). Viral genomes persisting in sensory neurons provide the reservoir of infectious virus for reactivation; therefore, virus must leave these neural cells to allow productive infection of peripheral cells without causing excessive damage, either by virus-induced reactivation or from the host’s physiological response to recrudescence. This later requirement is critical for the laboratory analysis of factors involved in reactivation and limits the usefulness of neuron explantation and subsequent recovery of infectious virus as a model for the reactivation process. The restriction of HSV productive-cycle gene expression during the establishment and maintenance of latency does not require the expression of any latent-phase-specific viral gene (6, 74, 81-83); however, restriction occurs only in a subset of infected neurons (84). Plausible models have been constructed involving the lack of or reduced amount of the specific transcription factors interacting with the enhancers controlling expression of the viral Q proteins in neurons or negative competition for such sites by other transcription factors (85-89). As with the establishment of latency, no latent phase gene expression is required for maintenance of viral genomes in latently infected animal models (90,91);neurotropic herpesviruses are in a protected and nonreplicating environment where no viral gene products are necessary for genome survival. The viral genome is a histone-associated, super-coiled episome during latent infection (92-94). This state must preclude frequent sporadic episodes of transcription via unmodified cellular processes occurring in latently infected neurons, because low-level expression of viral transcripts, except those

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

133

specific to the latent phase, are either not detectable or are at such low frequency as to require PCR amplification for detection (74, 95, 96a). This is an important consideration, because many herpesvirus promoters are active at detectable levels even in the absence of viral transcriptional activation in biochemically transformed cells and in transient assays (97-103).

2. REACTIVATION In humans and in several in uiuo models, reactivation of HSV results in the appearance of infectious virus at the site of initial primary infection (see 104, 105for reviews). Simply put, reactivation involves the presentation of a small amount of infectious virus or replication-competent viral genomes from the latent reservoir to peripheral tissues, leading to productive replication at the periphery until cleared by the host. PCR analysis of the earliest steps in reactivation in several animal models confirms this description (95, 96). Unlike the situation with establishment and maintenance of the latent infection, it is clear that expression of the latency-associated-transcription (LAT) unit has a significant role in reactivation, but the precise mechanism of this involvement is unknown; LAT expression is not an absolute requirement for reactivation, although it greatly increases the efficiency of induced reactivation in several in uiuo models (96, 105, 106). To complicate matters further, the role of LAT expression in several popular (and relatively inexpensive) in uitro models is problematic (cf. 95). The investigation of very early events in reactivation is complicated by the fact that it is necessarily a low-frequency event leading to a low-multiplicity infection, and it is difficult to differentiate those viral functions required for eficient productive phase infection from those required for reactivation itself. Animal models are an important resource for the rational experimental study of latency and reactivation of neurotropic herpesviruses, but no model fully reflects the actual situation in the natural host (104).

II. Transcriptional Switches during HSV Infection As with other nuclear-replicating DNA viruses, all transcripts are expressed via cellular transcriptional enzymes and factors. In HSV, as well as other herpesvirus infections, expression is governed by individual promoters. It is, thus, of great interest to know how the promoters for the different kinetic classes of transcripts interact with cellular transcriptional machinery, and how these interactions change as the infection progresses. Further, it is also important to know whether the structure of the transcription template in toto or in specific regions has a role in the differential expression of various

134

EDWARD K. WAGNER ET AL.

viral genes. This is an especially interesting question in regard to the expression of LAT during the latent phase of infection, when all other viral genes are quiescent.

A. Transcriptiona I ReguI a ti on dur ing Productive HSV Infection-General Considerations The transcription regulatory proteins encoded by HSV-1 could exert their influence on transcription in a variety of ways, including, but not limited to, the following mechanisms: (1) enzymatic modification or release of sequestered cellular transcription factors to alter their activity; (2) direct or indirect binding to cis-acting promoter elements to influence transcription through interaction or recruitment of transcription factors; or (3) modification of the viral transcription template to change the accessibility of specific promoters to transcriptional machinery. The interaction between viral enhancer elements and a-TIF illustrates the best-documented example of a virus-encoded transcriptional activator functioning through interaction with specific DNA binding sites. As detailed below, it is not yet possible to be as precise concerning the specific mechanism of action of the three viral a regulatory proteins, a4, a27, and aO. 1. INITIALSTAGESOF HSV-MEDIATED TRANSCRIPTION: a-TIF-MEDIATED TRANSACTIVATION OF TRANSCRIPTION The enhancement of a gene transcription by a-TIF has been the subject of much intensive research, and is relatively well understood (cf. 107). The 50,000 M , a-TIF protein has two experimentally separable functional domains. The N-terminal half interacts with the cellular transcription factor, Oct-1, and accessory proteins (HCF), to form a DNA/protein complex at consensus “TAATGARAT“ sequences found at single or multiple sites in the upstream enhancers of a-gene promoters (108-112). The C-terminal half of the protein is extremely acidic and acts as a potent “acidic blob-type of transcriptional activator ( I 13-116). Experimental evidence indicates that this acidic activation domain of a-TIF can interact with both TFIID and TFIIB (115, 117-119). Thus, the protein may increase transcription by recruiting either of these transcription factors to the a promoters.

2. Trans ACTIVATION BY

THE

HSV-1 a PROTEINS

a. The a4 Protein. The 132,000 M , protein encoded by the 4.7-kb a 4 transcript is a promiscuous trans-activator that stimulates transcription of a large number of promoters, including HSV-1 early and late promoters, heterologous cellular promoters, and minimal promoters containing TATA boxes as the only obvious cis-acting element (reviewed in 120). This observation has been used to suggest that a 4 may function through interaction with

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

135

general transcription factors or even RNA polymerase. This interpretation is supported by two studies indicating that a 4 protein and the closely related homolog encoded by pseudorabies virus (PRV), the immediate-early protein (IEP), can associate with TFIID. Partially purified fractions of IEP facilitate interactions between the adenovirus major-late promoter and TFIID in in vitro transcription studies (121), and a 4 specifically interacts with the TATA binding protein (TBP) of TFIID and with TFIIB at the a 4 promoter (122). Further, a reconstituted in uitro transcription system consisting of partially purified general transcription factors, purified a 4 from infected cells, and various glycoprotein C (gC) promoter constructs, to more precisely study the mechanism of transcriptional activation by ICP4, has been described (123).This study demonstrated that ICP4 can activate transcription directly through the general transcription factors by increasing the rate of transcription initiation complex formation. An interesting finding of this study was that trans-activation by a 4 requires the involvement of TBPassociated factors (TAFs), because recombinant TBP could not substitute for the multisubunit TFIID. Additionally, sequence elements at the gC start site were shown to play a critical role in ICP4-mediated transactivation of the gC promoter. The a 4 transcriptional activator is essential for virus replication and is continuously required for the transcription of all save the a (immediateearly) kinetic class of viral mRNAs (120),and activation is modulated by both a0 and a27 (124-126). Multiple phosphorylated forms of the protein have been identified in the infected cell; this phosphorylation has a role in modulating protein/DNA complex formation and may play a regulatory role during infection (127, 128). The a4 protein shares common features with GTP-binding proteins, and can also be post-translationally modified by uridine ribosylation, adenylation, and guanylation; however, no role in protein activity has been clearly correlated with these modifications (123, 130). The a 4 protein binds strongly to a consensus sequence at its own transcriptional start site, and interaction between the protein and this site leads to autoregulation of the a 4 transcript as well as inhibition of transcription of other heterologous promoters containing consensus strong binding sites at or near their cap sites (131-133). The a 4 protein also binds weakly to other nonconsensus sequences present in the promoters and sequences downstream of the cap sites of early and late genes (123, 131, 134-142). The relevance of this weak binding to the trans-activation of promoters is not clear at this time. Regions upstream and downstream of the TK (U,23) transcriptional start site form protein/DNA complexes that contain a 4 (103, 135, 136), but deletion of the upstream a 4 binding sites or the entire 5’ untranslated leader of the TK gene does not aEect promoter activity (103, 139, 143). Similar results were obtained with the P-y glycoprotein-D (gD)

136

EDWARD K. WAGNER ET AL.

promoter, wherein mutation of upstream a4 binding sites did not affect transcription following infection with recombinant viruses (144). Although such results suggest that the ability of a4 to bind to DNA is not important for the truns-activation of p, P-y, and y genes, mutational analyses of the a4 protein show that the DNA-binding and transcriptional-activation domains largely overlap (140, 145). This has led to the suggestion that the redundancy and degeneracy of a4 binding sites throughout the viral genome allow for mutation of individual sites within the context of specific promoters. In this view, a4 functions as a global truns-acting enhancer that binds at numerous locations throughout the genome and recruits transcription factors to the viral template (123, 138).

b. The a27 Protein. The essential a27 protein (51,000M,) is a nuclear protein that, like a4, undergoes diEerentia1 phosphorylation during the course of infection. Transient expression assays show that, although a27 does not have an effect on reporter gene expression by itself, it can act as either a trans-repressor of immediate-early and early promoters or as a trunsactivator of late promoters in conjunction with a0 and a4 (126, 146).The a27 protein also influences gene expression co- or post-transcriptionally by facilitating the utilization of specific cleavage polyadenylation sites, thereby promoting efficient transport of some late mRNAs from the nucleus. In addition, the protein has a role in altering basic splicing activity in the infected cell (147). Viruses containing temperature-sensitive mutations of a27 protein express elevated levels of immediate-early and early proteins, whereas expression of late-gene products is drastically reduced at the nonpermissive temperature (53, 148). Such results imply that the a27 protein is involved in the switch from early to late transcription. Consistent with this hypothesis is the fact that transcription of the a27 protein is shut off only following appreciable viral DNA replication, resulting in a large pool of this protein during the critical period of the early-to-late switch (28). c . The CUO Protein. The nonessential a0 protein (79,000 M,) enhances the transcriptional activation of the a4 protein in transient assays, and by itself can act as a powerful transcriptional activator on some promoters (149151). It is also localized in the nucleus of infected cells, and although its function is dispensable for virus replication in cell culture, a0 null mutants have very poor plaquing efficiency. This implies an important role for the protein during low-multiplicity infections, and is sufficient to explain the observation that a0 null viruses reactivate poorly from latent infections (151, 152). Although structure/function studies have defined regions within the a0 protein important in transcriptional activation, its mode of action remains

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

137

obscure, and specific DNA binding sites for it have not been identified. However, cysteine-rich and C-terminal regions of the protein have been identified as being important in its synergistic transcriptional activation with the a4 protein (51, 153).It has been suggested that one mode of action involves association with the cellular transcription factor Apl (154).

B.

Experimental Analysis of HSV Cis-acting Promoter Elements in Vivo

The coordinate regulation of groups of individual HSV transcripts, each controlled by an independent promoter, and the random general distribution of early and late promoters throughout the genome strongly suggest that local cis-promoter elements are sufficient to define kinetic class. As outlined below, the ability to insert defined minimal promoter/reporter gene constructs into different regions within the genome and faithfully reproduce the kinetic behavior of the native promoter confirms this conclusion (155-157). The different activities of a,@, Py, and y promoters at various times throughout the infection cycle must result from changes in the association between the cellular transcription machinery and kinetic class specific promoter elements. In other words, separate kinetic classes of viral transcripts are controlled by recognizably distinct promoters, which utilize cellular factors differently throughout the productive cycle. It might be noted here that virus infection also results in the modification of RNA polymerase, but it is not clear whether this has any major role in transcriptional switching (158). The functional elements of a number of HSV promoters have been analyzed using a variety of methods. The most direct method is in situ modification of promoter elements followed by assays of gene expression, but this method can be difficult if not impossible for essential genes, and “cryptic” promoter elements removed from the immediate vicinity of the transcript start site may not be identified. To overcome such limitations, three general approaches have been exploited for experimental definition of cis-acting elements involved in regulation. These approaches generally involve standard molecular techniques to alter viral promoters or suspected regulatory sequences controlling the expression of a reporter gene, such as the bacterial P-galactosidase and chloramphenicol acetyltransferase (CAT) or the firefly luciferase genes. Reporter gene constructs are then transiently or stably introduced into a cell as a plasmid or by infection with a recombinant virus, and the effect of the mutation is assessed by measuring reporter gene enzymatic activity or RNA levels. One approach has been to insert modified reporter constructs into the host genome and measure basal level expression as well as expression on infection by HSV (97, 99, 101). This technique has allowed identification of elements necessary for basal level expression and response to infection;

138

EDWARD K. WAGNER ET AL.

however, all promoters studied in this fashion show only p kinetics in infected cells, regardless of their kinetic class in the environment of the viral genome. A second, and more widely used, approach has been to transfer a reporter construct into a permissive cell line and measure expression in the presence or absence of super-infecting virus or co-transfected plasmids expressing viral regulatory proteins (such as a4 and aO).This approach, which has the great advantage of relative speed and ease of manipulation of modified promoter elements, is effective in defining autoregulatory and enhancer elements of a gene promoters, in delineating the functional limits of other productive cycle promoters, and in investigating elements of the LAT promoter (21, 87, 88, 98, 100, 102, 132, 159-175). Although early and late promoters can be partially differentiated in transient expression assays by virtue of the higher basal activity of the former in uninfected cells (159) and by their differential response to virus-induced reporter plasmid replication (98), both early and late promoters generally behave similarly in cells super-infected with virus, and early transcription is not shut off. Therefore, results obtained using transient expression assays are artificial and possibly artifactual as compared to the situation seen in the viral genome. The limitation of the approach is further emphasized by the fact that critical elements defined in transient expression assays do not always clearly correspond to those acting in the context of the viral genome (144,

176). The best way around the limitations of the methods described above is to introduce modifications of promoters in question into the viral genome. Many studies using recombinant virus place the modified promoter back into its normal genomic location (97, 132, 177-187). Such an approach, however, allows the possibility that redundant or degenerate regulatory elements outside the altered area may not be properly identified, thereby interfering with the interpretation of results. Also, the promoter in question must control expression of a nonessential gene in such systems, or complementing cell lines must be generated. The latter is a time-absorbing process and can generate artifacts because of differential replication efficiencies in different complementing cell lines, problems with maintaining multiple lines, etc. For example, complementing cell lines expressing capsid proteins are diffkult if not nearly impossible to generate in such a way as to produce reasonable levels of progeny virus when infected. Further, each promoter modification must be compared to wild-type virus in parallel experiments to ensure validity of results. Given these considerations, we have exploited two standardized recombination sites within the HSV genome, one with the gC (UL44)gene and one

139

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

2-L

$

xba sol

A TATA

xba sal

PA

7

7

-- ---PA

TRL

sal

sal

UL

I

sal

/

------______ 5‘

I I

I

I

..

I I

Sal

PA

7

...

, ,,

IRL IRS US

~.=)&ILl

TRS

(98.422) (114,517) (lrn,W)

1 5 . ~ 9 ) (14,mt

I

,c

,,

,

,

I 8\

/’

\

,,:;r..r y /

I

\

\

/

B-Wl

Sol

sal

Wa

PA

7

FIG. 2. Sites for recombining modified HSV promoters controlling P-galactosidase into the HSV-1 genome. Recombination in the gC location can be accomplished by inserting the reporter gene/promoter cassette backward in the interior of the gC ORF with a bidirectional polyadenylation signal derived from SV40 serving to truncate normal gC transcription. A second set of recombination vectors in which the gC promoter has been deleted allows insertion of cassettes in either orientation. Recombination into the R, location is accomplished with a vector in which the LAT promoter and 5’ 1200 bases have been deleted and replaced by the P-galactosidase promoter cassette in either orientation. Together these recombination vectors provide a means of assaying the same promoter in different genomic environments for both specific and global elements controlling expression. Details concerning the construction and use of these recombination vectors have been described in a number of publications (155-157, 159, 176, 188-190a).

in the R, near the LAT promoter, where we could insert any viral promoter (Fig. 2) (155-157, 176, 188-190a). The system has been set up so that the same reporter gene containing a significant amount of its endogenous leader as a “buffer” is utilized with all promoters to be tested; thus, we have a recombination “cassette” standardizing all of our constructs. Because one of our goals has been to eliminate the possible interference of other promoter elements that might be captured in our study of productive cycle promoters, a major amount of our work uses a region within the interior of the gC ORF that does not contain a promoter element within 600 or so bases. We have

140

EDWARD K. WAGNER ET AL.

also used recombination vectors in which the entire gC promoter has been eliminated to confirm that orientation, and upstream promoter elements do not influence reporter gene expression at this locus (190a, 191). Features and advantages of the approach are as follows:

1. The locations for insertion of modified promoters are within genes dispensable for efficient productive replication in cultured cells. 2. The activity of the wt promoter in its normal location and the modified promoter can be assayed in the same infected cells. This provides an invaluable internal control. 3. All promoters we have assayed to date (including 2 p, 2 Py, and 2 y) have all behaved with kinetics identical to their wt counterparts in their normal location. 4. General or nonspecific modifications that might influence levels of promoter activity can be made in the vicinity of the inserted promoter without disrupting other viral genes. 5. All reporter constructs express very similar or identical mRNAs. 6. Any promoter can be assayed under standard and equivalent conditions. This means that late promoters encoding capsid proteins can be readily studied in the same context as early promoters. Importantly, variations in complementing cell lines are avoided. 7. All recombinants are essentially equivalent; therefore, the possibility of introducing deleterious mutations by the recombination event, although not eliminated, is equivalent for all promoters and constructs studied. Despite the advantages of this approach, it does not, of course, entirely eliminate the possibility of artifact. For example, we must consider the effect of the transcriptional environment of the locus chosen for recombination, because it has been demonstrated that genomic “context” can have a significant effect on message levels even though not affecting the kinetics of expression (155). However, such complications can easily be noted by virtue of having the wt internal control.

111. Functional Analysis of Specific HSV Promoters in Vivo A generalized description of salient features of the architecture of some well-studied HSV promoters is presented in Fig. 3; the results of the detailed analysis that lead to these generalizations are described below.

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

141

FIG. 3. The architecture of selected specific HSV promoters. Analyses of specific HSV promoters representing various kinetic classes of genes expressed during productive and latent infection are described in Section I11 and summarized here.

A. The LAT Promoter HSV latency, LAT, and the LAT promoter have been the subject of many recent reviews (5, 8, 12, 75, 89, 192, 193). The abundant nuclear RNA hybridization seen in latently infected sensory neurons is the result of the accumulation of two stable uncapped, poly(A)- introns (2 and 1.4 kb in size) derived by the splicing of a large (-9 kb) primary transcript (19, 194, 195). Precise mapping of the 5' and 3' ends of the poly(A)- LAT species from latently infected tissue has been carried out using both S, nuclease and RNase protection analysis. Northern blot analysis of poly(A)+ RNA from productively infected cells reveals the presence of a 9-kb transcript expressed from the LAT strand beginning just 3' of the LAT promoter and extending to a polyadenylation signal in the short repeat region (196, 197). It should be noted that although others have reported detection of the primary transcript using northern blots of RNA from latently infected neural tissue (196), we have not-although we have detected both this and the processed form, derived by the splicing of the LAT intron, by PCR analysis (95, 96). The presence of the primary transcript can also be inferred from in situ hybridization data. The LAT promoter was first identified by generating recombinant viruses in which a canonical pol I1 promoter located at - 1700 bases downstream of the 3' end of the a0 transcript was shown to mediate neuronal expression of a

142

EDWARD K . WAGNER ET AL.

rabbit p-globin reporter gene (197). All transcription is abolished in latent infections with viruses in which this promoter element is deleted (96, 197, 198). This latter point is important because at least one other region of the R, downstream of the LAT promoter has promoter activity in transient expression assays, and appears to mediate long-term expression of a reporter gene in latently infected neurons (199); this “minor LAT” promoter is not active when the upstream promoter is deleted, and its physiological significance is unclear. RNase protection experiments with RNA from latently infected neurons and productively infected cells show the LAT primary transcript initiates about 28 bases 3’ of the canonical TATA box of the LAT promoter. Work with recombinant viruses suggests that sequences 3’ of the transcription start site also important for stable neuronal expression from this LAT promoter (200). The sequence of the promoter contains a number of potential control elements that could have a role in LAT’s expression during latent infections. These include a canonical cyclic AMP response element (CRE), CAAT box, and Spl binding sites. Further, there is a consensus 014 binding element at the cap site that may be involved in repression of LAT expression during productive infection (201). Although the actual importance of such sequences in the control of LAT expression is unclear, LAT expression is responsive to CAMP levels in cell culture (202-204). Further, mutation of promoter elements in situ results in viruses with altered expression of LAT during latent-phase infection (205). In order to examine neuronal specificity of the LAT promoter, we and others have utilized transient expression analysis (163, 170, 194, 196, 202, 206-209). Our results, in general agreement with those of others, can be summarized as follows: A construct containing 360 bases upstream of the LAT cap expresses somewhat more P-galactosidase activity than a p promoter in uninfected rabbit skin cells. However, in cells infected with HSV, levels of indicator enzyme activity increase only approximately 3-fold compared to a greater than 50-fold increase with promoters active in productive infection. One factor that accounts for the lack of high-level induction of the LAT promoter by superinfection is the presence of the 014 autoregulatory site at the cap. However, other cis-acting elements in the promoter are also involved; for example, several constructs containing small deletions within the LAT promoter region show slightly higher activities in uninfected rabbit skin cells and in cells after HSV infection. Transient expression assays in murine neuroblastoma cells display a significantly different pattern of expression. Here, efficient expression of the LAT promoter requires sequences upstream of the distal PstI site, in that deletions of the 120 or so bases 5’ of this region significantly lower LAT promoter activity in neuroblastoma cells. Interestingly, elements of the

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

143

DNA sequence in question seem to reduce LAT promoter activity in rabbit skin cells, because increased activity is observed with the deletions in such cells. Work in other laboratories shows that this region binds to a putative transcription factor found in neuron-derived IMR-32 cells (87, 163, 207). The HSV LAT promoter is active during productive infection of peripheral and cultured cells as well as during latent infection (8, 20, 194); therefore, it is not, strictly speaking, a neuron-specific promoter. The factors influencing the expression of LAT during latent infections of neurons are complex, and although their study is technically demanding, some conclusions are clear. For example, recombinant viruses in which the LAT promoter and various amounts of upstream sequence-controlling reporter genes are introduced into the U, are not stably active for long periods, and removal of sequences downstream of the LAT cap site also yield viruses in which LAT promoter activity is not stable in latently infected neurons. Such results suggest that dispersed elements within the RL are important in stable neuronal expression of LAT (89). We have utilized a LAT promoter with a tighter neuronal restriction, that controlling the expression of the PRV analog of HSV-1 LAT, to begin the experimental investigation of cell-specific regulatory elements (190). The PRV latency-associated transcript and its corresponding promoter elements are not as well characterized as the corresponding HSV one; however, the PRV transcript has features reminiscent of HSV-1 LAT. High-resolution in situ hybridization and cDNA mapping have located the 5’ end of the PRV latency transcript just downstream of a region of the genome-containing sequence elements similar to those present in the HSV-1 LAT promoter (210-212). These elements include nominal promoter-control elements as well as putative CRE and PRV autoregulatory immediate-early protein (PRVIEP) binding sites-the latter clearly homologous to the strong HSV-1 a 4 binding sequence. We generated reporter constructs containing differing extents of the PRV LAT promoter linked to the bacterial P-galactosidase gene and recombined it into two Iocations in the HSV genome, within gC and in the R, replacing the LAT promoter (Fig. 2). We then assayed reporter gene activity following productive infection of cultured cells of various origins. Unlike recombinants containing HSV-1 promoters (including LAT), all variants of the PRV latency promoter are essentially inactive during lytic infection of rabbit skin cells or mouse embryo fibroblast cells whether present in the RL or the gC location of the genome. In contrast, all PRV latency promoters recombined into the RL express significant reporter gene activity in productively infected cells of neuronal origin. The lack of activity of the PRV LAT promoter in all cell types when resident in the U, indicates that the expression of this promoter is strongly dependent on its location within the HSV genome. The fact that

144

EDWARD K. WAGNER ET AL.

even 900 bases of sequence upstream of the cap site are not sufficient to allow expression in the U, suggests that specific HSV sequences in the RL are important for the neuronal specificity observed. The exploitation of this system will be useful in further defining critical cis-acting elements important in neuronal expression of the PRV and HSV LAT promoters during latent infection.

6. Early HSV Promoters 1. a (IMMEDLATE-EARLY) PROMOTERS Similar to the promoters controlling expression of p transcripts (see below), a promoters have many features reminiscent of most cellular pol 11 promoters, including TATA and CAAT boxes and other cellular transcription factor binding sites. Superimposed on this general scheme are a-specific control elements. The first of these are multiple upstream copies of the a enhancer that interacts with a-TIF to stimulate transcription (described in Section II,A,l). A second specific control element present in some a promoters mediates transcriptional repression by the a4 protein. This protein binds strongly to a consensus sequence “ATCGTCnnnnYCGRC” at its own transcriptional cap site (131-133);this sequence is also found at an analogous location in the transcription unit for another transcriptional activator, aO. As previously mentioned, the consensus a binding site is found in the LAT promoter and also occurs in the promoter controlling expression of unusual transcripts of unknown function (the L1STs) within the RL (213). The autoregulation of a4 protein expression is presumably important in the life cycle of the virus, and may also be important in keeping levels of LAT and LIST low during productive infection. The lack of such a repressive site in the a27 promoter results in this transcript and the protein it encodes being expressed at high levels until expression of all early promoters is shut off (Section IIX,B,3); the importance of the a27 protein in the expression of strict late transcripts (see Section I,B,2) may correlate with its continued synthesis. It is not clear how the strong a4 binding site functions in the rather unusual pattern of expression of the a0 transcript. Here, following an initial repression of transcription, expression reattains a high level late in the productive cycle (28,214).It may be that the a0 transcript has a role in the expression of some late genes, but the fact that its product is dispensable in productive infection of cultured cells makes it more difficult to identify this role. It is also not clear what feature of the a0 promoter is involved in this resumed late expression, a property not seen with other early promoters, This is an interesting question, which is subject to careful experimental analysis.

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

145

2. THE THYMIDINE KINASE PROMOTER As exemplified by the now classic studies of McKnight and co-workers on

the promoter controlling expression of the early (p) transcript encoding TK (UL23) (100, 102, 103, 175, 215-218), transcription of early transcripts is mediated by promoters that serve as models for cellular pol I1 promoters in general (101). Additional experimental evidence that promoters controlling early HSV transcripts are typical cellular pol I1 promoters comes from the observation that a cellular promoter of the same apparent general architecture, recombined into the viral genome, is expressed with normal early (p) kinetics (99, 101). Mutational studies of the TK promoter assayed by transient transfection experiments indicate that upstream transcription factor binding sites (two S p l binding sites and a CAAT box) and a TATA box are required for efficient expression. Although mutation of upstream Spl binding sites and the C U T box greatly reduces the levels of RNA expression in recombinant virus, it does not alter the p pattern of expression (219). Deletion of the sequences from -12 to $189 of the TK gene does not significantly affect levels or kinetics of RNA expression at all (143);also, linker-scanning mutagenesis shows that sequences around the cap are not critical, and mutation results more in the loss of accuracy of transcription start site than in the quantitative alteration of promoter activity. These results suggest that there are no “early-specific” promoter elements identifiable within HSV early promoters, and argue that the elements critical for early expression of the TK promoter cluster between the TATA box to about 100 bases or so upstream. The TATA box and critical upstream transcription factor binding sites are separated by DNA stretches important in maintaining appropriate spatial requirements without possessing any specific sequence elements. Sequences downstream of the TATA element do not appear to play a dominant role in early gene expression.

3. SHUT-OFFOF EARLYTRANSCRIPTION Transcription rates of all early transcripts peak at a time either prior to or co-incident with maximal rates of viral DNA synthesis and decline markedly thereafter, so they are difficult to detect on polysomes following the onset of genome replication. The only known exception to this rule is seen with the transcription of the a0 mRNA, which increases late in infection after an extended hiatus. Actual times of maximum rates of p transcription fall into a minimum of three groups that do not readily correlate with the function of the proteins encoded (18, 28); for example, synthesis of the major DNA binding protein (UL29)mRNA peaks and declines significantly earlier than rates of transcription for some other mRNAs encoding DNA replication

146

EDWARD K. WAGNER ET AL.

machinery (UL5, UL30, and UL42). This heterogeneity in times of attaining maximal rates of synthesis and shut-off may indicate that no single factor is involved. The specific mechanism or mechanisms of the shut-off of early promoters observed late in infection has yet to be determined. It is not known whether cis-acting elements within p promoters function as “shut-off elements, although linker-scanning mutagenesis of the TK promoter argues against the presence of such elements within the promoter (215).It seems plausible that turnover of trans-acting factors required for efficient expression of fipromoters leads to a progressive decrease in transcription from p promoters at late times. Further, the generation of large quantities of viral DNA could serve to titrate critical factors. Also, relocation of viral genomes into replication compartments at the time of viral DNA synthesis could lead to an abrupt alteration in the transcriptional environment. In support of these last two hypotheses is the fact that inhibition of viral DNA synthesis prevents the formation of replication compartments and the expression of y genes, but does not affect p gene expression (67, 68).

4. ANALYSIS OF OTHER EARLYPROMOTERS Mutational analyses of the p UL37 and dUTPase (U.50) promoters have been initiated in our laboratory both to determine whether the model derived from analysis of the TK promoter accurately describes other early promoters and to investigate the phenomenon of shut-off more closely. The UL37 promoter contains obvious TATA and CAAT boxes, but the nearest S p l site is over 130 bases upstream of the cap (188). Deletional analysis of this promoter in recombinant viruses demonstrates at least one critical region about 50 bases upstream of the cap site that plays a major role in expression (220); this element has no sequence similarity to any known transcription factor building site. We have used viruses in which the dUTPase promoter controlling expression of p-galactosidase has been recombined into either the U, or R L as a control early promoter in a number of studies (156, 157, 190). Two interesting chimeras have been generated with this promoter: (1) UTPIVP5, wherein the dUTPase sequences from -235 to -4 have been juxtaposed onto the By VP5 promoter elements from -13 to +226, and (2) VP5/UTP, wherein the VP5 elements between -364 to -13 have been h s e d to dUTPase elements from -4 to +158. Both promoters demonstrate wt levels of reporter gene expression, and both display early kinetics of mRNA accumulation and shut-off. Such results, along with the fact that the VP5 cap sequence is critical in late, but not early, expression (see Section III,C,Z) suggest that elements upstream of’ - 4 are important in dUTPase promoter function, and that sequence elements near the cap site are not. This result is consistent

147

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

with the lack of such elements in the model TK promoter, and also indicates that any cis-acting element involved in shut-off of early transcription must lie upstream of -4. The promoter controlling the transcript encoding the large subunit of the ribonucleotide reductase enzyme (UL39) is expressed with unusual early kinetics. This transcript is expressed very early following infection and can be detected at low levels in infected cells in which virus-induced protein synthesis is inhibited by cycloheximide (17, 26, 221). Because these conditions are used to define a kinetics of expression, the mRNA has been classified as an cx transcript by some workers (221-223). Consistent with this classification, the promoter controlling this and the homologous HSV-2 transcript has an apparent TAATGARAT element, which defines the a promoter enhancer sequence responsive to a-TIF (222-224). Despite this, the U,39 upstream element does not have a fully equivalent role in the expression of UL39 mRNA during productive infection by HSV-1, because high levels of cycloheximide reduce expression to virtually undetectable levels in some cells (28), and a-TlF only weakly activates expression (225).The unusual nature of the UL39promoter is confirmed by the fact that it (unlike any other productive cycle HSV promoter yet characterized) does not respond to activation by the a 4 protein; rather, it responds only to activation by the a0 protein mediated by an AP1 site (223, 225). It is not known what biological role this very early expression of the UL39 transcript might play during productive infection, but it is perhaps not surprising that, given the large number of promoters in the HSV genome, some are atypical.

C. Late HSV Promoters 1. THE UL38 PROMOTER IS A MODELFOR OF y PROMOTERS

A

NUMBER

Mutational analyses of several y promoters show them to lack functional cis-acting elements upstream of the TATA box, and to consist of a TATA box and critical cis-acting elements at or near the transcript cap site (168, 182, 185-187, 226). This structure is in striking contrast to the architecture of early promoters. We have extensively characterized the promoter regulatory regions governing expression of the y UL38 transcript encoding the VP19C capsid protein. Comparison between the results of our research an data on other late promoters demonstrates that the UL38promoter serves as a model y promoter. Deletional and mutational analysis of the UL38 promoter defines three important cis-acting elements (188, 189):a TATA box (TITAAA) at -31, an initiation element spanning the transcript start site to +9, and the down-

148

EDWARD K. WAGNER ET AL.

stream activation sequence (DAS) located within the 5' untranslated leader. The TATA box and initiation element form an irreducible core promoter, in that removal of either leads to a dramatic loss of promoter activity. The UL38 DAS lies within bases +20 to +33, and increases transcription from the core promoter approximately 10-fold in a number of cell lines. Unlike an enhancer element, the orientation and spacing of DAS with respect to the core promoter are critical for this stimulatory effect, and the region from +1O to +19 of the UL38 gene serves as a spacer region. Experimental evidence suggests that although a recognizable TATA box is required, specific TATA box homologies are largely interchangeable in the context of intact strict late promoters. For example, the TATA box of the y glycoprotein H (gH; UL22) can be replaced by the TATA clement of either the (Y ICP4, the p TK, or the y gC promoters in the context of the gH promoter; each of these recombinants expressed the reporter gene with y kinetics (184). Also, change of the wt noncanonical TATA box of the UL38 promoter, 'IlTAAAC, to the strong consensus, T A T M , does not have a significant effect on transcription levels in a promoter context that contains the other critical elements of the U,38 promoter-the initiation element and DAS; in contrast, this TATA alteration increases transcription twofold in a UL38 promoter context in which DAS is mutated (157). These results indicate that the TATA box and elements near and downstream of the transcript start site together determine absolute levels of expression at late times. Analyses of the y gC, gH, and UL38 promoters and the p-y, gB, and VP5 promoters show that specific sequences at the transcriptional start site play a critical role in transcription at late times (157, 186, 187, 189, 227). Deletion analysis of the leader demonstrates that elements to +9 are required for measurable expression from the UL38 promoter. Replacement of UL38 promoter sequences from - 14 to 18 with those of the p UL37promoter results in an inactive promoter; interestingly, the addition of DAS downstream of a similar chimeric U ,38/UL37 core promoter increases transcription to approximately 30% of UL38 promoter levels. Despite the quantitative difference, transcripts expressed from this chimeric promoter exhibit strict late kinetics indistinguishable from the wt promoter. Mutational analysis of UL38 DAS indicates that the most critical portion is the sequence G(G/T)AGC (157). This core element is present as a direct repeat, and both copies are required for full activity, although significant activity is observed with a single copy. Other y promoters contain elements that share positional, sequence, and functional similarities with DAS. We have demonstrated that a similarly positioned element from the gC gene has partial DAS function in the U Q 8 promoter context; also, the U s l l promoter

+

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

149

contains a functional DAS element with striking homology to the UL38 element. Because of its strong effect on transcription and the presence of DAS-like elements in other y promoters, we have characterized the mechanism of action of DAS on UL38 promoter activity. The strict spacing requirement of DAS within the viral mRNA leader and direct analysis of mRNA stability demonstrate that DAS does not act post-transcriptionally to improve RNA stability or processing, but rather functions to increase transcription directly. Two lines of molecular genetic evidence also support our conclusion that DAS functions to increase transcription initiation and not the efficiency of transcriptional elongation: (1)there is an interdependence between TATA box homology and DAS effect, i.e., DAS has a more pronounced effect on promoters containing “poor” TATA boxes; and (2) the addition of DAS downstream of an otherwise inactive promoter increases transcription to readily measurable levels. Further support of this conclusion is provided by our inability to detect short transcripts indicative of transcriptional pausing or premature termination during infection with a recombinant virus containing a UL38 promoter construct in which DAS is mutated. The structure of the UL38 promoter is very similar to the glial fibrillary acidic protein (GFAP) promoter and some other cellular promoters (228, reviewed in 157). This suggested to us that the transcription factor interacting with DAS might be of cellular origin. Accordingly, we performed in vitro transcription reactions with uninfected-cell nuclear extract and UL38promoter constructs with or without DAS. The results of these assays were striking; the effect of DAS deletion on UL38 promoter activity in vitro is virtually identical to that observed in viva. Further, by electrophoretic mobility shift and UV cross-linking assays we have partially characterized an unmodified cellular transcription factor that binds DAS. The sum total of our results establishes that DAS specifically interacts with a cellular protein (DAS binding factor, DBF) of approximately 35 kDa. Work currently in progress suggests that DAS plays a critical role in nucleating higher order protein complex formation near the UL38 transcriptional start site; such a role is consistent with our earlier conclusions, based on molecular genetic studies, suggesting that DAS is involved with preinitiation complex formation. Additionally, the UL38 promoter interacts indirectly with the a4 protein via the interaction of cellular proteins (128, 188).Thus, a4 protein may function at the U,38 promoter by activating a basal transcriptional process potentiated by cellular factors. The similar features of a number of HSV-1 y promoters suggest a model of y-gene regulation in which the TATA box, a strict late (y) initiator element, and DAS interact with cellular transcription factors to form a stable preinitia-

150

EDWARD K . WAGNER ET AL.

tion complex; removal or alteration of any of these elements has a dramatic effect on complex formation and hence promoter activity The interaction between the TATA box and the TATA binding protein (TBP) of the multisubunit general transcription factor TFIID is critical and may be stabilized and/or facilitated by DAS interaction with cellular DBF. The relative importance of this stabilization is dependent on the structure of the core y promoter; thus, the contribution of DAS to total promoter activity will vary depending on the basal core promoter activity. The interactions between the preinitiation complex and DAS are equivalent to those occurring in uninfected cells with similar cellular promoters; partial viral specificity may be d o r d e d by the additional interaction between the y initiator element and an, as yet, uncharacterized cellular or virusmodified cellular protein. This protein could be a TBP associated factor (TAF) (in the terminology of (229), and thus a component of a late genespecific TFIID complex, or could function independently of TFIID. The low levels of expression of y transcripts observed prior to genome replication could be a result of the low inherent activity of y promoters (i.e., the UL38 promoter), or could result from specific repression of some y promoters prior to DNA replication, as has been suggested by others (230, 231). In addition to the possible virus-induced modification of cellular factors, two other events would be required to see the characteristic strong expression of y promoters concomitant with viral genome replication: (1) an increase in transcription templates via genome replication, and (2) trans-activation of the basic transcription reaction by the (Y protein. Although much experimental justification for this model is in place, many specific details remain obscure. DBF is a potentially significant transcription factor important not only in HSV gene expression but also for cellular transcription by RNA polymerase 11. As such, purifying and characterizing DBF and identifying the protein or proteins it interacts with at the promoter are goals of current studies.

2. PROMOTERS CONTROLLING Py TRANSCRIPTS ARE HETEROGENEOUS IN OVERALL ORGANIZATION, BUT SHARESOMECOMMON FEATURES Analyses of recombinant viruses containing engineered chimeric promoters composed of both P and y promoter eIements have led to the argument that Py promoters are natural and functional chimeras of early and strict late promoters where cis-acting elements upstream of the TATA box of the former act to “turn on” the promoter prior to viral genome replication, whereas late elements function to obviate shut-off and allow maximal expression following genome replication (181, Z87). This view has been bolstered by the fact that many Py promoters contain recognizable transcription-factor binding-sites

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

151

well upstream of their TATA boxlcap sites, which have a role in transient expression. Despite this, our extensive functional analysis of critical elements of the Py major-capsid-protein (VP5 or U,19) transcript promoter in recombinant viruses fails to identify any element other than the TATA box as being important in early expression. Further, late-specific elements near the cap site cannot readily be transported to change the kinetics of an early promoter (156, 176, 190). Thus, a chimeric arrangement of Py promoters is not an accurate global model if, indeed, it describes any promoters of this class. A number of sequence elements having homology to known transcription factor binding sites are located 5’ of the VP5 transcript start site. There is an excellent TATA box at -30; consensus binding sites for the Spl transcription factor at -48, -144, -208, and -268; a site for CAAT-binding protein at -111; and a YY1 factor binding site at -78. In addition, nominal binding sites for the HSV ICP4 regulatory protein occur at -228 and -353 (232). Transient expression assays have suggested that the VP5 promoter can be dissected into three domains: a core promoter domain containing sequences downstream of -83 responsible for the basal promoter activity; a domain between -83 and - 125 that negatively regulates basal activity; and a domain upstream of - 125 that allows expression to be activated by immediate-early (a)gene products (162, 233, 234). However, the VP5 promoter looks quite different when recombinant viruses are examined, thus providing a powerful argument against relying solely on transient expression data to formulate functional models of HSV promoters (156, 176). Although the sequences upstream of the TATA element in the VP5 promoter appear to influence activity in transient expression, the deletion and/or modification of sequences upstream of -48 (rernoving three S p l sites, the CAAT site, two ICP4 sites, and the YY1 site) have no significant effect on either the levels or the kinetics of expression from the VP5 promoter in recombinant viruses. From this we conclude that the upstream boundary for full and regulated expression from the VP5 promoter lies within 48 bases of the transcript cap site. We cannot, however exclude a role for the upstream sites in some other aspect of viral pathogenesis in certain critical tissues or under certain conditions of viral gene expression, such as early during the reactivation process. Such a role might correlate with the measurable importance of these sites seen in transient expression assays wherein the transcription template is clearly different from that present on the viral genome during productive infection. Our studies of the VP5 promoter in recombinant viruses show that it consists of a 60-bp region extending from a critical S p l site at -48 and includes sequences spanning the transcript start site to +lo. In uitro footprinting and mobility-shift assays using purified Spl and crude nuclear ex-

152

EDWARD K. WAGNER ET AL.

tracts, respectively, show that this putative site does indeed specifically interact with Spl. The critical element at the transcriptional start site is homologous to an equivalently located element in the LTR of HIV-1 that interacts with cellular transcription factors (235-237). Interestingly, perturbation of either extreme of the functional VP5 promoter has the same net effect: mRNA expression at early times prior to viral template replication is not significantly affected, whereas RNA levels at late times are drastically reduced. Indeed, the pattern of reporter mRNA expression from recombinant viruses, deleted of either the critical SPL site or containing mutations near the transcript start site, closely resembles that of nominal early promoters. The requirement for the SP1 site late, rather than early, was rather unexpected and shows that the chimeric model proposed for Py promoters is not valid for the VP5 promoter. As noted previously (Section III,C,l), specific cis-acting elements at the transcriptional start site are a common structural feature of HSV y promoters as well as of the gB promoters, and other Qy promoter studied in detail; this structural similarity shows that py promoters are more clearly related to y promoters than to p promoters, and argues that such elements play a crucial role in transcription at late times. The presence of two elements critical in transcription kinetics at distal ends of the VP5 promoter is consistent with action in a cooperative fashion; this would be analogous to late-phase promoter activity during infections with other DNA viruses. For example, during adenovirus late-gene expression, synergistic promoter activation occurs under conditions wherein no cooperative DNA binding of the cognate activators occurs. From this, it was suggested that the late-phase-specific activation results from the simultaneous action of factors bound at the upstream and down stream elements onto a common component of the transcriptional machinery (238, 239). Thus, there may well be common structural features in the ability of promoters to be active on replicating templates in infections with nuclearreplicating DNA viruses. Again, as observed for the U,38 and other y promoters, the functional architecture of the VP5 promoter is similar to that of certain cellular promoters, as evidenced by a recent analysis of the triose phosphate isomerase (TPI) promoter (240).The defined cis-acting elements in this promoter are an S p l binding site at approximately -50, a TATA box, and a cap proximal element; as noted elsewhere in this review, these elements are required for both constitutive expression and trans-activation by the E 1A protein of adenovirus or the IEP of PRV. As with the VP5 promoter, there appears to be a functional interdependence between the S p l and cap site elements, because deletion of either profoundly affects promoter activity. Such analyses illustrate a recurring theme from HSV promoter muta-

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

153

genesis studies: the cis-acting elements of a,P, Py, and y promoters interact with cellular transcription factors to potentiate promoter activity. The ability to express differentially a large number of promoters during the course of infection must reflect differences in the ability of viral trans-acting proteins to interact with specific assembled transcription complexes, and differences in the nature of the transcription template at early and late times. Despite the success of our approach for studying the cis-acting elements of y promoters and for developing a consistent general model for promoters controlling this kinetic class of transcripts, preliminary analysis of other Py promoters suggests that their architectures may be heterogeneous. For example, the a-TIF promoter appears similar to that of VP5 in that it contains sequences near its cap site required for full activity and has a potential Spl factor binding site upstream of the TATA box (162, and unpublished), but promoter controlling the Py gB (UL27)transcript appears to contain a functional DAS-like element downstream of the cap site that is required, in addition to sequences upstream of the cap, for full activity (241). A tentative general model for Py promoters can be formulated as follows: As with y promoters, the core promoter comprises approximately 40 bases from the TATA box and sequences spanning the transcript start site; this core mediates the basic interaction between TFIID and initiator element binding factors. Again, like y promoters, the full activity of the promoter late in infection requires the additional participation of a third, accessory transcription factor binding element or elements. In the case of the VP5 promoter, this is the Spl site just upstream of the TATA box, but DAS may fulfill this role in the gB promoter. Analysis of other py promoters should identify other elements with analogous functions. The relatively high activity of such promoters prior to viral genome replication is a result of greater inherent core promoter strength compared to y promoters; indeed, a minimal activity of Py promoters essentially requires only a good TATA box.

IV. Other Factors in the Earlyllate Switch in HSV mRNA Expression Although this review emphasizes the role of specific promoters in the regulation of expression of productive and latent-phase transcripts, other factors are also important. The U,1 ORF, which codes for glycoprotein (gL) (242),is spanned by two transcripts of 1.8 and 2.6 kb. Although both transcripts utilize the same promoter and start site, the 1.8-kb transcript is expressed with Py kinetics and the 2.6-kb species is expressed with y kinetics. This difference in kinetics of expression is clearly not related to promoter

154

EDWARD K. WAGNER ET AL.

elements; rather, it is due to differential polyadenylation efficiency at the 3' end of the 1.8-kb transcript (27). This is a concrete example of a situation wherein differential polyadenylation has a role in determining the kinetics of accumulation of a specific HSV-1 transcript. Such a mechanism has been described in the regulation of expression of cytomegalovirus transcripts (see 243 for a review), and differential poly(A)-site utilization is an important factor in temporal regulation of adenovirus gene expression (cf. 244). Although it has not been extensively characterized in HSV infections, differential efficiency in polyadenylation site usage early and late after infection has been seen with other polyadenylation signals in the genome as well (245-247) Further, it is quite clear that this differential polyadenylation efficiency is due, in part, to the actions of the a27 protein, which is generally conserved throughout the herpesviruses (53, 125, 147, 247, 248). It is not so clear, however, whether this phenomenon has any biologically significant role in differential viral gene expression during productive infection, although it could be important in the expression of a protein encoded by the full U,3 ORF that is strongly conserved, but is not expressed as a unique transcript (27).

V. In Vitro Analysis of HSV Promoters As relevant cis-acting elements and trans-acting factors have become more clearly defined, it is reasonable to attempt reconstructing transcription complexes containing viral promoter elements and transcription factors from uninfected and/or infected cells to better understand their interactions. Although simple in concept, such in oitro approaches have been applied only sporadically in the investigation of HSV gene expression (122, 157, 249252). Early studies showed that an uninfected cell transcription system efficiently utilizes an early viral promoter as a template (253). These studies utilized run-off transcription as an assay and relatively large (600-1200 bp) promoter elements, and high backgrounds significantly interfered with sensitivity. For these reasons, we concentrated much early effort in the definition of HSV promoters with the use of transient expression assays. Recently, armed with the knowledge that different HSV promoters represent different types of cellular promoters, we have returned to in vitro transcription analysis (157). Currently, we are in the process of examining the transcription of model a (d), p (dUTPase), Pr (VP16), and y (U,38) promoters in an uninfected cell transcription system, using primer extension analysis to detect specific products (unpublished). These studies show interesting correlations between transcription levels and template concentration. At low template concentrations, the a4 and dUTPase promoters are highly active in this

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

155

system; interestingly, increasing template concentration does not result in an increase in product formation, but instead leads to a decrease. With the VP16 promoter, the amount of transcription products formed is relatively constant over a wide range of template concentrations. In contrast, transcription product generated from the U,38 promoter increases in a nearly linear fashion until a plateau is reached. In short, like the situation with transient expression assays with replicating reporter plasmids (159), the early/late switch seen during infection can be partially mimicked through increasing template concentrations in uitro. This system is an oversimplification of productive infection and crude extracts demonstrate considerable variability. Also, it does not consider the effects of a proteins on the transcription mechanisms. Nonetheless, it does suggest that promoters of different classes can respond differently to changes in template concentrations. This provides a simple mechanistic basis for some aspects of the transcriptional patterns observed during productive infection. The use of in vitro transcription would seem to be well-suited for a relatively straightforward examination of the specific role that various HSV a regulatory proteins play in transcription during infection. As noted in Section II,A,2,a, this approach has been used with some success in the analysis of the interactions between the PRV I E P and cellular transcription factors, specifically TFIID (121, 254), and has recently been applied to the study of a4 protein function in both transcriptional activation and repression. The addition of semipurified a4 protein has a threefold stimulatory effect on gD promoter transcription in vitro, and deletion of an upstream weak binding site partially reduces this stimulation (255). A more dramatic effect is seen with a partially purified system (Section II,A,2,a) (123).A study of the repression of transcription by the a4 protein via its interaction with its strong binding site showed that although the a4 protein does not repress basal expression of its promoter, it does repress Spl-activated expression (252). Further analysis should provide some valuable insights into the specific mechanism of activation and repression.

VI. Conclusions The experimental work described in this review is fully consistent with our conclusion that the different stages of gene expression seen at various stages in latent and productive infection with HSV are controlled by readily differentiable promoters. The transcriptional events occurring during the programmed interplay between virus and host are shown in Fig. 4 in schematic form.

156

EDWARD K. WAGNER ET AL.

Latent genomes are histone associatedand transcriptionally quiescent

Early Transcription Templates Dispersed

Late transcription requires cellular factors

FIG.4. Transcriptional switches in the life cycle of HSV. As outlined in this review, HSV transcription occurs in phases correlated with the stage of infection. Latent transcription occurs only from the LAT encoded in the R,. In this stage of infection, the viral genome is histone associated, and restriction of the productive cycle cascade results in stable association between viral genomes and neuronal cells. Both general cellular transcription factors and neuron-specific factors are involved in the continued expression of LAT. Only a subset of latently infected cells express this transcript, and a significant number of cells infected with LAT+ virus as well as the great majority of cells infected with LAT- virus, maintain the viral genome in a completely inactive and unreactivatable state. LAT expression expedites reactivation by an as yet unknown mechanism. Productive infection involves the association of Q promoters with cellular transcription factors mediated by the virion-associated a-TIF interacting with cellular Oct-1 binding at the TAATGARATantaining enhancers. The expression of transcriptional activation proteins during this earliest stage of productive infection leads to general stimulation of cellular transcriptional processes and high-level expression of p genes and relatively lower level expression of py genes. In the case of both the p and the py VP5 promoter, only the TATA homology is clearly required for this early expression, although in the case of p promoters, upstream transcription-factor binding sites are important in maximum promoter activity Concomitant with genome replication and in the presence of the a regulatory proteins, late (y and By) promoters function at maximal levels, whereas p transcription diminishes. The active elements of both py and y promoters required for maximum expression are clustered near the TATA box and include essential sequences at or near the transcription s t a r t site as well as other transcription-factor binding sites either upstream of the TATA box or downstream ofthe cap site.

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

157

All of the HSV promoters extensively analyzed are clearly representative of types of cellular promoters and require cellular transcription factors for activity. A simple demonstration of this fact is that all have appreciable template activity when incubated with transcription machinery from uninfected cells (Section V). What, then, functionally differentiates these promoters, and why do they display the differential kinetics of expression observed? One factor appears to be the activity of the virus-induced transcriptional regulator, a27, which is required for efficient late-gene expression (Section II,A,2,b). Yet, many of the observed changes in transcription patterns can be viewed as indicating a subtractive process, an increasing loss of transcriptional ability in the infected cell, so that efficient expression of promoters active in low-copy-number templates (i.e., latent and early) cannot be maintained while infection proceeds into the time of genome replication. Viral modification of cellular transcription factors can be expected to accelerate this decline in overall transcriptional potential. The decrease in complexity of promoters outlined in Fig. 3 correlates well with time of maximum expression, and we suggest that the high-copy-number of late promoters as well as their limited requirements for interaction with transcription factors in a very compact spatial organization relative to the transcription start-site together lead to production of relatively large amounts of late transcripts concomitant with genome replication. It may well be significant that the late promoters characterized to date have architectures reminiscent of constitutively expressed and/or cellular housekeeping gene promoters that are measurably active in transcriptionally “quiescent” cells. The compartmentalization of presumptive late transcription templates in the infected cell nucleus is observed using immunomicroscopy (Section I,B,3). This can be expected to concentrate the infected cell’s waning transcriptional activity, and a molecular understanding of the ability of HSVencoded regulatory proteins to interact with cellular transcriptional machinery and to influence their distribution in the nucleus is a goal of immense experimental and practical importance. We suggest that in addition to the compartmentalization of HSV transcription templates, viral regulatory proteins can influence what transcription factors are available and how they are ordered on digerent promoters. Thus, more extensive analysis of transcriptional switches requires the application of increasingly sophisticated ultrastructural methods in concert with careful biochemistry. ACKNOWLEDGMENTS Experimental work carried out by members of this laboratory described herein was supported by Grants CA11861 and A106246 from the National Institutes of Health. Further support

158

EDWARD K. WAGNER ET AL.

was provided hy the UC Irvine Program in Animal Virology and The UC Irvine Biotechnology Corporate Affiliates. JFG was supported by training grants T32GM07311 and T32A107319. JS was supported by training grant T32CA09054. We thank C. J. Huang, M. Petroski, N. Pande, and R. M. Sandri-Goldin for reading and suggesting modifications to this review.

REFERENCES 1 . B. Roizman, in “Virology”(9. N . Fields, R. M . Chanock, D. M. Knipe, M. S. Hirsch,

J. L. Melnick, M. T P. Monath and B. Roizman, eds.), 2nd ed., p. 1787. Raven Press, New York, 1990. 2. J. Sinclair and J. C . P. Sissons, Semin. Viro2.-Herpesuirus Latency 5, 249 (1994). 3. N. A. Peterslund, Scand. J. lnfec. Dis., Suppl. 80, 15 (1991). 4 . B. Sugden, Semin. Viro/.-Herpesoims Latency 5, 197 (1994). 5. J. G. Stevens, Curr. Top. Microbiol. Immunol. 70, 31 (1975). 6. J. G. Stevens, Semin. Viral.-Herpesuims Latency 5, 191 (1994). 7. E. K. Wagner, ed.. “Herpes Virus Transcription and its Regulation.” CRC Press, Boca Raton, FL, 1991. 8. M. K. Rice, G, B. Devi-Rao and E. K. Wagner, in “Genome Research in Molecular Medicine and Viroloky: (K. W. Adolph, ed.), p. 305. Academic Press, Orlando, FL, 1993. 9. B. Roizman and A. E. Sears, in “Virology” (B. N. Fields, D. M. Knipe, R. M. Channock, M. S. Hirsch, J. L. Melnick. M. T. P. Monath and 8. Roizman, eds.), 2nd ed., p. 1795. Raven Press, New York, 1990. 10. E. K. Wagner, in “Encyclopedia of Virology” (R. 6. Webster and A. Granoff, eds.), p. 593. Academic Press, London, 1994. 11. J. Hay and W. T. Ruyechan, Curr. Top. Microbiol. Immunol. 179, 1 (1992). 12. E. K. Wagner, in “Pathogenicity of Human Herpesviruses Due to Specific Pathogenicity Genes” (Y. Becker and 6. Darai, eds.), p. 210. Springer-Verlag, Heidelberg, 1994. 13. L. Aurelian, in “Encyclopedia of Virology” (R. G. Webster and A. Granoff, eds.), p. 587. Academic Press, San Diego, 1994. 14. A. J. Davison, ed., Semin. Viro1.-Alpha-Herpesuiruses 4 (1993). 15. D. J. McGeoch, Annu. Reu. Microbwl. 43, 235 (1989). 16, D. J. McGeoch, in ”Herpesvirus Transcription and its Regulation” (E. K. Wagner, ed.), p. 29, CRC Press, Boca Raton, FL, 1991. 17. K. P. Anderson, L. E. Holland and E. K. Wagner, J. Virol. 34, 9 (1980). 18. E. K. Wagner, in “Herpesvirus Transcription and its Regulation” (E. K. Wagner, ed.), p. 1. CRC Press, Boca Raton, FL. 1991. 19. E. K. Wagner, W. M. Flanagan, G. B. Devi-Rao, Y. F. Zhang, J. M. Hill, K. P. Anderson and J. G. Stevens, J . Virol. 62, 4577 (1988). 20. E. K. Wagner, C. B. Devi-Rao, L. T. Feldman, A. T. Dobson, Y. F. Zhang, W. M. Flanagan and J. C. Stevens, J. Virol. 62, 1194 (1988). 21. K. G. Draper, G . B. Devi-Rao, R. H. Costa, E. D. Blair, R. L. Thompson and E. K. Wagner, J . Virol. 57, 1023 (1986). 22. E. K. Wagner. J . lnuest. Dennotol. 83, 48s (1984). 23. E. K. Wagner, in “The Herpesviruses” (9. Roizman, ed.), Vol. 3, p. 45. Plenum, New York, 1985. 24. E. K . Wagner, Viral Oncol. 3, 239 (1983). 25. E. K. Wagner, R. H. Costa, G. B. Devi-Rao, K. Draper, R. J. Frink, L. Hall, M. Riceand W. Steinhart. in “Developments i n Molecular Virology” (Y. Becker, ed.), Vol. 6, p. 79. Martinus Nijhoff, The Hague, 1985.

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

159

26. E. K. Wagner, K. P. Anderson, R. H. Costa, G. B. Devi-Rao, B. Gaylord, L. Holland, J. Stringer and L. Trihhle, in “Herpesvirus DNA: Recent Studies o n the lnternal Organization and Replication of the Viral Genome” (Y.Becker, ed.), p. 45. Martinus Nijhoff, The Hague, 1981. 27. J. Singh and E. K. Wagner, Virology 196, 220 (1993). 28. Y. F. Zhang and E. K. Wagner, Virus Genes 1, (1987). 29. Y. F. Zhang, G. B. Devi-Rao, M. Rice, R. M. Sandri-Goldin and E. K. Wagner, Virology 157, 99 (1987). 30. J. R. Stringer, L. E. Holland, R. I. Swanstrom, K. Pivo and E. K. Wagner, J . Virol. 21, 889 (1977). 31. R. I. Swanstrom and E. K. Wagner, Virology 60, 522 (1974). 32. E. K. Wagner, R. I. Swanstrom and M. G. StafFord, J . Virol. 10, 675 (1972). 33. E. K. Wagner, Virology 47, 502 (1972). 34. P. A. SchafFer, E. K. Wagner, G. B. Devi-Rao and V. G. Preston, in “Genetic Maps 1987” (S. O’Brien, ed.), 4th ed., p. 93. CSH Lab Press, Cold Spring Harbor, NY, 1987. 35. F. Liu and B. Roizman, J. Virol. 65, 206 (1991). 36. K. Baradaran, C. E. Dabrowski and P. A. SchaEer, J. Virol. 68, 4251 (1994). 37. F. Liu and B. Roizman, J. Virol. 65, 5149 (1991). 38. V. G . Preston, F. J. Rixon, I. M. McDougall, M. McGregor and M. F. Al Kobaisi, Virology 186, 87 (1992). 39. F. Liu and B. Roizman, PNAS, 89, 2076 (1992). 40. S . P. Weinheimer, P. J. McCann 111, D. R. O’Boyle 11, J. T. Stevens, B. A. Boyd, D. A. Drier, G. A. Yamanaka, C. L. DiIami, I. C. Deckman and M. G. Cordingly, J. Virol. 67, 5813 (1993). 41. R. H. Costa, K. G . Draper, T. J. Kelly and E. K. Wagner, J. ViroE. 54,317 (1985). 42. A. J. Davison, Virology 186, 9 (1992). 43. A. P. W. Poon and B. Roizman, J . Virol. 67, 4497 (1993). 44. R. J. Frink, R. Eisenberg, G. Cohen and E. K. Wagner, J . Virol. 45, 634 (1983). 45. K. K. Wobbe, P. Digard, D. Staknis and D. M. Coen, J. Virol. 67, 5419 (1993). 46. P. G. Spear, M.-T. Shieh, B. C. Herold, D. WuDunn and T. I. Koshy, Ado. Exp. Med. B i d . 313, 341 (1992). 47. W. Batterson, D. Furlong and B. Roizman, J. Virol. 45, 397 (1983). 48. G. S. Read, B. M. Karr and K. Knight, J. Virol. 67, 7149 (1993). 49. A. A. Oroskar and G . S. Read, J. Virol. 63, 1897 (1989). 50. Z. Zhu, W. Cai and P. A. SchaEer, 1. Virol. 68, 3027 (1994). 51. M.-A Mullen, D. M. Ciufo and G. S. Hayward, J. Virol. 68, 3250 (1994). 52. D. M. Ciufo, M.-A. Mullen and G. S. Hayward, J. Virol. 68, 3267 (1994). 52a. M. Kretzschmar, K. Kaiser, F. Lottspeich and M. Meisterernst, Cell 78, 525 (1994). 53. R. M. Sandri-Goldin, in “Herpesvirus Transcription and its Regulation” (E. K. Wagner, ed.), p. 77. CRC Press, Boca Raton, FL, 1991. 54. K. L. Poffenberger, P. E. Raichlen and R. C. Herman, Virus Genes 7, 171 (1993). 55. F. C. Purves, W. 0. Ogle and B. Roizman, PNAS 90, 6701 (1993). 56. I. A. York, C. Roop, D. W. Andrews, S. R. Riddell, F. L. Graham and D. C. Johnson, Cell 77, 525 (1994). 57. C. A. Wu, N. J. Nelson, D. J. McGeoch and M. D. Challberg, J. Virol. 62, 435 (1988). 58. S. K. Weller, in “Herpesvirus Transcription and its Regulation” (E. K. Wagner, ed.), p. 105. CRC Press, Boca Raton, FL, 1991. 59. P. D. Olivo and M. D. Challberg, in “Herpesvirus Transcription and its Regulation” (E. K. Wagner, ed.), p. 137. CRC Press, Boca Raton, FL, 1991.

160

EDWARD K. WAGNER ET AL.

60. J. 6. Jacohson, D. A. b i b , 1).J. Goldstein, C. L. Bogard, P. A. SchaEer, S. K. Weller and D. M. Coen, Virology 173, 276 (1989). 61. C. R. Brandt, R. L. Kintner, A. M . Pumfery, R. J. Visalli and D. R. Grau, 1. Gen. Virol. 72, 2043 (1991). 62. A. D. Idowu, E. B. Fraser-Smith, K. L. Poffenberger and R. C. Herman, Antiuiral Res. 17, 145 (1992). 63. L. Shm, L. M. Rapp and S. K. Weller, Virology 196, 146 (1993). 64,R. B. Tenser, in “Pathogenicity of Human Herpesvirus Due to Specific Pathogenicity Genes” (Y. Becker and G. Darai, eds.), p. 68. Springer-Verlag, Berlin, 1994. 65. C. R. Brandt, R. Kinter, R. J. .Visali and A. M. Pumfery, in “Pathogenicity of Human Herpesviruses Due to Specific Pathogenicity Genes” (Y. Becker and G. Darai, eds.), p. 136. Springer-Verlag, Berlin, 1994. 66. M. Levine, D. J. Fink, R. Ramakrishnan, P. Desai, W. F. Goins and J. C. Glorioso, in “Pathogenicity of Human Herpesviruses Due to Specific Pathogenicity Genes” (Y. Becker and G. Darai, eds.), p. 222. Springer-Verlag. Berlin. 1994. 67. A. de Bruyn Kops and D. M. Knipe, Cell 55, 857 (1988). 68. D. M . Knipe, D. Senechek, S. A. k c e and J. L. Smith, 1. Virul. 61, 276 (1987). 69. F. J. Rixon, A. M. Cross, C. Addison and V. G. Preston, 1. Gen. Virol. 69, 2879 (1988). 70. D. R. Thomsen, L. L. Roof and F. L. Homa, j . Virul. 68, 2442 (1994). 71. C. A. Smibert, B. Popova. P. Xiao, J. P. Capone and J. R. Smiley,]. Virol. 68, 2339 (1994). 72. J. D. Baines and B. Roizman, 1. Virol. 66, 5168 (1992). 73. L. A. Tengelsen, N . E. Pederson, P. R. Shaver, M. W. Wathen and F. L. Homa, J. Virol. 67, 3470 (1993). 74. J. G . Stevens, Micrubiol. Reu. 53, 318 (1989). 75. R. Ahmed and J. G. Stevens. in “Virology”(B. N. Fields and D. M. Knipe, eds.), 2nd Ed., p. 211. Raven Press. New York, 1990. 76. E. K. Wagner, ed., Semin. Virol.-Herpesuirus Latency 5 (1994). 77. J. G. Stevens and M. L. Cook, Science 173, 843 (1971). 78. M. L. Cook, V. B. Bastone and J. G . Stevens, Infect. Immun. 9, 946 (1974). 79. G. A. Lewandowski. D. La and F. E. Bloom, PNAS 90,2005 (1993). 80. A. Simmons, D. Tscharke and P. Speck. Curs. Top. Micrubiol. Immunol. 179, 31 (1992). 81. R. T Javier, J. G. Stevens, V. B. Dissette and E. K. Wagner, Virology 166, 251 (1988). 82. J. P. Katz, E. T. Bodin and D. M. Coen, I. Virol. 64, 4288 (1990). 83. F. Sedarati, T. P. Margolis and J. G. Stevens, Virology 192, 687 (1993). 84. T. P. Margolis, F. Sedarati, A. T Dobson, L. T Feldman and J. G. Stevens, Virology 189, 150 (1992). 85. K. A. Lillycrop, J. K. Estridge and D. S. Latchman, Virology 196, 888 (1993). 86. K. A. Lillycrop, M. K. Howard, J. K. Estridge and D. S . Latchman, NARes22, 815 (1994). 87. D. S. Latchman, in “Pathogenicity of Human Herpesviruses Due to Specific Pathogenicity Genes” (Y. Becker and G. Darai, eds.), p. 238.Springer-Verlag. Berlin, 1994. 88. G. S. Hayward, Semin. Viro1.-Transcript. Regul. Viruses 4, 15 (1993). 89. L. T. Feldman, Semin. Viro1.-Herpesuiruses Latency 5, 207 (1994). 90. F. Sedarati, K. M. Izumi, E. K. Wagner and J. G. Stevens, 1. Virol. 63, 4455 (1989). 92. K. M. Izumi, A. M. McKelvey, G. B. Devi-Rao, E. K. Wagner and J. G. Stevens, Microb. Pathog. 7, 121 (1989). 92. S. L. Deshmane and N. W. Fraser, J . Virol. 63, 943 (1989). 93. D. L. Rock and N . W. Fraser, Nature 302, 523 (1983). 94. D. L. Rock and N. W. Fraser, j . Virol. 62, 3820 (1985). 95. G. B. Devi-Rao. D. C. Bloom, J. G . Stevens and E. K. Wagner, J. Virol. 68, 1271 (1994).

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

161

96. D. C. Bloom, G. B. Devi-Rao, J. M. Hill, J. G. Stevens and E. K. Wagner, J. Virol. 68, 1283 (1994). 96a. M. F. Kramer and D. M. Coen, J. Virol 69, 1389 (1995). 97. S. Silver and B. Roizman, MCBiol 5, 518 (1985). 98. R. H. Costa, K. 6 . Draper, G. B. Devi-Rao, R. L. ThompsonandE. K. Wagner,J. Virol. 56, 19 (1985). 99. J. R. Smiley, C. Smibert and R. D. Everett, J. Virol. 61, 2368 (1987). 100. S. L. McKnight, CSHSQB 47, 945 (1983). 101. J. R. Smiley, B. Panning and C. A. Smibert, in “Herpesvirus Transcription and its Regulation” (E. K. Wagner, ed.), p. 151. CRC Press, Boca Raton, FL, 1991. 102. S. L. McKnight and R. Tjian, Cell 46, 795 (1986). 103. S. P. Eisenberg, D. M. Coen and S. L. McKnight, MCBiol5, 1940 (1985). 104. L. R. Stanberry, Semin. Viro1.-Herpesoilus Latency 5, 213 (1994). 105. L. R. Stanberry, Cum Top. Microbiol. Immunol. 179, 15 (1992). 106. J. M. Hill, F. Sedarati, R. T. Javier, E. K. Wagner and J. G. Stevens, Virology 174, 117 (1990). 107. A. C . Wilson, M. A. Clery, J.4. Lai, K. LaMarco, M. G. Peterson and W. Herr, C S H S Q B 58, 167 (1993). 108. P. Xiao and J. P. Capone, MCBiol 10, 4974 (1990). 109. B. Roizman and D. Spector, in “Herpesvirus Transcription and its Regulation” (E. K. Wagner, ed.), p. 17. CRC Press, Boca Raton, FL, 1991. 110. S. Stern and W. Herr, Genes Den 5, 2555 (1951). 111. T. M. Kristie and P. A. Sharp, JBC 268, 6525 (1993). 112. A. C. Wilson, K. LaMarco, M. G. Peterson and W. Herr, Cell 74, 115 (1993). 113. L. Donaldson and J. P. Capone, JBC 267, 1411 (1992). 114. P. OHare and G . Williams, Bchem 31, 4150 (1992). 115. M. Ptashne, Nature 335, 683 (1988). 116. S. J. Triezenberg, R. C. Kingsbury and S. L. McKnight, Genes Deu. 2, 718 (1988). 117. S. Hayes and P. O’Hare, J . Virol. 67, 852 (1993). 118. B. Choy and M. R. Green, Nature 366, 531 (1993). 119. J. Greenblatt, Cell 66, 1067 (1991). 120. D. M. Knipe, Ado. Virus Res. 37, 85 (1989). 121. S. M. Abmayr, J. L. Workman and R. G. Roeder, Genes Deo. 2, 542 (1988). 122. C. A. Smith, P. Bates, R. Rivera-Gonzalez, B. Gu and N . A. DeLuca, J. Virol. 67, 4676 (1993). 123. B. Gu and N. DeLuca, J. Virol. 68, 7953 (1994). 124. R. M. Sandri-Goldin, R. E. Sekulovich and K. Leary, NARes 15, 905 (1990). 125. I. L. Smith, M. A. Hardwicke and R. M. Sandri-Goldin, Virology 186, 74 (1992). 126. R. E. Sekulovich, K. Leary and R. M. Sandri-Goldin, J. Virol. 62, 4510 (1988). 127. K. W. Wilcox, A. Kohn, E. Sklyanskaya and B. Roizman, J . Virol. 33, 167 (1980). 128. A. G. Papavassiliou, K. W. Wilcox and S. J. Silverstein, EMBO J. 10, 397 (1991). 129. J. A. Blaho and B. Roizman, J. Virol. 65, 3759 (1991). 130. J. A. Blaho, N. Michael, V. Kang, N. Aboul-Ela, M. E. Smulson, M. K. Jacobson et al., J . Virol. 66, 6398 (1992). 131. N. Michael and B. Roizman, PNAS 86, 9808 (1989). 132. M. S. Roberts, A. Boundy, P. O’Hare, M. C. Pizzorno, D. M. Ciufo and G. S. Hayward, J . Virol. 62, 4307 (1988). 133. S. W. Faber and K. W. Wilcox, NARes 16, 555 (1988). 134. N. Michael, D. Spector, P. Mavromara-Nazos, T. M. Kristie and B. Roizman, Science 239, 1531 (1988).

162

EDWARD K. WAGNER ET AL.

A. G . Papavassiliou and S. J. Silverstein, JBC 265, 9402 (1990). A. 6 . Papavassiliou and S. J. Silverstein, JSC 265, 1648 (1990). P. Kattar-Cooley and K. W. Wilcox, J. Virol. 63, 696 (1989). P. Beard, S. Faber. K. W. Wilcox and L. I. Pizer, PNAS 83, 4016 (1986). A . N. Imbalzano, A. A. Shepard and N. A. DeLuca, J. Virol. 64, 2620 (1990). A. A . Shepard, A. N. Imbali.ano and N. A. DeLuca, J. Virol. 63, 3714 (1989). A. A. Shepard and N. A. DeLuca, J. Virol. 65, 787 (1991). J. A. DiDonato, J. R . Spitzner and M. T. Muller, JMB 219, 451 (1991). M. E. Hdpern and J. R. Smiley, J . Virol. 50, 733 (1984). J. R. Smiley, D. C. Johnson, L. I. Pizer and R. D. Everett, J. Virol. 66, 623 (1992). N. A. DeLuca and P. A. SchaEer, J. Virol. 62, 732 (1988). S. A . Rice and D. M. Knipe, J. Virol. 62, 3814 (1988). R. M. Sandri-Coldin and G . E. Mendoza, Genes Deu. 6 , 848 (1992). W. R . Sacks, C. C. Greene. D. P. Aschrnan and P. A. SchafFer,]. Virol. 55, 796 (1985). R. D. Everett, R. M. Preston and N. D. Stow, in “Herpesvirus Transcription and its Regulation” (E. K. Wagner, ed.), p. 49. CRC Press, Boca Raton, FL, 1991. 150. W. Cai and P. A. SchaEer, J. Virol. 66, 2904 (1992). 151. W. Cai, T. L. Astor, I.. M. Liptak, C. Cho, D. M. Coen and P. A. Schatfer, J . Virol. 67, 7501 (1993). 152. X. Zhu, J. Chen, C. S. H. Young and S. J. Silverstein, J. Virol. 64, 4489 (1990). 153. R. D. Everett, /. Gen. Virol. 70, 1185 (1989). 154. K.-L. Jang, B. Pulverer, J. R. Woodgett and D. S. Latchman, NARes 19, 4879 (1991). 155. S. A. Goodart, J. F. Cuzowski, M. K. Rice and E. K. Wagner, J. Virol. 66, 2973 (1992). 156. C. Huang, S. A. Coodart, M . K. Rice, J. F. Guzowski and E. K. Wagner,]. Virol. 67,5109 (1993). 157. J. F. Guzowski, J. Singh and E. K. Wagner, J. Virol. 68, 7774 (1994). 158. S. A. Rice, M. C. Long, V. Lam and C. A. Spencer, J. Virol. 68, 988 (19%). 159. B. W. Snowden, E. D. Blair and E. K. Wagner, Virus Genes 2, 129 (1988). 160. W. M. Flanagan and E. K. Wagner, Virus Genes I, 61 (1987). 161. E. D. Blair, C. C. Blair and E. K. Wagner, J. Virol. 61, 2499 (1987). 162. E. D. Blair and E. K. Wagner, J. Virol. 60, 460 (1986). 1 6 3 . A. H. Batchelor and P. O’Hare, /. Virol. 64, 3269 (1990). 164. P. Mavromara-Nazos, S. Silver, J. Hubenthal Voss, J. L. McKnight and 8 . Roizrnan, Virology 149, 152 (1986). 165. R. D. Everett, J M B 203, 739 (1988). 166. C. I. Ace, M. A. Dalrymple, F. H. Ramsay, V. G. Preston and C. M. Preston, J. Gen. Virol. 69, 2595 (1988). 167. R. D. Everett, E M B O J . 6, 2069 (1987). 168. P. A. Johnson and R. I). Everett, NARes 14, 8247 (1986). 169, E. D. Blair and B. W. Snowden, in “Herpesvirus Transcription and its Regulation” (E. K. Wagner, ed.). p. 181. CRC Press, Boca Raton, FL, 1991. 170. K. A. Lillycrop, C . L. Dent, S. C. Wheatley, M. N. Beech, N. N. Ninkina, J. N. Wood and D. S. Latchman, Neuron 7, 381 (1991). 171. R. D. Everett, E M B O J . 3, 3135 (1984). 172. P. O’Hare and G. S. Hayward, Cancer Cells 4, 175 (1986). 173. M. Campbell, J. Palfreyman and C. M. Preston, J M B 18, 1 (1984). 174. R. D. Everett, NARes 7. 3037 (1984). 175. S. L. McKnight, Cell 31, 355 (1982). 176. C. Huang and E. K. Wagner, J . Virol. 68, 5738 (1994). 177. B. Roizman, Science 239. GllO (1988).

135. 236. 137. 138. 139. 140. 141. 142. 143. 144. 145. 146. 147. 148. 149.

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

163

178. M. Arsenakis, G. Campadelli Fiume and B. Roizman, J. Virol. 62, 148 (1988). 179. F. J. Jenkins and B. Roizman, BioEssays 5 , 244 (1986). 180. D. Spector, F. Purves and B. Roizman, PNAS 87, 5268 (1990). 181. P. Mavromara-Nazos and B. Roizman, PNAS 86, 4071 (1989). 182. F. L. Homa, T. M. Otal, J. C. Glorioso and M. Levine, MCBiol6, 3652 (1986). 183. J. P. Weir and P. R. Narayanan, J. Virol. 64, 445 (1990). 184. K. R. Steffy and J. P. Weir, J. Virol. 65, 972 (1991). 185. F. L. Homa, A. Krikos, J. C. Glorioso and M. Levine, in “Herpesvirus Transcription and its Regulation” (E. K. Wagner, ed.), p. 207. CRC Press, Boca Raton, FL, 1991. 186. K. R. Steffy and J. P. Weir, J . Virol. 65, 6454 (1991). 187. M. Sethna and J. P. Weir, Virology 196, 532 (1993). 188. W. M. Flanagan, A. G. Papavassiliou, M. Rice, L. B. Hecht, S. J. Silverstein and E. K. Wagner, J . Virol. 65, 769 (1991). 189. J. F. Guzowski and E. K. Wagner, J. Virol. 67, 5098 (1993). 190. C. Huang, M. K. Rice, G. B. Devi-Rao and E. K. Wagner, J. Virol. 68, 1972 (1994). 190a. J. Singh and E. K. Wagner, Virus Genes, in press (1995). 191. J. Singh, “Molecular Analysis of Herpes Simplex Virus Type 1 Promoters.” University of California, Irvine, 1994. 192. L. T. Feldman, in “Herpesvirus Transcription and its Regulation” (E. K. Wagner, ed.), p. 223, CRC Press, Boca Raton, FL, 1991. 193. N. W. Fraser, T. M. Block and J. G. Spivack, Virology 191, 1 (1992). 194. G. B. Devi-Rao, S. A. Goodart, L. B. Hecht, R. Rochford, M. K. Rice and E. K. Wagner, J. Virol. 65, 2179 (1991). 195. M. J. Farrell, A. T. Dobson and L. T. Feldman, PNAS 88, 790 (1991). 196. J. C. Zwaagstra, H. Ghiasi, S. M. Slanina, A. B. Nesburn, S. C. Wheatley, K. Lillycrop, J. Wood, D. S. Latchman, K. Patel and S. L. Wechsler, J. Virol. 64, 5019 (1990). 197. A. T. Dobson, F. Sedarati, G. B. Devi-Rao, W. M. Flanagan, M. J. Farrell, J. G. Stevens, E. K. Wagner and L. T. Feldman, J. Virol. 63, 3844 (1989). 198. W. J. Mitchell, I. Steiner, S. M. Brown, A. R. MacLean, J. H. Suhak-Sharpe and N. W. Fraser, J. Gen. Virol. 71, 953 (1990). 199. W. F. Goins, L. R. Sternberg, K. D. Croen, P. R. Krause, R. L. Hendricks, D. J. Fink, S. E. Strauss, M. Levine and J. C. Glorioso, J. Virol. 68, 2239 (1994). 200. T. P. Margolis, D. C. Bloom, A. T. Dobson, L. T. Feldman and J. G. Stevens, Virology 197, 585 (1993). 201. R. Rivera-Gonzalez, A. N. Imbalzano, B. Gu and N. A. DeLuca, Virology 202,550 (1994). 202. J. C. Zwaagstra, H. Ghiasi, A. B. Nesburn and S. L. Wechsler, J. Gen. Virol. 70, 2163 (1989). 203. D. A. Leib, K. C. Nadeau, S. A. Rundle and P. A. Schaffer, PNAS 88, 48 (1991). 204. A. H. Batchelor, K. W. Wilcox and P. O’Hare, J . Gen. Virol. 75, 753 (1994). 205. K. A. Rader, C. E. Ackland-Berglund, J. K. Miller, J. S. Pepose and D. A. Leib, J. Cen. Virol. 74, 1859 (1993). 206. J. C. Zwaagstra, H. Ghiasi, A. B. Nesburn and S. L. Wechsler, Virology 182, 287 (1991). 207. A. H. Batchelor and P. O’Hare, J. Virol. 66, 3573 (1992). 208. J. A. Morrow and F. J. Rixon, J. Gen. Virol. 75, 309 (1994). 209. J. J. Kenny, F. C. Krebs, H. T. Hartle, A. E. Gartner, B. Chatton, J. M. Leiden, J. P. Hoefiler, P. C. Weber and B. Wigdahl, Virology 200, 220 (1994). 210. S. A. Priola, D. P. Gustafson, E. K. Wagner and J. G. Stevens, J. Vird. 64,4755 (1990). 211. S. A. Priola and J. G . Stevens, Virology 182, 852 (1991). 212. A. K. Cheung, J. Virol. 65, 5260 (1991). 213. L. Yeh and P. A. Schaffer, J . Virol. 67, 7373 (1993).

164

EDWARD K. WAGNER ET AL.

E. Hams-Hamilton and S. Bachenheimer, J. Virol. 53, 144 (1985). D. M. Coen, S . P. Weinheimer and S. L. McKnight, Science 234, 53 (1986). S. L. McKnight, NARes 8, 5949 (1980). S . L. McKnight and R. Kingsbury, Science 217, 316 (1982). J. R. Smiley, in “Viral Messenger RNA: Transcription, Processing, Splicing and Molecular Structure” (Y. Becker, ed.), p. 101. Martinus Nijhoff, Boston, 1985. 219. A. N. Imbalzano, D. M. Coen and N. A. DeLuca,J. Virol. 65, 565 (1991). 220. N. T. Pande and E. K. Wagner, unpublished. 221. R. J. Watson, C. M. Preston and J. B. Clements,]. Virol. 31, 42 (1979). 222. J. P. Wymer, C. M. J. Aprhys, T. D. Chung, C.-P. Feng, M. Kulkaand L. Aurelin, Virus Res. 23, 253 (1992). 223. P. Desai, R. Ramakrishnan, 2. W. Lin, B. Osak, J. C. Glorioso and M. Levine, J . Virol. 67, 6125 (1993). 224. J. P. Wymer, T. D. Chung, Y.-N. Chang, C. S. Hayward and L. Aurelian, J. Virol. 63, 2773 (1989). 225. P. Sze and R. C. Herman, Virus Res. 26, 141 (1992). 226. F. L. Homa, J. C. Glorioso and M. Levine, Genes Den 2, 40 (1988). 227. E. B. Rasmussen and J. T.Lis, PNAS 90, 7923 (1993). 228. Y. Nakatani, M. Brenner and E. Freese, PNAS 87, 4289 (1990). 229. B. D. Dyntacht, T. Hoey and R. Tjian, CeU 66, 563 (1991). 230. B. Meignier, R . Longnecker, P. Mavromara-Nazos, A. E. Sears and B. Roizman, Virology 162, 2-51 (1988). 231. R. Rivera-Conzalez. B. Gu and N. A. DeLuca, 19th Int. Herpesoirus Workshop, Vancouver, p. 63 (1994). 232. B. Roizman and F. J. Jenkins, Science 229, 1208 (1985). 233. S. Chen, L. Mills, P. Perry, S. Riddle, R. Wobig, R. Lown and R. L. Millette, J. Virol. 66, 4304 (1992). 234. L. K. Mills, Y. Shi and R. L. Millette, J . Virol. 68, 1234 (1994). 235. H. Du, A. L. Roy and R. C. Roeder, E M B O J . 12, 501 (1993). 236. D. M. Margolis, M. Somasundaran and M. R. Green, J Virol. 68, 905 (1994). 237. D. M. Margolis, J. M. Ostrove and S. E. Straus, Virology 192, 370 (1993). 238. C . Mondesert and C. Kedinger, NARes 19, 3221 (1991). 239. Y. S. Lin, M. Carey, M. Ptashne and M. R. Green, Nature 345, 359 (1990). 240. T. G. Boyer and L. E. Maquat, JBC 265, 20524 (1990). 241. N. E. Pederson, S. Person and F. L. Homa, J . Virol. 66, 6226 (1992). 242. L. Hutchinson, H. Browne, V. Wargent, N. Davis-Poynter, S. Primorac, K. Goldsmith, A. C. Minson and D. C. Johnson, J. Virol. 66, 2240 (1992). 243. M. F. Stinski, C. L. Malone, T. W. Hermiston and B. Liu, in “Herpesvirus Transcription and its Regulation” (E. K. Wagner, ed.), p. 245. CRC Press, Boca Raton, FL, 1991. 244. M. S . Horwitz, in “Virology” (B. N. Fields and D. M. Knipe, eds.), 2nd ed., p. 1679. Raven Press, New York, 1990. 245. J. McLauchlan, S. Simpson and J. B. Clements, Cell 59, 1093 (1989). 246. J. McLauchlan, C. L. Moore, S. Simpson and J. B. Clements, NARes 16, 5323 (1986). 247. J. McLauchlan, A. Phelan, C. Loney, R. M. Sandri-Coldin and J. B. Clements, 1.Virol. 66, 6939 (1992). 248. C. J. Chapman, J. D. Harris, M. A. Hardwicke, R. M. Sandri-Goldin, M. K. L. Collins and D. S . Latchman, Virology 186, 573 (1992). 249. N. A. DeLuca and P. A. SchalTer, M C B w l 5 , 1997 (1985). 250. A. A. Shepard, P. Tolentino and N. A. DeLuca, J. Virol. 64, 3916 (1990). 251. L. 1. Pizer, D. C. Tedder, J. L. Betz, K. W. Wilcox and P. Beard, J. Virol. 60,950 (1986). 214. 215. 216. 217. 218.

HERPES SIMPLEX VIRUS TRANSCRIPTIONS

252. 253. 254. 255.

165

B. Gu, R. Rivera-Gonzalez, C. A. Smith and N. A. DeLuca, PNAS 90, 9528 (1993). R. J. Frink, K. G . Draper and E. K. Wagner, PNAS 78, 6139 (1981). T. Gerster and R. G . Roeder, PNAS 85, 6347 (1988). D. G . Tedder, R. D. Everett, K. W. Wilcox, P. Beard and L. I. Pizer, J . Virol.63, 2510 (1989).

This Page Intentionally Left Blank

Structure, Function, and Inhibition of 0 6 Al ky Igua nine- DNA Al kyltransferase ANTHONY E.

€'EGG,*.'

M. EILEENDOLAN~ AND ROBERT C. MOSCHEL* *Departments of Cellular and Molecular Physiology and of Pharmacology Pennsylvania State University College of Medicine Hershey, Pennsylvania 17033 Vection of Hematology-Oncology The University of Chicago Chicago, Illinois 60637 *Carcinogen-ModijiedNucleic Acid Chemistry ABL-Basic Research Program National Cancer Institute-Frederick Cancer Research and Development Center Frederick, Maryland 21 702

I. Alkvltransferase Structure and Function

11.

111.

IV.

V.

...................

A. Alkyltransferase Occurrence and Purification . . . . . . . . . . . . . . . . . . . B. Alkyltransferase Structure and Reaction Mechanism . . . . . . . . . . . . . C. Chemical Studies of 0-Alkylated Purine and Pyrimidine Dealkylation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inhibition of Alkyltransferase Activity ............................ A. Small Alkyltransferase Inactivators . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Mechanism of Alkyltransferase Inhibition ...... C. Use of 0"-Benzylguanine or Derivatives as Substrates to Study Alkyltransferase Activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Substrate Specificity and Metabolism of Alkyltransferase ............ A. Substrate Specificity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Location and Degradation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation of Alkyltransferase Expression ......................... A. Structure of the Gene and Activity of the Promoter . . . . . . . . . . . . . B. Alkyltransferase Induction and Tissue-specific Levels . . . . . . . . . . . . Function of Alkyltransferase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Role in Preventing Toxicity, DNA Aberrations, Sister Chromatid Exchanges, Mutations, and Carcinogenesis ....................... B. Transgenic Expression of Alkyltransferase ......................

170 170 174 178 182 182 187 189 191 191 195 196 196 198 200 201 208

To whom correspondence may be addressed Progress in Nucleic Acid Research and Molecular Biology, Vol. 51

167

Copyright 0 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.

168

ANTHONY E. PEGG ET AL.

VI. Inactivation of Alkyltransferase to Enhance Chemotherapy . . . . . . . . . . A. Methylating Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. 06-Alkylguanines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

209 209 210 214 215

The repair of damaged DNA is an essential cellular function that involves a large, and ever increasing, number of gene products. Some pathways for DNA repair can deal with many types of DNA lesions, whereas others are highly specific. Some repair reactions require multiple proteins and many different gene products. In contrast, the subject of this essay, 06-alkylguanine-DNA alkyltransferase ( a l k y l t r a n ~ f e r a s eEC , ~ ~2.1.1.63), ~ is a remarkable protein that alone can, in a single step, remove adducts from DNA that are formed at the 06-position of guanine and the 04-position of thymine, and can thus restore the original DNA. Production of such adducts is a major contributor to the toxic, mutagenic, and carcinogenic effects of alkylating agents (1-4). The ability of potent alkylating carcinogens such as alkylnitrosoureas to form higher proportions of these adducts has been attributed to the SN1 character of their reactivity (I), which produces a “ h a r d alkylating agent exhibiting a propensity for reaction at exocyclic oxygen atoms (5, 6). It has recently been suggested that these agents be referred to as “high oxyphilic” (7). Regardless of their characterizations, the exocyclic oxygen-substituted adducts they produce with guanine or thymine residues are strongly miscoding lesions. Their repair is therefore of major importance in limiting the production of mutations in response to such agents. Although alkyltransferase plays this critical role in the prevention of mutations, and thus in tumor initiation, the occurrence of alkyltransferase in tumors has a protective effect against killing by those cancer chemotherapeutic drugs that are monofunctional methylating agents (such as procarbazine, dacarbazine, temozolomide) or by chloroethylating agents (CENUs) (such as BCNU, ACNU, CCNU, MeCCNU, fotemustine, and clomesone). The reason for the sensitivity to killing by methylating agents is that 0 6 methylguanine in DNA can act as a lethal lesion. The mechanism of this lethality is not well understood and may be multifaceted, but is likely to involve inappropriate processing of the 06-methylguanine during DNA repAlthough the name M G M T (for methylguanine methyltransferase) has been suggested for the gene and protein, on the basis of its action to demethylate 06-methylguanine in DNA, the fact that the protein acts effectively on other alkyl groups implies that alkyltransferase is a more appropriate name 3 See Section VII for a glossary of terms and abbreviations.

06-ALKYLGUANINE-DNA

ALKYLTRANSFERASE

169

lication (8). Replication of SV40 DNA sequences by human cell extracts is inhibited when 06-methylguanine is present (9). Furthermore, the results of recent site-specific mutagenesis experiments in Escherichia coli are consistent with significant retardation or arrest of modified plasmid strand replication by 06-benzyl- and 06-ethylguanine as well as by 06-methylguanine (10). Obviously, the removal of the 06-alkylguanine lesions by the alkyltransferase prevents this means of killing. Killing can also be prevented by another mechanism in which “tolerance” to the presence of 06-methylguanine (m6G) lesions in the DNA occurs (8, 11). Such tolerant cells are frequently cross-resistant to 6-thioguanine ( s ~ G ) and have a much higher mutation induction in response to high doses of methylating agents (12, 13).The underlying reason for the tolerance may be the loss of a component of the G.T mismatch repair system that involves long-patch D N A repair. A factor that binds to G.T mispairs is absent from extracts of tolerant cells (13-16). [The G.T mismatch incision pathway that brings about short-patch repair can incise s6G.T and G.T mismatches and is active in extracts of tolerant cells. Therefore, this short-patch repair is unlikely to be involved in the cytotoxic effects of m6G (15)].After the recognition of m6G.T base-pairs by human cell proteins (15, 16), the attempt to correct this site would fail because the repair synthesis occurs in the strand opposite m6G and no good match for the complementary base can be found. In concert with this, it has been shown (10)that the methylation-directed mismatch repair system of E . coli recognizes m6G.T base-pairs much more efficiently than m6G.C base-pairs, indicating that mismatch repair involving m6G residues occurs after replication. Plasmids containing m6G.T basepairs showed sharply reduced viability in mismatch repair-competent cells, consistent with the model that methylation-directed mismatch repair diverts plasmids containing promutagenic m6G.T base-pairs into replicationarrested complexes that are lethal to the plasmid. This is quite analogous to the mismatch repair-mediated cell-killing proposed for human cells exposed to methylating carcinogens (8). In the E . coli experiments (lo),a disabled mismatch repair system led to enhanced viability of plasmids containing m6G.T base-pairs at the expense of increased mutation, providing additional support for the probability that a “tolerance” to m6G lesions in D N A results from mismatch repair deficiencies (8, 11-16). Other mechanisms for resistance to killing by methylating agents in DNA may also exist, involving lethal lesions other than m6G (17) or proteins that may not be part of the mismatch repair system (11). The predominant mechanism of killing by CENUs involves the formation of a 1-(3-cytosyl)-Z-(l-guanyl)ethanecross-link (18, 19). Such cross-links are very effective lesions for cell killing. However, the initial adduct formed in DNA is 06-(2-chloroethyl)guanine, which then undergoes an intramolecular

170

ANTHONY E. PEGG ET AL.

cyclization to 1,06-ethanoguanine. This then reacts with the N-3 of cytosine in the complementary strand to form the cross-link. Cross-link production is prevented by removal of the chloroethyl group by alkyltransferase or by the reaction of the alkyltransferase with the 1,06-ethanoguanine precursor (20). The latter reaction may not actually occur in cells treated with CENUs because the reaction of the alkyltransferase with 06-(2-~hIoroethyl)guanine may be more rapid than the rearrangement to l,O6-ethanoguanine. The "tolerance" phenomenon described above does not provide resistance to CENUs. There is, therefore, an even stronger correlation between alkyltransferase activity and resistance to chloroethylation than with resistance to methylation (21-23). Although there are some other mechanisms that impart resistance to CENUs, some but not all of which are mediated by glutathione S-transferase (18,24-27'), these appear to be generally less prevalent than the alkyltransferase-mediated resistance. The design of inactivators or modulators of alkyltransferase that can increase the effectiveness of those cancer chemotherapeutic agents that form 06-alkylguanine adducts is therefore a major focus of research and of this essay. This article focuses on studies of the alkyltransferase protein published in the past 4 years; several previous reviews have described in detail the earlier papers in this field (21-23, 28-30).

1. Alkyltransferase Structure and Function A. Alkyltransferase Occurrence and Purification 1. OCCURRENCE Alkyltransferase activity has been detected in many species, including microorganisms, insects, fish, and mammals (21-23, 28-31). Complete cDNA or genomic clones containing the alkyltransferase sequence have been isolated from human, rat, mouse, Chinese hamster, rabbit, and Saccharomyces cerevisiae cells, and from E . coli a& and ogt genes, Salmonella typhimurium a&,, genes, and the Bacillus subtilis a&B and dat genes. There is considerable homology among these sequences (Fig. 1). There are 22 invariant residues in all the sequences, and many other places have only highly conservative changes. Most prominent is the sequence surrounding the cysteine acceptor site to which the alkyl group is transferred from guaFIG 1. Amino-aciJ sequences of alkyltransferases. The 11 known amino-acid sequences are shown (one-letter amino-acid code). The complete sequences of all the alkyltransferases are shown, except for the E . coli Ada sequence, which shows only the carboxylterminal fragment corresponding to residues Thr'76-Arg354 (Ada-C), and the S. typhimurium Ada,, sequence, which shows only residues Thr"5-GIu"2 (Adas.&). Residues in common with the human sequence are Imxed.

mnnn

C E K L HIE,IIK>LIGK G I T I S C E R G L H G I R F L S G K T P

Rat muse Hamster Rabbit

Yeast

3 44 T 44 T 44 K 44

M H K K K I E N G R I F D L N

L I C D E Q - - F R - - L R A V EW E O D R I T R L F - - L S Q E Dw Q N K P I E H L F - - - - - - EW

ogt Dat Ma8

M E T N K P T L Y W S - T A K Q F R - T A K Q F R

Adam-C Rda-C

A 44 F L L G 60 E E Y s 35 V H W K 37 a - R K 48 - 45

-

45

E E F P V P A L H H P V F Q Q E S F T 95

Yeast Cgt Dat

AdaB

OEG F L K K

H E K Q D TM Y D LQ E EAMVQLLDIHYRKEGY@RISA D T V Q N T E H K E - - - - - - - - - R F H G S L L B E D - - - - - - - - - K

LK E TN T P N D D K

P O L P V P A L H H P V F Q Q E S F T m I PF EF L FG T D F S I I D T L P T A T G - - - G T P F Q K T F S L P O S Q K - - - G T P F Q K N O : T V P V E Y A - - - G T Q F Q M P L

99

Q" 120 90 81 92

207 210 211 209 181 206 171 165 179 178 179

172

ANTHONY E. PEGG ET AL.

nine. This sequence is (V/I)PCHR(I/V) in all cases. Many of the other conserved residues are present in the putative DNA binding domain made up of a helix-turn-helix structure (32). With the exception of the rabbit alkyltransferase (31),all of the mammalian alkyltransferases have a carboxyl-terminal extension of about 25 amino acids that is quite poorly conserved and can be used to generate species-specific antibodies (33, 34). The E. coli Ada alkyltransferase, which is highly inducible in response to alkylation damage as part of the adaptive response (28-30), consists of a single 39-kDa protein with two domains, each containing a cysteine acceptor. The 19-kDa carboxyl-terminal domain is responsible for the repair of m6G and o ~ T ,whereas the 20-kDa amino-terminal domain repairs the S, diastereomer of methylphosphotriesters by transferring the methyl group to cysteine-69, which is located in a sequence -PCKR- (28-30). The transfer of a methyl group to this cysteine leads to the appearance of AAAGCGCA sequence-specific DNA binding activity that allows the protein to activate the transcription of a number of genes encoding proteins that counteract alkylation damage. The amino-terminal domain of Ada binds Zn2+ very tightly, and studies of its protein structure by cadmium-113 nuclear magnetic resonance indicate that the Zn2+ is bound to four cysteine residues, one of which is the cysteine-69 acceptor (35, 36). This coordination aids in the activation of the cysteine for methyl transfer. The formation of a thioether-S ligand that would bind the Zn2+ less tightly than the thiolate-S may cause the structural changes that unmask the DNA binding conformation of the protein (35, 36). Many other microorganisms also show an adaptive response to alkylation damage that involves the induction of an alkyltransferase protein (30). The equivalent protein from the S . typhimurium a&,, gene is similar to Ada but is a less powerful transcriptional regulator (37). Bacillus subtilis has two distinct genes, a&, whose product corresponds to the methylphosphotriester repair domain, and a&B, corresponding to the 06-methylguanine repair domain. These genes are in the same operon and overlap by 11bp, but two distinct proteins are the primary products (38). All of the other currently cloned alkyltransferases consist of a single domain responsible for the repair of 06-alkylguanine. 2. PURIFICATION

The 19-kDa C-tenninal domain of the Ada protein was prepared in sufficient quantities for the production of crystals by use of a high-level expression vector under the control of the tuc promoter (32). The 39-kDa Ada protein was produced in very large amounts from a lac-promoter-driven expression-plasmid from which the a& regulatory sequences had been deleted (39). Purification of large quantities of the recombinant mammalian alkyltransferase after expression at a high level in E. coli with various vectors

O~-ALKYLGUANINE-DNA ALKYLTRANSFERASE

173

has been achieved by many groups (20, 40-47). It is noteworthy that, in general, the specific activities of these preparations are around 35 pmol methyl groups accepted per mg protein, which is less than the theoretical value of 46 pmol/mg (42,46).This may be due to the presence ofa significant fraction of the protein in an inactive form due to incorrect folding, or to denaturation during purification. It is unlikely that this is due to part of the protein having reacted with alkylated DNA, because the level of expression of the protein far exceeds the likely number of alkylation lesions in the DNA of untreated cells. Human alkyltransferase has also been expressed as a fusion protein with glutathione-S-transferase (47, 48) or bacteriophage AN (49).Although these preparations have alkyltransferase activity, some care should be taken in interpretation of results on kinetics and substrate specificity with such preparations, because they may not undergo the same conformational changes as the normal protein on binding the DNA substrate. For example, we were unable to confirm a report based on studies with an alkyltransferase fusion protein, that a deletion of the 28 amino acids at the carboxyl end led to an 80% reduction in the rate of reaction with 06-benzylguanine (48) when a similarly truncated protein was tested without additional sequences at the amino terminus (50). Additionally, the rate constants for the repair of various oligodeoxyribonucleotides by the Zl-kDa human alkyltransferase differed significantly from those obtained using its fusion protein with g1utathione-Stransferase (51).

3. ANTIBODIES Specific polyclonal and monoclonal antibodies have been made to the human alkyltransferase and to peptides from the sequence (33, 34, 41, 47, 49; L. Wiest and A. E. Pegg, unpublished). Few of these antibodies reacted well with the native enzyme. An important exception is one made to the sequence of amino acids from Hisl71-SerlS4in the human protein (33) suggesting that these must be located on the surface of the protein. Antibodies made to sequences near the amino (LysS-Glu20 and Metl-Thrll) and carboxyl terminus (A1alg7-Asn207)reacted very well with the denatured protein, but not with the native form (34). Sensitive in situ immunohistochemical staining procedures for the alkyltransferase have been set up (41,47, 52-54), and can be used to evaluate the possibility that expression of alkyltransferase in tumors may be heterogeneous. This may be of importance in planning therapy with alkyltransferase inhibitors such as 06-benzylguanine (see Section II,A), because the small group of patients with Mer- tumors that do not express alkyltransferase would not benefit from such therapy and could be spared any additional toxicity associated with the treatment. However, studies with experimental tumors derived from mixing a few cells expressing al-

174

ANTHONY E. PEGG ET AL.

kyltransferase with Mer- cells indicate that it is necessary to use the inhibitor in order to prevent the selection of the Mer+ cells and relapse with a tumor producing alkyltransferase (55).Therefore, a very sensitive test will have to be used to establish the complete absence of Mer+ cells.

6. Alkyltransferase Structure a n d Reaction Mechanism The X-ray crystal structure of the 19-kDa C-terminal domain of Ada has been determined (32).This 178-amino-acid polypeptide (Ada-C) spans residues Thr176-Arg314of the full-length protein. The polypeptide shows a high degree of amino-acid sequence and secondary-structure homology with the related sequence of other alkyltransferases (Fig. l), which suggests that the structural features (Fig. 2) observed for this C-terminal domain (32) will be common to the wider range of alkyltransferases, including the human protein. Several interesting features of the polypeptide are apparent in the crystal structure (32). First, the N-terminal domain of this sequence spanning the first 80-90 amino-acid residues forms a fold resembling that found in ribonuclease H (56).Second, the remaining C-terminal residues fold to form three helices in a helix-turn-helix arrangement with a previously unobserved organization for a DNA-binding protein (57). The presumed DNA binding helix is quite remote from the active-site Cys146 residue (corre-

FIG. 2. Structure of E . coli Ada-C alkyltransferase protein (from reference 32). The structure is presented as a schematic stereo drawing. The side chains of the residues common to all known alkyltransferases are indicated by ball-and-stick structures and the a-helical and P-strand regions are shown. The helical regions, H3. H4, and H5, which make up the putative DNA binding region and the C-terminal helix H6 that may move to accommodate the D N A substrate, are indicated. The figure was kindly provided by P. C. E. Moody and was generated by the computer program MOLSCRIPT (552).

06-ALKYLGUANINE-DNAALKYLTFUNSFERASE

175

sponding to Cys321 in the full-length Ada protein). Furthermore, Cys146 is surprisingly buried within the polypeptide secondary structure. Indeed, the crystal structure conformation for this protein causes the reactive cysteine to be completely inaccessible for transfer of an 06-methyl group from an 0 6 methylguanine residue in a modified DNA segment. This suggests that a very significant conformational change must occur when the alkyltransferase binds to a DNA substrate. The conformational change proposed (32)was one in which the C-terminal helix encompassing residues Arg165-Ala174would swivel to expose a DNA binding surface and allow Cys146 access to a methyl group on an 06-methylguanine in double-stranded DNA. The C-terminal helix would then reside in the wide groove of the DNA. In this conformation the protein covers roughly eight DNA bp, which is consistent with previously described data based on fluorescence anisotropy measurements (see below). Furthermore, the conformation allows the adjacent His147residue to adopt a favorable orientation with respect to C Y S 'to ~ perhaps ~ abstract the sulfhydryl proton and generate a reactive CyslM thiolate anion. All alkyltransferases contain the highly conserved amino-acid sequence PCHR in the active site for the methyl transfer reaction (23, 28, 29) (Fig. 1). It is thought that the proline residue facilitates a bend in the protein chain, allowing the active cysteine sulfhydryl group to protrude, whereas the histidine residue or some other basic residue (e.g., the arginine of this sequence) abstracts the sulfhydryl proton to produce a thiolate anion that subsequently reacts with the methyl group of an 06-methylguanine (28).The active site of the thymidylate synthase protein from a variety of sources also contains a PCH sequence. In this protein, a reactive cysteine thiolate anion is generated to add to the C-6 of 2'-deoxyuridine 5'-phosphate. These findings suggests that the PCH sequence might be generally important for generating strongly nucleophilic cysteine thiolate anions for addition or displacement reactions (28, 29). The methyl group transfer from 06-methylguanines by alkyltransferase proteins will produce a neutral guanine either by protonation of the liberated guanine anion through reaction with water or some other proton source, or by a preliminary protonation of the 06-methylguanine residue prior to cysteine attack to liberate a zwitterionic or normal neutral guanine residue. Arguments have been put forth to support a preliminary protonation of 0 6 methylguanine residues through studies of reactions involving methylated free bases not incorporated in DNA segments (58, 59) (see Sections 1,C and 11,B). However, whatever the mechanism, the reaction sequence with DNAbound 06-methylguanine restores a normal guanine at the site of the original 06-methylguanine residue and an S-methylcysteine residue is produced in the protein's active site. This resulting S-methyl ether is extremely stable and there is no meLhanism apparent to demethylate this ether to restore a norma1 active-site cysteine. Studies with antibodies to the human alkyltrans-

176

ANTHONY E. PEGG ET AL.

ferase suggest that a conformational change in the protein occurs after methyl group transfer that renders the protein susceptible to proteolysis (60, 61). This may provide a signal for additional protein production. In any event, restoration of active alkyltransferase levels requires de nmo protein synthesis. Fluorescence anisotropy studies have been used to study the interaction of the 39-kDa Ada with DNA (62).These studies indicate that DNA binding is noncooperative, with association constants in the range of 4 x LO7 to 3 x 105 (depending on NaCl concentration) and that the protein covers 7 to 8 bp in the DNA complex. Binding to longer DNA segments was strongest and the protein’s a f h i t y for methylated DNA was greater than for unmethylated DNA (62).Earlier studies revealed that the rate of methyl transfer to the 19kDa Ada alkyltransferase from 06-methylguanine residues in methylated DNA is extremely rapid. A second-order rate constant for this process was 2108 M-1 sec-1 at 37°C (28, 29, 63). This is a very fast reaction; the rate resembles those for interactions between DNA and gene regulatory proteins or the EcoRI restriction endonuclease (28, 64, 65). High rates for these interactions have been proposed to result from exchange of the protein between specific domains on a DNA molecule, or to result from protein diffusion along the DNA within domains. The second-order rate constant for the reaction between homogeneous recombinant human 06-alkylguanine-DNA alkyltransferase and DNAbound 06-methylguanine residues has been determined to be 1 x lo9 M - 1 min-1 (43), somewhat higher than the values previously reported for partially purified human protein (23), although a fifth of that for the Ada alkyltransferase. The second-order rate constant for methyl transfer by the human protein is one-third of that for DNA that had previously been depurinated (43).This was proposed to result from a conformational change in the DNA, inhibiting access of the alkyltransferase to the bound 06-methylguanine residues. Like the Ada protein, the human alkyltransferase shows an enhanced affinity for binding to methylated rather than unmethylated DNA and covers 8 bp in the associated complex. The association constant for reaction with normal DNA of the human alkyltransferase [4.7 x 105 M-1 at 25°C (43)Jis one-tenth that of the Ada protein (62).This may contribute to the difference in methyl transfer rate constants exhibited by the two proteins. Fluorescence quenching and circular dichroism studies of the interaction of the human protein with DNA indicate that the human alkyltransferase appears to undergo a conformational change when it binds to DNA (43), which may be similar to the proposed conformational changes suggested for the 19-kDa E . cob Ada fragment (32). Site-specific mutagenesis techniques have been used to investigate the importance of some of the highly conserved residues in the alkyltransferase.

O~-ALKYLGUANINE-DNA ALKYLTKANSFERASE

177

As expected, the cysteine acceptor site in the Ada (66), Ogt (67, 68), and human (69, 70) alkyltransferases was absolutely essential for activity. The role of the amino acids near the methyl acceptor site in the sequence PCHRV of both the human and the Ogt alkyltransferases was studied by multiple replacement techniques (68, 70). A variety of hydrophobic residues could replace the valine residue. However, changes of the His and Arg residues in the Ogt protein could not be tolerated. Some replacements for Pro138 in the Ogt (equivalent to in the human sequence) did give rise to an active protein, as did some of the substitutions for His14 in the human alkyltransferase, but the level of protein present was very low suggesting that these proteins are unstable. This hypothesis is consistent with our findings that human alkyltransferase mutants derived by replacing His146 or Arg147 with alanine (H146A or R147A) have half-lives of less than 20 minutes in E . coli, whereas the wild-type protein has a half-life of >24 hours (T. M. Crone and A. E. Pegg, unpublished) and with the finding that the transposition of Cys321 and His322 in the Ada protein also leads to instability (66). However, the presence of plasmids leading to expression of either H146A or R147A did increase the resistance of an a&- ogt- E . coli strain to MNNG, suggesting that these proteins do retain some ability to repair methylated DNA (T. M. Crone and A. E. Pegg, unpublished). Therefore, His146 may not be absolutely required for the formation of the thiolate anion. It is possible that in the absence of H i ~ 1 4 another ~, residue (e.g., Arg147)may be able to assume this function. Thus, the conserved PCHR motif may be of critical importance, both for the stability of the alkyltransferase as well as its activity. The human mutant alkyltransferase derived by replacing Cys145 (the cysteine acceptor site that is equivalent to Cys146 in Ada-C) with Ala (C145A), which has absolutely no activity in in uitro assays (69, 70) and does not protect a d - ogt- strains from MNNG toxicity (69), is as stable as wildtype protein in E . coli, and is easily purified to homogeneity using the same techniques developed for the wild-type protein (69). Truncation of the carboxyl end of the human alkyltransferase terminating the protein at positions 176 and 185 (50) and 182 (71) did not affect the activity. Deleting residues 172 to 207 or the mutation E172Q led to a loss of all measurable alkyltransferase activity and to a very large reduction in the amount of immunoreactive protein present in E . coli extracts expressing alkyltransferase from plasmids containing these mutations (50, 72). Th'is confirms that GIu172, which is a totally conserved residue, is likely to be critical for enzyme activity and stability. The removal of amino acids 2-9 or 2-19 at the amino terminus of the protein also led to a loss of alkyltransferase activity and a decrease in half-life (50). Therefore, it is likely that the mutant alkyltransferase protein formed by converting the codon for Gly177 to a stop codon (50) represents the smallest size alkyltransferase that can readily be

178

ANTHONY E. PEGG ET AL.

produced in large amounts and possibly used for NMR studies of the reaction mechanism.

C. Chemical Studies of 0-Alkylated Purine and Pyrimidine Dealkylation An in uitro reaction system has been developed to compare the rates of dealkylation of 21 0-alkylated purine and pyrimidine derivatives, including those illustrated in Fig. 3 (73).The reaction system consisted of 0-alkylated purine or pyrimidine in methanol at 60°C in the presence of a 100-fold molar excess of thiophenolate anion derived from triethylamine and thiophenol. Pseudo first-order rate constants (kobs) and t,,, values for some of these reactions are presented in Table I. The reactivity order in these studies (73) is not always consistent with observations made for related reactions involving alkyltransferase proteins. For example, the rates of inactivation of the Ada alkyltransferase by 06-methylguanine and 06-methylhypoxanthine have been measured; these indicate a guanine-to-hypoxanthine analog rate ratio of 1.6 (59), while an inverse rate ratio of 0.42 was observed in the chemical system (Table I) (73).Similarly, benzylated analogs of many of the heterocycles used in the model system exhibit quite different reactivity orders in inactivation of the human alkyltransferase. For example, inactivation by 0 6 benzylguanine is quite sensitive to substitution at the 9-position (see Section 11,A). Furthermore, 06-benzyl-7-methylguanine and 06-benzylhypoxanthine are far less effective than 06-benzylguanine. However, in the model system, reaction rates are insensitive to changes at the 9-position of 0 6 methylguanine, although all such derivatives react more slowly than 06,7dimethylguanine or 06-methylhypoxanthine. Additionally, 04-benzylthymidine is totally ineffective as an alkyltransferase inactivator (Section 11,A) whereas in the model system (73)all 04-methylthymine derivatives are relatively very reactive. Thus, the relevance of these model reactions (73) in predicting reactivities toward alkyltransferase proteins has yet to be established. It has been proposed that 06-methylguanine residues in DNA are likely to be protonated in the active site of the alkyltransferase protein to enhance the ease of transfer of the 06-methyl group (58,59, 74).One group suggested that the histidine residue in the PCH sequence of the alkyltransferase might serve a dual role by both abstracting a proton from the adjacent cysteine residue to form a thiolate anion as well as by protonating the modified guanine residue at the 7-position (58, 74). To determine if positive charge development at the 7-position accelerates methyl group transfer to a thiol nucleophile, a platinum complex containing two molecules of 06,9-dimethylguanine coordinated to platinum through their 7-nitrogens [i. e. cis-

179

06-ALKYLGUANINE-DNA ALKYLTRANSFERASE O#CH3 N

y

H

CH3 3

N

OAN H

5

OAN H

@-methylthymine

0,CH3

@-methyluracil

N

v

H

3

OAN I CH3 1,@-dimethylthymine

I

CH3 1,@-dimethyluracil

pHCH3

HO @-methylthymidine

06.7-dimethylguanine

CY

06-methylhypoxanthine

,C H3

YN 06-methylguanine

06.9-dimethylguanine

HO 06-methyl-2-deoxyguanosine

01‘

I

CH2CH3

kc H3

H2N

H

@-ethylguanine

HO

06-ethyl-2’-deoxyguanosine

FIG. 3. Structures of 0-alkylated purine and pyrimidine derivatives dealkylated by thiophenoxide in methanol.

diamminebis(0~,9-dimethylguanine-7)platinum(II) dichloride] was prepared by incubating cis-diamminedichloroplatinum(I1) with two equivalents of 06,9-dimethylguanine. The coordinated platinum at the 7-position of the complex was regarded as representative of protonation. When reacted with a large excess of thiophenol in methanol at 60°C, the complex decomposed to

180

ANTHONY E. PEGG ET AL.

TABLE 1 DEALKYLATION OF SOME 0-ALWLATED HETEROCYCLES BY THIOPHENOXIDE IN METHANOL AT 60°Ca RATES OF

~

04-Methylthymine 04-Methyluracil 04-Methylthymidine 1,04-Dimethylthymine 1,O 4- Dimethyluracil 06 , 7- Dimethylguanine 06-Methylhypoxanthine 0"-Methyl-2’-deoxyguanosine 06,9-Dimethylguanine 06-Methylpanine 06-EthyL2’-deoxyguanosine 06-Ethylguanine

23.9 27.7 18.2 18.5 16.1 12.4 13.6 5.68 4.71 4.78 0.854 0.346

29 25 38 37.5 43 56 51 122 147 145 811

2000

modified from reference 73

liberate largely 06,9-dimethylguanine (indicating sulfur attack at Pt) togethe r with 9-methylguanine (indicating 0 6 demethylation by sulhr followed by degradation of the complex). Although 9-methylguanine was not the major product of this reaction, its rate of formation was much higher than observed in a system containing thiophenol and 06,9-dimethylguanine without platinum coordination, suggesting that some positive charge character at the 7-position facilitates transfer of the 06-methyl group to a sulfur nucleophile. It was suggested that positive charge development at the 7-position of a DNA-bound 06-methylguanine residue through protonation by the activesite histidine residue would facilitate the methyl group transfer to the alkyltransferase cysteine sulfur (58, 74). In contrast to this proposal, others have argued that protonation at the 06-position of an 06-methylguanine residue is more important for facilitating methyl group transfer than is protonation at ring nitrogen sites (59). This was suggested after studying the rate of inactivation (kinact)and dissociation constants (K,,) for the interaction of the Ada alkyltransferase with nine methylated purine analogs [OG-methylguanine, 06-methylhypoxanthine, 06-methyl-1-deazaguanine (5-amino-7-methoxy-3H-imidazo[4,5-b]pyridine), 06-methyl-3-deazaguanine (6-amino-4-methoxy-lH-imidazo[4,5-c]pyridine), and 06-methyl-7-deazaguanine (2-amino-4-methoxypyrrolo[2,3-d]pyrimidine) as well as SG-rnethyl-6-thioguanine,6-methylthiopurine, Sd-methyl-6selenoguanine, and 6-methylselenopurine]. With respect to the O-methy-

06-ALKYLGUANINE-DNA ALKYLTRANSFERASE

181

lated derivatives, it was hypothesized that, if a particular ring nitrogen site were involved in protonation during methyl group transfer to the alkyltransferase, the reaction rate for the corresponding deaza analog would be significantly lower. However, all the kinactvalues for these particular substrates fell within the range of 1.0 to 2.0 hr-1 regardless of the presence or absence of a ring nitrogen at either the I-, 3-, or 7-position. On the other hand, the absence of a l-nitrogen (as in 06-methyl-ldeazaguanine) or a %amino group (as in 06-methylhypoxanthine) increased the dissociation constant for these substrates (i.e., K,, = 15.1and 10.6 mM, respectively) compared to a value in the range of 1.5to 2.6 mM for the three other O-methylated derivatives studied. It was argued that the 1- and N2sites are likely to be involved in interactions that contribute to binding of the substrate to the protein. The seleno analogs, Se6-methyl-6-selenoguanine and 6-methylselenopurine, also exhibited high dissociation constants (K,, = 10.6 and 15.7 mM, respectively) but were surprisingly demethylated by the alkyltransferase at rates similar to those of the O-methyl analogs (kinact = 1.76 and 2.51 hr-1, respectively), One might expect the transfer of a selenomethyl group to sulfur to be disfavored because the nucleophilicity of selenium relative to sulfur toward the O-methyl group of 06-methyl-2’deoxyguanosine is greater by a factor of 4.5 (73). Also noteworthy is the observation (59) that although the inactivation rate observed with S6methyl-6-thioguanine (kinact = 0.63 hr-1) is somewhat reduced relative to that for 06-methylguanine (kinact = i.66 hr-I), 6-methylthiopurine is completely unreactive. In any event, these data were interpreted as reflecting a decreased &nity for protonation at the exocyclic 6-substituent atom in the order 0 > S > Se. However, because the basicity of the anionic form of 6-selenoguanine (pK, = 7.7) is lower than that of thioguanine (pK, = 8.2) or guanine (pK, = 9.2), it was argued that the improved leaving-group properties of the weakly basic seleno derivative offset its need for preliminary protonation for methyl transfer and thereby increased its kinactvalue relative to that for the sulfur-containing analog. With these considerations in mind, a model was presented whereby an 06-methylguanine residue binds to the alkyltransferase with hydrogen bonds to the exocyclic amino and 0x0 substituent as well as to the nitrogen at the l-position. Protonation at oxygen facilitates transfer of the 06-methyl group to the cysteine sulfur by neutralizing the developing charge on the guanine leaving group (59). Although it is certainly reasonable that protonation of an 06-methylguanine residue by the alkyltransferase prior to cysteine thiolate attack will improve the ease of methyl group transfer, it seems that additional experiments will be required to establish the site of this protonation on the modified guanine with certainty.

182

ANTHONY E. PEGC ET AL.

II. Inhibition of Alkyltransferase Activity A. Small Alkyltransferase lnactivators It was shown almost 10 years ago that 06-methylguanine can inactivate

the alkyltransferase from human tumor cells, although at rates far below those observed for the loss of activity when 06-methylguanine residues are incorporated in DNA (75, 76). It was also known that rates of alkyltransferase inactivation by DNA-bound 06-substituted guanines decreased dramatically with increasing 06-substituent size in the series methyl > ethyl > propyl (75, 77). It therefore seemed unlikely that guanines bearing substituents larger than methyl on 0 - 6 would be capable of reacting with alkyltransferase proteins more rapidly than 06-methylguanine does. However, DNA-repair studies using oligodeoxyribonucleotides containing site-specifically incorporated 06-benzylguanine residues indicated that the human alkyltransferase can be inactivated by DNA-bound 06-benzylguanine; this led to the discovery that 06-benzylguanine, as the free base, can also inactivate the protein, but at least 2000 times more effectively than 0“-methylguanine (78). This initial observation stimulated several surveys of the alkyltransferase-inactivating properties of a wide range of other compounds. In the first such survey (79), several 0 6 - and S6-substituted purines were tested, as were 7- and 9-substituted 06-benzylguanine derivatives as inactivators of the alkyltransferase from HT29 human colon tumor cells in either HT29 cell-free extracts or intact HT29 cells. Many of these compounds are illustrated in Fig. 4. The most active of those tested were 06-(p-Y-benzyl)guanines with fairly small para substituents (Y = hydrogen, fluorine, chlorine, and methyl). All exhibited similar activity regardless of the electronic character of the para substituent (Table 11). 06-Benzyl-2 -deoxyguanosine exhibited good activity, but was a tenth as active as 06-benzylguanine. Ribonucleoside analogs, i. e., 06-(p-Y-benzyl)guanosines(Y = hydrogen, chlorine, and methyl) were %th as active as 06-benzylguanine, but also exhibited similar activity regardless of the para substituent attached to the benzyl moiety (Table 11). 06-Allylguanine and some additional 9-substituted 06-benzylguanines bearing fairly small lipophilic substituents were 1/65th to 1/235th as active as 06-benzylguanine (79). 06-Benzylhypoxanthine was 1/425th as active as 06-benzylguanine, whereas the sodium salt of a 9-carboxymethyl derivative of Oh-benzylguanine was nearly 1/800th as active. The latter compound’s low activity suggested that polar 9-substituents other than carbohydrates are not compatible with efficient alkyltransferase inactivation. S6-Benzylated 6-thioguanine and 6-thioguanosine derivatives were inactive, as were 7-substituted 06-benzylguanines and 7-benzylguanine. Based on these observations, it was concluded that efficient alkyltransferase inactivation requires a benzyl or ally1 group attached through exocyclic oxy-

183

06-ALKYLGUANINE-DNA ALKYLTRANSFERASE

CH3 n

6-benzylguanine. Y = H

H2N

H

6-(pfluorobenzyf)guanine,Y = F @-(3,5dimeth@er 1zyl)guanine HO 6-@chlorobenzyl)guanine,Y = CI 6benzyl-Z-deoxyguanosine @-@methylbenzyl)guanine. Y = CH3 6-@hydroxymethylbenzyl)guanine,Y = CHzOH 06-@formylbenzyl)guanine. Y = CHO @-@-isopropylbenzyl)guanine, Y = CH(CH& @-@phenylbenzyl)guanine. Y = CsH5 OG-@n-butylbenzyl)guanine, Y = (CH2),CH3

“$1

A~

H2N

N

k

06-benzyl-S(cyanornethyl)guanine,R = CH&N 06-benzyl-9-(2-hydroxybutyl)guanine, R = CHzCH(OH)C%CH3

H 06.a11y1guanine

06-benzyl-9-[(ethoxycarbonyl)methyl]guanine,

R CH2CGCH2CH3 @-benzyC9-(carbamoylrnethyl)guanine,

R = CHzCONk @-benzylguanosine, Y = H

6-(gchlorobenzyl)guanosine,Y = CI 6-(pmelhylbenzyl)guanosine,Y = CH,

06-benzyl-9-(carboxymethyl)guanine,sodium salt. R = CH2CO;Na’ @-benzyl-9-[2-hydror/-3-(isopropoxy)propylJguanine, R = CH2CH(OH)CH20CH(CH3)2 @-benzyl-9-[2-hydroxy-3-( isopropy1amino)propyllguanine.R = CHzCH(0H)CH2NHCH(CH3)z

@-benzyl8[(3-oxo-5-androstan-1 ’ip-yloxy-

csrbonyI)methyl]guanine. R= OCOCH2

H 06-benzylhypoxanthine

FIG. 4. Structures of 06-benzylguanine derivatives tested as inactivators of alkyltransferase.

gen at the 6-position of a 2-aminopurine derivative. A variety of substituents at the 9-position were compatible with alkyltransferase inactivation, whereas substitution at the 7-position led to a complete loss of activity (79). Several other 06-benzylguanine analogs bearing increasingly bulky substituents on the benzene ring or at position 9 were tested subsequently (80). These experiments extended the earlier comparisons (79)and provided fur-

184

ANTHONY E. PEGG ET AL.

ALKYLTRANSFERASE INACTIVATION

IN

TABLE 11 HT29 CELL-FREE EXTRACTS AND

In cell-free extracts

Compound 06-Benzylguanine 2,4-Diamino-6-benzyloxy-S-ni trosopyrimidine ropyrimidine 2,4- Diaminc-Bl~enzyloxy-5-nit 8-Aza-06-benzylguanine 0 fi-Benzyl-8-bromoguanine 0"-( p-Fluorobenzy1)guanine OR-(p-Chlorobenzyl)guanine 06-( p-Methylbenzy1)guanine 06-( p-Isoprop yl be nzy l)guani ne 0 6 - ( p-Pheny 1benzyl)guanine 0 6-(3,5-DimethylbenzyI)guanine 06-Benzyl-2’-deoxyguanosine 06-( p-n-Butylbenzy1)guanine Ofi-Benzyl-9-[(3-0~0-5a-androstan17P-yloxycarbon yl) meth yllguanine 0 fi-Benzyl-9-[2-hydroxy-3-(isopropoxy)propyl]guanine 06-Benzylguanosine 06-(

p-Chlorobenzy1)guanosine

0 "( p-Methylbenzy1)guanosine O6-Benzy~-9-(cydnoniethyl)guanine Oh-Benzyl-9-(2-hydroxybutyl)guanine N*-Acetyl-O 6-benzylguanine 06-Benzyl-9-[ (ethoxycarbonyl)methyl]guanine 06-Benzyl-9-(carbamoylmethyl)guanine Ofi-Benzylhypoxanthine 0"-Benzyl-9-[2-hydroxy-3-(isopropylamino)propyl]guanine 06-Benzyl-9-(carboxymethvl)guanine,sodium salt 06-Allylguanine 0 6-(2-Ethylallyl)guaiiine 06-(2-Methylallyl)guaniiie 0 fi-[2-(2-Methyl)ethylallyl]giia~iine 06-(2-Phenylallyl)guanine 0 "-(2-Oxopropyl)guanine 06-(2-Oxobutyl)guanine 0 6-(2-Oxo-3-methylhutyl)g~~anine 0 6-(2-0xo-2-phenylethyl)grlanine Ofi-(n-Butyl)guanine .The concentration required to produce minutes or in cells on incubation for 4 hours. "ND. Not determined

0.2 0.06 0.06 0.07 0.08 0.2 0.2 0.2

0.5 0.3 1 2 4 4

7 11 10 9 13 13 24 30 47 85 106 157 20 8.5 1G 25

> lo00 77 192

151

>lo00 >lo00 493 FjOQ

IN

HT29 CELLS

In cells

Ref.

0.05 0.02 0.02 0.06 0.05 0.05 0.08 0.08 0.6 0.1 0.4 0.5 1 0.5

78, 79 82 82 82 82 79 78, 79 78, 79 80 80 80 79 80 80

0.8 2

80 79 79 79 79 79 80 79 79 79 80 79 79 81 81 81 81 81 81 81 81 81 81

5 3 0.8 2 2

3 7.5 36

23 Inactive 4 NDb ND ND ND ND ND ND ND ND ND

inactivation in cell-free extracts on incubation for 30

06-ALKYLGUANINE-DNAALKYLTHANSFERASE

185

ther information about the types of groups and the location at which they could be attached to 06-benzylguanine without significantly lowering its alkyltransferase-inactivating potential. The results of these experiments demonstrated that substitution on the benzene ring with quite bulky substituents is well tolerated as evidenced by the activity of OG-(p-Y-benzyl) guanines bearing para substituents as large as Y = isopropyl or phenyl (Fig. 4);these were essentially as active as 06-benzylguanine (Table 11).Increasing the size of the substituent on the benzene ring by attaching methyl groups to the 3- and 5-positions, as in 06-(3,5-dimethylbenzyl)guanine,or attachment of an n-butyl substituent to the para position, as in Oe-(p-n-butylbenzy1)guanine (Fig. 4), reduced activity relative to that for 06-benzylguanine by a factor of approximately 3 and 13, respectively. In contrast to these observations, certain substituent changes at the 9-position had a much greater influence on alkyltransferase inactivation. Quite large lipophilic substituents, such as the steroid on 0~-benzyl-9-[(3-oxo-5~~androstan-17~-yloxycarbonyl)methyl]guanine,or a 2-hydroxy-3-(isopropoxy)propyl group, as in 0~-benzyl-9-[2-hydroxy-3-(isopropoxy)propyl)]guanine (Fig. 4, decreased activity by factors of only 13 and 23, respectively. Surprisingly, O~-benzyl-9-[2-hydroxy-3-(isopropylamino)propyl]guanine, bearing a 9-substituent very similar in size to that on the 9-[2-hydroxy-3-(isopropoxy)propyl] derivative, although positively charged under physiological conditions, was %sth as active as the neutral 2-hydroxy-3-(isopropoxy)propyl compound and correspondingly 1/350th as active as 06-benzylguanine. These findings (80)together with those made previously (79) led to the conclusions that both large and small lipophilic substituents at the 9-position preserve the alkyltransferase-inactivatingpotential of 06-benzylguanine derivatives, whereas large and small noncarbohydrate polar groups at this position significantly reduce activity (80). A later report (81) on HT29 cell alkyltransferase inactivation extended the range of 06-substituents on 06-substituted guanines to include 2-0x0alkyl substituents, 2-substituted allyl substituents sterically similar to the 2-oxoalkyl groups (Fig. 5), as well as some previously untested alkyl substituents, e.g., 3-methylbutyl, cyclohexylmethyl, and 2-phenylethyl. The inactivating ability of 06-methyl-, -benzyl-, -allyl-, -ethyl-, -propyl-, and -12butylguanine was also reexamined for comparison. The 2-oxoalkyl substituents and sterically related 2-substituted allyl substituents were selected to test whether the efficiency of alkyltransferase inactivation by 06-substituted guanines bearing these groups would parallel the chemical reactivity of these groups toward iodide ion when attached to more conventional leaving groups (e.g., Cl). Because 2-oxo-2-phenylethyl chloride is at least 1000 more reactive toward I- than allyl chloride, it seemed reasonable to expect that 06-(2-oxo-2-phenethyl)guaninemight be considerably more active than 0 6 -

186

ANTHONY E . PEGG ET AL. R

?' Y

H

06-(+butyl)guanine. R = (CH2)3CY

FIG.5 . Structures of 06-allylguanine and 06-oxoalkylguanine derivatives tested as inactivators of alkyltransferase.

allylguanine as an alkyltransferase inactivator. Unfortunately, the 0 6 4 2 oxoalky1)-substitutedguanines were rather poor alkyltransferase inactivators (Table II), as were the newly tested 06-alkylated guanines (81).Interestingly, 06-(2-ethylallyl)- and 06-(2-methylallyl)guanine showed only one-half and one-third the alkyltransferase inactivation of 06-allylguanine. This latter derivative and 06-benzylguanine were the most active of the alkyltransferase inactivators tested (81),which is consistent with earlier observations (79). It was concluded from these comparisons that alkyltransferase inactivation by 06-substituted guanines is not readily predicted by reactivity patterns established for substituents in model chemical reactions. It was also suggested that if the alkyltransferase harbors a hydrophobic pocket in the active site that surrounds a nonpolar chain on an 06-substituted guanine inactivator, this pocket must be very strictly defined so as to allow only certain hydrophobic substituents to fit properly and thereby undergo accelerated transfer to the active-site cysteine residue (81). Although all these studies provided a variety of analogs either as potent or less potent than 06-benzylguanine as alkyltransferase inactivators, none of the analogs was better than 06-benzylguanine. However, a recently completed survey of the human alkyltransferase-inactivating ability of several additional 2- andlor &-substituted6-benzyloxypurines and 6(4)-benzyloxypyrimidines revealed that two classes of compounds are more effective than 06-benzylguanine as alkyltransferase inactivators (82). These classes were 8-substituted 06-benzylguanine derivatives bearing electron-withdrawing substituents at the &-position,and 5-substituted 2,4-diamino-6-benzyloxypy-

0"-ALKYLGUANINE-DNA ALKYLTRANSFEHASE

&-benzyl-Bbmmoguanine

2.4-diamino-6-benzyloxy5nitmsopyrimidine

187

8-aza-06-benzylguanine

2.4-diamino-6-benzyloxy5nitropyrimidine

FIG. 6 . Structures of 8-substituted OG-benzylguaninederivatives and 5-substituted 2,4diamino-6-benzyloxypyrimidinederivatives tested as inactivators of alkyltransferase.

rimidine derivatives bearing electron-withdrawing substituents at the 5-POsition. Specifically, 06-benzyl-8-bromoguanine and 8-aza-06-benzylguanine (83) (Fig. 6) were each approximately 2.7 times more effective than 0 6 benzylguanine as alkyltransferase inactivators in HT29 cell-free extracts, whereas both 2,4-diamino-6-benzyloxy-5-nitro-(84) and 2,4-diamino-6benzyloxy-5-nitrosopyrimidine (85) were 3.3 times more active (82). The latter pyrimidines were also very effective inactivators in intact HT29 cells (Table II), although the two purine derivatives were slightly less active than expected from their activity in extracts. In any event, the greater alkyltransferase-inactivating efficiency of these compounds in extracts is probably a consequence of the weaker basicity of the anionic forms of their heterocyclic leaving-groups compared to that for guanine anion, which makes them easier to displace by the cysteine thiolate anion in the alkyltransferase active site. Provided these (Fig. 6) and related agents do not exhibit undesirable toxicity, they should be superior to 06-benzylguanine as chemotherapeutic adjuvants for enhancing the effectiveness of antitumor drugs whose mechanism of action involves modification of the 06-position of DNA guanine residues (see Section VI,B).

B. Mechanism of Alkyltransferase Inhibition The mechanism of inactivation by 06-benzylguanine of the purified recombinant human alkyltransferase has been investigated (42).These studies clearly demonstrated that 06-benzylguanine served as an alternative sub-

188

ANTHONY E. PEGG E T AL.

strate for the protein, causing formation of an S-benzylcysteine residue in

the active site of the protein with concomitant production of a stoichiometric amount of guanine. The second-order rate constant for alkyltransferase inactivation, determined from the rate of guanine liberation, was approximately 600 M - 1 sec-1, which is roughly 1/30,OOOth of the rate constant for removal of methyl groups from 06-methylguanine residues in DNA by the human protein, but is 2000-3000 times the rate constant for reaction with 06-methylguanine. Although the human alkyltransferase is very efficiently inactivated by 0 6 benzylguanine, this compound is far less effective in inactivating the alkyltransferase from the E. coli ogt gene (42, 45) and was inactive against either the E. coli Ada alkyltransferase (42, 45, 86) or the S . cereuisiae alkyltransferase (42). Related comparisons with other inactivators indicated that O~-benzy1-2’-deoxyguanosine is effective only against the human protein. 06-Allylguanine was active at high concentrations against the human protein and the E. coli Ogt protein, whereas 04-benzylthymidine was inactive against all alkyltransferases studied (42).The decreasing ease of inactivation by 06-benzylguanine of alkyltransferases in the series human > Ogt > Ada 2 yeast was interpreted as reflecting a decrease in size of the respective alkyltransferase active site in the same order, such that the bulky 06-benzylor 06-allylguanine inactivators can only be accommodated by the human and Ogt proteins (42). Mutant human alkyltransferases have been prepared to test this hypothesis (50, 69). In the first study, mutations were made that replaced either Trp1O0or Pro140 by Ala (W100A or P140A). These proteins were tested for their ability to demethylate 06-methylguanine residues in DNA and to undergo inactivation by pretreatment with 06-benzylguanine (69). Mutants WlOOA and P140A were fully active in repairing 06-methylguanine residues in DNA. However, although mutant WlOOA was very effectively inactivated by low concentrations of 06-benzylguanine, mutant P140A was some 40-fold more resistant to 06-benzylguanine inactivation. Mutant P140A was also very ineffective in producing guanine from an 06-benzyl-[8-3H]guanine substrate. The P140A mutant was only 2.8-fold more resistant to inactivation by 06-allylguanine (69). Presumably, the smaller 06-allylguanine fits better in the active site pocket of reduced size in the mutant alkyltransferase. It was concluded that a proline residue at position 140 in the normal human protein produces a bend in the active site that increases its size and allows the bulky 06-benzylguanine to be accommodated in the active site (69). In the yeast and Ada proteins, which are not inactivated by 0 6 benzylguanine (42), this proline is replaced by an alanine. In the Ogt protein, this proline residue is replaced by a serine, although there is another

06-ALKYLGUANINE-DNAALKYLTRANSFERASE

189

proline residue at position 138 in the Ogt sequence that probably contributes to its sensitivity to both 06-benzyl- and 06-allylguanine inactivation (42). Subsequent experiments (50) are consistent with these conclusions, and demonstrate, in addition, that at least two domains on either side of the active cysteine residue in the human alkyltransferase affect access of 0 6 benzylguanine to the active site of the protein. Thus, mutant proteins obtained by changing the Pro138 to Lys (P138K)or Ala (P138A) were 8- and 10fold more resistant to inactivation by 06-benzylguanine than the normal human protein. Double mutants resulting from changes in both prolines, i.e., P138A/P140A and P138K/P140A, were 88- and 116-fold more resistant to 06-benzylguanine inactivation, respectively. Changes in a second domain at the other side of the active-site cysteine were made by replacing Gly1= by Ala (G156A) or Trp (G156W) (50). These proteins were 240- and 320-fold more resistant to inactivation by 06-benzylguanine, whereas a double mutant resulting from changing one amino acid in each domain, i.e., P140A/ G156A, was more than 1200-fold more resistant to 06-benzylguanine. These data not only support the likelihood of a major conformational change in the human alkyltransferase on binding to DNA (32, 43), but also emphasize the structural contributions made by prolines at 138 and 140 and glycine at 156 in facilitating access of 06-benzylguanine to the human protein active-site cysteine residue (50). These prolines (see above) and the glycine residue are not present in the analogous methyl transfer domain of the Ada protein, which is totally resistant to inactivation by 06-benzylguanine (42, 45, 86). Furthermore, Gly156 is not present in any of the microbial alkyltransferases that show only limited inactivation, if any, by 06-benzylguanine (42,45,86),and Pro140 is absent from the E. coli Ogt protein, which is inactivated by 06-benzylguanine, albeit only at high concentrations (42, 45). Finally, the fact that simple point mutations in the human alkyltransferase can produce significant resistance to inactivation by 06-benzylguanine raises the possibility that the clinical usefulness of 06-benzylguanine may be limited when combined with mutagenic chloroethylating or methylating antitumor drugs that could produce mutations leading to resistant alkyltransferases. Additional alkyltransferase inactivators that retain activity against mutant proteins could be used to advantage in such an adjuvant chemotherapy scenario.

C. Use of 06-Benzylguanine or Derivatives as Substrates to Study Al kyltransferase Activity The finding that 06-benzylguanine can be a substrate for the alkyltransferase allows its use to study features of the reaction that are difficult

190

ANTHONY E. PEGG ET AL.

to investigate with DNA substrates. The comparatively slow rate of reaction with the 06-benzylguanine allows kinetic measurements to be carried out much more easily. For these studies, it was necessary to synthesize 06-benzylguanine analogs in radioactively labeled forms with the label at a defined site. Catalytic tritium-exchange provided 06-benzyl-[8-3H]guanine and 0 6 benzyl-[8-3H]-2 -deoxyguanosine(42, 87). Labeling of the benzyl portion of a substituted 06-benzylguanine derivative was carried out by synthesizing 06-(p-formylbenzyl)guanine (Fig. 4), which was then converted to labeled 06-(p-hydroxy-[3HH]methylbenzyl)guanine by reduction with sodium boro[3H]hydride (80, 88). When the human alkyltransferase was incubated with 06-benzyl-[8-sHH]guanine, [&”]guanine was formed (42, 50, 69, 87). When the human alkyltransferase was incubated with 06-(p-hydroxy[3H]methylbenzyl)guanine, radioactivity was incorporated into the protein (88). Both reactions ceased when all of the alkyltransferase activity was depleted, and did not occur when the alkyltransferase was first inactivated by reaction with a methylated DNA substrate. All of the protein-bound radioactivity from the reaction with 06-(p-hydroxy-[3H]methylbenzyl)guaninewas incorporated into a single tryptic peptide; sequencing of this peptide showed that it had the sequence GNPVPILIPXH where all of the label was released with residue X (88). This peptide corresponds exactly to that expected from the amino-acid sequence and the binding of the p-hydroxy-[3H]methylbenzyl group to Cys’4”. The rate of production of [3H]guanine from 06-benzyl-[8-3H]guanine and of the labeling of the alkyltransferase protein by 06-(p-hydroxy-[3H]methylbenzy1)guanine was greatly reduced with mutants P140A, P138K/P140A, G156A, G156W, or P140A/G156A, confirming that these mutants react much less readily with 06-benzylguanine (50, 69, 88). A mutation of ArglZ8to Ala (R128A) produced an alkyltransferase that reacted with 06-benzyl-[8-3H]guanine and 06-(p-hydroxy-[3H]methylbenzy1)guanine at a normal rate but was virtually inactive with methylated DNA substrates, repairing these at a rate 1/100Oth that of the wild-type alkyltransferase. Furthermore, this mutant protein did not lead to a “gel shift” when added to ssDNA, whereas the wild-type protein produces such a shift (88a). These results indicate that the R128A mutation affects the DNA binding of the substrate and support the suggestion that the helix-turn-helix region in which Arg1z8 is located is indeed involved in this DNA binding. However, it is noteworthy that mutation WlOOA, which is also in this region, has no detectable effect on the activity (69). The TrplW residue, like ArglZ8, is totally conserved in all alkyltransferase sequences. The rate of alkyltransferase reaction with 06-benzyl-[8-3H]guanine or 06-(p-hydroxy-[3H]methylbenzyl)guaninewas increased about 5- or 6-fold by the addition of D N A (87, 88). This supports the hypothesis that a confor-

O~-ALKYLGUANINE-DNA ALKYLTKANSFERASE

191

mational change in the protein occurs on binding to DNA that facilitates the reaction (32).The reaction with mutants P138AIK and P140A was stimulated to a greater extent (15- to 20-fold) than with the wild-type protein, suggesting that the conformational change is even more effective in these proteins (87, 88). In contrast, guanine production by mutant G156A was only slightly increased (about %fold) by DNA addition. Glyl” is in the region forming a hinge, movement about which is postulated to change the position of the carboxyl-terminal helix region in response to DNA binding. Its alteration may therefore prevent the rotation from occurring. It is well-known that the alkyltransferase reaction with methylated DNA substrates is reduced by about 80% by increasing ionic strength, with a maximal effect at about 0.2 M NaCl (23). The increase in [8-3H]guanine production from 06-benzyl-[8-3H]guanine, but not the basal rate of reaction, was also inhibited by increasing ionic strength (87). This suggests that it is the DNA binding or the configurational change in response to DNA binding that is inhibited by salt. The formation of [8-3H]-2 -deoxyguanosine from 06-benzyl-[8-3H]-2’deoxyguanosine is not stimulated by DNA but is, in fact, inhibited (87).This can be explained if the binding of 06-benzyl-[8-3H]deoxyguanosine occurs in a way such that the sugar moiety intrudes into the DNA binding site and thus prevents or reduces the simultaneous binding of both molecules. This might explain the somewhat unexpected finding that 06-benzylguanine is a more potent inactivator of the alkyltransferase than its deoxyribonucleoside, which more closely resembles a DNA substrate.

111. Substrate Specificity and Metabolism of Alkyltransferase

A. Substrate Specificity 1. REPAIR

OF O4-ALKYLTHYMINE

Although all known alkyltransferases react most rapidly with adducts at the 06-position of guanine, it has been well-established for some time that the Ada and Ogt alkyltransferases are able to remove methyl groups from 04methylthymine in DNA (21-23, 28-30, 63, 89, 90). Initial attempts to show unequivocally that the mammalian alkyltransferases also carry out this reaction were unsuccessful (go), but indirect evidence that the E . coli, S . Cereuisiae, and human alkyltransferases can react with oligodeoxyribonucleotides containing 04-methylthymine was obtained (91). The availability of large quantities of recombinant mammalian alkyltransferase has allowed this ques-

192

ANTHONY E. PEGG ET AL.

tion to be investigated in more detail, and it was found that both the human and the rat protein do react with 04-methylthymine in D N A (40, 44; N Loktionova and A. E. Pegg, unpublished). However, the rate of repair is about 1/200th (rat) or 1/5500th (human) that of the repair of 06-methylguanine in D N A (44). Thus, in order to demonstrate repair of 04-methylthymine, a large excess of the mammalian alkyltransferase is needed. Furthermore, in substrates containing both 06-methylguanine and 04-methylthymine, the latter is not repaired until virtually all of the 06-methylguanine has been demethylated. These results suggest that the repair of 04-alkylthymine in uiuo by alkyltransferase is likely to occur only under conditions wherein the extent of alkylation is so low that all of the 06-methylguanine can be repaired without depleting the cellular content of alkyltransferase. The results of a study in which the persistence of 06-methylguanine and 04-methylthymine in rat liver D N A was measured after treatment with 1,2-dimethylhydrazine with and without 06-benzylguanine are consistent with this (92).Treatment with 06-benzylguanine did reduce the rate of loss of W-methylthymine from the DNA, indicating that alkyltransferase was involved in its removal, but 0 4 methylthymine disappearance was slow when 06-methylguanine was also present in the D N A (92). Thus, alkyltransferase activity may play a limited role in mammalian tissues in repairing 04-methylthymine, which is formed at much lower levels than 06-methylguanine (1-3). It has been suggested that the presence in D N A of 04-methylthymine paired with adenine causes a large degree of local curvature and that this may account for the low efficiency of alkyltransferase repair (93). Further comparisons of the structures of oligodeoxyribonucleotides containing 0 6 alkylguanine and 04-alkylthymine with their rates of repair by alkyltransferase are needed to evaluate this suggestion. The biochemical basis for the marked species differences in the ability of the alkyltransferases to act on thymine and guanine is at present unknown. The use of site-specific mutagenesis of amino-acid residues that differ in alkyltransferases showing differences in specificity should allow this to be investigated. The fact that the rat and the human alkyltransferases, which vary only slightly in primary sequence (Fig. l), show a 25-fold difference in this rate (44) should be particularly helpful in identifying key residues.

2.

&PAIR

OF

DIFFERENT ALKYLGROUPS

It is well-established that a variety of groups can be removed from the 06-position of guanine in D N A by the alkyltransferase. These include methyl-, ethyl-, n-propyl-, n-butyl-, 2-chloroethyl-, 2-hydroxyethyl-, iso-propyl-, iso-butyl, benzyl- (21-23, 63, 75-78, 89, 94), and perhaps pyridyloxobutyl(95). With the aliphatic alkyl groups, the rate of repair decreases with in-

O~-ALKYLGUANINE-DNA ALKYLTRANSFERASE

193

creasing size, but the ability of 06-benzylguanine to serve as a substrate for the mammalian alkyltransferases illustrates the point that it is dangerous to generalize about steric effects on reaction rates without experimental evidence. The inability of the microbial alkyltransferases to react with 0 6 benzylguanine also emphasizes marked species specificity in the relative rates of repair with different substrates. Several studies have established that the Ada, Ogt, and mammalian alkyltransferases differ in their relative abilities to act on larger alkyl adducts (21-23, 63, 77, 89). A better understanding of the biochemical basis for the differences in rates of repair may be revealed from comparative studies with alkyltransferases containing point mutations. A preliminary study of this type has revealed a striking difference between the human and the rat alkyltransferase in the rate of repair of 06-ethylguanine in a dodecamer substrate (94). In addition to the expected increase in mutations in response to alkylating agents, the E . coli ada- ogt- strain showed an increased frequency of spontaneous mutations (96). This could be due to endogenous alkylation by S-adenosylmethionine or other methylating agents, but an increase was observed in a wide variety of point mutations, not just the G.C to A.T transitions seen with simple methylating agents. This suggests that these alkyltransferases do act in vivo on some other endogenous alkylation lesions. 3.

SEQUENCE

SPECIFICITY IN REPAIR

The sequence specificity of alkyltransferase action can be investigated using oligodeoxyribonucleotide substrates containing an 06-alkylguanine in a defined sequence (21-23, 51, 94, 97, 98). Early studies showed that double-stranded dodecamers suffice to give rates of repair equivalent to high-molecular-weight DNA, and later work indicates that octamers may be sufficient (94). Investigations with oligodeoxyribonucleotides containing 0 6 methylguanine in a variety of sequences show that the mammalian alkyltransferase can repair 06-methylguanine in all sequences tested, but there were some differences in rates of repair (21-23, 63, 89, 97-102). The presence of a guanine on the 5’ side of the 06-methylguanine reduced the rate of repair (98, 99) and an 06-methylguanine located in a sequence -GmeGA-, which corresponds to a hot spot for mutation at the twelfth codon of the H-rus proto-oncogene, was repaired particularly slowly by the Ada alkyltransferase (100, 101). There was a correlation between the avidity of binding of an antibody to 06-methyl-2 -deoxyguanosine and the ease of repair by the alkyltransferase when a number of pentadecamers containing 06-methylguanine were compared, suggesting that the rate of repair reflects the conformation of the sequence (101, 103). Studies of the transformation of Rat-4 cells by 0 6 substituted guanines incorporated in codon 12 (GGA) of the rat H-rus gene

194

ANTHONY E. PEGG ET AL.

(104)indicated a more enhanced transforming potency of 06-methylguanine residues in alkyltransferase-depleted cells compared to normal cells when the modified base was incorporated in place of the second guanine residue of codon 12 rather than the first. These data also suggest more efficient repair by the alkyltransferase of the -m6GGA sequence compared to a -GmGGA sequence even in mammalian cells. It has been suggested that repair differences appear to be due to the strength of the stacking interaction between the 5' preceding base and the 06-methylguanine (102). Transfection into COS-7 cells of an SV-40-based shuttle vector containing 06-methylguaninein place of either the first or second guanine of the human H-rus codon-12 sequence (GGC) showed that the mutation frequency is significantly higher when the modified guanine is at the second position (105).This report is also consistent with differential repair, but it differs from reports of related experiments with the CGA sequence in alkyltransferase repair-competent Rat-4 fibroblasts (104, 106). This difference may be a consequence of the different cell types used, the different codon-12 sequences, or the fact that the modified GGC sequence was incorporated in an extrachromosomally replicating shuttle vector (105),whereas the GGA sequence was replicated intrachromosomally (204,106). Remarkably, a 5-methylcytosine base 5' to an 06-methylguanine completely prevents repair by the alkyltransferase in some sequences and reduces it in others (107).This result suggests that the lack of repair of 0 6 methylguanine formed at CpG sites that are frequently methylated on the cytosine could contribute to the known increased mutation frequencies at these sites. The striking effect of 5-methylcytosine may be related to the previous observations that the Ada alkyltransferase cannot repair 0 6 methylguanine in Z-DNA substrates (108).Because the protein can act on the free base and very small oligodeoxyribonucleotides (21, 22, 97, 102), these results suggest that some DNA configurations prevent the access of the 06-methylguanine to the active site of the protein. The accessibility of 06-alkylguanine lesions in nuclear DNA to the alkyltransferase may be affected by either chromatin structure or transcription (209, 110). A more rapid repair of 06-alkylguanine appears to occur in transcribed genes (109, 110), but it is not certain that the repair was mediated by alkyltransferase. Most (90%)of the 06-methylguanine formed after exposure to MNU in nuclear DNA of hamster fibroblasts cells expressing the Ada alkyltransferase was repaired within 1-2 hours, but some of the remainder was repaired much more slowly over the next 48 hours (110). The persistent lesions were present in the nuclear matrix DNA. However, all the adducts could be repaired by the Ada alkyltransferase in uitro when presented with purified nuclear DNA. These results suggest that packing of DNA into chromatin structure drastically affects its accessibility to repair.

06-ALKYLGUANINE-DNA ALKYLTRANSFERASE

195

B. Location and Degradation 1. FATEOF ALKYLATEDPROTEIN After transferring an alkyl group to the cysteine acceptor site, the alkyltransferase protein has a different conformation. This conformation is much less stable in the cell, and the alkyltransferase protein disappears rapidly after treatment of HT29 cells with 06-benzylguanine or MNNG (53, 60).This conformational change and/or degradation could be of physiological importance in reducing the nuclear content of the alkyltransferase protein and thus allowing additional transport or diffusion of the active protein into the nucleus. It is also possible that the degradation of the consumed alkyltransferase protein provides a signal for the synthesis of additional protein. The alkylated alkyltransferase is more sensitive to digestion with protease V8, forming bands of 18 and 14 kDa that correspond to cleavages at E30 and El72 (61). It was suggested that this provides a method to evaluate exposure to alkylating agents by running immunoblots developed with antibodies to alkyltransferase on samples with and without treatment with protease V8. A positive result was obtained with cells treated in culture with millimolar concentrations of alkylnitrosoureas; a small fraction of the alkyltransferase in one patient treated with CCNU was converted to the V8sensitive form (61). Although others have also noted the presence of immunoreactive protein corresponding to inactive alkyltransferase in some cultured cells (49), it remains to be determined whether this method is sensitive enough to prove useful as a general method.

2. NUCLEARLOCALIZATION To repair methylated DNA, the mammalian alkyltransferase must be present in the nucleus, but the protein contains no nuclear transport signal sequence. It is possible that it is sufficiently small to pass through the pores in the nuclear membrane and that the DNA binding domain of the protein helps to retain it in the nucleus. However, the DNA binding is quite weak and very sensitive to ionic strength, so it is possible that only a fraction of the protein is located in the nucleus. Immunocytochemical visualizations of the alkyltransferase have given rather contradictory results; some studies show a predominantly nuclear location (47, 52, 54) whereas others show a more general distribution with a substantial part of the protein in the cytoplasm (53, 111). A difference in the ability of the antibody to detect the DNAbound form of the alkyltransferase could account for the discrepancies. A fusion protein with the estrogen receptor sequence attached to the amino terminus of the human alkyltransferase was expressed from a plasmid vector transfected into HeLa cells (111). This protein repaired DNA and protected

196

ANTHONY E. PEGG ET AL.

the cells from the toxic effects of ACNU only when estrogen was added to bring about its translocation to the nucleus. This experiment shows clearly the need for nuclear translocation for the alkyltransferase to be effective, but the larger size (55 kDa) of the estrogen receptor-alkyltransferase fusion protein may have been a factor in its failure to localize in the nuclei. Comparison of the relative abilities of the Ada and the human alkyltransferases to produce resistance to alkylation damage in CHO cells by transfections with plasmids expressing these proteins showed that the human protein is considerably more active (112).This may indicate that the mammalian protein has better access to DNA lesions, or is more active or stable in a mammalian cell environment.

IV. Regulation of Alkyltransferase Expression A. Structure of the Gene and Activity of the Promoter 1. GENE STRUCTURE

The human alkyltransferase gene has been localized to the telomeric end, 26q, of the long arm of chromosome 10 (113).The mouse and human alkyltransferase genes have been only partially characterized. The gene is clearly very large; its size has been estimated to be >145 kb in the mouse (114) and >170 kb in the human (115),despite the short transcripts of about 950 nucleotides. The exonlintron junctions are virtually identical and there are five exons (115).Two very large introns of about 46 kb and >50 kb flank exon 111. 2. PROMOTER

A portion of the gene having promoter activity has been isolated and tested by fusion to a reporter gene and transfection into a variety of mammalian cells (115-11 7). The promoter has similarities to many “housekeeping genes” whose activity is present in most cells, and does not show rapid responses to external stimuli. The promoter region contains no TATA or CAAT consensus sequences. The promoter is very (G+C)-rich and contains multiple potential binding sites for the S p l transcription factor, but there was no correlation between cellular content of S p l and alkyltransferase expression. There were striking differences in the activity of the promoter in different cell types, amounting to more than 2500-fold between the most active and the least active (117). In some lymphoblast cell lines, the alkyltransferase promoter was even stronger than the CMV promoter activity. These results suggest that different levels of expression of alkyltransferase in

06-ALKYLGUANINE-DNA ALKYLTRANSFERASE

197

Mer+ cells may be due to variations in the content of various as yet unidentified transcription factors (117). However, some caution should be exerted in interpretation of results using the 1.2-kb fragment as the promoter sequence. Although this has maximal promoting activity, the possibility that other regions of the very large gene have powerful modifying effects on transcription cannot be ruled out.

3. BIOCHEMICAL BASISOF

THE

MER- PHENOTYPE

The underlying biochemical basis of the Mer- phenotype is only partially understood. Although there are a few exceptions wherein inactive protein appears to be produced, presumably as a result of a mutation affecting activity or stability (49, 118), it is well established that Mer- cells contain no alkyltransferase protein or mRNA, but that the great majority of Mercells do contain the alkyltransferase gene (23, 49, 115, 119, 120). This suggests a lack of transcription of the gene. Plasmid constructs containing the alkyltransferase promoter fused to a reporter gene give good expression of the reporter when transfected into Mer- cells, indicating that the appropriate transcription factors are present in Mer- cells (115-117). These results focused attention on the methylation status of the alkyltransferase gene. Although initial results indicated differences in the methylation status of the gene, some studies reported a greater methylation of the gene in Mer- cells and others a greater methylation in Mer+ cells (115,120-123). More detailed studies have clarified this situation. Methylation in the body of the gene correlated directly with expression of the protein, whereas methylation at 21 of the 25 CpG sites in the promoter region varied inversely with expression (124).In uitro methylation of the alkyltransferase promoter by either HpaII or HhaI methylases reduced the promoting activity (125).One of the sites for HpaII methylation (which is also at a SmaI site) located at position 69 appears to be particularly well correlated with the Mer- phenotype (126). Thus, it is possible that selective methylation at a few sites in the promoter does suppress alkyltransferase transcription and that this can be overcome by methylation at other enhancer sequences in the gene. Additional methylation in the promoter could then overcome the positive effect of the methylation at other sites. It still remains to be determined whether these correlations are related to the cause of the Mer- phenotype or are a consequence of it. Evidence for the former is provided by studies with 5-azacytidine, whose inhibitory effect on DNA methylation did produce a conversion of some Mer- cells to a Mer+ phenotype and a demethylation of the SmuI site at position 69 (126).Continued culture of these reverted cells in the absence of 5-azacytidine led to an increase in expression, which correlated with additional methylation in the body of the gene. The finding that the conversion of Mer- to Mer+ by

198

ANTHONY E. PEGG ET AL.

5-azacytidine exposure is a relatively rare event is consistent with the presence of a few critical inethylation sites and a larger number of sites having reinforcing but not controlling effects. The ability of methylation in the body of the gene to increase transcription may account for the finding that prolonged exposure of Mer+ cell lines to 5-azacytidine actually reduces alkyltransferase expression (121). The high frequency of occurrence of the Mer- phenotype in cultured human cell lines (20-40% of lines are Mer-) is still unexplained, although it suggests that the absence of the protein provides some advantage for growth in an environment normally lacking alkylating agents. The possibility that alkyltransferase levels may change significantly in culture due to changes in gene expression (12T) needs further study. An obvious strategy for the depletion of alkyltransferase activity in tumors would be to induce the Mer- phenotype. Unfortunately, despite the high prevalence of this phenotype in cultured human tumor cells, there is no clear protocol for accomplishing this. The finding that exposure of a multiple myeloma cell line RPMI 8226 to a combination of doxorubicin and verapamil led to the loss of alkyltransferase mRNA production and activity and hence to an increase in sensitivity to nitrosoureas (128) suggests that it may be possible to find protocols for the reproducible conversion of cells to a Merstatus. An attempt has been made to reduce alkyltransferase expression by the transfection of a construct expressing a ribozyme specific for the cleavage of the alkyltransferase mRNA sequence (129).However, although the ribozyme clearly cleaved the mRNA in oitro, only 1 clone from a total of 16 clones of HeLa CCL2 cells resulting from the transfection and selection showed a complete lack of alkyltransferase activity. Clearly, this innovative approach will require more refinement to increase the level of ribozyme expression to that needed for effective mRNA reduction before the next step of trying to devise a procedure for the delivery of the ribozyme to tumor cells in a cancer patient.

B. Alkyltransferase Induction and Tissue-specific Levels 1. INDUCTION

There is no mammalian counterpart to the massive induction of the Ada alkyltransferase in E . coli as part of the adaptive response to alkylating agents (28-30). Modest (up to 6-fold) increases in alkyltransferase occur in rat liver and some other tissues in response to cellular damage by alkylating agents and some other toxins and in response to partial hepatectomy (2, 3, 21-23). Some experiments indicate a similar type of induction in human and rat

06-ALKYLGUANINE-DNA ALKYLTRANSFERASE

199

tumor cells in culture (23, 130-133). Treatment with y-radiation brings about a substantial rise in alkyltransferase activity in various rat tissues and cultured cells (23, 134-136). It is possible that the common link in these changes is the presence of DNA breaks, which have been postulated to serve as a signal for increased transcription of the alkyltransferase gene (133).The induction may be negatively regulated by protein phosphorylation and poly(ADP-ribose) content, because inhibitors of protein kinases (133) or poly(ADP-ribose) polymerase (132) increase the induction. In all cases wherein it has been investigated with appropriate antibodies or cDNA probes, the rise in alkyltransferase activity produced by these treatments appears to be due to an increase in the content of alkyltransferase protein as a result of an increase in the mRNA content, presumably as a result of increased transcription. Detailed studies of alkyltransferase expression in a range of Merf cells and tissue samples also indicate that variations in the content of alkyltransferase activity correlate with differences in the level of protein and in differences in the level of mRNA (23, 119, 133, 137). Small changes in alkyltransferase activity occur during the cell cycle in some cultured mammalian cells (21-23). In mouse embryo fibroblasts and in human promonocytic leukemia cells, alkyltransferase levels were minimal prior to the onset of S phase, low during S phase, but increased during G , and highest in Go (23,138).This is surprisingly different from the changes in other enzymes involved in DNA repair and metabolism, but the generality of these changes is not established because alkyltransferase did not decline in resting human ovarian cancer cells (138).

2. INDIVIDUAL VARIATIONS Many studies have confirmed early findings (reviewed in 21-23) that there are substantial individual variations in human alkyltransferase activity in both normal tissues and tumors (137, 139-148). The extent to which variations in alkyltransferase are due to exposure to inducers, which might increase activity, or to alkylating agents, which might reduce it by producing substrate lesions in DNA, is unknown. Alkyltransferase activity is lower in lymphocytes of clinical workers handling cancer chemotherapeutic agents, and in tire storage workers (143). Activity is increased in lungs of smokers and appears to decrease after more than a year of abstinence from smoking (146), but studies with lymphocytes showed no effects of smoking (143). Alkyltransferase in the gastric mucosa was greater in patients with chronic atrophic gastritis or intestinal metaplasia (144). Assays of human leukocytes from patients undergoing cancer chemotherapy show a strong inverse relationship between the initial alkyltransferase activity and the amount of 06-methylguanine found in DNA after treatment with a combination of fotemustine with either dacarbazine or l-p-

200

ANTHONY E. PEGG ET AL.

carboxyl-3,3-dimethylphenyltriazine, another methylating agent (147, 148). There is also a clear relationship between alkyltransferase depletion and the amount of 06-methylguanine persisting in DNA. These results show clearly the critical role of alkyltransferase in removing DNA damage from tumor cells during therapy with alkylating agents. The incidence of primary Mer- tumors is clearly much less than that of Mer- tumor cell lines. There is a possibility that some cases of apparent absence of alkyltransferase may be due to sample mishandling. However, there are sufficient examples of Mer- samples that contain normal levels of other control enzymes (137,139,140,142,145)such that it appears probable that this phenotype does actually occur, albeit at a very low frequency. 3. TISSUESPECIFICITY It is well known that there are quite striking differences in the level of expression of alkyltransferase between different mammalian tissues and between different tissues within organs (2,3,21-23).There are also quite large interindividual variations in alkyltransferase activity in both normal tissues and tumors. A deficiency in alkyltransferase levels was found in patients with liver cirrhosis, a major risk factor for the development of hepatocellular carcinoma (149).The close correlation between mRNA levels and alkyltransferase expression, which, as described above, applies also to Mer- cells, suggests that in situ hybridization will prove to be a useful method to examine cell specificity in alkyltransferase expression. Such studies using a hybridization probe consisting of 39 nucleotides complementary to positions 511-549 of the human cDNA were used to show the cell type-specific expression in human kidney and liver samples (150, 151).

V. Function of Alkyltransferase Although the role of alkyltransferase in protection against the lethal, mutagenic, and carcinogenic effects of certain alkylating agents has been recognized for some time from cell culture and animal studies, direct evidence in support of this notion based on human data is still lacking. There are three lines of experimental evidence indicating strongly that the alkyltransferase protein plays an important role in resistance to the cytotoxic, mutagenic and carcinogenic effects of alkylating agents, such as the nitrosoureas. First, there is an inverse correlation between the level of this protein and the sensitivity of cells to the cytotoxic and mutagenic effects of chloroethylating and methylating agents. Second, transfection experiments show that expression of prokaryotic or eukaryotic alkyltransferase cDNA in

O~-ALKYLGUANINE-DNA ALKYLTKANSFERASE

201

mammalian cells protects them from the toxic and mutagenic effects of alkylnitrosoureas. Additionally, expression of a human alkyltransferase transgene in mice efficiently protects these animals from developing tumors in target organs after nitrosourea treatment. Third, treatment of cells with agents that deplete alkyltransferase activity prior to nitrosourea exposure results in an increase in the number of cross-links, sister chromatid exchanges, mutations, and cell lethality as compared to cells treated with nitrosourea alone.

A. Role in Preventing Toxicity, DNA Aberrations, Sister Chromatid Exchanges, Mutations, and Carcinogenesis 1. RESISTANCETO THE CYTOTOXICEFFECTSOF ALKYLNITROSOUREAS As described above, the alkyltransferase protein plays a critical role in the resistance to chemotherapeutic agents that produce a toxic lesion at the 06-position of guanine, including CENUs (18-23), methylating agents (21-23, 152, 153), and nitrososemicarbazides, a new class of bifunctional alkylating drugs (154). Expression of either the Ada or the mammalian alkyltransferase in Mer- cells protects against killing by these agents (reviewed in 21-23). The expression of the Ogt alkyltransferase (which, as described in Section IIJ, has relatively good activity against 04-alkylthymine and higher alkyl derivatives) also protects against killing by ethylating and butylating agents (155). However, the larger monofunctional alkyl groups are also repaired by other pathways; alkyltransferases may be only a minor factor in resistance to killing by such agents. The limited usefulness of alkylnitrosoureas in curative cancer chemotherapy may be accounted for by the relatively large number of tumors that express alkyltransferase activity. ACNU chemotherapy was ineffective in two patients with glioblastomas. Both tumors had high levels of alkyltransferase activity (156). Lower alkyltransferase activities were found in melanoma metastases from patients who responded to dacarbazine compared to nonresponders (157). The importance of alkyltransferase-mediated prevention of the formation of DNA interstrand cross-links was demonstrated by the increased number of such cross-links in Mer- cells and cells treated with inactivators of the alkyltransferase (18-23, 158, 159). Furthermore, increasing alkyltransferase activity in Mer- cells through transfection of DNA or plasmids expressing the alkyltransferase results in a decrease in the number of BCNU-induced DNA interstrand cross-links and cell killing (18-23, 160-162).

202

ANTHONY E. PEGG ET AL.

2. RESISTANCETO FORMATION OF SISTERCHROMATID EXCHANGES OR CHROMOSOMAL ABERRATIONS Sister-chromatid-exchange (SCE) formation has frequently been used as a sensitive indicator of DNA damage. The mechanism of SCE formation is proposed to reflect double-strand breaks followed by recombination with the intact sister chromatid. There is a correlation between SCE and cell killing, suggesting that SCE induction and reproductive cell death are interrelated (163, 164). Many DNA-damaging agents induce SCE formation, albeit with different efficiencies. Chloroethylnitrosoureas are potent inducers of SCEs, whereas methylating and, particularly, ethylating agents are much less efficient. There is substantial evidence indicating that the alkyltransferase protein protects cells from methylating and bifunctional alkylating agent-induced SCEs, implicating lesions at the 06-position of guanine as inducers of SCEs (21-23,112,165-167). The sensitivity of cell lines to chloroethylating agents (CENUs) correlates well with the induction of SCEs (112, 165-167). Cells expressing low levels of alkyltransferase are 5- to 10-fold more sensitive to the cytotoxic effects of CENUs and 7- to %fold more sensitive to SCE induction than cells with high levels of alkyltransferase (168). Taking into consideration the diflerences in the extent of DNA alkylation, induction of SCEs by the chloroethylating agent, ACNU, was 45-fold greater than for the ethylating agent, ENU (165).This strongly suggests that SCEs result from the formation of DNA interstrand cross-links and not just from alkylation. Furthermore, pretreatment of Mer+ cells with MNU prior to BCNU resulted in a dose-dependent increase in cytotoxicity and the frequency of SCEs compared to BCNU alone (168, 169). When alkyltransferase activity was inhibited by the presence of 06-methylguanine (170) or 06-benzylguanine (171, 172; M . E. Dolan, J. L. Schwartz and A. E. Pegg, unpublished). BCNU-induced SCEs were markedly increased. Among the many lesions induced in DNA by methylating agents, the lesion(s) that gives rise to SCEs has been debated for years. Although there is not a perfect correlation between alkyltransferase activity and the production of SCEs by methylating agents (112, 1 0 , numerous reports demonstrate an important role for the alkyltransferase protein in the protection against SCE formation ( 164; reviewed in 21-23). These include observations that Mer- cells, defective in the removal of 06-alkylguanine, have a higher frequency of SCE induction in response to methylating agents, and that reductions in MNNG- or MNU-induced cell killing and SCE formation occur after transfection with plasmids expressing the Ada or human alkyltransferase (21-23, 112). Similarly, inactivation of alkyltransferase by 0 6 benzylguanine prior to exposure to MNNG increases the yield of aberrations and SCEs and the production of aberrations through several cell cycles (171).

06-ALKYLGUANINE-DNA ALKYLTRANSFEKASE

203

Transfected C H O cell lines overexpressing N3/N7-methylpurine-DNA glycosylase did not show protection against methylation-induced SCEs and aberrations (112, 173). The probability that 06-methylguanine will be converted into a cytogenetic effect is estimated to be about 1:30 for SCEs and 1:147,000 and 1:22,000 for chromosomal aberrations in the first and second post-treatment mitosis, respectively (112).Treatment of Mer- cells with MNNG produces 1 SCE for every 42 10 06-methylguanine lesions formed in the genome of Mer- cells, and 1lethal event per 6650 k 1200 06-methylguanine lesions in the coding region of the hypoxanthine phosphoribosyltransferase (HPRT) gene (164). Thus, 06-methylguanine lesions are recombinogenic in cells lacking the alkyltransferase protein, and alkyltransferase contributes toward protection against toxicity and SCE formation. Methylating agents are more potent than ethylating agents as inducers of chromosomal aberrations (171). There was only a small decrease in SCEs induced by ethylating agents such as ENU and EMS in cells transfected with plasmids expressing human or Ada alkyltransferases compared to control cells (reviewed in 21-23, 112, 164). These results suggest that a lesion other than 06-ethylguanine plays the major role in SCE formation in cells treated with ethylating agents.

*

3. PREVENTION OF MUTATIONS a. Methylation. Although other DNA adducts can contribute to the initiation of mutations, the production of 06-alkylguanine and 04-alkylthymine is a major factor in the production of mutations by carcinogenic methylating agents (1-7, 174). The predominant miscoding adduct, 06-methylguanine, miscodes by pairing with thymine instead of cytosine during DNA replication, resulting in G.C to A.T transition mutations (1-7, 175-1 77). Studies of mutagenesis in mammalian cells show that MNNG produces many more mutants in Mer- human cells lacking repair of 06-methylguanine than in repair-proficient cells (21-23). Transfection of Mer- cell lines with plasmid-borne genes encoding the Ada or mammalian alkyltransferase protein reduces the incidence of these mutations (21-23, 178), and inactivation of alkyltransferase by 06-benzylguanine leads to an increased number of mutations (179). Similarly, the presence or absence of alkyltransferase activity in bacteria greatly alters the incidence of mutations caused by MNNG or MNU (30, 180, 181). The incidence of spontaneous G.C to A,T transitions is also reduced by the presence of the alkyltransferase in CHO cells (182) and in E . coli (96), suggesting that the protein protects cells from endogenous methylation. In all cases, including those where alkyltransferase is inhibited or absent, the mutations produced by MNNG and MNU are predominantly G.C to A.T

204

ANTHONY E. PEGG ET AL.

transitions in mammalian cells (178,179,183-191) and in bacteria (191-194). Thus, 06-methylguanine and not 04-methylthymine is the major adduct responsible for production of mutations. This is probably due to the very lowlevel production of 04-methylthymine, because it is known to be a strongly miscoding lesion in uitro and in plasmids containing 04-methylthymine at defined sites (4, 175, 195). In E . coli, 04-methylthymine residues are up to 15- to 20-fold more mutagenic than 06-methylguanine residues when incorporated in M13 genomes (195). It is of some interest that the percentage of the mutations that are not G.C to A.T transitions was greater in experiments where mutations were measured in a shuttle vector plasmid (183,185) rather than in an endogenous gene (178, 179, 184, 186-189). It is possible that the extrachromosomal location of such plasmids renders them more likely to suffer damage by strand breaks and depurination. It is also possible that such damage in shuttle vector sequences is not repaired efficiently. Several studies have shown a nonrandom distribution of mutations in the HPRT gene or other target genes of cells treated with MNU or MNNG (178, 179, 186-191). Certain DNA sequences are mutational hot-spots, particularly the second G in the sequence GGR. A similar bias in the production of mutations has been observed in experiments studying mutations in defined target genes in E . coli wherein the sequence RG is frequently mutated (191194). There is an interesting parallel between these results and findings of mutation of the K-rus or Ha-rus gene in nitrosamine-induced tumors, wherein activating G-to-A transitions occur exclusively in the second G of the GGA of codon 12 (100, 196-198). Some of the sequence specificity in mutational spectra resulting from these methylating agents may be attributed to the differences in rates of repair by the alkyltransferase (see Section III,A,3 reference 178), but an additional contributing factor is likely to be the nonrandom alkylation of DNA (98, 100, 109, 199-202) or sequence context effects on the miscoding of modified guanines (176, 203). The contention that repair by alkyltransferase is not the major factor is strengthened by studies showing that inactivation by 06-benzylguanine does not alter the distribution of mutations caused by MNNG in human fibroblasts (179). Similarly, the bias toward mutants at the RG sites was also apparent in E . coli strains lacking alkyltransferase (181). There was also a bias in mutations toward those derived from the nontranscribed strand of the DNA both in mammalian cells (178, 186-189) and in E . coli (181).This bias is probably greater than can be accounted for by the possible bias in mutation selection. Again, alkyltransferase inactivation (179) or absence (181)did not affect this bias, suggesting that it is more likely to be due to the alteration in the formation of alkylation products or their conversion to mutations than to differential repair. However, there are reports of

O~-ALKYLGUANINE-DNA ALKYLTRANSFERASE

205

more rapid removal of 06-alkylguanine from the transcribed strand, although it has not been shown definitively that this is due to alkyltransferase (109, 110).

b. Ethylation and Higher Alkylation. The formation of mutants by ethylating agents such as EMS, ENU, and ENNG is more complex than with related methylating agents, particularly in mammalian cells (174, 187, 190, 191, 195, 204-206). This probably reflects the fact that ethylating agents produce a greater variety of potentially miscoding lesions in relatively similar amounts. Exposure to ethylating agents results in G.C to A.T transitions but also A.T to G.C transitions. The latter correlates with the miscoding potential of 04-alkylthymine (1-7,175, 177). Ethylating agents such as ENU form much more 04-alkylthymine than do methylating agents. In mammalian cells, ethylating agents also cause a substantial number of A.T to T.A and of A.T to C.G transversions. In E . coli, the predominant mutations produced by ethylation are G.C to A.T transitions and A.T to G.C transitions, expected from the formation of 06-ethylguanine and 04ethylthymine, respectively. The incidence of mutations induced by ethylating agents is increased when alkyltransferase-deficient cells are employed (205-208). As the size of the alkyl group increases, adducts at the 06-position become increasingly better substrates for excision repair; in E . coli this becomes a predominant method of repair (209). Nevertheless, alkyltransferase activity, particularly that of Ogt, which is better than Ada at the repair of longer chain alkylation products (89), plays a significant role in reduction of mutations by ethylation and propylation (104, 207). The contribution of alkyltransferase to the repair of 06-substituted-guanine damage at codon 12 (GGA) of the rat H-rus coding sequence decreases as the 06-substituent is changed from methyl to ethyl to benzyl (104). However, treatment of human fibroblasts with ENU led to a much higher frequency of mutations at the middle position of the GGC sequence at the twelfth codon of the H-rus proto-oncogene than at the first position (210). This result is similar to that found with 06-methylguanine incorporated into a GGC triplet (105) as described above. There was also a strong bias for mutations arising from the nontranscribed strand (210).Preferential repair of 06-ethylguanine from the transcribed strand of the p-actin gene has been reported (109). The ability of excision repair to remove 06-ethylguanine and 06-butylguanine from DNA in human lymphocytes and leukemic cells was shown clearly when 06-benzylguanine was used to block alkyltransferase. This reduced the rate of 06-ethylguanine elimination but did not abolish it (211). Removal of 06-butylguanine did not correlate with alkyltransferase activity (211). There was a significantly greater frequency in the incidence of G.C to

206

ANTHONY E. PEGG ET AL.

A T transitions in Mer- human fibroblasts (205) or B-lymphoblastoid cells (206) treated with ENU, compared to an equivalent Mer+ line, but not in the frequencies of other mutations, including A.T to G.C transitions. This suggests that the human alkyltransferase repairs 06-ethylguanine but not 04-ethylthymine in these cells. Attempts have been made to determine the relative contributions of excision repair and alkyltransferase to the repair of 06-ethylguanine and 04-ethylthymine in mammalian cells. Mutations produced in two EBV-transformed human B-lymphoblastoid cell lines, which were positive and negative for alkyltransferase activity, were compared with mutations produced in a similar cell line from a xeroderma pigmentosum patient lacking excision repair (206). The effects of 06-benzylguanine on mutation frequency were also studied (212).Lack of either pathway led to a significant increase in mutations and to an increase in G.C to A.T transitions likely to be caused by 06-ethylguanine. The frequency of A.T to G.C transitions presumably resulting from 04-ethylthymine was not altered, suggesting that this product is not repaired sufficiently rapidly by either pathway to influence the production of mutants. Further studies supported these conclusions by assaying the loss of ethylated bases from the DNA (213).The repair of ethylated pyrimidines, including 04-ethylthymine, was slow and was not affected by alkyltransferase or excision repair, whereas the efficient removal of 06-ethylguanine required the presence of both pathways. It was concluded that alkyltransferase and nucleotide excision repair cooperate in the repair of 06-ethylguanine, but the mechanism of this cooperation is unknown and the generality of the conclusion is not yet established (214).Some confirmation of these results in other cell types would be useful. In Mer+ human fibroblasts, competent in excision repair, inactivation of the alkyltransferase by 06-benzylguanine had little effect on killing or mutagenesis by ENU. In Mer+ fibroblasts, which also had the xeroderma pigmentosum defect in excision repair, there was a significant increase in mutations when 06-benzylguanine was used (V. Maher and A. E . Pegg, unpublished). These results show that either pathway is able to remove the lesions efficiently in the time period allowed for repair. Plasmids containing a single 06-alkylguanine or 04-alkylthymine in a defined site do lead to mutations, but the frequency of mutation is quite low (4, 10,104, 175-177, 194, 201, 215-217). This suggests that these plasmids can be repaired quite effectively. Removing the alkyltransferase activity either by giving a saturating dose of methylating agents or by using an a&ogt- strain increased the mutation frequency slightly in some experiments, although the occurrence of mutations was still lower than expected. This suggests that some other repair functions may be able to act effectively when the number of adducts is very low. This could be mismatch repair or, more

06-ALKYLGUANINE-DNA ALKYLTKANSFEKASE

207

probably, it may reflect an adduct-induced strand bias in plasmid DNA replication (lo), which would not be distinguishable from repair in these experiments. It should be noted that other studies suggest that no repair of 06-alkylguanine occurs in E. coli strains in which excision repair and all alkyltransferase activity have been removed (208).

c. Chloroethylation. BCNU induces primarily G.C to T.A transversions at the hemizygous adenine phosphoribosyl transferase (EC 2.4.2.7) gene target in alkyltransferase-deficient CHO cells (218). The G.C to A.T transitions, likely to be caused by an adduct at the 06-position of gnanine, were the second most frequent mutations but amounted to only 16% of the total. The frequency of both of these mutations was reduced markedly in cells transformed to express high levels of alkyltransferase, suggesting that the lesions causing these mutations are both repaired by this protein (219). The source of the G.C to T.A transversions is not known but could be due to the formation of an abasic site during resolution, to bypass of the interstrand cross-link, or to coding of either the 06-(2-chloroethyl)guanine or the 1 , 0 6 ethanoguanine adducts for adenine. 4. PROTECTION AGAINST CARCINOGENICITY CAUSED BY ALKYLATING AGENTS

Evidence from animal and cell culture systems indicates that repair of 06-methylguanine protects cells from malignant conversion. It has been known for some time that carcinogenic alkylnitrosoureas induce tumors preferentially in tissues with low alkyltransferase activity (I-3,220-223). Rodent cell variants with low alkyltransferase activity undergo malignant conversion with much higher frequency than their counterpart cells with high alkyltransferase activity (224). Preliminary studies in monkeys treated with ENU indicate that those with low alkyltransferase activities in the gastric mucosa are most sensitive to the induction of gastric cancer (225). Mutant animals defective in the alkyltransferase gene would be expected to reflect the importance of alkyltransferase-mediated DNA repair in vivo by showing increased neoplastic susceptibility to methylating agents. On the other hand, increasing alkyltransferase activity in specific target issues would predict a decrease in the initiation of neoplasia by methylating carcinogens due to repair of mutagenic DNA lesions. Tumorigenesis could then be prevented if repair occurs prior to formation of point mutations in critical oncogenes or tumor suppressor genes. This hypothesis has been tested recently with the successful generation of transgenic animals expressing high levels of alkyltransferase activity (reviewed in 226). The production of “gene knock-out” mice in which the alkyltransferase has been deleted has not yet been reported, but is in progress

208

ANTHONY E. PEGG ET AL.

in several laboratories; such animals will be useful in adding to the evidence implicating alkyltransferase in protection against tumor production. Another way to approach this questioq is to use a potent alkyltransferase inactivator such as 06-benzylguanine to block activity during treatment with a chemical carcinogen. Initial attempts to do this during treatment of rats with MNU and ENU showed only a small effect on tumor incidence (227), but the dose of nitrosoureas used may have been too high and the dose of 0 6 benzylguanine was probably not adequate to alter significantly the persistence of 06-alkylguanines in the DNA. Additional studies using a higher dose of 06-benzylguanine to produce a greater increase in the persistence of 06-alkylguanines are in progress (228).

B. Transgenic Expression of Al kyltransferase Transgenic mice carrying the bacterial or human alkyltransferases offer a unique model with which to study the protective role of the alkyltransferase in the genotoxicity induced by N-nitroso compounds (229-237). A summary of the current experiments with such mice has appeared (226). The first successful transgenic mice expressing the a& gene were generated by introduction of the ada gene sequence attached to the Chinese hamster metallothionein-I gene promoter, which was inducible up to %fold by exposure to zinc (229).The metallothionein promoter targeted Ada production to the liver and testes. These transgenic mice showed significantly reduced rates of development of liver tumors after treatment with dimethylnitrosamine or diethylnitrosamine (233, 236). A second group of transgenic mice expressing Ada was produced using a chimeric gene consisting of the ada gene and a phosphoetiolpyruvate (GTP) carboxykinase (EC 4.1.1.32) promoter (231). This promoter allowed targeting to the liver and kidney and resulted in increased expression of liver activity in animals fed a high-protein diet. Several experiments demonstrate expression of the human alkyltransferase in transgenic mice. One used a transgene consisting of the mouse metallothionein-I promoter with the human alkyltransferase from CEM cells in C57BL/6 X DBA F, mice (230, 234). Alkyltransferase expression was mainly in the liver, small intestine, and testes (-3- to 20-fold above basal activity), but small increases were also observed in brain, colon, pancreas, ovary, and kidney. Overexpression of the human alkyltransferase in brain and liver of transgenic mice was produced using the alkyltransferase cDNA with a portion of the human transferrin 5’-flanking region (235).Alkyltransferase levels in the brain and liver were 150- and 25-fold greater than in nontransgenics, respectively. A chimeric gene consisting of chicken p-actin promoter, human alkyltransferase cDNA from VACO 8 human colon cancer cells, the poly(A) region from bovine growth hormone, and the locus control

06-ALKYLGUANINE-DNA ALKYLTRANSFERASE

209

region from the human CD2 gene produced very high levels of alkyltransferase in the thymus (237). These transgenic mice carrying the human alkyltransferase gene targeted to T-cells had an overall lower incidence of thymic lymphomas (84% in nontransgenic versus 10% in transgenic mice) when treated with a single dose of MNU (237). Rapid 06-methylguanine repair due to enhanced levels of alkyltransferase in these transgenic mice was responsible for blocking the initiation of MNU-induced carcinogenesis

(238).

VI. Inactivation of Alkyltransferase to Enhance Chemotherapy The unique activity of alkyltransferase suggests that it is an ideal target for biochemical modulation. The efficient repair of toxic lesions formed at the 06-position of guanine without additional enzymes or cofactors provides a less complex target to modulate than other DNA repair proteins. Furthermore, the high degree of correlation that exists between alkyltransferase activity and sensitivity to nitrosoureas indicates that elimination of this protein may reverse resistance in many cases. Successful modulation of tumor drug resistance requires an increase in the therapeutic response without an equivalent increase in the toxicity. Two methods have been used to overcome alkylnitrosourea resistance by inactivation of alkyltransferase. One uses methylating agents that indirectly decrease alkyltransferase levels by introducing 06-methylguanine residues in DNA that are then repaired by the alkyltransferase (239-243). The second method uses direct alkyltransferase inactivators such as 06-methylguanine (75, 76, 160, 244, 245) or 06-benzylguanine (78).

A. Methylating Agents Methylating agents such as MNNG or streptozotocin form sufficient 0 6 methylguanine DNA adducts to deplete alkyltransferase activity in uitro (239). Exposure of colon tumor cells to streptozotocin therefore results in a depletion of alkyltransferase activity, an increase in BCNU-induced interstrand cross-linking, and a two to three log enhancement of BCNU cytotoxicity in vitro (242, 243, 246). Temozolomide, dacarbazine, and streptozotocin also depleted alkyltransferase levels in human lymphocytes (241, 247-249). The promising data from studies on cultured cells led to the evaluation of FDA-approved methylating agents (streptozotocin or dacarbazine) with BCNU or fotemustine in human clinical trials. The results indicate that there is a depletion in alkyltransferase by streptozotocin in peripheral blood mononuclear cells in the range necessary to produce sensitivity to CENUs. Clini-

210

ANTHONY E. PEGG ET AL.

cal trials with streptozotocin and BCNU on carcinoid tumors produced some encouraging results; however, thrombocytopenia was the dose-limiting toxicity (250, 251). When BCNU is combined with streptozotocin, the maximal tolerated dose of BCNU is reduced to 50%of that when BCNU is used as a single agent, and platelet suppression occurs earlier (251). Dacarbazine treatment combined with fotemustine was effective against malignant melanoma (252),although a rapid fatal pulmonary toxicity appeared in some patients treated with this combination (253). Human tumor xenograft studies in nude mice demonstrated that although depletion of alkyltransferase activity can be demonstrated in tissues and tumors with methylating agents, this does not result in an increase in the therapeutic index of BCNU (254, 255). Doses of methylating agent resulting in a modest depletion of alkyltransferase activity are quite toxic prior to the addition of BCNU. Equally important to the added toxicity are the mutagenic and carcinogenic properties of the methylating agents (1-3). These properties may limit the clinical usefulness of combining methylating agents with CENUs.

B. 06-Alkylguanines An alternative method of alkyltransferase depletion involves the use of 06-alkylguanines. The first 06-alkylguanine developed as a potential inactivator of the alkyltransferase was 06-methylguanine (75, 76, 160, 244, 245). Exposure of cells or cell extract to millimolar amounts of 06-methylguanine for several hours results in a loss of alkyltransferase activity and subsequent increase in the sensitivity of tumor cells to alkylating agents (reviewed in 21, 22). Although the results looked promising, the maximal reduction that could be achieved was about 85%, presumably due to the slow inactivation process such that the new steady-state alkyltransferase concentration resulting from inactivation and de nouo synthesis left 15-20% of the activity remaining. There was no enhancement of the therapeutic index of BCNU when combined with 06-methylguanine to treat mice carrying human tumor xenografts (256).The problems associated with the use of 06-methylguanine include poor solubility, poor affhity for the alkyltransferase, poor uptake into cells, and lack of selectivity of alkyltransferase reduction, resulting in a requirement for very high doses of 06-methylguanine. 06-Benzylguanine is a much more promising agent for the therapeutically useful depletion of alkyltransferase (55, 78, 257). Inactivation of the alkyltransferase protein in Mer+ cells renders these cells more sensitive to the cytotoxic effects of alkylating agents, including BCNU, CCNU, ACNU, chlorozotocin, clomesone, streptozotocin, procarbazine, and dacarbazine (55, 78, 257-264). There is a strong correlation between the degree of en-

06-ALKYLGUANINE-DNA ALKYLTRANSFERASE

211

hancement and the level of alkyltransferase activity, with little or no enhancement observed in Mer- cells and the greatest enhancement in cells with high alkyltransferase activity. Enhancement does not occur with alkylating agents that do not produce a toxic lesion at the 06-position of guanine in DNA, such as cisplatin and melphalan (258). Treatment with 06-benzylguanine increases the cytotoxicity of BCNU in both oxic and hypoxic brain tumor cells (265).Attempts have been made to increase the sensitivity of cells to BCNU even further by adding agents to the regimen that act by a different mechanism. Thus, the combination of a-difluoromethylornithine, which blocks polyamine synthesis, with 0 6 benzylguanine potentiated the killing of cultured brain tumor cells by BCNU to a greater extent than when either agent was used alone. However, this enhancement was not seen in animal studies, possibly because exposure to exogenous polyamines was not prevented (A. Sarkar, D. Deen, M. E. Dolan, A. E. Pegg and L. J. Marton, unpublished). Glutathione may protect cells from CENUs. To assess the relative contributions of alkyltransferase and glutathione toward resistance to BCNU, L-buthionine sulfoxamine, which inhibits glutathione synthesis, and/or 06-benzylguanine were added to breast tumor cells that express high levels of glutathione-S-transferase. The treatment with 06-benzylguanine led to a much larger increase in the sensitivity of the cells to BCNU, and L-buthionine sulfoxamine produced no further increase. This suggests that the alkyltransferase protein plays the major role in resistance to BCNU. The mechanism of enhancement of BCNU toxicity by 06-benzylguanine involves an increase in the number of interstrand cross-links (159). Some studies indicate a lack of correlation between alkyltransferase activity and sensitivity of cells to bifunctional alkylating agents (24-27, 263, 266). This may be due to repair mechanisms other than the alkyltransferase present in these cells. There are also some isolated reports of a lack of enhancement of BCNU cytotoxicity by 06-benzylguanine in lymphoma and glioma cells expressing alkyltransferase activity (263, 266). These observations are in such sharp contrast to numerous reports (55, 78, 257-262, 264, 265) of enhancement by 06-benzylguanine of the toxicity of BCNU toward Mer+ tumor cells that some reexamination of these experiments and of the alkyltransferase status of the cells should be considered. A more water-soluble 06-benzylguanine derivative, Nz-acetyl-06-benzylguanosine, was suggested to be useful for increasing the sensitivity of human melanomas to CENUs (267). However, unless it is metabolized to 0 6 benzylguanine, its activity as an alkyltransferase inhibitor would not be expected to be very high, because we have shown (79, 80; Table 11) that both 06-benzylguanosine and N2-acetyl-06-benzylguanineare very much poorer

2 12

ANTHONY E. PEGG ET AL.

inactivators of the human alkyltransferase than 06-benzylguanine. A molecule with both modifications would therefore be expected to be even less active than either 06-benzylguanosine or N2-acetyl-06-benzylguanine. Alkyltransferase activity in tissues and tumors of rodents is depleted with doses as low as 10 mg/kg 06-benzylguanine, although higher doses are needed for effective sensitization to chloroethylnitrosoureas (159, 227, 228, 268-274). Treatment of nude mice carrying SF767 tumors with 06-benzylguanine prior to MeCCNU or BCNU leads to a significant inhibition of tumor growth as compared to the CENU alone (159,268). There were 14/15 tumor regressions in mice carrying the D341MED and loll0 in mice carrying the D456MG brain tumor xenografts when treated with 06-benzylguanine prior to BCNU, compared to 1/ 16 and 1/10 for animals treated with BCNU alone, respectively (269, 270). Xenografts of the human rhabdosarcoma line, TE-671, were also sensitized to BCNU by 06-benzylguanine with 8/10 regressions for the combination and 0110 for mice treated with BCNU alone (269,270). Further studies, using the intracranial D341MED medulloblastoma model, showed a significant increase in median survival in animals treated with 06-benzylguanine prior to BCNU compared to BCNU alone

(271). The ability of 06-benzylguanine to reverse BCNU resistance in colon tumor xenografts was also evaluated (159, 272, 273). Colon tumors with low alkyltransferase activity responded well to BCNU alone and did not benefit significantly by the addition of 06-benzylguanine. In contrast, tumors with intermediate to high alkyltransferase were resistant to BCNU alone and required 06-benzylguanine in combination with BCNU for tumor growth inhibition (273).Tumor growth inhibition was also observed in the Dunning rat prostate tumor model (274). Toxicity associated with the combination of BCNU and 06-benzylguanine in rats, mice, and dogs included bone marrow suppression, loss of intestinal crypts, and a decreased number of lymphocytes in the spleen (274-276), but there was an improvement in the chemotherapeutic index when tumors containing higher levels of alkyltransferase were treated with this combination rather than BCNU alone using equitoxic doses (159, 269, 270, 272). Clinical trials combining 06-benzylguanine with BCNU have begun. The usefulness of 06-benzylguanine and of the inactivation of alkyltransferase for cancer chemotherapy should become clearer when the results of these trials are known. However, although 06-benzylguanine is clearly the most promising compound for alkyltransferase inactivation at present, it is not an ideal drug because it has only limited water solubility and rapid plasma clearance. Clearance is due to urinary excretion and metabolism to 06-benzyl-8oxoguanine, N~-acetyl-0~-benzyl-8-oxoguanine, NZ-acetyl-06-benzylguanine, and debenzylated products (277).Acetylation to form the N2-acetylated

O~-ALKYLGUANINE-DNA ALKYLTRANSFERASE

213

derivatives is a species-specific reaction, observed in rats but not mice. Following intravenous injection of 06-benzylguanine in rats, dogs, and nonhuman primates, the plasma profile was best described by a one-compartment model. The plasma half-life was 1.5hours for rats (M. E. Dolan and E . Gupta, unpublished observations) and 3.4-6.3 hours for Beagle dogs (276), respectively. The half-lives in plasma and cerebrospinal fluid of nonhuman primates was 1.5 and 1.9 hours, respectively (278). It has been estimated that at least 37% of the parent drug in rats is oxidized to 06-benzyl-8-oxoguanine (M. E. Dolan and E . Gupta, unpublished). This metabolite was found to be only slightly less potent as an inactivator of the alkyltransferase protein than 06-benzylguanine (277). This product is formed primarily by the cytochrome P450 microsomal system. Studies using membrane fractions expressing high levels of various human P450 forms indicate that 06-benzylguanine is metabolized primarily by CYPlA2 (K,= 1.3 pM) and to a lesser extent by CYP3A4 (K,= 52.2 pM) (M. E. Dolan and S. K. Roy, unpublished). It is expected that there will be interindividual variations in the extent to which this reaction occurs in patients; in some cases the metabolite may be primarily responsible for inactivation of alkyltransferases. The low solubility of 06-benzylguanine has led to its formulation in a polyethylene glycol-400-based vehicle (279). This formulation was effective in sensitizing D456MG glioblastoma xenografts in nude mice to BCNU (279) at doses of 06-benzylguanine less than those used in the original experiments that employed a cremophor-EL vehicle (159,268-273), but its general suitability for effective, prolonged modulation of alkyltransferase has not been established. Effective modulation requires a prolonged period of inactivation to allow the full conversion of the 06-(2-chloroethyl)guanineadducts to interstrand cross-links (261, 262). As noted above, it is possible that the selection of mutant alkyltransferases in tumor cells after exposure to 06-benzylguanine and a nitrosourea might occur, leading to a population of tumor cells refractory to the inhibitor. The newer, more potent inactivators described in Section III,B or some compounds based on these leads may be of value in overcoming some of these problems. If a suitably active compound can be obtained, it may be possible to give a single dose in an aqueous vehicle that produces a sufficiently long-lasting inactivation of the alkyltransferase in tumor tissues. Also, although the P140A and G156A mutant human alkyltransferases described above were more resistant than the control alkyltransferase to inactivation by 2,4-diamino-6-benzyloxy-5-nitrosopyrimidine, there was much less of a difference than with 06-benzylguanine, and inactivation by this pyrimidine could be achieved at concentrations likely to be achievable physiologically (T. M. Crone, R. C. Moschel and A. E . Pegg, unpublished).

214

ANTHONY E. PEGG ET AL.

Although in the experimental animal xenograft models described above there was an improvement in the therapeutic index of BCNU when combined with 06-benzylguanine, additional problems with this adjuvant approach involving a non-tumor-specific inhibitor such as 06-benzylguanine may arise from a general increase in toxicity of the alkylating agent (275, 276). One approach to dealing with this would be the synthesis of alkyltransferase inhibitors that are directed more selectively toward tumor tissues. The data base of some 60 compounds tested as alkyltransferase inhibitors provides some clues about modification that might lead to tumor specificity. Finally, a potential way to avoid the dose-limiting bone-marrow toxicity correlated with nitrosourea therapy would be to transfect stem cells with a retroviral vector expressing either the Ada alkyltransferase or a mutant mammalian alkyltransferase resistant to inactivation by 06-benzylguanine. Thus, the detailed knowledge of alkyltransferase structure, function, and mechanism of action that has become available should enable an in-depth study of the therapeutic opportunities afforded by this protein as a target for drug design and as a means of modulating the toxic effects of alkylating agents.

VII. Glossary alkyltransferase

HPRT Ada Ada-C ogt Ada,, Adas,-C AdaB Dat Mer- phenotype

06-alkylguanine-DNA alkyltransferase (EC 2.1.1.63) hypoxanthine-guanine phosphoribosyltransferase (EC 2.4.2.8) product of the E . coli ada gene regulating the adaptive response the carboxyl-terminal 179 amino acids from the Ada protein product of the E. coli ogt gene for constitutive repair of 06-alkylguanine damage product of the S . typhimurium a&,, gene for the repair of 06-methylguanine damage the carboxyl-terminal 178 amino acids from the AdasT protein product of the inducible B . subtilis adaB gene for the repair of 06-methylguanine damage product of the B. subtilis dat gene for constitutive repair of 06-alkylguanine damage the lack of expression of activity repairing 06-methylguanine damage

@-ALKYLGUANINE-DNA

procarbazine dacarbazine temozolomide fotemustine clomesone MNNG ENNG EMS MNU ENU CENU BCNU ACNU CCNU MeCCNU

ALKYLTRANSFERASE

215

N-( l-methylethyl)-4-[ (2-methylhydrazino)methyl]benzamide 5-(3,3-dimethyl-l-triazeno)imidazole-4-carboxamide 8-carbamoyl-3-methylimidazo[5,I-d]-1,2,3,5tetrazin-4(3H)-one diethyl-1-[ 3-(2-chloroethyl)-3-nitrosoureido]ethylphosphonate 2-chloroethyl(methylsulfonyl)methanesulfonate N-methyl-N -nitro-N-nitrosoguanidine N-ethyl-N ’-nitro-N-nitrosoguanidine ethyl methanesulfonate N-methyl-1;-nitrosourea N-ethyl-N-nitrosourea chloroeth ylnitrosourea 1,3-bis(~-chloroethyl)-l-nitrosourea 1-(4-amino-2-methyl-5-pyrimidinyl)methyl-3-(2chloroethyl)-3-nitrosourea l-(2-chloroethyl)-3-cyclohexyl-l-nitrosourea l-(2-chloroethyl)-3-(4-methylcyclohexyl)-lnitrosourea

ACKNOWLEDGMENTS This work has been supported in part by the National Cancer Institute through Grants CA-18137 (AEP), CA-47228 (MED), and CA-57725 (AEP, MED) and by NCI Contract N01CO-46000 with ABL (RCM). We are most grateful to P. C. E. Moody and his colleagues for providing Fig. 2.

REFERENCES 1. 2. 3. 4.

5. 6.

7. 8. 9. 10. 11.

P. D. Lawley, ACS Monogr. 182, 325 (1984). A. E. Pegg, Reu. Biochetn. Toricol. 5, 83, (1983). R. Safhill, 6. P. Margison and P. J. O’Connor, BBA 823, 111 (1985). B. Singer and J. M. Essigmann, Carcinogenesis 12, 949 (1991). R. C. Moschel, W. R. Hudgins and A. Dipple, J . Org. Chem. 51, 4180 (1986). R. C. Moschel, IARC Sci. Publ. 125, 25 (1994). E. L. Loechler, Chem. Res. Toricol. 7, 277 (1994). P. Karran and M. Bignami, NARes 20, 2933 (1992). S. Ceccotti, E. Dogliotti, J. Gannon, P. Karran and M. Bignami, Bchem 32, 13664 (1993). G. T. Pauly, S. H. Hughes and R. C. Moschel, Bchen 33, 9169 (1994). G . Fritz, J. Dosch, H. W. Thielmann and B. Kaina, ]BC 268, 21102 (1993).

216

ANTHONY E . PECG ET AL.

12. 13. 14. 15. 16. 17. 18. 19. 20.

G . Aquilina, R. Biondo, E. Dogliotti and M. Bignami, Carcinogenesis 14, 2097 (1993). A. Kat, W. G. Thilly, W. Fang, M. J. Longley, G . Li et a ! . , PNAS 90, 6424 (1993). P. Branch, G . Aquilina, M. Bignami and P. Karran, Nature 362, 652 (1993). S. Griffin, P. Branch, Y. Xu and P. Karran, Bchem 3, 4787 (1994). Sigbhat-Ullah and R. S. Day, Bchem 31, 7998 (1992). P. Lefebvre and F. Laval, Carcinogenesis 14, 1671 (1993). D. B. Ludlum, Mutat. Re.s. 233, 117 (1990). T. P. Brent, Phanrmol. Ther. 31, 121 (1985). P. E. Gonzaga, P. M. Potter, T. Niu, D. Yu, D. B. Ludlum et al., Cancer Res. 52, 6052

(1992). A. E . Pegg, Cancer Res. 50, 6119 (1990). A. E. Pegg and T. L. Byers, FASEB J . 6 , 2302 (1992). S. Mitra and B. Kaina, This Series 44, 109 (1993). Z. Matjasevic, M. Boosalis, W. Mackay, L. Samson and D. B. Ludlum, PNAS 90, 11855 (1993). 25. F. Ah-Osman, D. E. Stein and A. Renwick, Cancer Res. 50, 6976 (1990). 26. H. S . Friedman, M . E. Dolan, S. H. Kaufmann, 0. M. Colvin, 0. W. G&th et al., Cancer Res. 54, 3487 (1994). 27. M. C. Walker, J. R. W. Masters and G . P. Margison, Br. J. Cancer 66, 840 (1992). 28. B. Demple, in “Protein Methylation” (W. K. Paik and S. Kim, eds.), p. 285. CRC Press, Boca Ftaton, FL, 1990. 29. T. Lindahl, B. Sedgwick, M . Sekiguchi and Y. Nakabeppu, ARB 57, 133 (1988). 30. L. Samson, Mol. Microbiol, 6 , 825 (1992). 31. A. Iyama, K. Sakumi. Y. Nakabeppu and M. Sekiguchi, Carcinogenesis 15, 627 (1994). 32. M. H. Moore, J. M. Gulbis, E. J. Dodson, B. Demple and P. C. E. Moody, EMBOJ. 13, 1495 (1994). 33. L. E . Ostrowski. C. N . Pegram, M. A. von Wronski, P. A. Humphrey, X. He et a l . , Cancer Res. 51, 3339 (1991). 34. A. E. P e g , L. Wiest, C. Mummert and M. E. Dolan, Carcinogenesis 12, 1671 (1991). 35. L. C. Myers, M. P. Terranova, A. E. Ferentz, G. Wagner and 6. L. Verdine, Science 261, 1164 (1993). 36. L. C. Myers, 6 . L. Verdine and G. Wagner, Bchem 32, 14089 (1993). 37. A. Hakura, K. Morimoto, T. Sofuni and T. Nohmi, J . B a t . 173, 3663 (1991). 38. F. Morohoshi, K. Hayashi and N . Munakata, NARes 18, 5473 (1990). 39. D. Bhattacharyya, K. T h o , G. J. Bunick, E. C. Uberbacher, W. D. Bhenkeetal., NARes 16, 6397 (1988). 40. G . Koike, H. Maki. H. Takeya, H. Hayakawa and M. Sekiguchi, ]BC 265, 14754 (1990). 41. S . M. Lee, J. A. RaI€erty, R. H. Elder, C. Y. Fan, M . Bromleyetal., Br. J . Cancer66,355 (1992). 42. A. E. Pegg M. Boosalis, L. Samson, R. C. Moschel, T. L. Byers et al., Bchem 32, 11998 (1993). 43. C. Chan, Z. Wu, T. Ciardelli, A. Eastman and E. Bresnick, ABB 300, 193 (1993). 44. P. Zak, K. Kleibl and F. Laval, JBC 269, 730 (1994). 45. R. H. Elder, 6. P. Margison and J. A. Rafferty, BJ 298, 231 (1994). 46. P. M. Potter, A. Lasiter and T. P. Brent, Methods Mol. Cell. B i d . 4, 139 (1993). 47. T. C. Ayi, K. C. h h , R. B. Ali and B. F. L. Li, Cancer Res. 52, 6423 (1992). 48. S. E. Morgan, M. R. Kelley and R. 0. Pieper, JBC 268, 19802 (1993). 49. N. Zhukovskaya, B. Rydberg and P. Karran, NARes 20, 6081 (1992). 50. T. M. Crone, K. Goodtzova, S. Edara and A. E. P e g , Cancer Res. 54, 6221 (1994). 51. L. K. Liem, C. W. Wong, A. Lirn and B. F. L. Li, JMB 231, 950 (1993).

21. 22. 23. 24.

06-ALKYLGUANINE-DNA ALKYLTRANSFERASE

217

52. T. P. Brent, M. A. von Wronski, C. C. Edwards, M. Bromley, G. P. Margison et al., Oncol. Res. 5, l(1993). 53. M. Belanich, T. L. Ayi, J. T. Kibitel, D. W. G o b , T. Randall et al., Oncol. Res. 6 , 129 (1994). 54. S. M. Lee, M. Harris, J. Rennison, A. McGown, M. Bromley et al., Eur. J . Cancer 29A, 1306 (1993). 55. S. L. Gerson, L. Liu, W. P. Phillips, N. H. Zaidi, A. Heist et al., Proc. Am. Assoc. Cancer Res. 35, 699 (1994). 55a. P. J. Kraulis, J. Appl. Crystallogr. 24, 946 (1990). 56. W. Yang, W. A. Hendrickson, R . J. Crouch and Y. Satow, Science 249, 1398 (1990). 57. J. W. R. Schwabe and A. Travers, Curr. Biol. 3, 628 (1993). 58. K. Kohda, I. Terashima, N. Sawada, I. Nozaki, M. Yasuda et al., Chem. Res. Toricol. 5 , 8 (1992). 59. T. E. Spratt and H. de 10s Santos, Bchem 32, 3688 (1992). 60. A. E. Pegg, L. Wiest, C. Mummert, L. Stine, R. C. Moschel et al., Carcinogenesis 12, 1679 (1991). 61. T. C. Ayi, H. K. Oh, T. K. Y. Lee and B. F. L. Li, Cancer Res. 54, 3726 (1994). 62. M. Takahashi, K. Sakumi and M. Sekiguchi, Bchem 29, 3431 (1990). 63. R. J. Graves, B. F. L. Li and P. F. Swann, Carcinogenesis 19, 661 (1989). 64. M. G. Fried and D. M. Crothers, J M B 172, 263 (1984). 65. B. J. Terry, W. E. Jack and P. Modrich, JBC 260, 13130 (1985). 66. K. Tano, D. Bhattacharyya, R. S. Foote, R. J. Mural and S. Mitra, J. Buct. 171, 1535 (1989). 67. L. C . Harris, P. M. Potter and G. P. Margison, BBRC 187, 425 (1992). 68. K. Ihara, H. Kawate, L. L. Chueh, H. Hayakawa and M. Sekiguchi, MGG 243,379 (1994). 69. T. M. Crone and A. E. Pegg, Cancer Res. 53, 4750 (1993). 70. C. Ling-Ling, T. Nakamura, Y. Nakatsu, K. Sakumi, H. Hayakawa et al.. Carcinogenesis 13, 837 (1992). 71. R. H. Elder, J. Tumelty, K. T. Douglas and G. P. Margison, BJ 285, 707 (1992). 72. J. A. RafFerty, J. Tumelty, M. Skorvaga, R. H. Elder, G. P. Margisonetal., BBRC 199,285 (1984). 73. K. Kohda, M. Yasuda, N. Sawada, K. Itano and Y. Kawazoe, Chem.-Biol. Interact. 78,23 (1991). 74. Y. Yamagata, K. Kohda and K. Tomita, NARes 16, 9307 (1988). 75. M. E. Dolan, K. Morimoto and A. E. Pegg, Cancer Res. 45, 6413 (1985). 76. P. Karran, PNAS 82, 5285 (1985). 77. K . Morimoto, M. E. Dolan, D. Scicchitanoand A. E. P e g , Carcinogenesis6, 1027 (1985). 78. M. E. Dolan, R. C. Moschel and A. E. P e g , PNAS 87, 5368 (1990). 79. R. C . Moschel, M. 6 . McDougall, M. E. Dolan, L. Stine and A. E. Pegg, J. Med. Chem. 35, 4486 (1992). 80. M.-Y. Chae, M. G. McDougall, M. E. Dolan, K. Swenn, A. E. Pegg et al, J. Med. Chem. 37, 342 (1994). 81. C . E. Arris, C. Bleasdale, A. H. Calvert, N . J. Curtin, C. Dalby et al., Anti Cancer Drug Des. 9, 401 (1994). 82. M.-Y. Chae, K. Swenn, S. Kanugula, M. E. Dolan, A. E. Pegg et al. J. Med. Chem. 38, 359 (1995). 83. Y. F. Shealy, J. D. Clayton, C. A. O’Dell and J. A. Montgomery, J. Org. Chem. 27,4518 (1962) 84. J. Kosary, E. Diesler, P. Matyus and E. Kasztreiner, Acta Phann. Hung. 59, 241 (1989). 85. W. Pfleiderer and R. Lohrmann, Chem. Ber. 94, 12 (1961).

218

ANTHONY E. PEGG ET AL.

86. M. E. Dolan. A. E . Pegg, L. L. Dunienco, R. C. Moschel and S. L. Gerson, Carcinogenesis 12, 2.3305 (1991). 87. K. Goodtzova, T. Crone and A. E. Pegg, Bchem 33, 8385 (1994) 88. 6. M. Ciocco, A . E. Pegg, M . Chae and R. C. Moschel, Proc. Am. Assoc. Cancer Res. 35, 394 (1994). 88a. S. Kanugula, K. Coodtzovd, S. Edam and A . E. Pegg, Bchem in press (1995). 89. M . C. Wilkinson, P. M. Potter, L. Cawkwell, P. Georgiadis, D. Patel e t a ! . , NARes 17, 8475 (1989). 90. T. P. Brent, M . E. Dolan, H. Fraenkel-Conrat, J. Hall, P. Karran et al., PNAS 85, 1759 (1988). 91. M. Sassanfar, M. K. Dosanjh, J. M. Essigmann and L. Samson, JBC 266, 2767 (1991). 92. S. M. O’Toole, A. E. Pegg and J. A. Swenberg, Cancer Res. 53, 3895 (1993). 93. L. Cruzeiro-Hansson and J. M . Goodfellow, Carcinogenesis 15, 1525 (1994). 94. L. K. Liem, A. Lirn and B. F. L. Li, NARes 22, 1613 (1994). 95. L. A. Peterson, X. Liu and S. S. Hecht. Cancer Res. 53, 2780 (1993). 96. W. J. Mackay, S. Han and L. D. Samson, 1.Bact. 176, 3224 (1994). 97. D. Scicchitano, R. A. Jones, S. Kuzmich, B. Gaffney, D. D. Lasko et al., Carcinogenesis 7, 1383 (1986). 98. M. E . D o h , M. Oplinger and A. E. P e g , Carcinogenesis 9, 2139 (1988). 99. A . E. P e g and M. E. Dolan, in “ D N A Repair Mechanisms and their Biological Implications in Mammalian Cells” (M. W. Lambert and J. Laval, eds.), p. 45. Plenum, New York, 1989. 100. M. D. Topal. Carcinogenesis 9, 691 (1988) 101. P. Georgiadis, C. A. Smith and P. F. Swann, Cancer Res. 51, ,5843 (1991). 102. C. Wong, N. Tan and B. F. L. Li,]MR 228, 1137 (1992). 103. R. E. Bishop and R. C. Moschel, Chem. Res. Toxicol. 4 , 647 (1991). 104. R. E. Bishop, L. L. Dunn, C . T. Pauly, M . E. DoIan and R. C. Moschel, Carcinogenesis 14, 593 (1993). 105. V. Pletsa, A . Gentil, 4.Margot, J. Armier, S. A. Kyrtopolous et al., NARes 20, 4897 (1992). 106. G. Mitra, G. T. Pauly, R . Kumar, G. K. Pei, S. H. Hughes e t a ! . , PNAS 86, 8650 (1989). 107. S. S. Bentivegna and E. Bresnick, Cancer Res. 54, 327 (1994). 108. S. Boiteux, R. Costa de Oliveira and J. Laval, ]BC 260, 8711 (1985). 109. J. Thomale, K. Hochleitner and M. F. Rajewsky, ]BC 269, 1681 (1994). 110. A. T. Gordon, F. C. R. Manning, D. P. Cooper, 6. P. Margison, P. J. O’Connor et al., Biochem. SOC. Trans. 21, 374s (1993). 111. T. Ishibashi, Y. Nakabeppu and M. Sekiguchi, JBC 269, 7645 (1994). 112. B. Kaina, G. Fritz and T. Coquerelle, Enoiron. Mol. Mutagen. 22, 283 (1993). 113. A. T. Natarajan, S . Vrrmeulen, F. Darroudi, M. B. Valentine, T. P. Brent et al., Mutagenesis 7, 83 (1992). 114. A. Shiraishi, K. Sakumi, Y. Nakatsu, H. Hayakawa and M. Sekiguchi, Carcinogenesis 13, 289 (1992). 115. Y. Nakatsu, K. Hattori, H. Hayakawa, K. Shimizu and M. Sekiguchi, Mutat. Res. 293, 119 ( 1993). 116. L. C. Harris, P. M . Potter, K . Tano, S. Shiota, S. Mitra et d.,NARes 19, 6163 (1991). 117. L. C. Harris, P. M . Potter, J. S. Remack and T. P. Brent, Cancer Res. 52, 6404 (1992). 118. G . Fritz and B. Kaina, BRRC 183, 1184 (1992). 119. X. He, L. E. Ostrowski. M. A. von Wronski, H. S. Friedman, C. J. Wikstrnad et a!., Cancer Res. 52.1144 (1992). 120. Y. Wang. T. Kato. H. Avaki, K. Ishizaki. K. Tano et a l . , Mutat. Res. 273, 221 (1992).

06-ALKYLGUANINE-DNA ALKYLTRANSFERASE

219

121. R. 0. Pieper, J. F. Costello, R. A. Kroes, B. W. Futscher, U. Marathi et al., Cancer Commun. 3, 241 (1991). 122. S. Cairns-Smith and P. Karran, Cancer Res. 52, 5257 (1992). 123. M. von Wronski, L. C. Harris, K. Tano, S. Mitra, D. D. Bigner et al., Oncol. Res. 4, 167 (1992). 124. J. F. Costello, B. W. Futshcer, K. Tano, D. M. Graunke and R. 0. Pieper, JBC 269,17228 (1994). 125. L. C . Harris, J. S. Remack and T.P. Brent, BBA 1217, 141 (1994). 126. M. A. von Wronski and T. P. Brent, Carcinogenesis 15, 577 (1994). 127. P. Karran, C. Stephenson, P. Macpherson, S. Cairns-Smith and A. Priestley, Cancer Res. 50, 1532 (1990). 128. B. W. Futscher, K. Campbell and W. S. Dalton, Cancer Res. 52, 5013 (1992). 129. P. M. Potter, L. C. Harris, J. S. Remack, C. C. Edwards and T. P. Brent, Cancer Res. 53, 1731 (1993). 130. M. Fukuhara, H. Hayakawa, K. Sakumi and M. Sekiguchi, fpn. 1. Cancer Res. 83, 72 (1992). 131. P. Lefebvre, P. Zak and F. LaVal, DNA Cell B i d . 12, 233 (1993). 132. P. Lefebvre and F. LaVal, BBRC 163, 599 (1989). 133. G. Fritz and B. Kaina, BBA 1171, 35 (1992). 134. Y. Habraken, F. LaVal, Cancer Res. 51, 1217 (1991). 135. C. L. Chan, Z. Wu, A. Eastman and E . Bresnick, Cancer Res. 52, 1804 (1992). 136. R. E. Wilson, B. Hoey and G. P. Margison, Carcinogenesis 14, 679 (1993). 137. M. Citron, M. Graver, M. Schoenhaus, S. Chen, R. Decker et al., JNCI 84, 337 (1992). 138. P. Coccia, S. Sen, E. Erba, P. Pagani, C. Marinelloet al., CancerChemother. Pharmocol. 30, 77 (1992). 139. M. Citron, R. Decker, S. Chen, S. Schneider, M. Graver et al., Cancer Res. 51, 4131 (1991). 140. M. Citron, M. Schoenhaus, M. Graver, M. Hoffman, M. Lewis et al., Cancer Invest. 11, 258 (1993). 141. M. Citron, M. Schoenhaus, H. Rothenberg, K. K. Kostroff, P. Wasserman et al., Cancer Invest. 12, 978 (1994). 142. J. Chen, Y. Zhang, C. Wang, Y. Sun, J. Fujimoto et al., Carcinogenesis 13, 1503 (1992). 143. F. Oesch and S. Klein, Cancer Res. 52, 1801 (1992). 144. G. W. Dyke, J. L. Craven, R. Hall and R. C. Garner, Cancer Lett. 68, 169 (1993). 145. J. R. Silber, B. A. Mueller, T. G. Ewers and M. S. Berger, Cancer Res. 53, 3416 (1993). 146. I. Drin, B. Schoket, S. Kostic and I. Vincze, Carcinogenesis 15, 1535 (1994). 147. S . M. Lee, G . P. Margison, N. Thatcher, P. J. O’Connor and D. P. Cooper, Br. J . Cancer 69, 853 (1994). 148. S. M. Lee, P. J. O’Connor, N. Thatcher, D. Crowther, G. P. Margison et al., Cancer Res. 54, 4072 (1994). 149. J. D. Collier: K. Guo, A. D. Burt, M . F. Bassendine and G. N. Major, Lancet 341, 207 (1993). 150. G. Wani, A. A. Wani and S. M. D’Ambrosio, Carcinogenesis 13, 463 (1992). 151. G. Wani, A. A. Wani and S. M. D’Ambrosio, Carcinogenesis 14, 737 (1993). 152. J. C. Baer, A. A. Freeman, E. S. Newlands, A. J. Watson, J. A. Rafferty et al., Br. 1, Cancer 67, 1299 (1993). 153. R. B. Mitchell and M. E. Dolan, Cancer Chemother. Pharmucol. 32, 59 (1993). 154. C. Schell, 0. Lantermann, W. Popp, C. Vahrenholz, J. Thomale et al., J . Cancer Res. Clzn. Oncol. 120, 403 (1994). 155. L. C. Harris and G. P. Margison, Br. J . Cancer 67, 1196 (1993).

220

ANTHONY E. PEGG ET AL.

156. I. Izumu, K. Mineura, K. Wataneke, and M. Kowada, /. NeuroOncol. 17, 111 (1993). 157. S. Egyhazi, J. Hansson and U. Ringborg, Proc. Am. Assoc. Cancer Res. 35, 2119 (1994). 158. L. C. Erickson, M. 0. Bradley, J. M .Ducore, R. A. G. Ewig, and K. W. Kohn, PNAS 77, 467 (1980). 159. R. B. Mitchell, R. C. Moschel and M. E. Dolan, Cancer Res. 52, 1171 (1992). 160. D. B. Yarosh, S. Hurst-Calderone, M . A. Bahich and R. S. Day, Cancer Res. 46, 1663 (1986). 161. M. E. Dolan, L. Norbeck, C. Clyde, N. K . Hora, L. C. Erickson and A. E. Pegg, Carcinogenesis 10, 1613 (1989). 162. 2.Wu, C. L. Chan, A. Eastman and E. Bresnick, Cancer Res. 52, 32 (1992). 163. L. Samson and S. Linn, Carcinogenesis 8, 227 (1987). 164. A. Rasouli-Nia, Sighhat-Ullah, R. Mirzayans, M. C. Paterson and R. S. Day, Mutat. Res. 314, 99 (1994). 165. W. J. Bodell, T.Aida and J. Rasmussen, Mutat. Res. 149, 95 (1985). 166. W. J. Bodell, K. Tokuda and D. B. Ludlum, Cancer Res. 48, 4489 (1988). 167. J. L. Schwartz, T. Turkula, T. D. Sagher and B. Straws, Carcinogenesis 10, 681 (1989). 168. K. Tokuda and W. J. Bodell, Cancer Res. 48, 3100 (1988). 169. T. Aida, R. A. Cheitlin and W. J. Bodell, Carcinogenesis 8, 1219 (1987). 170. J. E. Trey and S. L. Gerson, Cancer Res. 49, 1899 (1989). 171. C. L. Bean, C. I. Bradt. R. Hill, T. E. Johnson, M. Stallworth et al.,Mutat. Res. 307, 67 (1994). 172. S . Galloway, Enoiron. Mol. Mutagen. 23, 44 (1994). 173. 6 . Ibeanu, B. Hartenstein, W. C. Dunn, L. Y. Chang, E. Hofmann et al., Carcinogenesis 13, 1989 (1992). 174. J. G . Jansen, A. J. L. de Groot, C. M. M. van Teijlingen, P. H. M. Lohman, G . R. Mohn et al., Mutat. Res. 307, 95 (1994). 175. B. Singer and M. K. Dosanjh, Mutat. Res. 233, 45 (1990). 176. H. B. Tan, P. F. Swann and E. M . Chance, Echem 33, 5335 (1994). 177. P. F. Swann, Mutat. Res. 233, 81 (1990). 178. J. L. Yang, F. P. Hsieh, P. C. Lee and H. J. R. Tseng, Cancer Res. 54, 3857 (1994). 179. L. L. Lukash, J. Boldt, A. E. Pegg, M. E. Dolan, V. M. Maher et al.,Mutat. Res. 250,397 (1991). 180. G. W. Rebeck and L. Samson, I. Bmteriol. 173. 2068 (1991). 181. T. Roldh-Arjona, F. L. Luque-Romero, R. R. Ariza, J. Jurado and C. Pueyo, Mol. Carcinog. 9, 200 (1994). 182. G. Aquilina, R. Biondo, E. Dogliotti, M. Meuth and M. Bignami, Cancer Res. 52,6471 (1992). 183. M . 0. Sikpi, L. C. Waters, K. H. Kraemer, R. J. Preston and S. Mitra, Mol. Carcinog. 3, 30 (1990). 184. W. C. Dunn, K. Tano, G . J. Horesovsky, R. J. Preston and S. Mitra, Carcinogenesk 12,83 (1991). 185. S. Moriwaki, T. Yagi, C. Nishigori, S. Imamura and H. Takebe, Cancer Res. 51, 6219 (1991). 186. F. Palombo, E . Kohfeldt, A. Calcagnile, P. Nehls and E. Dogliotti, ] M E 223, 587 (1992). 187. N. F. Cariello and T. R. Skopek, ] M E 231, 41 (1993). 188. J . Yang, J. Lin, M. Hu and C. Wu, Cancer Res. 53, 2865 (1993). 189. T. Akagi, K. Hiromatsu, H. Iyehara-Ogawa, H. Kimura and T. Kato, Carcinogenesis 14, 725 (1993).

06-ALKYLGUANINE-DNAALKYLTRANSFEKASE

221

190. J. G. Jansen, G. R. Mohn, H. Vrieling, C. M. M. van Teijlingen, P. H. M. Lohman et al., Cancer Res. 54, 2478 (1994). 191. R. Begnini, F. Palombo and E. Dogliotti, Mutat. Res. 267, 77 (1992). 192. M. J. Horsfall, A. J. E. Gordon, P. A. Burns, M. Zielenska, G. M. E. van der Vliet et al., Environ. Mol. Mutagen. 15, 107 (1990). 193. J. Jiao, B. W. Glickman, M. W. Anderson and M. Zielenska, Mutat. Res. 301, 27 (1993). 194. R. R. Ariza, T. Roldin-Arjona, C. Hera and C. Pueyo, Carcinogenesis 14, 303 (1993). 195. M. K. Dosanjh, B. Singer and J. M. Essigmann, Bchem 30, 7027 (1991). 196. H. Zarbl, S. Sukumar, A. V. Arthur, D. Martin-Zanca and M. Barbacid, Nature 315, 382 (1985). 197. S. A. Belinsky, T. R. Devereux, R. R. Maronpot, G. D. Stoner and M. W. Anderson, Cancer Res. 49, 5305 (1989). 198. Y. Wang, M. You, S. H. Reynolds, G. Stoner and M. W. Anderson, Cancer Res. 50, 1591 (1990). 199. F. C. Richardson and K. K. Richardson, Mutat. Res. 233, 127 (1990). 200. K. Sendowski and M. F. Rajewsky, Mutat. Res. 250, 153 (1991). 201. S. Shoukry, M. W. Anderson and B. W. Glickman, Carcinogenesis 14, 155 (1993). 202. T Basic-Zaninovic, F. Palombo, M. Bignami and E. Dogliotti, NARes 20, 6543 (1992). 203. M. K. Dosanjh, G. Galeros, M. F. Goodman and 8 . Singer, Bchem 30, 11595 (1991). 204. P. R. Harbach, A. L. Filipunas, Y. Wang and C. S. Aaron, Enoiron. Mol. Mutagen. 20,96 (1992). 205. J. L. Yang, P. C. Lee, S. R. Lin and J. G. Lin, Carcinogenesis 15, 939 (1994). 206. M. S. Bronstein, J. E. Cochrane, T. R. Craft, J. A. SwenbergandT. R. Skopek, Cancer Res. 51, 5188 (1991). 207. N. Abril, C. Hera, E. Alejandre, J. A. RafTerty, G . P. Margison et al., MGG 242, 744 (1994). 208. N. Abril, T. Roldin-Arjona, M. J. Prietp-Alamo, A. A. van Zeeland and C. Pueyo, Enuiron. Mol. Mutagen. 19, 288 (1992). 209. L. Samson, J. Thomale and M. F. Rajewsky, EMBOJ. 7, 2261 (1988). 210. C. Pourzand and P. Cerutti, Carcinogenesis 14, 2193 (1993). 211. J. Thomale, F. Seiler, M. R. Miiler, S. Seeber and M. F. Rajewsky Br. /. Cancer 69,698 (1994). 212. S. M. Bronstein, M. J. Hooth, J. A. Swenberg and T. R. Skopek, Cancer Res. 52, 3851 (1992). 213. S. M. Bronstein, T. R. Skopek and J. A. Swenberg, Cancer Res. 52, 2008 (1992). 214. V. S. Goldmacher, Cancer Res. 52, 6983 (1992). 215. J. M. Essigman and M. L. Wood, Toricol.Lett. 67, 29 (1993). 216. G . T Pauly, S. H. Hughes and R. C. Moschel, Bchern 30, 11700 (1991). 217. P. M. Baumgart, H. C. Kliem, J. Gottfried-Anacker, M. Wiessler and H. H. Schmeiser, NARes 21, 3755 (1993). 218. D. T. Minnick, M. L. Veigl and W. D. Sedwick, Cancer Res. 52, 4688 (1992). 219. D. T. Minnick, S. L. Gerson, L. L. Dumenco, M. L. Veigl and W. D. Sedwick, Cancer Res. 53, 997 (1993). 220. L. Y. Y. Fong, D. E. Jensen and P. N. Magee, Carcinogenesis 11, 411 (1990). 221. S. A. Belinsky, T. R. Devereux and M. W. Anderson, Mutat. Res. 233, 105 (1990). 222. L. A. Peterson and S . Hecht, Cancer Res. 51, 5557 (1991). 223. L. Y. Y. Fong, R. F. Bevill, J. C. Thurmon and P. N. Magee, Carcinogenesis 13, 2153 (1992). 224. J. Thomale, N. Hugh, P. Nehls, G. Eberle and M. F. Rajewsky, PNAS 87, 9883 (1990). '

222

ANTHONY E. PEGG ET AL.

225. N. A. Loktionova. D S. Beniashvili, M. S. Sartania, M. A. Zabeshinski, 0. I. Kazanovaet a l . , Biochimie 75, 821 (1993). 226. S. L. Gerson, N . H. Zaidi, L. L. llumenco, E. Allay, C . Y. Fan et al., Mutat. Res. 307,541 11994)

227. W. Lijinsky, A. E. Pegg, M. R. Awer and R. C . Moschel, Jpn. J. Cancer Res. 85, 226 (1994). 228. V. M. Mikhailenko, W. L. Lijinsky, M. R. Anver, A. E. Pegg and R. C . Moschel, Proc. Am. Assoc. Cancer Res. 35, 113 (1994). 229. S. Matsukuma, Y. Nakatsuru, K. Nakagawa, T. Utakoji, H. Sugano et al., Mutat. Res. 218, 197 (1989). 230. C . Y. Fan, P. M. Potter. J. Rafkrty. A. J. Watson, L. Cawkwell et al., NARes 18, 5723 (1990). 231. I. K. Lim. L. L. Ihmenco. J. Yun. C . Donovan, B. Warman et al., Cancer Res. 50, 1701 (1990). 232. L. L. Dumenco, C . Arce, K. Norton, J. Yun, T. Wagner et al. ,Cancer Res. 51,3391 (1991). 233. Y. Nakatsuru, S. Matsukuma, M. Sekiguchi and T. Ishikawa, Mutat. Res. 254, 225 (1991). 234. J. A. RafFerty, C. Y. Fan, P. M . Potter. A. J. Watson, L. Cawkwelletal.. Mol. Carcinog. 6, 26 (1992). 235. C. A. Walter, J. Lu, M . Bhakta, S. Mitra, W. Dunn et a l . , Carcinogenesis 14, 1537 (1993). 236. Y. Nakatsuru, S . Matsukuma, N . Nemoto, H. Sugano, M . Sekiguchi et al., PNAS 90,6468 (1993). 237. L. L. Dumenco, E. Allay, K. Norton and S. L. Gerson, Science 259, 219 (1993). 238. L. Liu, E. Allay, L. L. Durnenco, and S. L. Gerson, Cancer Res. 54,4648 (1994). 239. C . Zlotogorski and L. C . Erickson, Carcinogenesis 4, 759 (1983). 240. W. J. Zeller, M . R. Berger. T. Henne and E. Weber, Cuncer Res. 46, 1714 (1986). 241. S. L. Gerson, Cancer Res 49, 3134 (1989). 242. B. W. Futscher. K. C . Micetich, D. M . Barnes, R. I. Fisher and L. C. Erickson, Cancer Commun. 1, 65 (198Y). 243. L. Meer, S. C . Schold and P. Kleihues, Biochem. Phunmol. 38, 929 (1989). 244. M. E. Dolan, C. D. Corsico and A. E. P e g . BBRC 132, 178 (1985). 245. P. Karran and S. A. Williams, Carcinogenesis 6, 789 (1985). 246. R. 0. Pieper, B. W. Futscher, Q . Dong and L. C. Erickson, Cancer Res. 51, 1581 (1991). 247. S. M. Lee, N. Thatcher and G. P. Margison, Cancer Res. 51, 619 1991). 248. S. M. Lee, N. Thatcher, D. Crowther and G. P. Margison, Br. J. Cancer 69, 452 (1994). 249. R. B. Mitchell, M. E. Dolan, L. Janisch, N . J. Vogelzang and M. J. Ratain et al., Cancer Chemother. Pharnmol. 34, 509 (1994). 250. T. J. Panella, D. C. Smith, S. C . Schold, M. P. Rogers and E. P. Winer et al., Cancer Res. 52, 2456 (1992). 251. K. C . Micetich. B. Futscher, D. Koch, R. I. Fisher and L. C . Erickson, JNCI 84, 256 (1992). 252. M.-F. Avril, J. Bonneterre, M. Delaunay, E. Grosshans, P. Fumoleau et al., Cancer Chemother. Phannocol. 27, 81 (1990). 253. S. Aamdal, B. Gerard, T. Bohrnan and M . D’lncalci, Eur. J. Cancer 28, 447 (1992). 254, R. B. Mitchell and M. E. Dolan, Cancer Chemother. Pharmacol. 32, 59 (1993). 255. U . K. Marathi, M. E. Dolan and L. C. Erickson, Biochem. Phurmucol. 48, 2127 (1994). 256. M. E. D o h , 6. I,. h r k i n , H. F. English and A. E. P e g , Cancer Chemother. Pharm o l . 25, 103 (1989). 257. A. E. P e g , M. E. Dolan, H. S. Friedman and R. C . Moschel, Proc. Am. Assoc. Cancer Res. 34, 565 (1993).

06-ALKYLGUANINE-DNA

ALKYLTRANSFERASE

223

258. M. E. Dolan, R. B. Mitchell, C. Mummert, R. C. Moschel and A. E. Pegg, Cancer Res. 51, 3367 (1991).

259. J. Chen, Y. Zhang, R. C. Moschel and M. Ikenaga, Anticancer Res. 13, 801 (1993). 260. J. Chen, Y. Zhang, R. C . Moschel and M. Ikenaga, Carcinogenesis 14, 1057 (1993). 261. U. K. Marathi, R. A. Kroes, M. E. Dolan and L. C. Erickson, Cancer Res. 53, 4281

(1993). 262. U. K. Marathi, M. E. Dolan and L. C. Erickson, Cancer Res. 54, 4371 (1994). 263. M. R. Muller, J. Thomale, C. Lensing, M. F. Rajewsky and S. Seeber, Anticancer Res. 13, 2155 (1993).

264. S. L. Gerson, S. J. Berger, M. E. Varnes and C. Donovan, Biochem. P h a m o l . 48, 543

(1994). 265. A. Sarkar, M. E. Dolan, G. 6. Gonzalez, L. J. Marton, A. E. Pegg et a!., Cancer Chemother. P h a m o l . 32, 477 (1993). 266. J. R. Silber, M. S. Bobola, T. G. Ewers, M. Muramoto and M. S. Berger, Oncol. Res. 4,

241 (1992). 267. C. Cussac, E. Mounetou, M. Rapp, J. C. Madelmont, J. C. Maurizis et al., Drug Metab. 22, 637 (1994).

268. M. E. Dolan, L. Stine, R. B. Mitchell, R . C. Moschel and A. E. P e g , Cancer Commun. 2, 371 (1990).

269. H. S. Friedman, M. E. Dolan, R. C. Moschel, A. E. Pegg, G. M. Felker et al..JNC184,

1926 (1992). 270. H. S. Friedman, M. E. Dolan, R. C. Moschel, A. E. Pegg, G. M. Felker et al., JNCZ 86,

1027 (1994). 271. 6. M. Felker, H. S. Friedman, M. E. Dolan, R. C. Moschel and C. Schold, Cancer Chemother. P h a m o l . 32, 471 (1993). 272. S. L. Gerson, E. Zborowska, K. Norton, N. H. Gordon and J. K. V. Wilson, Biochem. P h a m o l . 45, 483 (1993). 273. M. E. Dolan, A. E. Pegg, R. C. Moschel and G. B. Grindey, Biochem. Pharmocol. 46,285

(1993).

E. Dolan, A. E. Pegg, N. D. Biser, R. C. Moschel and H. F. English, Cancer Chemother. P h a m o l . 32, 221 (1993). J. G. Page, H. D. Giles, W. Phillips, S. L. Gerson, A. C. Smith et al., Proc. Am. Assoc. Cancer Res. 35, 1952 (1994). T. S. Rogers, L. E. Rodman, J. E. Tomaszewski, B. L. Osborn and J. G. Page, Proc. Am. Assoc. Cancer Res. 35, 1953 (1994). M. E. Dolan, M.-Y. Chae, A. E. P e g , J. H. Mullen, H. S. Friedman et al., Cancer Res. 54, 5123 (1994). S. L. Berg, K. Godwin and F. M. Balis, Proc. Am. Assoc. Cancer Res. 35, 2543 (1994). M. E. Dolan, A. E. Pegg, R. C. Moschel, B. R. Vishnuvajjala, K. P. Flora et al., Cancer Chemother. P h a m o l . 35, 121 (1994).

274. M. 275. 276. 277. 278. 279.

This Page Intentionally Left Blank

Replicable RNA Vectors: Prospects for Cell-free Gene Amplification, Expression, and Cloning ALEXANDER B. CHETVERIN~ AND ALEXANDERS. SPIIUN Institute of Protein Research Russian Academy of Sciences 142292 Pushchino Moscow Region, Russia

....................... ........... .................... .............. .................................. A. Discovery of RQ RNAs: The Variant Hypothesis . . . . . . . . . . . . . . . .

I. Synthesis of RNA by Qp Replicase

A. Qp Replicase ......................

B. Attempts to Separate Qf3 Replicase from Contaminating RQ RNAs C. Hypothesis of RNA Generation de N w o ....................... ............... D. Origin of RQ RNAs . . . . . E. Replication and Structure 111. RQ RNA Vectors .......... ...............

IV. Cell-free Molecular Cloning .....................

B. Future Directions . . . . . . . . . . . . . . . . . ............. V. Conclusion ............. VI. Glossary.. . . . . . . . . . . . . . . . . ................... References ..........................................

227 227 229 230 231 231 233 235 236 247 252 252 254 261 261 262 265 265 266

What happens to genes of living cells strictly obeys the central dogma of molecular biology. Genetic information is stored and replicated in the form of DNA and is expressed in the form of proteins. RNA performs a service function by transferring the information from genes to proteins. It is therefore not surprising that traditional gene engineering, which utilizes cells for cloning, amplification, and expression of exogenous genes, uses the same 1

To whom correspondence may be addressed.

Progress in Nucleic Acid Research and Molecular Biology, Vol. 51

225

Copyright 0 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.

226

ALEXANDER B . CHETVERIN AND ALEXANDER S . SPIRIN

strategy (1). This strategy has not changed even when genes have become amplified and expressed in vitro. Usually, genes are amplified at the DNA level in the polymerase chain reaction (PCR) (2) and are transcribed by a DNA-directed R N A polymerase into mRNAs, which are then used as templates for protein synthesis in cell-free translation systems. At the same time, RNA is as potent a carrier of genetic information as DNA. The genomes of many viruses are RNAs. The reproductive cycles of retroviruses include RNA-directed synthesis of DNA, and no DNA intermediates are formed during reproduction of other RNA viruses. Moreover, the genomic RNAs of positive-strand RNA viruses can function directly as mRNAs, i.e., they serve as templates for the synthesis of virus-specific proteins. Replication of such genomes is performed by RNA-directed RNA polymerases, which are also called RNA replicases. Because the host cells lack RNA replicases, these enzymes are encoded by the viral genomes (in many cases viral R N A replicases function in association with some host proteins) and are synthesized in infected cells (3). In this paper, we discuss the possibility of utilizing the reproductive system of one of the positive-strand RNA viruses-Qp bacteriophage, which infects Escherichia coli cells-for amplification, expression, and cloning of foreign genes in uitro. This virus encodes a unique RNA replicase. In contrast to RNA replicases of other viruses, Qp replicase can easily be isolated in preparative amounts and is long-lived under cell-free conditions. Qp replicase is much more efficient in cell-free reactions than other systems utilized for amplification of nucleic acids. It can produce as many as 1012 copies of RNA template in less than 30 minutes of incubation. Furthermore, because the replication product is RNA, it can be used directly as a template in cell-free translation systems. After the discovery of Qp replicase in the 1960s (4), long before the invention of PCR, there were many attempts to utilize this enzyme for the cell-free amplification of foreign genes. However, this work soon virtually ceased for two main reasons. First, replication of heterologous RNA templates was very inefficient under all conditions tested, and, second, QP replicase exhibited uncontrolled synthesis of RNA, irrespective of template addition. The rate of this “spontaneous” RNA synthesis was often higher than the rate of replication of the genomic QP RNA. It was the study of the spontaneously synthesized RNAs that resulted in the recent breakthrough in this area. These RNAs, called RQ RNAs (Replicable by Qp replicase) (5),appear capable of accommodating gene-long inserts without losing their ability to be amplified by QP replicase. In addition, the unraveling of the mystery of spontaneous RNA synthesis resulted in finding the conditions whereby its interfering effect is minimal.

REPLICABLE RNA VECTORS

227

1. Synthesis of RNA by QP Replicase This topic is discussed here to the extent that is necessary for understanding of the subsequent sections. Interested readers are referred to a series of more detailed reviews (6-9).

A. Qp Replicase Qp replicase consists of four subunits, only one of which [65,317 Da (lo)] is encoded by the viral genome ( 1 1 , 12). Three other subunits are host proteins normally involved in protein synthesis (7). These are elongation factors, EF-Tu (13) [MW 43,225 (14)]and EF-Ts (13)[MW 30,257 (15)],and ribosomal protein S1 (16-18) [MW 61,159 (19)].The four subunits form a stable stoichiometric complex. A small fraction of the enzyme can lose S1 protein during purification (20, 21). Figure 1 presents a three-dimensional model of the Qp replicase molecule (22). In the host cells, EF-Tu and EF-Ts catalyze the GTP-dependent binding of aminoacyl-tRNA to the ribosome (7, 23), while protein S1 is a component of the small (304) ribosomal subunit and participates in the mRNA binding (24-26). No associations were detected between S1 and EF-Tu or EF-Ts outside the replicase complex. The exact roles of these host proteins in the Qp replicase functioning are now known. The host subunits participate in the initiation of RNA synthesis and are not needed for elongation of the nascent RNA strands (27). Thus, the RNA polymerizing activity resides in the virus-encoded subunit. Synthesis of this subunit in uiuo is controlled by the tertiary structure of the genomic Qp RNA and by the coat protein of QP phage (28-30). Early in the infection cycle the initiation region of the RNA polymerase cistron is hidden within the Qp RNA tertiary structure and is not available for ribosome binding until the RNA is partially unfolded as a result of translation of the upstream coat protein cistron (31, 32). Later in infection, synthesis of the polymerase subunit is again repressed, now because of complexing of the polymerase initiation region with coat protein that accumulates to large amounts (29, 33). As a result, the content of the QP replicase in the E . coli cells infected with the wild-type Qp phage is relatively low. A high content of Qp replicase can be obtained by infecting the cells with Qp phage strains that bear a nonsense mutation in the second half of the coat protein cistron (32, 34). In this case, translation of the first half of the coat protein cistron triggers the initiation of translation of the RNA polymerase cistron, which is not repressed later because no functionally active coat protein is synthesized. A high content of Qp replicase is also observed in the E . coli cells transformed with plasmids carrying the isolated polymerase

228

ALEXANDER 8 . CHETVERIN AND ALEXANDER S. SPIRIN

A

FIG. 1. Electron-microscope images of Qp replicase. (A) Three main types of images of individual Qp replicase molecules (I-d), shown schematically at the top. (B) Three-dimensional model of the QB replicase molecule. Three views of the model (a-c) correspond to the three main types of images. The smaller (white-colored) subparticle presumably corresponds to the EFTu.EFTs complex. The larger subparticle probably consists of the phage-coded polymerase subunit and protein S1. From 22.

gene (35).More than 20 mg of homogeneous Qp replicase preparation is routinely isolated from 100 g of either type of cells by employing a standard procedure. The purified QP replicase can be stored in a freezer for years without appreciable loss of activity. In cell-free reactions, it continues to synthesize RNA for several days if a sufficient amount of NTP substrates is provided (36).

REPLICABLE RNA VECTORS

229

B. Replication of Qp RNA The genome of Qp phage consists of a single-stranded RNA that is 4217 nucleotides (nt) in length (10). It is the positive (+) strand containing three cistrons that encodes (in the order of their appearance in RNA) maturation protein (A2), coat protein, and the polymerase subunit. The fourth protein, A,, is synthesized by reading the coat protein cistron through its stop-codon

(28, 30).

Qp(+) RNA cannot serve as a template for the purified Qf3 replicase. It can only be copied by the enzyme in the presence of a host factor (HF) (3740). The HF consists of six identical subunits (40, 41), each with a molecular mass of 11,166 Da (42), and is associated with the 3 0 3 ribosomal subunit (43).Its function in the uninfected E . coli cells is not known. HF binds with Qp(+) RNA at two sites located at approximately 50 and 720 nt from the 3’ end (10, 44, 45). The amount of HF required for a half-activation of RNA synthesis is proportional to the amount of (+) strands, and the maximum rate of RNA synthesis is observed when 1 mol of the HF hexamer is bound per mole of Qp RNA (40). Binding of HF seems to induce a conformational change in the (+) strand and this exposes its otherwise hidden 3’ end (A. V. Munishkin and A. B. Chetverin, unpublished) and, probably brings the 3’ end into the proximity of the Qp replicase bound with the internal S-site/M-site domain: in addition to the 3’ end, HF was shown recently to interact with the Qp replicasebinding S- and M-sites on the Qp(+) strand (46). Unlike the (+) strands, copying of the (-) strands of Qp RNA does not require HF (47). Like other polynucleotide polymerases, Qp replicase elongates the product RNA strand at its 3’ terminus, and thus reads the template strand in the 3’ + 5’ direction (47-50). The 3’ end of both the (+) and (-) strands of Qp RNA has the same sequence, . . . CCCAOH (51). Copying starts at the penultimate C of the template strand, whereas the terminal A is skipped (5153). Accordingly, the 5’-terminal nucleotide of the product strand is always G (54).The 3‘-terminal nucleotide of the mature product strand is A, which is added subsequent to the last template-encoded CCC (the 5’ end of template consists of pppGGG . . .) (51, 54). The 3’-terminal A is not essential for the template activity of Qp RNA (54)and, if removed from mature RNA strands by chemical treatment, it cannot be reconstituted by Qp replicase (51). Seemingly, the 3‘-terminal adenylylation is an intrinsic event of the strand termination mechanism. Substrates of QP replicase are the nucleoside 5‘-triphosphates (ATP, GTP, CTP, and UTP) (55). During incorporation into the product strand, they lose their P,y-pyrophosphoryl groups (56), with the exception of the 5‘-terminal G, whose triphosphate moiety is preserved in mature strands

230

ALEXANDER B. CHETVERIN AND ALEXANDER S. SPIFUN

(54). GTP plays a special role in the initiation of the RNA synthesis. Although ITP can replace GTP during strand elongation, it cannot do so at the initiation step (57, 58). Only single-stranded RNA can serve as a template for QP replicase; perfect RNA duplexes are inactive (59, 60). The product RNA is also singlestranded. No double-stranded intermediates are formed during replication of Qp RNA (48, 59, 61, 62). At the same time, the growing 3’ end of the nascent strand must be base-paired with its template because template copying obeys the Watson-Crick complementarity rules (48), which are valid only within the structure of a double helix. It is not known what causes melting of the transient intermolecular helix and what prevents the strands from being reannealed during the elongation step. QP replicase participates in maintaining the single-strandedness of replication intermediates, inasmuch as denaturing the enzyme results in the immediate collapse of the template and the product strand into a duplex (48, 62, 63). At the same time, the structure of Qp RNA must also play an important role, because copying heterologous templates by native enzyme results in duplex formation (see Section 1,C). In the first round of replication the (+) strand of Qp RNA serves as a template for synthesis of the complementary (-) strand. Both strands serve as templates for synthesis of their complementary copies in the second round, 4 strands serve as templates in the third round, 8 in the fourth round, 16 in the fifth round, and so on. Thus, the number of templates ( N ) increases according to the formula N = 2” where n is the number of completed rounds. In other words, RNA is amplified exponentially (48, 49, 64). Exponential kinetics proceeds until the molar concentration of RNA reaches the molar concentration of active Qp replicase. After this, linear synthesis is observed (65). In a cell-free reaction the (+) and (-) strands of Qp RNA are produced in equal amounts, provided that HF is present in excess (66). It follows that both strands are equally efficient QQ replicase templates. At the same time, in phage-infected cells, the (+) strands of Qp RNA are accumulated in a 10fold excess over the (-) strands (48, 62, 67). As discussed in Section III,B,l, the main reason for the observed replication asymmetry is the involvement of the (+) strands in the competing process of translation.

C. Copying Heterologous Templates Various RNAs can serve as templates for synthesis of their complementary copies by Qp replicase, including cellular RNA (68, 69) and synthetic polyribonucleotides (37, 70, 72). Polydeoxyribonucleotides can also be copied, although less efficiently (72).The presence in a template of an oligo(C) region is essential (37, 70, 73). Oligo(C) seems to play a key role in the

231

REPLICABLE RNA VECTORS

initiation of RNA synthesis, because it occurs at the 3' end of every efficient Q p replicase template (6, 74).The first (5'-terminal) nucleotide of the product strand is always pppG, no matter which sequence is present at the 3' end of the template (58, 75). Apparently, Q p replicase scans a template from the 3' end until it meets the first oligo(C) region, and then starts copying. The best synthetic template is poly(C), which is routinely used in assays of the QP replicase activity (20, 21, 66). Initiation on a heterologous template can be enhanced by the following means.

1. Introduction of an oligo(C) or oligo(dC) stretch into the 3' end of the template (58, 73, 76). 2. Use of an oligoribonucleotide or oligodeoxyribonucleotide primer complementary to a region of the template (57, 77, 78). 3. Use of an elevated concentration of GTP (68). 4. Addition of Mn2+ ions (75, 76, 79). However, even in the best instances, RNA synthesis on heterologous templates is much less efficient than on Qp RNA and is confined to a single round of template copying (7). In contradistinction to the replication of Qp RNA, the product strands are annealed to their templates (71, 75, 77). Consequently, it is the formation of dead duplexes rather than the absence of a special recognition structure that seems to prevent heterologous templates from being exponentially amplified by Qp replicase. These observations point to a key role played by the structure of Qp RNA and other efficient Q p replicase templates in maintaining the single-stranded state of the replication intermediates. The presence of stable secondary structure elements is important (80, 81), but not sufficient. Although the secondary structure of the potato spindle tuber viroid RNA is very stable (82), replication of this template ceases after synthesis of the complementary copy (76).

II. R Q RNAs The inability of Q p replicase to amplify exponentially heterologous templates, including the genomic RNAs of related bacteriophages (4, 83-85), has led to a wide-spread belief in the strict template specificity of this enzyme. At the same time, Qp replicase does amplify exponentially numerous unrelated RNA species called RQ RNAs (5, 74).

A. Discovery of RQ RNAs: The Variant Hypothesis RQ RNAs were observed for the first time by S . Spiegelman and colleagues in their experiments on serial transfers of Q p RNA replication prod-

232

ALEXANDER B. CHETVERIN AND ALEXANDER S. SPIRIN

ucts between cell-free reactions that contained partially purified Qp replicase and labeled NTPs (49, 86-88). The first reaction was initiated by RNA isolated from QP phage. After incubation for 5-20 minutes, a small sample of the replication products was transferred to a fresh medium. The procedure was repeated for several tens of consecutive cycles. The replication products were characterized by their ability to transfect E. coli protoplasts and by label distribution through sucrose gradients. The products of the first cycle contained a significant proportion of full-sized Q p RNA, and were infectious. The infectiousness of the products and the content of the full-sized Qp RNA gradually decreased in the subsequent cycles. The disappearance of Qp RNA from the replication products was accompanied by accumulation of short RNA species (several hundreds of nucleotides in length) that replicated at an even higher rate than did Qp RNA. These changes occurred faster the smaller the transferred sample, i. e., the greater the dilution of the products synthesized in the previous cycle. The authors concluded that the short RNA species had arisen from Qp RNA as a result of evolution by virtue of spontaneous deletions and selection of those of the deleted variants whose replication rate was the highest. Accordingly, the short replicating RNAs were called Qp RNA variants, referred to here also as RQ RNAs. RQ RNAs were later found in the extracts of E. coli cells infected with Qp phage (89). In contrast to healthy bacteria, the infected cells contained RNAs that sedimented at 6 S and were capable of amplification by Qp replicase in uitro. The 6-S fraction amounted to 10% of the total cellular RNA. In accord with Spiegelman’s conclusion, “ 6 4 RNAs” were considered deleted variants of Qp RNA generated during its replication in uiuo (6). However, it was also observed that even the purest of the then available preparations of Qp replicase contained 6-S RNAs detectable by ultraviolet absorption (89). This observation could have questioned interpretation of the results obtained by Spiegelman and co-workers in their experiments in uitro (but did not). If their Q p replicase and/or QP RNA preparations contained trace amounts of RQ RNAs that occur in infected cells, the results could be explained by merely supplanting Qp RNA with the already existing short species, rather than by origination of RQ RNAs from Qp RNA during serial transfers. Finally, RQ RNAs are spontaneously synthesized to large amounts by a purified Qp replicase when the enzyme is incubated with the four NTPs in the absence of added template (90-92). A great diversity of RQ RNA species can be synthesized in one reaction tube, with their lengths ranging from several tens to several hundreds of nucleotides (Fig. 2) (92a). The source of these RNAs seemed apparent because it was already known that QP replicase preparations isolated from the infected cells can be contaminated with 6-S RNAs, and that even a few such molecules are sufficient to initiate

REPLICABLE RNA VECTORS

233

FIG. 2. Gel electrophoresis patterns of RQ RNAs spontaneously synthesized in two independent Qp replicase reactions. Arrows indicate the bands of RQ120 and RQ223 RNAs that are 120 and 223 nt in length, respectively. Lane 1, 1 pg, and lane 2 , 5 pg of the products of the first reaction; lane 3, 1 pg, and lane 4, 5 pg of the products of the second reaction. RNAs were stained with ethidium bromide. From 92a.

exponential RNA synthesis (88, 89). Several of the spontaneously synthesized RQ RNAs were sequenced. some of them-“midivariant” [MDV-1 (93)], “minivariant” [MNV-11 (!%)I, and “nanovariant” [WS-1 (92)]appeared to contain sequences homologous to segments of QP RNA (Fig. 3) and could indeed be considered as variants of the QP phage genome. Thus, the following concept regarding the origin of RQ RNAs was commonly accepted early in the 1970s: these RNAs are deleted variants of the genomic QP RNA, which are generated in infected cells or in a cell-free reaction due to replication errors, and which contaminate Q p replicase preparations, resulting in the intensive RNA synthesis occurring in cell-free reactions in the absence of added template. We will refer to this concept as the “variant hypothesis.”

B. Attempts to Separate Qp Replicase from Contaminating R Q RNAs Much effort has been devoted to removal of traces of RQ RNAs from QP replicase preparations. Elaborate purification procedures, including multiple ion-exchange chromatography steps allowing Qf3 replicase to be efficiently separated from RNAs and RNA-protein complexes (20, 21, 95), did not eliminate spontaneous RNA synthesis. Reportedly, RQ RNAs can be completely removed in a procedure that

MDV-1(-) RNA I f0

40

I

I

To

............................................ ---... O

-

-

-

w

20

-

io0

80

I

P

I I t20 I 140 I o I c c Q L I A c o u C ~ C O C Q C C C O C m ) U C O A W C ( R U C C [ I O

. . .C

... .................................................... .. 5

3

A

A

c

-

m

.

4140

4120

liGC8

40

Qp (+) RNA Qp (+) RNA ?O

MNV-11(-) RNA

?O

........................................... ..

00W-ACCCCCmm-)-M-~.

................................................

Q

G

Q

A

A

c

C

c

u

m

v

Q

c

20

60

80

I

I

I

-

40

MDV-1(-) RNA

Qp (+) RNA 120

.. . ~

~

I

I

I

tso v

Po

........................................................ ..... -

U

~

~

-

-

c

20

u

i40

-

-

~

C

~

~

A

40

-

-

~

~

Q

C

~

-

60

U

-

~

C

~

~

80

I

-

c

-

~

-

-

-

~

..

~

-

-

-

-

-

~ ~

U

90

WS-1(+) RNA FIG. 3. Homologies of MDV-1 (93). MNV-11 (94). and WS-1 (94) RNAs with segments of Qp RNA (10).Bold dots indicate perfect matches, fine dots indicate transition-type base differences. Note the homology between the MDV-1 and MNV-11 sequences.

~

~

REPLICABLE RNA VECTORS

235

quantitatively deprives QP replicase of S1, an RNA-bonding protein. The resulting enzyme preparation exhibits no spontaneous RNA synthesis up to 24 hours of incubation with NTPs, but efficiently synthesizes RQ RNAs on template addition (96). However, a later work argues that even such a preparation does synthesize RQ RNA in the absence of added template if higher concentrations of Q p replicase and NTPs are used (97). To account for the possibility that RQ RNA molecules could be occluded within the QP replicase complex and therefore do not dissociate during purification, we isolated Q p replicase polypeptides on a DEAE-Sephadex column under denaturing conditions (8 M urea and 0.2 M P-mercaptoethanol). Nevertheless, the active enzyme reconstituted by dialysis against a renaturation buffer (98) is still capable of spontaneously synthesizing RQ RNAs (A. A. Volkov and A. B. Chetverin, unpublished). We also treated the enzyme preparation for up to 4 days with a biotinylated RNase-A present in a 35-fold molar excess over QP replicase. (Control experiments showed that at a 1:l molar ratio of the RNase to RQ RNA, the latter is completely destroyed down to short oligonucleotides in a few minutes.) After removal of RNase on a streptavidin-Sepharose column, Qp replicase still displayed the spontaneous synthesis (A. A. Volkov, E. E. Maximov and A. B. Chetverin, unpublished). Most surprising was the fact that neither the kinetics of spontaneous synthesis nor the pattern of synthesized RQ RNAs changed significantly after the above procedures. Finally, spontaneous RNA synthesis occurs even when Qp replicase is obtained from uninfected E . coli cells transformed with a plasmid encoding the Q p phage polymerase subunit (97). Because the full-sized Q p RNA was absent from these cells and thus did not replicate, its variants could not be generated there. Obviously, this observation contradicts the variant hypothesis.

C. Hypothesis of RNA Generation de Novo Sumper and Luce (95) succeeded in purlfying QP replicase to the extent that the addition of as few as five molecules of RQ RNA templates appreciably stimulated RNA synthesis above the spontaneous background. This indicated that the average aliquot of the QP replicase preparation added to the incubation mixture contained less than five molecules of active RQ RNAs. Nevertheless, after a reduction of the incubation volume by 10-4 (and, consequently, of the amount of added Q p replicase), the authors still observed spontaneous synthesis in each of the reactions. This observation allowed them to advance a concept that, during the reaction time, Qp replicase can synthesize RQ RNAs without any template (de nmo) by virtue of random condensation of nucleotides, with the fortuitous formation of replicable molecules and their subsequent evolution into rapidly amplifiable spe-

236

ALEXANDER B . CHETVERIN AND ALEXANDER S . SPIRIN

cies. We refer to this concept as the de nmo hypothesis.” Should it prove correct, the Q p replicase system could be used as an efficient model for studying the creation of informational molecules. However, this would bury any hope to eliminate the background caused by the spontaneously growing RQ RNAs, inasmuch as spontaneous RNA synthesis would appear to be an intrinsic property of QP replicase.

D. Origin of RQ RNAs 1. SIMILARITIES BETWEEN RQ RNAs ISOLATED IN DIFFERENTLABORATORIES Considering the de nouo hypothesis to be correct, one could predict that every new spontaneous reaction would result in different products. In this respect, an unexplained feature of the experiments of Sumper and Luce (95) is that most of their untemplated reactions yielded RQ RNA of the same size and fingerprint pattern. It turned out (99) that this RNA species is very similar, if not identical, to MDV-1 RNA isolated and sequenced earlier (90, 93). In our experiments (A. V. Munishkin, H. V. Chetverina and A. B. Chetverin, unpublished) spontaneous synthesis of RQ223 RNA, which was almost identical to MDV-1 (Fig. 4A), was also observed. To explain these puzzling observations, the authors of the de nmo hypothesis suggested that Q@ replicase can “instruct” synthesis of its own templates, thereby resulting in a “convergence” of the products of separate de nmo reactions (95, 99). They pointed out that the 221-nt-long MDV-1 RNA can be viewed as built of 57 sequence blocks of four types: CCC(C) arid UUCG, and their complements GGG(G) and CGAA. All the deviations found in the MDV-1 from a “master” sequence made of these blocks could be accounted for by 56 point-mutations and two insertions. Oligo(C) is recognized by QP replicase during the initiation step, whereas UUCG resembles the invariant W C G loop of tRNA and could therefore be recognized by the QP replicase subunits that participate in protein synthesis (EF-Tu, EF-Ts, or Sl). According to the authors, “it is likely that the oligomers CCC(C) and UUCG are selected by the enzyme through its inherent affinity to these patterns and subsequently used as templates during elongation, thereby introducing their complementary oligomers GGG(G) and CGAA into the structure” (99). However, even if the ability of QP replicase to synthesize the above oligonucleotides without template is demonstrated, this hypothesis would leave unexplained how the 57 blocks of four types always occur in the same order, which corresponds to only one of the 457 (1034) possible arrangements. It would also remain unclear why the 56 mutations and four insertions always occur at the same positions and lead to the appearance of the same nucleotides. Finally, the hypothesis fails to explain the origin of the two extended

237

REPLICABLE RNA VECTORS

A

Po

I

f0

Po

............................................................ I

I

GGGGACCCCCCCQGAOQUQCGGGCACCUCGUACGGGA~CGACCGUG

RQ223+1(+)

MOOACCCCCCCQGM~CGAQGUGC~CCUCGUACGGGAGVUCGACCGUG

MDV-1(+)

Po

too

............................................................

tZ0

Yo ............................................................

PoRQ223+1(+)

I

I

I

A c ~ ~ C Q a a c v A o c ~ C ( l U ( I C a C O C V C V C C C A a a [

RQ223+1(+)

A ~ ~ C A C ~ ~ O C G C ~ C U C C C M C

MDV-1(+)

I

I

t40

I

C [ J O C O [ I O C O U I J [ I C O O C O A C Q C A C O A a A A C C G C C A C O G

C [ J O C W ~ C O U I J [ I C ~ ~ C G A ~ C C G C ~ ~ ~ C ~ G C ~ M ~ DV-1 C C (+) [ J O C G

........................................... PZO I

I

?OO

CGcAGcCCGclJGCGCGA~GAccccccm~cccA

RQ223+1(+)

COCAOCCCOCOQCOCGA~~ccccc~-wccCA

MDV-l(+)

B

.........................................................

~

~

~

~

C

~

-

-

-

O

a

a

C

~

C

C

C

U

C

G

C

G

U

~

A

~RQ87,(+) A C O

C

O

CQCGUA ~ I J W~ ~A C C C Q C ? MNV-1l(+) ooov[)cA~ J A F " C U A W C ~ A ~ A ~ C 60 20 40

F " " v

..........................

Q G W G A C C C C C C O C C C A

RQ874(+)

daDaAcccc c A cccQ c$oM a A a

MNV-1l(+)

F

80

C

To f0 To to ............................................................ ?O

OQ~C(I~GCOOIUACACOCACOO~ACK~DCCC~

Po

RQlXml(+)

~ ~ Q c a C A I M C A C O C A C O O O A O G O C C ~ C ~ G C G C ~ U ~ sv-7 C U A G

Po

To

too

tZ0

7O +lo ~ U ~ C U ~ U C C C U C ( ~ C ~ G C C U A ~ U C C C U ~ ~RQ135,(+) C

...........................................................

~ U C C C V ~ C C C U C O U A M W a A C O C A C a O O O O A D a sv-7 C

.............. t30

cm43eAaccccA

C Q W U ~ C C C A

FIG.4. Sequence similarity between RQ RNAs independently isolated in different laboratories. (A) RQ223+1(this laboratory) and MDV-1 (93) RNAs; (B) RQ87_, (this laboratory) and MNV-11 (94)RNAs; (C) RQ135 (74) and SV-7 (102) RNAs.

A

238

ALEXANDER €3. CHETVERIN AND ALEXANDER S. SPIRIN

homologies between MDV-1 and Qp RNAs (Fig. 3), one ofwhich (42 identical positions of 48) passes the most rigorous statistical tests. Of course, independently arisen RQ RNAs might converge if there was a unique sequence characterized by the highest rate of replication. However, this is not the case; different RQ RNAs replicate at similar rates. For example, RQ135 RNA whose sequence is quite dissimilar from the sequence of either MDV-1 or Q p RNA is as efficient a QP replicase template as MDV-1 RNA (74). Moreover, the rate of replication of MDV-1 RNA does not appreciably change on insertion into this RNA of different sequences of up to several tens of nucleotides in length (100, 101). There are other examples of independent isolation of similar RQ RNAs from the products of spontaneous synthesis. The sequence of RQ87 RNA isolated in our laboratory (L. A. Bondareva, A. V. Munishkin and A. B. Chetverin, unpublished) is closely related to that of the MNV-11 RNA reported by Biebricher (94) (Fig. 4B). Recently, Biebricher and Luce (102) published the sequence of SV-7 RNA that is virtually identical to the sequence of the RQ135 RNA we reported earlier (74)(Fig. 4C). Because there was no exchange of RQ RNA samples between these two laboratories, the striking similarity between RQ RNAs spontaneously synthesized in independent reactions needs an explanation. 2. RQ RNAs ARE SATELLITESOF Qp PHAGE The RQ RNA species identical to those isolated in other laboratories appeared among the products of our spontaneous reactions after we had carried out experiments utilizing large amounts of wild-type Qp phage. This observation led us to suggest that the source of these RQ RNAs could be Qp phage particles (103).At that time, it was not known whether RQ RNAs can be transmitted by Qp phage between host cells, or arise anew and disappear within the span of each infectious cycle (92). Unlike infected cells (89),Qp phage does not contain short RNA species in amounts detectable by standard methods (Fig. 5A). However, such RNAs can be revealed by labeling them at the 3' termini with [32P]pCp in the presence of T4 RNA ligase (103),as shown in Fig. 5B. In contrast to detection by ultraviolet absorption or staining techniques, the terminal labeling reveals only the RNAs with intact 3'-OH ends (thus differentiating them from Q p RNA degradation products) and is proportional to their molar content (104). Remarkably, the short RNAs found in Q p phage are indistinguishable by gel electrophoresis from the major RNA species spontaneously synthesized in our experiments. Some of these species also grew spontaneously in other laboratories. These RNAs appeared to be amplifiable by Qp replicase (Fig. SC), i.e., they were HQ RNAs. RQ RNAs remained associated within a core

REPLICABLE RNA VECTORS

239

FIG. 5. Observation of RQ RNAs in wild-type QP phage particles. (A) The products of a QP replicase reaction (lane 1) initiated with the RNA (lane 2) isolated from a highly purified Q p phage. The RNAs were stained with ethidium bromide after electrophoresis through a 2% agarose gel. (B) Electrophoresis through a 10% polyacrylamide gel of the Q p phage RNA preparation end-labeled with [32P]pCpwithout addition (lane 1)and with the addition of RQl20 and RQ135 RNAs (lane 2). The molar proportion of individual RQ RNAs in the preparation was estimated from the increase in the relative intensity of the RQl20 and RQ135 RNA bands. (C) Toluidine blue-stained products ofa Q p replicase reaction carried out in a 1.5%agarose medium and initiated with 1pg of the short-chain RNAs isolated from Q p phage. The amplification factor was lo7 within 1 hour of incubation. From 103.

of Q p phage particles on removal of more than 90% of coat protein by treatment with dodecyl sulfate (103). The particles containing RQ RNAs were indistinguishable from the infectious phage when analyzed by centrifugation through CsCl or sucrose gradients. The molar content of RQ RNAs in Q p phage was less than stoichiometric, but comparable with that of QP RNA (Fig. 5B). The molar proportion of RQ RNAs did not significantly change after a number of consecutive passages of Q p phage through fresh E . coli cells (103). It followed that RQ RNAs can be transmitted by Q p phage, can co-infect the cells, and can propagate in vivo together with phage. Thus, RQ RNAs seem to be typical viral satellites. They can be considered as the smallest known molecular parasites. It is not known whether they play any role in the infectious process, or are selfish molecules that merely use the reproductive system of Qp phage for their own multiplication. Also, it is unknown how ancient these molecules are. If, as is likely, QP phage is the source of RQ223 (MDV-1) RNA spontaneously synthesized in different laboratories (90, 99, 103), the origin of this RQ RNA can be traced back to 1964, when Q p phage was discovered by Watanabe (105)and soon after was

240

ALEXANDER B . CHETVERIN AND ALEXANDER S. SPIRIN

sent to Spiegelman (4), who isolated MDV-1 RNA for the first time in 1972 (W), and to Weissmann (106).We isolated RQ223 RNA in 1986; the wild-type Qp phage received by us from Kaesberg was brought from the laboratory of C. Weissmann (P. Kaesberg, personal communication). Interestingly, although RQ223 and MDV-1 RNAs diverged soon after the discovery of Q p phage, their sequences remained almost unchanged (Fig. 4A), indicating a low rate of evolution

3. SPONTANEOUS SYNTHESIS1s CAUSED BY AIRBORNERQ RNA MOLECULES Thus, a plausible source of the reproducibly synthesized RQ RNAs was found, and template dependence of their synthesis in “spontaneous” reactions became apparent. Yet, it remained unclear how the templates occur in incubation mixtures despite their absence from Qp replicase preparations and despite every precaution made to exclude carryovers between samples (95). To trace the source of these templates we have developed a method for detection of individual RQ RNA molecules (103). For this purpose, the Q p replicase reaction was carried out in an agarose gel rather than in a liquid medium. In this format, the progeny of each RQ RNA molecule is accumulated in a limited zone around the progenitor template, i.e., it forms a colony. By counting the number of colonies it is possible to say how many RQ RNA molecules were present in the sample prior to solidification of the agarose. To prevent RQ RNAs from premature amplification, we used two layers of agarose cast in a Petri dish, one atop another. The lower layer contained NTPs, whereas the upper layer contained Qf3 replicase and w a s cast after solidification of the first layer. Thus, the onset of the amplification reaction was controlled by diffusion of substrates into the enzyme layer and the RNA colonies formed on the interface between the layers. Figure 6 depicts two dishes (A and B) that were prepared in a room where experiments with RQ RNAs were usually carried out. After the substrate layer had been cast, the dishes were left on a bench for 1 hour. While dish A was covered with a lid, dish B was left open. The enzyme layer was then cast in each dish and incubation was continued for 1 hour prior to agarose staining with ethidium bromide. As can be seen, both the number of distinguishable colonies and the overall fluorescence intensity are greater in dish B than in A. Because the dishes were prepared under identical conditions, the only reason for the different number of colonies could be the additional exposure of the substrate layer of dish B to laboratory air. Figure 6C shows a dish prepared in a room where no experiments with RQ RNAs had been previously carried out. In this case, the number of RNA colonies is even smaller than in dish A.

24 1

REPLICABLE RNA VECTORS

FIG. 6. Spontaneous growth of RNA colonies in two-layer agarose sandwiches prepared in 35-mm dishes. The lower layer contained NTPs and the upper layer contained QP replicase. The upper layer was cast without exposure (A, C) or after a 1-hour exposure of the lower layer to laboratory air (B). The experiments were carried out in a room where RQ RNAs were often used (A, B) or outside that room (C). From 103 (A, B). Reprinted from 150 by permission of Oxford University Press (C).

These experiments show that RQ RNA molecules are present in laboratory air and can invade the reaction medium, resulting in the apparently spontaneous synthesis of RNA. Furthermore, these experiments suggest that the interference with amplification of desirable templates from background RNA growth can be substantially reduced by carrying out the amplification reactions in an immobilized medium such as agarose gel. This largely eliminates competition between different templates, because the progeny is not allowed to spread throughout the reaction volume.

4. RQ RNAs ARE FORMED BY RNA

RECOMBINATION

The above findings seemingly rehabilitate the variant hypothesis, albeit with the essential additions that RQ RNAs can be transmitted by Q p phage, and that they can initiate spontaneous RNA synthesis by invading the reaction mixture from the air, rather than by being present in the Qp replicase preparation. With this reservation, the occurrence of spontaneous synthesis in cases when Qp replicase is not contaminated by RQ RNAs, or when it is isolated from uninfected cells, is easily explained. However, these additions do not change the key point of the hypothesis, which assumes RQ RNAs to be defective Q p phage genomes formed from Qp RNA due to replication errors, such as deletions and nucleotide substitutions. Errors of this type are a likely source of the heterogeneity of RQ RNA populations, which is welldocumented (74, 92, 94). Any sequence perturbations were believed to occur only in cis, because all attempts to detect intermolecular recombination in RNA bacteriophages have been unsuccessful (107). Thus, according to the variant hypothesis, RQ RNAs can only be generated during replication of a Qp replicase template, and the only ancestor of all RQ RNAs is the genomic RNA of Qp phage. The sequences of the first several RQ RNAs apparently support this hypothesis. Extended segments of

242

ALEXANDER B . CHETVERIN AND ALEXANDER S. SPIRIN

midi-, mini-, and nanovariants appear to be homologous to various regions of Q p RNA (Fig. 3). The presence in these RQ RNAs of regions that are nonhomologous to QP RNA, as well as the lack of homology between Q p RNA and the “microvariant” (91) or its derivative CT-RNA (80),could easily be explained by sequence degeneration during evolution of the variants toward faster replicating phenotypes (49, 86-88). We were therefore surprised by the sequence of RQ120 RNA, the first RQ RNA isolated by us from the products of spontaneous synthesis (5). As seen in Fig. 7A, this RNA consists of two distinct segments, one derived from the last quarter of the Q p coat protein cistron and the other originated from the 3’-terminal half of E. coli tRNApp. Because no DNA intermediate exists in the QP phage life-cycle, such a molecule could have arisen only as a result of intermolecular RNA recombination. Even more surprising has been the sequence of RQ135 RNA (74). This RNA consists entirely of segments homologous to 23-S ribosomal RNA and to an mRNA of phage A (Fig. 7B). The following conclusions can be drawn from the above observations.

1. RNA recombination can occur in RNA phages. This extends the earlier findings of R N A recombination in animal and plant viruses (108). The recent observations of RNA recombination in experiments with phage QP (109)and a double-stranded RNA phage $6 (110)in uiuo support this conclusion. 2. Qp RNA is not the only source of RQ RNAs. Therefore, the view that RQ RNAs are Qp RNA variants is not correct. It can be noticed that the tentative ancestors of known RQ RNAs are those RNAs that occur in E. coli cells in large amounts (phage RNAs, tRNA, rRNA), suggesting that it is the concentration of an RNA in the cell that determines the probability of its involvement in a recombination event. 3. Replicability of a template is neither a prerequisite for its participation in the recombination process, nor required for the resulting recombinant to be replicable. In this regard, the most striking example is RQ135 RNA, entirely made of segments of nonrephcable molecules. Nonetheless, it is one of the most efficient QP replicase templates (74). One more unexpected observation is the small divergence of the final structures of RQ RNAs from the structures of their tentative ancestors, which is most striking in the case of RQl20 RNA (Fig. 7A). Even when the nucleotide sequence is considerably mutated (as in the case of RQ135 RNA), a great similarity to the ancestors can be seen at the level of secondary structure. Thus, one of the hairpins of RQ135 RNA is almost identical to the hairpin of E. coli 2 3 3 RNA, from which it presumably descends (74). All the base differences reside in the hairpin stems, and all of them enhance the

A

tRNAyP 50

RQl20(-) RNA

70

.................................

I ...UUMOOO[ICGCOOOLIOCOcmccmccQccA

40 I

80

60

I

120 1

100

I

I

I

I

I

I

.....................................................................................

w Q A l l c ( i c 0 ~ c c m c ~ c ~ ~ m c ~ ~ ~ m Q c v c Q c v ~ c m ~ m G A u c G A v ~ G A v ~ ~ c c ~ 1

I

I

.. .c U

~

Y

20

~

~

c

m

~

~

~

Q

c

v

1660

C

~?OCGAUQcwA A Q yQ Ac u ~m y m- - mc c c ..~ .

P

1740

1120

1100

Qj3Ti0) RNA

B

23s RNA RQ135(+)

23s RNA 40 I

20 40 I I mAcP.cmQQAvQccm~c---AQA.

.. .............................. ............................. -IIowyLBAccc-cA .................................... ................... ........................... ~ ~ y m ~ c m c. y . . .. A A t a v y c m ~ ~ c c m c A R .. .

~

.. .

20 I

c

~

I

Q

c

I

I

I

O

(

I

O

A

A

U

I

I

C

O

I

U

60

100

I

.. . A c v A A Q c - - w ~ O c m ~ c - - - A Q. A . .. . . m A A Q C

20 I

120

L

C

O

O

Q

M

W I

C

C

I

U

V

U

I

L

C

-

130 I

.tj4UALVCCCTvVA--------------

39060

740

(

39080

39100

X 0-protein mRNA

23s RNA

C

A A

A A

u

c

G

A-20

G-C C-G G- C U-A 10-G-C A-U C-G A G G A C-G- 3 0 U*G GGG UCCGAA.

u

..

RQI35(+) RNA

c

G A G-C G.U-720 710-U-A U-A G-C G.U A G A G G A 700 U.G I U-G . . .AGG ACCGAA.

..

23s RNA

FIG.7. Comparison of (A) the sequence of RQ120(-) RNA with the sequences of E . coli tRNApP and Qp RNA; (B) the sequence of RQ135(+) RNA with the sequences of E . coli 2 3 4 rRNA and A phage 0-protein mRNA; (C) the homologous hairpins predicted for RQ135(+) RNA (left) and E . coli 23-5 RNA (right). The differing bases of the hairpins are shown in lightface letters.

Rm23(*) RNA

. . .C

110

120

130

140

I

I

I

I

150 1

160

170

180

190

I

I

I

I

O ~ W e ~ ...Q Q C A C ~ C ~ C O C Q - W I I O C W C G U U C O C CG C~ AI CV C CO OC CC ~ AI I

I

I

I

I

I

5160

5170

5180

5190

5200

5210

~ ~

5230

O

~

-

200 I

~ C

I 5240

A

O

C

A

210 I

- .~

~ -

I 5250

A

C

W

C

I 5260

O

C

C

C

I 5270

+6( +) RNA large segment FIG. 8. Homology between the sequences of RQ223 RNA and of the large segment of phage $6 genome (data from Ills).

~

~ ~

C

.

.

REPLICABLE RNA VECTORS

245

stability of the RQ135 hairpin as compared with that of 23-S RNA (Fig. 7C). It can be concluded that RQ RNAs are made of the same structural elements as those of nonreplicable RNAs, and that the peculiarity of the structure of RQ RNAs is in a different spatial arrangement of those elements, which is achieved via RNA recombination (111). This observation also supports the conclusion drawn earlier from the comparison of the RQ223 and MDV-1 RNAs (Section 11,D,2) that the rate of evolution of RQ RNA sequences is rather low. Therefore, if some extended segment of an RQ RNA shows no homology with a known RNA, one may suspect that this is because its ancestor has not been sequenced, rather than because that segment has intensively mutated. Indeed, searching a recent release of GeneBank revealed homology between a previously unassigned segment of RQ223 (MDV-1) RNA and the large segment of the phage +6 genome (Fig. 8) (111~) (L. A. Voronin and A. B. Chetverin, unpublished). Thus, recombination between preexisting RNAs (whether replicable or not) seems to be the main (if not the only) mechanism of generation of new RQ RNA species. Figure 9 summarizes currently available information on the origin of sequenced RQ RNAs.

5. CANRQ RNAs ORIGINATE de Nmo? The data reviewed clearly show that many RQ RNA species claimed to be generated de n m o (95, 99) or to be evolved from early products of de nmo synthesis (60) were actually synthesized in template-instructed reactions. The interference from background growth of these RQ RNAs can be reduced to a minimum and even eliminated by taking some precautions: using uninfected cells for isolation of QP replicase; ensuring a clean environment during enzyme purification and preparation of samples, and/or protecting the samples from the environment; maximally reducing any nucleic acid contaminations; and carrying out amplification reactions in immobilized media. At the same time, the above observations do not rule out the conceptual possibility of RNA origination de nmo. The authors of the de nmo hypothesis argued that the processes of a de nmo RNA generation and a template-instructed synthesis are fundamentally different. Biebricher et al. (112) reported that while the template-instructed process at the exponential and linear phases obeys Michaelis-Menten kinetics, the lag period of a de nmo reaction is proportional to approximately the third power of both the enzyme and substrate concentrations. They suggested that this reflects the existence of a “nucleation” step that requires simultaneous association of several Qp replicase and substrate molecules to synthesize the first RNA molecule de nmo. This particular suggestion was not supported in a recent study (113) employing a capillary technique to observe single spontaneous RQ RNA

246

ALEXANDER B . CHETVERIN AND ALEXANDER S. SPIRIN

FIG. 9. Scheme of the likely origin of known RQ RNAs.

colonies and finding, in a large number of events, that varying the replicase concentration from 3.6 to 0.6 pM (6-fold) increased the lag period of colony formation from 6.5 to 20 hours @-fold),whereas varying the NTP concentration from 12.5 to 0.32 mM (40-fold) increased the lag period from 1.2 to 30 hours (%-fold). At the same time, both papers reported a considerable scatter in the lag times distinguishing spontaneous reactions from those initiated by template addition, and noted that the observed lag periods substantially exceeded the time needed for a single RQ RNA molecule to produce detectable progeny. Moreover, McCaskill and Bauer (113) observed the late appearance of RNA colonies in a sealed capillary, into which contaminating RQ RNAs could not enter from air during the reaction. The RNAs that appear after long incubation times are often less efficient Q p replicase templates and are shorter (35 to 50 nt) than the usual RQ RNAs (114). It follows that replicating RNAs can

REPLICABLE RNA VECTORS

247

be generated in a reaction mixture during long incubations, but does this necessarily mean that these RNAs are synthesized without a template? Apart from generation de nooo, there can be other rare events producing replicating molecules after extended lag periods, such as copying DNA by Qp replicase (72), spontaneous melting of an RQ RNA duplex (60), and recombination of nonreplicating RNA fragments (5, 74, 102). As there may be no more than trace amounts of putative nucleic acid templates in a reaction mixture prepared with highly purified Qp replicase, their presence can only be ascertained if the products of spontaneous reaction are related to known sequences. Biebricher and Luce (114)cloned cDNA sequences of a number of shortchained RNAs spontaneously synthesized after unusually long lag periods, and claimed that “no significant sequence homologies were found when searching the E . coli sequences of the GeneBank, the Q p sequence, or the numerous other RNA species sequences replicated by Qp replicase. ” RNA transcripts of two such cDNAs were amplifiable by Qp replicase, although rather inefficiently. We compared sequences of the two RNAs with known RQ RNA sequences and found that, contrary to this claim, these RNAs are highly homologous to the RQ87 (MNV-11) RNA frequently used in their experiments and also independently isolated in our laboratory (Fig. 10). Moreover, each of them consists entirely of several segments of RQ87 RNA, suggesting that these RNAs may have been formed by recombination from RQ87 fragments. The presence of a few base differences distinguishing these RNAs from the corresponding RQ87 segments is consistent with the sequence heterogeneity always observed within RQ RNA populations (74, 92, 94). Thus, there are no facts whose explanation requires postulating template-independent generation of RQ RNAs. Quite to the contrary, all available facts are readily explained by known properties of the replication system of Qp phage, such as its template dependence, its ability to generate RQ RNAs by RNA recombination, the ability of those RNAs to be amplified exponentially, and their capability of being disseminated in different ways, including that by virtue of Qp phage particles and through the air. There are no (and probably cannot be) observations rejecting the possibility of de nmo origination of RQ RNAs, but such an event has never been documented and is therefore a matter of belief.

E. Replication and Structure of RQ RNAs The replication of RQ RNAs is similar to that of Q p RNA. In excess replicase, the RQ RNAs are amplified exponentially, the template and product strands remain single-stranded during the entire replication cycle, and the (+) and (-) strands are produced in roughly equal amounts (90-92,112).

REPLICABLE RNA VECTORS

249

Because RQ RNAs do not encode proteins, designation of strands as (+) and (-) is rather arbitrary. By convention, the (+) strand is the one produced in a slight (5-10%) excess over its complement, the (-) strand (90). As in the case of Q p RNA, the 3‘-terminal A of a template is not read, and the 3‘-terminal A of a product strand is not encoded by its template (92, 115, 116). At the same time, unlike the Qp RNA (+) strand, both (+) and (-) strands of an RQ RNA are efficiently synthesized in the absence of host factor and protein S1 (116). Under optimal conditions, the replication cycle takes 20-30 seconds (87, 101, 113).It follows that a single copy of RQ RNA can give rise to 10’2 progeny molecules (about 50 ng RNA) within only 13-20 minutes. Comparison of the primary structures of studied RQ RNAs reveals no common sequence motif except the terminal pppGGG . . . and. . . CCCAOH, which is evidence against the existence of a special replicase recognition sequence proposed earlier (117). Yet, Qp replicase unambiguously distinguishes each of the RQ RNAs from a great variety of heterologous templates. Furthermore, the (+) and (-) strands of each RQ RNA are almost equally efficient templates despite the fact that their tertiary structures are quite different, as shown by the dif€erent mobilities of the (+) and (-) strands during electrophoresis through polyacrylamide gel under nondenaturing conditions (5, 74,92, 118). This poses certain constraints on the possible structure of the replicase recognition element, as this structure must be nearly the same in each of the complementary versions of an RQ RNA. A hairpin is located not far from the 3‘ end of the (+) and (-) strands of each RQ RNA (92). This, however, does not ensure the required uniformity of a recognized structure, because the distance between the hairpin and the 3’ end varies significantly even for the complementary strands of the same RQ RNA. In at least one of the complementary strands of each RQ RNA, the hairpin is always 4 to 6 nucleotides away from the 3’ end, and these nucleotides are capable of base-pairing with the 5’ end, as shown in Fig. 11 (119). As a result, a continuous double-helical structure can be produced by the hairpin and the base-paired termini, resembling the helix formed in tRNA by the W C hairpin and the aminoacyl acceptor stem (120).In the complementary strand, a virtually identical helical structure can be produced by the base-paired termini and the 5‘-proximal hairpin (Fig. 11). The only difference between the helices formed by the (+) and (-) strands would be the orientation of a giant “bulge” that includes the rest of RQ RNA molecule. A similar structure can also be formed by Q p RNA and by genomic RNAs of other RNA phages (119). The existence of the predicted terminal helices in RQl20 and RQ135 RNAs has been supported by probing the RNAs with single-strand-specific and double-strand-specific RNases (5, 74). In spite of mismatches, the helices appeared to be unusually stable: they survived incubation at 50°C in the

RQl20(+) RNA

RQl20(-) RNA C G

cc

U G U A U G A U U U G U C-0 COG G*C UmA A*U A*U G*C Q*C U-A A-U COG C-0 A C Q*C

C A A A C*G C*G U *A u u uu U*A G-C G*C C*Q C-G C*G U*A C A U C C A U-GCUUUUGU UGGUC-Q G.C---G---C---U----A.U 0 . C COG C*G 0.C C*G Q*C

u

u

G A

U C

/

5’

A

\ 3'

A

RQ135(-) RNA U A C G u u COG G A G-C U -A u c Q*C A*U C-0 G-C Q-C G-C U'A G*C Q.C A C A'U G*C U-U---U * G u u C’Q G U C U A*U---G*C U C G-C G-C C-G C-G C-0 G*C G*C G.C U.G-----C.G--A--C A A*U G.U-U-U-A*U-U-G-Q*C A-U 0.C G U C-G G-C 0-C C U C A A C-G G-C / A C*G C A 5’ \ C*G C A 3' U*A A. U C-G G-C G C A U

G A C U A A U A C C*Q U-A A-U Q-C

C C A A A COG G-C A-U G.C

G G U G

u

u

C*G C-G COG G*U U *A A A A A U -A C-Q G*C G*C U-A C-G COG U-Q G A U-G C*QACCA ACAAAAGC A A.U----A---G---C---G.C Q*C C*G C*G Q.C C-G Q.C A A / A A C 5’ \ G U 3' U U9A

A-U

COO Q*C

RQl35(+) RNA U A C G COG A A u c G-C U .A G A 0 . C A. U G-C C.Q G*C Q-C G-C U*A G-U Q ’ C G-C A-U C.Q A A C A---A-A A Q A C G*C---A.U Q A G*C C-G G-C C.5 C-G G*C G-C U.Q--U--C.G----- C A G.C Q*C-C-A-A-U-A-A-AC A*U Q ’ C A*U COG A C Q*C G*C U.G A G / A G*C COG U 5’ \ U G COG 3' U G C-G U *A A. U C-G G*C G C A U

FIG. 11. Predicted secondary structures of the (+) and (-) strands of RQ120 and RQ135 RNAs. The terminal helices are shown in bold letters.

REPLICABLE RNA VECTORS

251

FIG. 12. Models of RNA replication in which annealing of the template with the nascent strand is prevented by formation of a double-circle structure. (A) The butterfly model (data from 33). (B) The terminal helix-lock model (data from 119).

presence of 7 M urea (5). Probably, the helices are stabilized by tertiary interactions within a core formed by bulging nucleotides. The terminal helix could constitute a structure specifically recognized by Qp replicase. In addition, it could “lock an RQ RNA template into a circle by fastening its termini, and thereby contribute to prevention of the annealing of the (+) and (-) strands during replication. As mentioned above, the ability of RNA to remain single-stranded during the replication cycle is a major prerequisite for its exponential amplification. A “butterfly model” of Q p RNA replication has been proposed (33),in which Q p replicase is permanently bound with the 3’ end of the template and with the 5‘ end of the nascent strand. Simultaneously, the growing 3’ end of the nascent strand and the region of the template being copied are bound within the polymerase active site. This would result in a double-circle structure (Fig. 12A) and thus impose topological constraints on the formation of a long interstrand helix, because two circles cannot wind around one another. However, no appreciable binding of Qp replicase with the 3’ end of the template is detected except during initiation (46, 121). At the same time, if the 3’ and 5’ ends of the template were held together by the terminal helix lock, the template circle could exist without the binding of the 3’ end by Qp

252

ALEXANDER B. CHETVERIN AND ALEXANDER S. SPIRIN

replicase. The only requirement would be the permanent replicase binding with the 5’ end of the nascent strand in order to form the second circle (Fig. 12B). This possibility is supported by the observation that isolated pppGGG, which resides at the 5’ end of each replicating RNA, binds to Q p replicase with a high affinity, at least 102 that of GTP (H. V. Chetverina and A. B. Chetverin, unpublished). In contradistinction to the butterfly model, the terminal helix-lock model would allow a number of product strands to be simultaneously synthesized on the same template.

111. R Q RNA Vectors

A. Amplification of Heterologous Templates 1.

RECOMBINANT RQ

RNAs

As discussed above, RQ RNAs are natural recombinants consisting of sequences of various, including heterologous, RNAs. This suggests that any heterologous template could be made amplifiable by Q p replicase if inserted into an RQ RNA sequence. Because the structure of terminal RQ RNA segments seems to be most important for replication, the foreign sequences should be inserted into an internal site of an RQ RNA molecule, preferably into a hairpin loop, in order to minimally disturb the RQ RNA structure. The feasibility of this approach was first demonstrated by insertion of decaadenylic acid into MDV-1 (RQ223) RNA (100); the resulting recombinant molecule was as efficient a Q p replicase template as the original RQ RNA. Since that time, recombinant RQ RNAs carrying many different foreign inserts have been tested for their ability to replicate in a cell-free reaction, utilizing a purified Qp replicase preparation. Relatively short inserts (30-40 nt) do not significantly alter the capability of an RQ RNA vector of exponential amplification (81, 101). A further increase in the length of the inserts decreased the rate of replication. RQ RNAs with longer inserts (up to 150 nt) are amplifiable only if the inserted sequences have a potential for folding into a stable secondary structure (81). The inefficient replication of vectors carrying long unstructured inserts correlates with a high percentage of double-stranded RNA in the reaction products. This indicates an important role for the secondary structure formation to play in maintaining the single-strandedness of RNA during replication (81). Very long inserts with a moderately developed secondary structure, such as mRNAs, seem to abolish the ability of RQ RNA vectors to be amplified by purified Qp replicase (69, 122). An exponential amplification of MDV-CAT RNA, in which a 783-nt-long sequence coding for chloramphenicol

REPLICABLE RNA VECTORS

253

acetyltransferase is embedded within MDV-1 RNA (123), was not reproduced in our experiments (69).At the same time, RQ-mRNA recombinants can be amplified in uiuo (12.2) or in the presence of a coupled cell-free translation system, as discussed in Section III,B,l.

2.

AMPLIFIABLE

PROBES

The ability of RQ RNA vectors carrying short inserts to replicate exponentially allows them to be used as amplifiable reporter probes, e.g., for diagnostic purposes (124). In the simplest version, a full-sized RQ RNA includes a sequence complementary to a target, such as viral or microbial RNA or DNA (125-128). The reporter RQ RNA probe is hybridized with the target in the presence of a solid support, such as paramagnetic beads, where a “capture probe” (an oligonucleotide complementary to a different target site) is immobilized. If the target is present in the sample, it binds to the beads via the capture probe together with the reporter probe hybridized to it. After washing away the nonhybridized reporter probes in several cycles of the target release and recapture on fresh beads, the target-reporter hybrid is dissociated in a buffer of low ionic strength and the sample is mixed with Q p replicase and NTPs. The presence of the target in the sample is revealed by amplification of the reporter RQ RNA, which is monitored by incorporation of labeled nucleotides or by fluorescence of intercalating dyes (125, 128). The drawback to this approach is the nonspecific binding of reporter molecules, which cannot be eliminated even by employing the most vigorous washing procedures. As a result, RNA growth is observed irrespective of the presence of target in the sample. The presence of target molecules can be established only by comparing the early kinetics of RNA synthesis in the experimental and control samples (125, 128). As a result, the sensitivity and reliability of the assay are worse than those of the PCR diagnostics. More attractive are diagnostic schemes that employ a target-directed formation of replicable probes directly in the sample (129). Instead of a complete RQ RNA reporter probe, two halves of its cDNA copy are used, one half carrying the T7 promoter sequence. If a target molecule is present in the sample, it serves as a template for the synthesis of a full-sized reporter RQ DNA by DNA ligase (Fig. 13A) or DNA polymerase (Fig. 13B).T7 RNA polymerase is then used to transcribe the DNA into RQ RNA copies, which are amplified by Q p replicase. In this format, a nonspecific probe binding does not interfere with the assay; the DNA half-probes do not replicate unless they are joined together in a target-dependent process and are transcribed into an RQ RNA reporter. However, even in this case, the assay suffers from the background caused by spontaneous synthesis of RQ RNA.

254

ALEXANDER B. CHETVERIN AND ALEXANDER S. SPIRIN

A Target template 13'

' \

5'

Two halves of an RQ cDNA probe

4 DNA ligase

n

-5'

4 T7 RNA polymerase RQ RNA reporter

I

target-specific segment

B Target 5'1

A

000000

'&A

polymerase

First half of

n o o mT- DNA polymerase

S

l7

promoter S

3’1

3’ 5' RQ RNA reporter

4 ~7 RNA polymerase I

target-specific segment

FIG. 13. Schemes for target-dependent formation of a replicable RQ RNA reporter employing (A) DNA ligase or (B) DNA polymerase

The interference from this background can be eliminated by employing the molecular colony technique discussed in Section IV,A.

B. Amplification and Expression of mRNAs 1. AMPLIFICATION OF RQ MRNAs IN THE OF A COUPLED TRANSLATION SYSTEM

PRESENCE

As noted above, RQ mRNA recombinants are inefficient templates for purified Qp replicase. In an in uitro reaction initiated by either the (+) or (-) strand, a rapid synthesis of the complementary copy is observed, but

255

REPLICABLE EWA VECTORS

replication then ceases due to formation of a duplex between the template and product strand (69). In this regard, RQ mRNA recombinants behave analogously to heterologous templates. At the same time, recombinant mRNAs appeared to be replicating efficiently in E . coli cells (122), suggesting that some cellular components can promote replication, presumably by preventing the duplex formation. Indeed, the addition to the purified Qp replicase reaction of the E . coli cell-free translation system, which includes ribosomes and soluble cellular proteins, greatly stimulates replication of RQ mRNA (Fig. 14A) (69). Replication includes many consecutive rounds, indicating that the template and product strands remain single-stranded throughout. The stimulation takes place only when the complete translation system is added (the separate presence of ribosomes and soluble proteins has no effect) and requires functioning of the translation system; the effect is eliminated by the addition of puromycin, a specific inhibitor of protein synthesis. Furthermore, the translation system enhances replication of only those RNAs that contain both the RQ RNA and the mRNA moieties (69). The need for the RQ RNA moiety hints at an importame of the Q p replicase-specific structure, whereas the

B

Time, m i n

FIG. 14. Stimulation of replication of an RQ135-dihydrofolate reductase mRNA recombinant by Q p replicase in the presence of a coupled E. coli translation system and [a-32P]UTP.(A) Electrophoretic separation of the labeled (+) and (-) strands after annealing with excess unlabeled (+) strands. RNA synthesis by the purified Qp replicase done (lanes l), in the presence of the complete cell-free translation system without further additions (lanes 2), or with the addition of 0.5 mM puromycin (lanes 3), and in the presence of an incomplete translation system lacking the SIOO protein fraction (lanes 4)or ribosomes (lanes 5). (B) Time course of accumulation of the (+) strands (A) and (-) strands (A)of the recombinant RNA in the coupled replication-translation system. Reprinted from 69 with permission.

256

ALEXANDER 8 . CHETVERIN AND ALEXANDER S. SPIRIN

need for the mRNA moiety suggests that the enhancement is due to involvement of the sense (+) strands of RQ mRNA in translation. Apparently, the translating ribosomes stimulate replication of RQ mRNA by sequestering the (+) strands and thereby protecting them from annealing with the (-) strands. This observation needs explanation, because if translation of the (+) strands is the only factor that prevents duplex formation, no enhancement of replication should be expected. No problem exists during copying of the (-) strand. In this case, ribosomes can initiate on the nascent (+) strand without interference with its elongation, because they move along the (+) strand in the same direction as it grows. The problem arises during copying of the (+) strand. Because translating ribosomes move along the template in the 5’ + 3’ direction, i.e., opposite to Q p replicase, they block replication (130).Hence, the full-sized (-) strand can be synthesized only if ribosomes interact with the (+) strand after, rather than during, its copying. At the same time, if the copying had resulted in the formation of a duplex, ribosomes could not unwind it, because they can neither translate nor bind double-stranded nucleic acids (131-133). Therefore, the observed stimulation of replication requires that RQ mRNAs remain single-stranded during strand elongation irrespective of ribosome action. It follows that the structure of RQ mRNA satisfies the basic requirements for replicating RNAs, such as the ability to be recognized by Qp replicase and the ability to maintain the template and product strand single-stranded during replication. However, in the absence of a translation system, the (+) and (-) strands of RQ mRNA collapse into a duplex subsequent to the synthesis of the product strand, presumably during its termination. As discussed above, termination plays a special role in replication. It includes the 3‘-terminal adenylylation of the product strand and is the slowest step in the replication cycle (65).During termination, the product strand leaves the QP replicase complex, which prevented it from annealing with the template during the elongation step. There seems to be a moment within the termination step (e.g., after strand detachment from replicase and before formation of the terminal helix that fastens strand ends) when the product strand could be allowed to anneal with the template unless the strands are stabilized by a strong intramolecular secondary structure. If the secondary structure is weak (as in the case of RQ mRNA recombinants), annealing can be prevented by involvement of the (+) strand in a coupled translation process. It also follows from the above consideration that the addition of a translation system should aEfect synthesis of the (+) and (-) strands differently. During synthesis of the (+) strand [i.e., when the (-) strand is copied], its

REPLICABLE RNA VECTORS

257

binding to ribosomes anywhere in the replication cycle does not interfere with its elongation and prevents it from annealing with the template at the termination step. If the (+) strand functions as a template, its binding with ribosomes at the termination step is beneficiary, as this prevents it from annealing with the product strand. However, entering of the (+) strand template into translation at the elongation step should lead to abortion of the (-) strand synthesis. This, in turn, should result in a lower content of the (-) strands in the replication products as compared with the (+) strands. Indeed, ribosome initiation on the coat protein cistron of the Qp(+) template leads to synthesis of aborted Qp(-) strands whose size is equal to the distance between the 3' end of the tempIate and the coat protein initiation site (130).Also, during amplification of an RQ mRNA recombinant in a coupled replication-translation system in uitro, its (+) strands are accumulated in a fivefold excess over the (-) strands (Fig. 14B). Thus, the recombinant templates manifest the same replication asymmetry as observed for the amplification of Q p RNA in viuo (see Section 1,B). This similarity is striking, because in the cell-free reaction the (+) strands of RQ mRNA are not packaged into virions, which was believed to be the major reason for the 1 O : l ratio of the Q p RNA (+) and (-) strands in the cell (66). Furthermore, this similarity suggests that the mechanism of amplification of RQ mRNA recombinants is basically the same as the mechanism of replication of the QP phage genome. The observations of RQ mRNA amplification in a coupled replicationtranslation system suggest an important practical point that could be a subject of further studies. It seems likely that apart from the translating ribosomes, any factor capable of preventing annealing of the (+) and (-) strands at the termination step will ensure an efficient amplification of the recombinant Q p replicase templates.

2. TRANSLATION AND STABILITYOF RQ MRNA RECOMBINANTS In addition to the capability of replication, the embedment of cellular mRNAs within RQ RNA structures results in a much higher yield of translation in vitro, which is a characteristic of viral mRNAs. Interestingly, this effect is observed even if mRNA is inserted into RQ135 RNA (69), which displays no homology to Qp RNA (74) and is therefore not associated with the presence of virus-specific sequences. Analysis of RNA integrity during translation shows that the half-life of a recombinant RQ mRNA in the E . coli cell-free extracts is extended 5 to 10 times above that of the original mRNA, and this accounts for much of the translation enhancement (134).The stabilizing effect of the RQ RNA moiety on the mRNA insert is consistent with the high resistance to RNases of RQ RNA vectors. Their stable terminal helices should effectively protect RNA

258

ALEXANDER B. CHETVERIN AND ALEXANDER S. SPIRIN

against 3'-exoribonucleases, which are abundant in bacterial cells and which play a major role in mRNA turnover in u i w (135).There is ample evidence that mRNAs can be stabilized by double-stranded structures introduced into their 3' termini, both in uivo and in vitro (135-137). In a cell-free system deprived of most of its endogenous RNases, some of the recombinant RQ mRNAs are still translated at a higher rate than their progenitor mRNAs. This effect cannot be explained by the elevated stability of the R Q recombinants, as it is observed during time intervals where no significant RNA degradation is seen. For example, insertion of chloramphenicol acetyltransferase mRNA into RQ135 RNA results in 20 times the yield of translation in crude cell-free extracts, and in a 5-fold higher initial translation rate in an RNase-deficient cell-free system (134).The observation that an R Q RNA vector, whose destination has solely been a high replication rate and that preforms no protein-coding function, can stimulate expression of the harbored gene is unexpected and suggests that both replication and translation processes may benefit from the same structural features of the vector, such as the proximity of its 3' and 5' ends.

3. RQ MRNA-DIRECTED PROTEIN SYNTHESIS I N REPLICATION--TRANSLATION SYSTEM

A COUPLED

When RQ mRNA is translated in the presence of a coupled Qp replication system, the rate and the yield of protein synthesis is further increased (69, 138). If the RQ mRNA concentration is rate-limiting, this can be explained by the synthesis of additional (+) strands (69).However, a significant stimulation is observed even if R Q mRNA is added in saturating amounts, i. e., when the addition of more template does not lead to further increase in protein synthesis (138). The stimulation is not caused by introduction of more EF-Tu, EF-Ts, and S1 proteins, which are a part of the Qp replicase complex and also a part of the translation machinery, because the addition of Qp replicase has no effect on translation of mRNAs that lack the RQ moiety (69). The stimulation has been attributed to the fact that, in a coupled replication-translation reaction, ribosomes can initiate on nascent (+) strands whose tertiary structure has not yet formed, and this may provide more favorable conditions for translation initiation (138).At the same time, protein synthesis on RQ mRNA templates is more efficient when coupled to Qp replication than when it is coupled to T7 RNA polymerase transcription, which also allows the ribosomes to initiate on nascent strands. Moreover, Qp replicase stimulates protein synthesis when added to the coupled transcription -translation system (138).Taken together, these observations suggest that the coupling of replication and translation processes is more than mere simultaneous occurrence of the two processes that are near each other.

REPLICABLE RNA VECTORS

259

One can speculate that the coupling between translating ribosomes and Q p replicase is more intimate than that between translation and transcription. In this regard, it may be essential that EF-Tu, EF-Ts, and S1 proteins are common to the two enzyme systems. The role of these proteins in the QP-replicase-effected RNA synthesis is still a mystery. Protein S1 is dispensable for replication of all QP-replicase templates except the (+) strand of Q p RNA (116),but its role in the switching between translation and replication of Q p RNA is well established (9, 115, 130, 139, 140). The earlier speculations that the EFTu.EFTs complex can participate in the binding of Q p replicase templates in a manner analogous to the binding of aminoacyl-tRNA and can also supply GTP for priming the synthesis of the product strand were not supported in subsequent experiments (7). However, it was shown that Q p replicase can replace the EFTu.EFTs complex in protein synthesis (27), and that EF-Tu and EF-Ts are needed for the Qp phage-encoded subunit to acquire the active conformation (141). We cannot exclude the possibility that the main, or even the only, reason for participation of the protein synthesis factors in Q p replicase has been to ensure the bona fide coupling and coordination between replication and translation of the Q p phage genome, which is reflected in the properties of the coupled replication-translation of RQ mRNA recombinants. Thus, insertion of mRNAs into RQ RNA vectors leads to multiple consequences. This protects the mRNAs against ribonuclease degradation, increases the rate of translation of some of the mRNAs, enables the mRNAs to be amplified by Q p replicase in the presence of a coupled translation system, and ensures a genuine coupling between mRNA replication and translation. These different factors result altogether in a large cumulative enhancement of mRNA translation that increases the yield of synthesized protein 10- to 100-fold. In effect, RQ RNA vectors convey to cellular mRNAs the characteristics of viral genomes. In view of the widespread opinion of a low fidelity of RNA replication, one may wonder whether amplification by Q p replicase could result in a rapid loss of the informational potential of mRNA, yielding a large proportion of defective protein. In fact, the error rate during RNA replication (10-3-10-4) is several orders of magnitude higher than during DNA replication (142-144), seemingly due to the lack of proofreading mechanisms. However, it is not higher than the error rate of translation (145). Furthermore, not every nucleotide substitution will result in the replacement of an aminoacid residue and not every aminoacyl substitution will inactivate protein function. Assays of the enzymes synthesized in coupled replicationtranslation reactions detected no appreciable decline in their specific activity in the course of the reactions (69, 138), suggesting that functionally active proteins were continuously produced.

260

ALEXANDER B. CHETVERIN AND ALEXANDER S. SPIRIN

4. USE OF RQ MRNAs I N CONTINUOUS-FLOW

CELL-FREEPROTEINSYNTHESISREACTORS Continuous-flow cell-free (CFCF) reactors allow proteins to be synthesized in uitro on a preparative scale. In these reactors, a cell-free translation system is continuously fed with a solution containing substrates (amino acids and nucleotides), and the products are withdrawn through an ultrafiltration membrane that is permeable for the synthesized proteins but retains the components of the translation system (146-148). The membrane pore size is not the only factor that controls efflux of the reaction components. There was no detected leakage of relatively small protein factors of translation and tRNAs (whose molecular mass is well below 100,000 Da) when membranes with the pores of up to 300,000-Da cutoff size were used (149), and this is probably due to association of these molecules with ribosomes and/or mRNAs. Translation goes on at a steady rate for several tens of hours, provided that enough intact mRNA remains in the reactor, and results in the production of a large amount of a synthesized protein (up to 1mg per 1ml of the reactor volume). The most important feature of these reactors is that they produce homogeneous translation products that do not need isolation (146, 149). The high performance of RQ mRNA recombinants makes them ideal templates for large-scale protein synthesis in CFCF reactors. Especially attractive is the expression of RQ mRNAs in such reactors that contain a coupled replication-translation system, because in this case mRNA losses due to degradation would be compensated by the continual production of new (+) strands, thus significantly extending the reactor lifetime. However, it was a danger that during long incubations the coupled replicationtranslation would fail, due to either or both of the following reasons: (1) spontaneously growing RQ RNAs might supplant the recombinant RQ mRNA, and (2) continual synthesis of the (+) strands might eventually result in their overproduction and inhibition of translation by excess of template. In both cases, an early decline in the rate of protein synthesis would be observed. Nevertheless, it turned out that Q p replicase does stimulate the translation of RQ mRNAs in CFCF reactors (I. Yu. Morozov, V. I. Ugarov and A. S. Spirin, unpublished), and that the high rate of protein synthesis remains unreduced for more than 40 hours of the CFCF reaction (138).The reasons why the mentioned possibilities do not diminish the performance of CFCF reactors remain to be investigated. Probably, the contaminating RQ RNA molecules are small enough to be swept out of the reactor by the flow of feeding solution; this reduces the interference from spontaneous RNA growth (138).On the other hand, any production of the (+) strands in excess over ribosomes should result in their annealing with the (-) strands, as

REPLICABLE RNA VECTORS

261

discussed above. This would make these (-) strands unavailable as templates for Q p replicase and, therefore, would slow down the production of new (+) strands. Thus, the inability of RQ mRNAs to remain single-stranded in the absence of translating ribosomes could provide a basis for autoregulation of their synthesis in a long-term replication-translation process.

IV. Cell-free Molecular Cloning A. Cloning of R Q RNA Molecules Traditionally, cloning of nucleic acids is carried out in uiuo. The cloning procedure comprises a series of steps, including the insertion of nucleic acid into suitable vectors such as plasmid or viral genomes, transformation of living cells with the resultant recombinant molecules, and obtaining single colonies of transformed cells or single viral plaques on the surface of nutrient agar (1).The procedure is often called “molecular cloning,” which is not quite correct, because the cloned entities are living cells or viruses rather than molecules and the recombinant nucleic acids are amplified within the cell milieu together with the host DNA and RNA. The ability of RQ RNAs to be exponentially amplified by Q p replicase provides for true molecular cloning in uitro. The first attempt to clone RQ RNA molecules was made in Spiegelman’s laboratory as early as 1968 (87).The authors diluted an RQ RNA preparation down to a single molecule per tube, and observed RNA synthesis in approximately as many tubes as was expected from the Poisson distribution. They concluded that a single molecule can be amplified by Q p replicase to produce a clone, i.e., the progeny of a single parent template. However, the validity of this conclusion was questioned by the later observation that RNA synthesis can occur even in the absence of added template (95). Moreover, because only a fraction of added RQ RNA molecules are active in replication (112, 113, 15O), the resemblance to the Poisson distribution was most likely coincidental. Cloning of RQ RNA molecules has been made possible by utilizing the molecular colony technique, earlier used for demonstration of the ability of RQ RNAs to be disseminated through the air (103). To retard d f i s i o n of RNA molecules, one of the two agarose layers was replaced with a nylon membrane capable of reversible interactions with RNA (150). This resulted in much sharper colonies (Fig. 15). In addition, because RNA molecules were bound to the membrane in the course of synthesis, the membrane could be directly used for analyzing the colonies, omitting a transfer step. When [a-32P]NTPs were used as substrates, RNA colonies became detectable after 10 minutes of incubation (150). In this time, each colony contained

262

ALEXANDER B. CHETVERIN AND ALEXANDER S. SPIRLN

FIG. 15. The RQ RNA colony pattern obtained in a 20 x 20-mm sandwich that includes a replicase-containing agarose layer and a substrate-containing nylon membrane. Reprinted from I50 by permission of Oxford University Press.

QP

as many as 1010 RNA copies, indicating a duplication time of approximately 20 seconds, i.e., the same as in liquid media. Several lines of evidence confirmed that each colony comprises a clone: the number of colonies was proportional to the number of seeded RNA molecules and each colony contained a single RQ RNA species; also, when a mixture of RQ RNAs was seeded, different RQ RNA species were found in dfierent colonies. Depending on which RQ RNA species was used, the number of colonies accounted for 10-40% of the number of seeded molecules (150).This is comparable to the 10% proportion of viable (plaqueforming) particles in the wild-type QP phage isolates (151).

B. Future Directions The molecular-colony technique can be used in a number of applications, some of which are discussed below.

1. STUDYINGRNA

RECOMBINATION

in Vitro

RNA recombination is a wide-spread phenomenon documented for RNA-containing viruses of any type of organism (108).It is believed that RNA recombination plays an important role in the evolution of viruses (152), and it has been suggested that RNA recombinations could lead to gene formation reminiscent of the exon-intron structure of modern genes (153). So far, RNA recombination has been almost exclusively studied in viuo on viruses and defective interfering particles (108,154). The ability of RQ RNAs to undergo recombinations and to be cloned as colonies offers a unique opportunity for studying the mechanism of RNA recombination in uitro. The molecular-colony technique can allow the products of very rare recombination events to be identified against a nonrecombinant background, and to be isolated and analyzed. Carrying out experiments in uitro can answer questions such as whether recombination is performed by RNA replicase, by some cellular components, or by RNA molecules, and whether RNA recombination is a copy-choice or a breaking-rejoining pro-

263

REPLICABLE RNA VECTORS

cess. Also, it can be very useful in direct testing of various models of RNA recombination that have been proposed (155-1 58). 2.

ULTRASENSITIVE DIAGNOSTICS

The advances in nucleic-acid amplification by PCR (2, 159), selfsustained sequence replication (160),strand displacement amplification (161), and Q p replicase reaction have led to the invention of a series of novel diagnostic assays whose sensitivity is several orders of magnitude above that of immunodiagnostics. However, there is a major drawback common to all these techniques: the background synthesis of nucleic acids. Although these methods can detect as low as a few target molecules in model experiments employing purified targets, the sensitivity becomes much worse when the assay is carried out on biological examples in the presence of a great excess of irrelevant nucleic acids. the reason for the sensitivity loss is an elevated background level caused by the limited amplification specificity. In the case of QP replicase, the background is caused to a large extent by the spontaneous RNA synthesis discussed above. The background problem could be readily overcome by employing the molecular-colony technique. In this format, the target molecules and the nonspecifically amplified nucleic acids would produce separate colonies. The target-specific colonies could be distinguished from the background, for example, by hybridization with target-specific labeled probes. Such molecularcolony diagnostics are ideally suited for use in combination with the QP-replicase-based assays, whereby a replicable RNA reporter is generated in a target-dependent reaction utilizing the target RNA or DNA as a template (129). The molecular-colony technique would potentially provide for absolute diagnostics, as it is capable of detecting even a single target molecule or its fragment in the sample. Furthermore, it could be used to assess target titer directly by counting the number of positive colonies. The technique would be of wide use in biomedical research for probing viruses, viroids, and virusoids and bacteria and other microorganisms, as well as their remains, and particular genes or their mRNAs in higher organisms. Samples could be taken from a variety of sources, such as biological fluid or tissue, raw water, or air.

3.

CELL-FREE

GENE CLONING

AND PROTEIN

ENGINEERING

As discussed above, RQ RNA recombinants carrying long mRNA inserts can be amplified efficiently by QP replicase if replication is coupled to translation (69). This opens the possibility for the cell-free cloning of genes. To this end, RQ mRNAs should be amplified in an immobilized medium

264

ALEXANDER B. CHETVERIN AND ALEXANDER S . SPIRIN

containing the components of replication and translation systems. RQ mRNA colonies could be screened in situ according to their ability to produce proteins that perform specific enzymatic reactions or bind specific ligands, antibodies, or antigens. Especially attractive is the use of this in uitro technique for protein engineering. Its advantages over in uiuo cloning are the ease of colony screening and selection; the homogeneity of the expression products; and the possibility of obtaining proteins not compatible with a living cell, proteins or polypeptides that contain unnatural or modified amino acids, or proteins whose analogs are constitutively expressed in the cell. 4. SELECTIONOF RIBOZYMES

Selection of efficient ribozymes is important for understanding the mechanisms of their action, and for numerous applications where a particular RNA target should be selectively destroyed without affecting other RNAs. At present, selection of RNA in uitro is carried out on large unseparated pools. It begins with generation of a complex pool of RNA variants initially containing molecules with the desired properties at a very low frequency. Several consecutive cycles of amplification in uitro and selection for the desired phenotype are performed to improve the average characteristics of the pool. Then the entire pool is reverse-transcribed into cDNAs and cloned in uiuo. Finally, a number of individual clones are picked at random, amplified, isolated, and separately analyzed (162). The molecular-colony technique could allow the procedure to be significantly simplified and speeded up by virtue of cell-free cloning the desired variants directly from the initial pool and testing their activity in situ. To make a ribozyme amplifiable, it should be embedded within an RQ RNA molecule. Clones that possess desirable characteristics can then be amplified by Q$ replicase for further use.

5. EMPLOYING OTHERSYSTEMS OF EXPONENTIAL AMPLIFICATION In addition to Q$ replicase, any enzyme system capable of exponentially amplifying nucleic acids in uitro should be suitable for nucleic-acid cloning in immobilized media, such as PCR (2,159),self-sustained-sequence replication (160),and strand-displacement amplification (161). In contrast to Q$ replicase, none of these systems is confronted with the template specificity problem, because any nucleic-acid template can be amplified by utilizing a pair of suitable primers. The amplification factors of each of these reactions ( ~ 1 0 7should ) ensure obtaining 10-17 mol of nucleic acid per colony, which exceeds the currently achievable detection limit of 10-18 to 10-19 mol of

265

REPLICABLE RNA VECTORS

DNA or RNA in dot blots (263). Temperature-resistant media, such as polyacrylamide gel, should be employed in the case of PCR.

V. Conclusion It follows from the above consideration that RQ RNA vectors can be used for efficient amplification, expression, and even cloning of genes in vitro. Actually, this completes a basis for cell-free gene engineering. It can be used for all the same various purposes as the traditional gene engineering that employs living cells, but it has a number of advantages over the in uivo techniques: it is 10 to 100 times faster; it does not require competent cells; it recovers a higher proportion of nucleic-acid molecules because there is no need for cell transformation and no penetration barriers exist; it allows genes to be cloned and expressed without any constraints from cellular control; and it permits the cloned nucleic acids to be screened, analyzed, and manipulated without prior isolation or purification. Because cloning, amplification, and expression of nucleic acids occur in an open system, are not mediated by the cell, and are not conditioned by cell survival, the medium composition and its physico-chemical parameters can be voluntarily changed over a wide range. We believe that many fields of molecular biology and biotechnology will benefit from these new opportunities.

VI. Glossary small nongenomic RNAs capable of amplification by Qp replicase MDV-1 (midivariant), WS-1 (nanovariant), MNV-11 (minivariant), and CT (cordycepin tolerant) RNAs RQ RNAs of 223, 91, 87, and 77 nucleotides in length, respectively; proposed to be designated as RQ223, RQ91, RQ87, and RQ77 RNAs HF host factor; an E . coli protein required, in addition to QP replicase, for replication of the genomic RNA of phage Qf3 PCR polymerase chain reaction; employs cyclic temperature changes and a thermostable DNA polymerase for primerdependent amplification of a DNA sequence SDA strand-displacement amplification reaction; employs a 5’ + 3’ exonuclease-free DNA polymerase and a restriction endonuclease, producing nicks within oligonucleotide primers, to ampllfy a DNA sequence under isothermal conditions

RQ RNAs

266

ALEXANDER B. CHETVERIN AND ALEXANDER S. SPIRIN

3SR

self-sustained sequence replication reaction; employs a DNAdirected RNA polymerase, a reverse transcriptase, and RNase H for isothermal primer-dependent amplification of nucleic acids to produce a mixture of RNA and DNA cell-free continuous-flow reactor; used for large-scale synthesis of proteins in a cell-free translation system under continuous supply of substrates

CFCF

ACKNOWLEDGMENTS The authors thank H. V. Chetverina and A. G. Raiher for help in manuscript preparation. This work was supported by the Russian Academy of Sciences, by Grants 93-04-6638 and 93-04-6550 from the Russian Foundation for Basic Science, and by Grant MTO-000 from the International Science Foundation.

REFERENCES 1. J. Sarnhrook, E . F. Fritsch and T. Maniatis, “Molecular Cloning: A Laboratory Manual,” 2nd ed. CSH Lab Press. Cold Spring Harbor, NY, 1989. 2. N. Arnheim and H. Erlich, ARB 61, 131 (1992). 3. B. Roizman, in “Virology” (8. N. Fields, ed.), p. 69. Raven Press, New York, 1985. 4. I . Haruna and S. Spiegelman, PNAS 54, 579 (1965). 5 . A. V. Munishkin, L. A. Voronin and A. B. Chetverin, Nature 333, 473 (1988). 6. C. Weissmann. M. A. Billeter, H. M. Goodman, J. Hindley and H. Weber, ARB 42,303 (1973). 7. T. Blumenthal and G. Carmichael, ARB 48, 525 (1979). 8. C. K. Biebricher and M. Eigen, in “RNA Genetics” (E. Domingo, J. J. Holland and P. Ahlquist, eds.), p. 1. CRC Press, Boca Raton, FL, 1988. 9. J. van Duin, in ‘The Bacteriophages” (R. Calender, ed.), p. 117. Plenum, New York, 1988. 10. P. Mekler, Ph.D. thesis. University of Zurich, 1981. 11. M. Kondo, R. Gallerani and C. Weissmann, Nature 228, 525 (1970). 12. R. Kamen, Nature 228, 527 (1970). 13. T. Blumenthal, T. A. Landers and K. Weber, PNAS 69, 1313 (1972). 14. K. Arai, B. F. C. Clark, L. Duffy, M. D. Jones, Y. Kaziro, R. A. Laursen, J. L’Italien, D. L. Miller, S. Nagarkatti, S. Nakamura. K. M . Nielsen, T. E. Petersen, K. Takahashi and M. Wade, PNAS 77, 1326 (1980). 15. G. An, D. S. Bendiak, L. A. Mamelak and J. D. Friesen, NARes 9, 4163 (1981). 16. R. Kamen, M. Kondo, W. Romer and C. Weissmann, EJB 31, 44 (1972). 17. H. Inouye, Y. I. Pollack and J. Pstre, EJB 45, 109 (1974). 18. A. J. Wahba, M. J. Miller, A. Niveleau, T. A. h d e r s , G. C. Carmichael, K. Weber, D. A. Hawley and L. 1. Slobin, JBC 249, 3314 (1974). 19. J. Schnier, M . Kimurd, K. Foulaki, A. R. Subramanian, K. Isono and B. WittrnannLiebold. PNAS 79, 1008 (1982). 20. R. Kamen, B B A 262, 88 (1972). 21. T. Blumenthal, Methods Enzymol. 60,628 (1979).

REPLICABLE RNA VECTORS

267

22. N. H. Berestowskaya, V. D. Vasiliev, A. A. Volkov and A. B. Chetverin, FEBS Lett. 228, 263 (1988). 23. Y. Kaziro, BBA 505, 95 (1978). 24. W. Szer and S. Leffler, PNAS 71, 3611 (1974). 25. J. M. Hermoso and W. Szer, PNAS 71, 4708 (1974). 26. G . van Dieijen, P. H. van Knippenberg and J. van Duin, EJB 64, 511 (1976). 27. T. A. Landers, T. Blumenthal and K. Weber, JBC 249, 5801 (1974). 28. M. Kozak and D. Nathans, Bacterial. Reu. 36, 109 (1972). 29. K. Weber and W. Konigsberg, in “RNA Phages” (N. D. Zinder, ed.), p. 51. CSH Lab Press, Cold Spring Harbor, NY, 1975. 30. M. R. Capecchi and R. E. Webster, in “RNA Phages” (N. D. Zinder, ed.), p. 279. CSH Lab Press, Cold Spring Harbor, NY, 1975. 31. M. L. Stewart, A. P. Grollman and M.-T. Huang, PNAS 68, 97 (1971). 32. L. A. Ball and P. Kaesberg, J M B 74, 574 (1973). 33. H. D. Robertson, in “RNA Phages” (N. D. Zinder, ed.), p. 113. CSH Lab Press, Cold Spring Harbor, NY, 1975. 34. A. Palmenberg and P. Kaesberg, J. Virol. 11, 603 (1973). 35. P. N. Shaklee, J. J. Miglietta, A. C. Palmenberg and P. Kaesberg, Virology 163, 209 (1988). 36. H. V. Chetverina, Ph.D. thesis. Institute of Protein Research, Pushchino, 1994. 37. T. S. Eikhom and S. Spiegelman, PNAS 57, 1833 (1967). 38. T. S . Eikhom, D. Stockley and S. Spiegelman, PNAS 59, 527 (1968). 39. M. T. Franze de Fernandez, L. Eoyang and J. T. August, Nature 219, 588 (1968). 40. M. T. Franze de Fernandez, W. S. Hayward and J. T. August, JBC 247, 824 (1972). 41. G . G . Carmichael, K. Weber, A. Niveleau and A. J. Wahba, JBC 250, 3607 (1975). 42. M. Kajitani and A. Ishihama, NARes 19, 1063 (1991). 43. M. S. DuBow, T. Ryan, R. Young and T. Blumenthal, MGG 153, 39 (1977). 44. A. W. Senear and J. A. Steitz, JBC 251, 1902 (1976). 45. M. A. Billeter, JBC 253, 8381 (1978). 46. I. Barrera, D. Schuppli, J. M. Sogo and H. Weber, JMB 232, 512 (1993). 47. J. T.August, A. K. Banerjee, L. Eoyang, M. T. Franze de Fernandez, K. Hori, C. H. Kuo, U. Rensing and L. Shapiro, CSHSQB 33, 73 (1968). 48. C. Weissmann, G. Feix and H. Slor, C S H S Q B 33, 83 (1968). 49. S. Spiegelman, N. R. Pace, D. R. Mills, R. Levinsohn, T. S. Eikhom, M. M. Taylor, R. L. Peterson and D. H. L. Bishop, CSHSQB 33, 101 (1968). 50. A. K. Banerjee, C. H. Kuo and J. T. August, JMB 40, 45 (1969). 51. H. Weber and C. Weissmann, J M B 51, 215 (1970). 52. M. A. Billeter, J. E. Dahlberg, H. M. Goodman, J. Hindley and C. Weissmann, Nature 224, 1083 (1969). 53. H. M. Goodman, M. A. Billeter, J. Hindley and C. Weissmann, PNAS 67, 921 (1970). 54. U . Rensing and J. T. August, Nature 224, 853 (1969). 55. I. Haruna and S. Spiegelman, PNAS 54, 1189 (1965). 56. A. K. Banerjee, L. Eoyang, K. Hori and J. T. August, PNAS 57, 986 (1967). 57. G . Feix and H. Hake, BBRC 65, 503 (1975). 58. G. Feix and H. Sano, EJB 58,59 (1975). 59. C. Weissmann, G . Feix, H. Slor and R. Pollet, PNAS 57, 1870 (1967). 60. C. K. Biebricher, S. Diekmann and R. Luce, JMB 154, 629 (1982). 61. G. Feix, H. Slor and C. Weissmann, PNAS 57, 1401 (1967). 62. T. S. Eikhom, JMB 93, 99 (1975). 63. C. Weissmann, L. Colthart and M. Libonati, Bchem 7, 865 (1968).

268

ALEXANDER B. CHETVERIN AND ALEXANDER S. SPIRIN

64. I. Haruna and S. Spiegelman, Science 150, 884 (1965). 65. C. K. Biebricher, M. Eigen and W. C. Gardiner, Bchem 22, 2544 (1983). 66. R. Kamen, in “RNA Phages” (N. D. Zinder, ed.), p. 203. CSH Lab Press, Cold Spring Harbor, NY, 1975. 67. M. A. Billeter, M. Libonati, E. Vitiuela and C. Weissmann, J B C 241, 4750 (1966). 68. T. Blumenthal, PNAS 77, 2601 (1980). 69. I. Yu. Morozov, V. 1. Ugarov, A. B. Chetverin and A. S. Spirin, PNAS 90, 9325 (1993). 70. K. Hori, L. Eoyang, A. K. Banerjee and J. T. August, PNAS 57, 1790 (1967). 71. Y. Mitsunari and K. Hori, J . Biochem. (Tokyo) 74, 263 (1973). 72. G. Feix and H. Sano, FEBS Lett.63, 201 (1976). 73. B. Kiippers and M. Sumper, PNAS 72, 2640 (1975). 74. A. V. Munishkin, L. A. Voronin, V. I. Ugarov, L. A. Bondareva, H. V. Chetverina and A. B. Chetverin, ] M B 221, 463 (1991). 75. A. Palmenberg and P. Kaesberg, PNAS 71, 1371 (1974). 76. R. A. Owens and T. 0 . Diener, Virology 79, 109 (1977). 77. G . Feix, Nature 259, 593 (1976). 78. J. N. Vournakis, G . G . Carmichael and A. Efstratiadis, B B R C 70, 774 (1976). 79. M. Obinata, D. S. Nasser and B. J. McCarthy, B B R C 64, 640 (1975). 80. C. Priano, F. R. Kramer and D. R. Mills, C S H S Q B 52, 321 (1987). 81. V. D. helrod, E. Brown, C. Priano and D. R. Mills, Virology 184, 595 (1991). 82. H. J. Gross, H. Domdey. C. Lossow, P. Jank, M. Raba, H. Alberty and H. L. Siinger, Nature 273, 203 (1978). 83. I. Haruna, Y. H. Itoh, K. Yamane, T. Miyake, T. Shiba and I. Watanabe, PNAS 68, 1778 (1971). 84. T. Miyake, I. Haruna, T. Shiba, Y. H. Itoh, K. Yamane and I. Watanabe, PNAS 68, 2022 (1971). 85. T Yonesaki, K. Furuse, I. Haruna and I. Watanabe, Virology 116, 379 (1982). 86. D. R. Mills, R. I. Peterson and S. Spiegelman, PNAS 58, 217 (1967). 87. R. Levisohn and S. Spiegelman, PNAS 60, 866 (1968). 88. R. Levisohn and S. Spiegelman, PNAS 63, 805 (1969). 89. A. K. Banerjee, U. Rensing and J. T August, J M B 45, 181 (1969). 90. D. L. Kacian, D. R. Mills, F. R. Kramer and S. Spiegelman, PNAS 69, 3038 (1972). 91. D. R. Mills, F. R. Kramer, C. Dobkin, T. Nishihara and S. Spiegelman, PNAS 72, 4252 (1975). 92. W. Schdner, K. J. Riiegg and C. Weissmann, J M B 117, 877 (1977). 92a. A. V. Munishkin, L. A. Voronin and A. B. Chetverin, in “Protein Structure and Biosynthesis” (A. S. Spirin, ed.), p. 35. Institute of Protein Research, Pushchino, 1987. 93. D. R. Mills, F. R. Kramer and S. Spiegelman, Science 180, 916 (1973). 94. C. K. Biebricher, C S H S Q B 52, 299 (1987). 95. M. Sumper and R. Luce, PNAS 72, 162 (1975). 96. D. Hill and T. Blumenthal, Nature 301, 350 (1983). 97. C. K. Biebricher, M. Eigen and R. Luce, Nature 321, 89 (1986). 98. T. Blumenthal and T. A. Landers, Bchem 15, 422 (1976). 99. C. K. Biebricher, M. Eigen and R. Luce, J M B 148, 369 (1981). 100. E. A. Miele, D. R. Mills and F. R. Kramer, ] M B 171, 281 (1983). 101. P. M. Lizardi, C. E. Guerra, H. Lomeli, I. Tussie-Luna and F. R. Kramer, BiolTechnology 6, 1197 (1988). 102. C. K. Biebricher and R. Luee, EMBO ]. 11, 5129 (1992). 103. A. B. Chetverin. H. V. Chetverina and A. V. Munishkin, ] M B 222, 3 (1991). 104. T. E. England and 0 C Uhlenbeck, Nature 275, 560 (1978).

REPLICABLE RNA VECTORS

269

105. I. Watanabe, Nihon Rinsho 22, 243 (1964). 106. C. Weissmann and G. Feix, PNAS 55, 1264 (1966). 107. K. Horiuchi, in “RNA Phages” (N. D. Zinder, ed), p. 29. CSH Lab Press, Cold Spring Harbnr. NY, 1975. 108. M. M . C. Lai, Microbiol. Reu. 56, 61 (1992). 109. K. Palasingam and P. N. Shaklee, J . Virol. 66,2435 (1992). 110. L. Mindich, X. Qiao, S. Onodera, P. Gottlieb and J. Strassman, J . Virol. 66, 2605 (1992). 111. A. B. Chetverin, L. A. Voronin, A. V. Munishkin, L. A. Bondareva, H. V. Chetverina and V. I. Ugarov, in “Frontiers in Bioprocessing 11” (P. Todd, S. K. Sikdar and M.Bier, eds.), p. 44.American Chemical Society, Washington, DC, 1992. 111a. L. Mindich, I. Nemhauser, P. Gottlieb, M. Romantschuk, J. Carton, S. Frucht, J. Strassman, D. H. Bamford and N . Kalkkinen, J . Virol. 62, 1180 (1988). 112. C. K. Biebricher, M. Eigen and R. Luce, J M B 148, 391 (1981). 113. J. S. McCaskill and G . J. Bauer, PNAS 90,4191 (1993). 114. C. K. Biebricher and R. Luce, Bchem 32, 4848 (1993). 115. P. W. Trown and P. L. Meyer, ABB 154, 250 (1973). 116. D. R. Mills, T. Nishihara, C. Dobkin, F. R. Kramer, P. E. Cole and S. Spiegelman, in “Nucleic Acid-Protein Recognition” (H. J. Vogel, ed.), p. 533. Academic Press, New York, 1977. 117. T. Nishihara, D. R. Mills and F. R. Kramer, J . Biochem. (Tokyo)93, 669 (1983). 118. D. R. Mills, C. Dobkin and F. R. Kramer, Cell 15, 541 (1978). 119. A. B. Chetverin and L. A. Voronin, in “Protein Structure and Biosynthesis” (A. S. Spirin, ed.), p. 43. Institute of Protein Research, Pushchino, 1987. 120. A. Rich and U. L. RajBhandary, ARB 45, 805 (1976). 121. F. Meyer, H. Weher and C. Weissmann, J M B 153, 631 (1981). 122. D. R. Mills, J M B 200, 489 (1988). 123. Y. Wu, D. Y. Zhang and F. R. Kramer, PNAS 89, 11769 (1992). 124. B. C. F. Chu, F. R. Kramer and L.. E. Orgel, NARes 14, 5591 (1986). 125. H. Lomeli, S. Tyagi, C. G. Pritchard, P. M. Lizardi and F. R. Kramer, Clin. Chem. 35, 1826 (1989). 126. J. D. Klinger and C. G. Pritchard, Clin. Microbiol. Newsl. 12, 133 (1990). 127. P. Cahill, K. Foster and D. E. Mahan, Clin. Chem. 37, 1482 (1991). 128. C. G. Pritchard and J. E. Stefano, Med. Virol. 10, 67 (1991). 129. F. R. Kramer and P. M. Lizardi, Nature 339, 401 (1989). 130. D. Kolakofsky and C. Weissmann, Nature N B 231, 42 (1971). 131. M. W. Nirenberg, 0. W. Jones, P. Leder, B. F. C. Clark, W. S. Sly and S. Pestka, CSHSQB 28, 549 (1963). 132. M. F. Singer, 0. W. Jones and M. W. Nirenberg, PNAS 49, 392 (1963). 133. M. Takami and T. Okamoto, J M B 7, 323 (1963). 134. V. I. Ugarov, I. Yu. Morozov, G. Y.Jung, A. B. Chetverin and A. S. Spirin, FEBS Lett. 341, 131 (1994). 135. J. G . Belasco and C. F. Higgins, Gene 72, 15 (1988). 136. R. S. McLaren, S. F. Newbury, G . S. C. Dance, H. C. Causton and C. F. Higgins, J M B 221, 81 (1991). 137. I. Hirao, S. Yoshikawa and K. Miura, FEBS Lett. 321, 169 (1993). 138. L. Ryabova, E. Volianik, 0. Kurnasov, A. S. Spirin, Y. Wu and F. R. Kramer, JBC 269, 1501 (1994). 139. I. V. Boni and D. M. Isaeva, Dokl. Akad. Nauk S S S R 298, 1015 (1988). 140. I. V. Boni, D. M. Isaeva, M. L. Musychenko and N. V. Tzareva, NARes 19, 155 (1991). 141. S. Brown and T. Blumenthal, JBC 251, 2749 (1976).

270

ALEXANDER B . CHETVERIN AND ALEXANDER S. SPIRIN

E. Batschelet, E. Domingo and C. Weissmann, Gene I, 27 (1976). E. Domingo, D. Sabo, T. Tanigushi and C. Weissmann, Cell 13, 735 (1978). J. W. Drake, PNAS 90, 4171 (1993). P. Edelrnan and J. Gallant, Cell 10, 131 (1977). A. S. Spirin, V. 1. Baranov, L. A. Ryahova, S. Yu. Ovodov and Yu. 8 . Alakhov, Science 242, 1162 (1988). 147. V. I. Baranov, I. Yu. Morozov, S . A. Ortlepp and A. S. Spirin, Gene 84, 463 (1989). 148. V. 1. Baranov and A. S. Spirin, Methods Enzymol. 217, 123 (1993). 149. A. S. Spirin, in “Frontiers in Bioprocessing 11” (P. Todd, S. K. Sikdar and M. Bier, eds.), p. 31. American Chemical Society, Washington, DC, 1992. 150. H. V. Chetverina and A. B. Chetverin, NARes 21, 2349 (1993). 151. W. Paranchych. in “RNA Phages” (N. D. Zinder, ed.), p. 85. CSH Lab Press, Cold Spring Harbor, NY, 1975. 152. A. M. Q . King, in “RNA Genetics” (E. Domingo, J. 1. Holland and P. Ahlquist, eds.), p. 149. CRC Press, Boca Raton, FL, 1988. 153. W. Gilbert, Nature 319, 613 (1986). 154. S. Schlesinger, in ”RNA Genetics” (E. Domingo, J. J. Holland and P. Ahlquist, eds.), p. 167. CRC Press, Boca Raton, FL, 1988. 155. L. I. Romanova, V. M. Blinov, E. A. Tolskaya, E. G. Viktorova, M. S. Kolesnikova, E. A. Guseva and V. I. Agol, Virology 155, 202 (1986). 156. S. Kuge, I. Saito and A. Nomoto, J M B 192, 473 (1986). 157. K. Kirkegaard and D. Baltimore, Cell 47, 433 (1986). 158. A. M. Q . King, NARes 16, 11705 (1988). 159. R. K. Saiki, S. Scharf, F. Faloona, K. B. Mullis, G. T. Horn, H. A. Erlich and N. h n h ei m , Science 230, 1350 (1985). 160. J. C. Guatelli, K. M. Whitfield, D. Y. Kwoh, K. J. Barringer, D. D. Richrnan and T. R. Gingeras, PNAS 87, 7797 (1990). 161. G. T. Walker, M. S. Fraiser, J. L. Schram, M. C. Little, J. C . Nadeau and D. P. Malinowski, NARm 20, 1691 (1992). 162. J. M. Burke and A. Berzal-Herranz, FASEB /. 7, 106 (1993). 163. U . Landegren, R. Kaiser, C. T Caskey and L. Hood, Science 242, 229 (1988).

142. 143. 144. 145. 146.

Examination of Mitotic Recombination by Means of Hyper-recombination Mutants in Saccharornyces cerevisiae HANNAHL. KLEIN Department of Biochemistry and Kaplan Comprehensive Cancer Center New York University Medical Center New York, New York 10016

I. Review of Mitotic Recombination in Yeast . . . . . . . . . . . . . . . . . . . . . . . . . 11. Isolation of Hypo-recombination and Hyper-recombination Mutants A. Types of Hypo-rec and Hyper-rec Mutants .....................

..

B. Screens and Selections for Hyper-rec Mutants . . . . C. Characterization of Yeast hpr Mutants . . . . . . . . . . . . . . . . . . . . . . . . . 111. Concluding Statements . . . . . . . . . . . . . . . . .... IV. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References ............................

271 274 274 278 284 299 301 301

Genetic Nomenclature The following nomenclature has been adopted for genes and proteins of

Saccharomyces cerevisiae: a locus or gene, this is used to refer to the wild-type gene or the locus when no genotype is specified any recessive mutation at the LEU2 locus leu2 leu2-112 a specific recessive allele or mutation Leu2p the protein encoded by the LE U2 gene; sometimes also designated as the Leu2 protein

LEU2

1. Review of Mitotic Recombination in Yeast Eukaryotic organisms have two types of cell division: a mitotic division that forms two identical daughter cells, and a meiotic division that forms four nonidentical haploid products from one diploid cell. Recombination is an essential feature of the meiotic cell division. The physical enjoinment of Progress in Nucleic Acid Research and Molecular Biology, Val. 51

271

Copyright 0 1995 by Academic Press, Inc. All rights of reproduction in any form reserved

272

HANNAH L. KLEIN

nonsister chromatids ensures the correct alignment and segregation of homologous chromosomes in the meiosis-I division. A consequence of the requirement for recombination in the faithful chromosome segregation during meiosis is that mutants that reduce or alter recombination often give no viable meiotic products. Thus, genetic studies to assess the effects of mutants on meiotic recombination must resort to strategies that interrupt the meiotic process or pass over the first meiotic division, this being the division that requires the successful completion of recombination. An alternative approach is to study these same mutants during mitosis from a genetic perspective. Again there are limitations involved. First, the gene in question may be expressed only in meiosis. Second, mitotic recombination occurs spontaneously at rates that are several orders of magnitude less than the rates of meiotic recombination values. This means that mitotic recombination events must be selected. Third, although many features of meiotic recombination are found also in mitotic recombination, there are some critical dissimilarities. For example, mitotic recombination occurs in the absence of any synaptonemal complex. Indeed, there are aspects of mitotic recombination that are unique to the vegetative cycle. The initiation events and sites for mitotic recombination are probably much more varied than those involved in meiotic recombination. Last, a mutant may display a different recombination behavior in mitosis compared to meiosis. The rad50 null mutant shows a slight hyper-recombination phenotype in mitosis, but eliminates meiotic recombination. These problems notwithstanding, studies of mitotic recombination have yielded valuable information. Mitotic recombination occurs primarily in response to DNA damage. It is one of several mechanisms used by the cell to repair DNA damage. Cells experience a variety of DNA-damaging events, not all of which result in substrates that are efficiently used by the mitotic recombination pathway(s). However, one of these substrates, double-strand breaks (DSBs), is repaired almost exclusively by mitotic recombination, and the generation of a DSB in a recombination-deficient strain is a lethal event. Mitotic recombination also is important for effecting rapid but transitory genetic changes. Some of these changes give a beneficial selective advantage to the cell, as occurs in antigen variation in trypanosomes. In other cases, mitotic recombination can lead to loss of heterozygosity (LOH) and uncover a recessive mutation in a growth-control gene. Such events are frequently found in tumor cells and are thought to be the causative event in unleashing the recessive phenotype of a deleterious mutation. Both meiotic and mitotic recombination take place between homologous chromosomes and repeated sequences. The repeated sequences may be tandemly arranged or may be a dispersed repeat family. However, the recombination events are homologous in all cases.

MITOTIC HYPER-RECOMBINATION MUTANTS

273

A recent comprehensive review of mitotic recombination in the yeast Saccharomyces cerevisiae is available (1) and recent short reviews discuss specific aspects of recombination (2-4). General features of mitotic recombination are presented here and the reader is referred to the review articles for further details. Mitotic recombination was first studied in diploid strains bearing markers heterozygous at several loci. Recombinants are detected as prototrophic segregants that arise primarily from intragenic recombinations, which are gene conversion events. Reciprocal recombination or crossing-over between a gene and the centromere is detected by the homozygosis of a heterozygous marker. This can be seen by the phenotypic unmasking of a recessive auxotrophy, but such events cannot be readily selected unless the recessive marker has a resistant phenotype, such as cycloheximide or canavanine resistance. Because spontaneous recombination is low, stimulatory agents such as UV light, ionizing radiation, or chemical treatment to introduce nicks or breaks in the DNA molecules have been used to increase recombination events. Such studies reveal differences between mitotic and meiotic recombination. First, mitotic recombination events often occur in the G, phase of the cell cycle prior to the initiation of DNA replication. In contrast, meiotic recombination occurs at the four-strand stage after DNA replication. Recombination can occur in mitosis after DNA replication, but these are usually DNA damage-induced events. The damage-induced events can take place in either G , or G, phase whereas spontaneous events occur in G,. Meiotic gene conversion events within a gene often have a distinct pattern. Different alleles have different conversion frequencies and form a gradient of conversion frequencies, with one end of the gene having the highest frequency and the other end having the lowest frequency. This phenomenon is termed polarity. The high end reflects a fixed initiation site, often a double-strand break, for recombination. Mitotic gene conversion events, both spontaneous and induced events, do not show polarity. Instead, allele conversion frequencies are similar. This suggests that no fixed initiation site is used for mitotic gene conversion. In support of this view is the observation that deletion of the ARC4 meiotic DSB site eliminates the polarity gradient at that gene, but does not eliminate all gene conversions at ARG4. The remaining gene conversion events have similar frequencies at all sites within in the gene, a pattern that resembles mitotic gene conversion (5).Elimination of the meiotic hotspot at HIS4 also eliminates the polarity gradient (6). Meiotic-gene conversions are associated with crossing-over in flanking regions approximately 50% of the time. Mitotic gene conversions also are associated with crossing-over in flanking regions, although the association values show a wider range, from 10 to 55%(1). Mitotic events selected first as

274

HANNAH L. KLEIN

cross-overs are associated with gene conversions 50% of the time (7, 8). Either the two events are independent, but concerted in some fashion, or conversions may be a prelude to cross-overs, as is proposed for meiotic recombination. In meiosis, markers that are several hundred base-pairs apart show coincident gene conversion, indicating repair of a heteroduplex region that has that minimal size. In mitosis, coincident conversion of markers up to 36 kb apart has been observed (9). Although these long-tract conversions usually reflect concerted multiple events, recent experiments suggest that some long coincident conversions result from conversion-associated chromosome breakage with subsequent repair using another chromatid (10).

II. Isolation of Hypo-recombination and Hyper-recornbination Mutants Mitotic recombination usually occurs at a site of DNA damage. Because recombination is one of the mechanisms used to repair DNA damage, and more specifically double-strand breaks, mutants that are deficient in the repair of double-strand breaks often show reduced recombination levels. However, this is dependent on the recombination assay, (discussed in Sections II,A,l and 11,A,2). In general, hypo-rec mutants are defective in some aspect of the recombination process and have reduced recombination rates. This is sometimes accompanied by increased sensitivity to DNA-damaging agents. In contrast, hyper-rec mutants produce more substrate for recombination either through the induction of DNA lesions or through the inhibition of a repair pathway that is usually not a recombinational repair pathway. Alternatively, hyper-rec mutants may be altered in a function that normally suppresses mitotic recombination. The hyper-rec mutant phenotype is manifested as an increased spontaneous recombination rate.

A. Types of Hypo-rec and Hyper-rec Mutants This section summarizes the types of mutants identified both in yeast and Escherichiu coli that have altered hypo- and hyper-recombination rates.

1. GENESINVOLVED IN VARIOUS ASPECTS OF DNA METABOLISM The rec mutants of E . coli were isolated from screens for mutants with reduced conjugational recombination. These mutants encode functions that are required for the recombinational process. Other mutants, such as the m u mutants, were first identified on the basis of increased sensitivity to mito-

MITOTIC HYPER-RECOMBINATION MUTANTS

275

mycin C or reduced recombination in merodiploids. Several recent reviews describe these mutants in more detail (11-13). Bacterial hyper-rec mutants have been found in genes that function in DNA replication and D N A repair. These include mutations in uurD (DNA helicase 11), uurA (excinuclease), mutS, rnutL, mutH (mismatch repair), darn (DNA adenosine methylase), Zig (DNA ligase), xthA (exonuclease 111), topBlmutR (DNA topoisomerase III), dut (deoxyuridine triphosphatase), and polA (DNA polymerase I) (14-18). In yeast, the RAD genes (genes involved in the repair of DNA damage) have been the major source of mutants that directly affect the recombination process. Most of the rad mutants in the RAD52 epistasis group affect meiotic recombination and mitotic recombination, although there is no direct correlation between the effect on mitotic versus meiotic recombination. The rad51, rad52, and rud54 mutants affect mitotic recombination (1, 19), but the effect of each mutant is very dependent on the recombination assay. For example, rad5.2 mutants show a 10-2 reduction in spontaneous gene conversion events (19-21), but are not greatly reduced in spontaneous deletion events between direct repeats (22, 23) or HO-induced recombination (a specific DSB event) in direct repeats (24, 25). The rad5l mutants show also 10-2 reduced levels of spontaneous mitotic gene conversion (19), but have 10-fold increase rates for deletions between direct repeats (26, 27). The ra& mutants reduce both spontaneous and induced mitotic recombination (19),but have not been reported to affect meiotic recombination significantly

(28). Recently, screens have been devised to recover mutants that specifically affect meiotic recombination. These mutants are called mei (29), mre (30), and rec (31). Other meiotic-specific Rec- mutants include spoll (32),merl (33),red1 (34),and hop1 (35).Some of these mutants affect the synaptonemal complex structure and may not be directly involved in the recombination reaction per se. RAD genes in other DNA repair epistasis groups also affect mitotic recombination. The RADf gene functions as an endonuclease in conjunction with the RADlO gene product in excision repair. Mutant strains show no gross changes in spontaneous mitotic recombination, but specific recombination products from recombination between direct repeats are reduced (23,

36-40). Because recombination is usually not an essential process in the mitotic cell cycle, and in fact is stimulated by the presence of DNA lesions, many mutants that have an accumulation of nicks or gaps in the DNA template have a mitotic hyper-rec phenotype. Table I shows a list of groups of yeast mutants that increase some form of mitotic recombination. Most of these genes can be placed in known DNA metabolic functions and all of the mu-

276

HANNAH L. KLEIN

TABLE I MITOTIC HYPER-REC MUTANTSOF Sacchuromyces cereoisiae Gene function

Genes

Excision repair Error-prone repair Other repair Replication and cell cyc:le (GJ

RAW reml alleles, radl, rad4, rad24 srs2(hpr5), rad6, radl8 mms9, m m l 3 , mm21 cdcl7(poll/hpr3/rml),cdc2(pol3/hpr6/tex2),cdc5, cddi, cdc8, cdc9, cdcl3, cdc14, cdcl5, cdcl6, cdc20, cdc23, cdc44, cdc45, cdc46, cdc54, pn'l, pri2, rfal(srr1 allele), ctf4(chll5/pobl) pmsl, pm2, pmS3 rad50, xrs2, rad51, rad54, rad55, rad57 topl, top2. top3 hprl, hpr2, hpr4, hpr7, hpr8, sohl, slgl0, sir2, pkcl, rwn.2, wm.3, sgsl, rec46, recl41, recl46, rec193, rec199, rec276, rec395, rec409, rec952, MZC

Mismatch repair Recombination repair Enrode topoisomerase Other functions

tants, with the exception of the pmsl, pms.2, pms3, topa, srs2, hprl, hpr2, hpr3, hpr4, hpr6, hpr7, hpr8, pkcl, M I C , rec, and r r m l - r d mutants, were first identified on the basis of an alternate phenotype unrelated to mitotic recombination. The list of mutants is remarkably similar to the hyper-rec mutants identified in E. coli.

2. GENESNOT KNOWN TO BE DIRECTLY INVOLVED IN DNA METABOLISM Although most hyper-rec mutants are involved in functions known to be involved in DNA replication or D N A repair, some mutants do not have any known direct role in nucleic-acid metabolism. These mutants are listed at the end of Table I. They include the pkcl @ I ) , rec (42),MlC (43),hpr (44),and rrm ( 4 9 , mutants that were isolated on the basis of a hyper-rec phenotype, but have no noticeable DNA repair defect such as UV, X-ray or methyl methane sulfonate sensitivity. Most of these genes have not been cloned. In the cases of HPRl and SOHl, wherein the gene has been cloned, the deduced protein sequence has partial homology to a protein known to function in DNA metabolism, but no activity has yet been demonstrated for the novel protein. The HPRl gene shows significant homology to DNA topoisomerases (46) and has a genetic interaction with yeast topoisomerase mutants. The HPRl gene is not required for normal growth at 30°, nor is TOP1 an essential gene. However, the double mutant shows a synthetic growth phenotype. Similarly,

MITOTIC HYPER-RECOMBINATION MUTANTS

277

hprl top2 and hprl top3 double mutants are synthetically lethal under permissive conditions for the single mutants. SOH1 was identified as a suppressor of the temperature-conditional growth of an hprl mutant strain (26). The SOH1 gene product has some homology to DNA- and RNA-directed RNA polymerases and is not essential. Both the sohl-1 point mutant and a null allele strain show an increase in deletion events in a direct repeat, with the null allele being increased 10-fold in the recombination rate. This hyper-recombination is RAD52 and RADl dependent, as is the case for hyper-deletion events in other yeast mutants. The slgl0 mutant was identified in an hprl synthetic lethal screen. The slgfO mutant shows a 10-fold increase in deletion events in a direct repeat (D. Dimova and H. Klein, unpublished) and this hyper-recombination is dependent on RAD52 and RADl functions. The identity of the SLGlO gene is unknown. The SIR2 gene is required for transcriptional repression of the H M L and HMR mating cassettes. The sir2 mutants show increased recombination rates at the rDNA locus, similar to t o p l , top2, and top3 mutants, but the S I R 2 gene product has no homology to topoisomerases, nor has any topoisomerase activity been demonstrated for the Sir2 protein (47). Overexpression of SIR2 results in increased histone deacetylation in uiuo (48). Whether SirZp functions directly or indirectly in nucleosome hypoacetylation is unknown, and it is unknown how this activity relates to increase rDNA recombination. Presumably, alteration of the nucleosome structure at the rDNA locus in sir2- strains renders the DNA more readily damaged or more accessible to the recombination enzymes. The PKCl gene encodes the yeast protein kinase C and has an essential role in the regulation of growth. Conditional PKCf alleles enhance mitotic recombination. Both gene conversion events and deletions between direct repeats are increased in mutant strains (41).These results suggest that PKCl regulates the activity of genes involved in DNA metabolism. Reduced activity in a pkcl mutant would result in increased DNA lesions and hyperrecombination. The rrml, rrm2, and rrm3 mutants were isolated on the basis of increased recombination within the rDNA locus (45). The rrml and rrm2 mutants also increase deletions in a direct repeat of LYS2 genes and have a mutator phenotype. rrml is an allele of CDCl7. The hyper-rec and mutator phenotypes of the rrml mutant have also been observed in other alleles of the CDC17 gene. The rrm2 mutant has a phenotype similar to other mutants defective in DNA replication functions and is a temperature-conditional mutant, but has not been found to be allelic to any CDC gene that increases mitotic recombination when mutant. Most likely RRM2 will turn out to be involved in DNA replication.

278

HANNAH L. KLEIN

6. Screens and Selections for Hyper-rec Mutants Several different screens for hyper-rec mutants of E. coli and S . cerevisiae have been used. Increase in recombination between two homologous DNA molecules or direct repeats has been productive. Another approach has been to isolate suppressors of Rec- mutants. Often these mutants have no effect in a wild-type (Rec+) background, but occasionally such mutants display a hyper-rec phenotype on their own. Still other mutants have been isolated on the basis of another defect such as reduced DNA replication repair (see Sections II,A, 1 and II,A,2) and were subsequently shown to have a hyper-rec phenotype.

1. BACTERIAL MUTANTS Increase in intragenic recombination in Hfr x F- crosses identified several mutants, all of which had the additional property of increasing spontaneous mutations (15). These mutants are alleles of the mismatch repair genes mutS and rntctl. Other ma- mutants, including uurD(mutU) and mutH, were then examined for hyper-recombination and shown to be hyperrec. Increased recombination between short direct repeats (49)or lac2 genes (14) identified mutants in topB encoding topoisomerase I11 (17)and DNA ligase (lig),DNA polymerase I ( p o k ) ,DNA helicase I1 (uorD),DNA adenosine methylase (darn),exonuclease 111 (&A), and deooxyuridine triphosphate (dut)(14).The same assay was then used to demonstrate the hyper-rec phenotype of mutations in the l e d , rep, xseA (exonuclease VII), and rdgB genes (16,50, 51). The recD gene product is a subunit of the RecBCD enzyme, which has DNA nuclease and helicase activities. The RecD activity is not essential in uiuo, as recD mutants do not significantly reduce recombination rates of a variety of substrates in wild-type strains. In fact, recombination rates are increased, particularly when plasmid substrates are used (52), although the increases are less than 10-fold. recB and recC mutants reduce conjugational recombination by 10-3. Suppressors of this hypo-recombination, called the sbcA, sbcB and sbcC genes, have been recovered. These mutants are hyper-rec in recBC mutants, but not in wild-type strains. 2. YEAST MUTANTS Hyper-rec mutants in this organism were first identified among mutants that were defective in the repair of DNA damage. rud3 (53-55), rud6 (53, 56), rudl8 (57),and rud24 (58)mutants, defective in excision repair or errorprone repair, have increased intragenic recombination rates in diploid cells. Other DNA repair mutants (the mms mutants, which have increased sensi-

MITOTIC HYPER-RECOMBINATION MUTANTS

279

tivity to methyl methane sulfonate, a DNA damaging agent), also have a hyper-rec phenotype. Some of these mutants affect intragenic recombination (gene conversion) and others increase intergenic recombination (crossingover), and still others increase both types of recombination (59). The rem mutants, subsequently shown to be alleles of the RAD3 DNA helicase gene, were selected as hyper-rec mutants for both gene conversion and crossingover in mitotic diploid cells (54, 55). Because a diploid strain was used in the initial screen, these mutants turned out to be semidominant for the recombination phenotype. This is not usually the case for recombination mutants, although when the screen or selection is performed using mutagenized diploid cells, it is possible to recover dominant or semidominant alleles in genes where all previously known mutations are recessive. An example of this is the RAD.51 group of suppressors of srs2 diploid strains (60). One of the mms mutants (mms8) turned out to be an allele of CDCS, which encodes DNA ligase (61).Independently, cdc9 was shown to have a mitotic hyper-rec phenotype for gene conversion and crossing-over (62, 63). This was not unexpected, because E. coli lig mutants are hyper-rec (14). However, it did suggest that, because E . coli mutants defective in DNA replication were hyper-rec, the same should be true for S . cerevisiae mutants. Examination of the CDC cell-cycle mutants that arrested in S phase or G, phase, indicative of a defect in replication, showed that many of these mutants were hyper-rec for mitotic cross-over (64-66). At the time, only a few of the genes had been identified as being directly involved in replication, but later the CDC17 and CDC2 genes were shown to encode DNA polymerase I and DNA polymerase 111, respectively. In a direct screen for mitotic hyper-rec mutants, mutations in the CDC17 and CDC2 genes were recovered (44), further strengthening the overlap between the bacterial and yeast hyper-rec mutants. The yeast DNA polymerase mutants were shown to be hyper-rec for intrachromosomal gene conversions and deletions between direct repeats, using the substrate shown in Fig. 1. This substrate was used to examine several of the cdc mutants that had been identified (64)as mitotic hyper-crossing-over mutants. In each case the mutants previously shown to increase mitotic crossing-over also increased recombination in the direct repeat substrate (44).This suggests that the mutants increase mitotic recombination by providing substrate for recombination in the form of nicked or damaged DNA. Mutants in genes that are known to be required for DNA replication, such as the genes encoding primase subunits, show increased mitotic recombination (hyper-recombination) (67). Using the approach of isolating suppressors of a hypo-rec genotype, in this case a radl rud52 strain, an allele of the yeast single-strand DNA binding protein gene SRRIIRFAI was identified ( 6 7 ~ )This . mutant not only

280

HANNAH L. KLEIN

\

Deletion

LEU2

uR43

kw2-k

W-I12

uRA3

LEU2

LEU2

U W

LEU2

leu2- 1 12

leuz-k,I I2

Leu+ urn+

Ura -

FIG. 1. Schematic diagram of intrachromosornal recombination events occurring in a heteroallelic direct repeat. A direct repeat of the LEU2 gene is shown. Both copies of the LEU2 gene are defective, but each copy has a daerent mutation. The pBR322 vector sequences and the yeast URA3 gene are located between the duplicated LEU2 genes. Gene conversion events are the Leu+ Ura+ segregants. Southern analysis has confirmed that segregants with the phenotype do indeed retain the duplicated copies of the LEU2 gene, with the URA3 gene placed in between. Three different gene-conversion events can be recovered: those that convert either the leu2-112 copy to LEU2, those that convert the Lu2-k copy to LEU2, and those that convert both the leu2-112 and the l e d - k copies to LEUZ. The first two conversion events are the predominant events recovered. Deletion events, confirmed by Southern analysis, are those that lose the duplication and the URA3 copy between the duplicated genes. Deletions may be either Leu+ or Leu-. When URA3 is located between the repeats, as depicted in this figure, deletion events may be selected by grtnvth on medium containing 5-fluoroorotic acid.

increases recombination of a direct repeat in the mutant strain, but also increases recombination in a wild-type strain. Deletions between direct repeats of 6 sequences, long terminal repeat (LTR)sequences of the yeast Ty retrotransposon, and deletions in the rDNA are increased, but intragenic gene conversion between homologs is reduced to l16 ( 6 7 ~ )The . RP-A SSB protein has a primary role in DNA replication, extending the list of hyperrec mutants of replication genes. In this case, however, the hyper-rec phenotype appears to be allele specific, as other mutations in the RFAl gene are hypo-rec (67b). Suppression of a hypo-rec mutant is unique to this gene; other hyper-rec mutants of DNA replication genes do not suppress rud52

MITOTIC HYPER-RECOMBINATION MUTANTS

28 1

mutants and in fact require the R4D52 gene for the hyper-recombination phenotype. The overlap between the E . coli and S . cerevisiae hyper-rec mutants also extends to mismatch repair mutants. As described above, mutS, mutL, mutH, and mutU(uur0) mutants are hyper-rec. In yeast, p m s l and srs2 mutants are increased in mitotic gene conversion (44, 68). P M S l is homologous to mutL (69),and SRS2 is homologous to mutU(uurD) (70).Both mutants show a 5- to 10-fold increase in intragenic recombination in diploids, and the srs2 mutant is also increased for gene conversion in a direct repeat (see Fig. 1).The pmsl mutant has a mutator phenotype, consistent with a role in mismatch repair, and recently the Pmsl protein has been shown to bind to mismatches in uitro (71).However, the srs2 mutant does not have a mutator phenotype and the meiotic data do not indicate that it functions in mismatch repair. The PMS2 and PMS3 genes have not yet been cloned, so their relationship to the bacterial mismatch repair genes is unknown. However, the mutants have a mutator phenotype and show increased mitotic recombination (72). The RAD52 epistasis group of mutants is required for the mitotic repair of double-strand breaks and mitotic recombination and generally reduces or eliminates meiotic recombination. However, there are many exceptions to this statement. The rad50 and xrs2 mutants have similar phenotypes. They are X-ray sensitive, implying a defect in the repair of double-strand breaks, but do not eliminate mating-type switching, which involves a DSB. Matingtype switching is greatly slowed in these mutants, but eventually occurs. General mitotic recombination is not reduced in rad50 and xrs2 mutants, and in fact both gene conversion and crossing-over show a slight increase (73, 74). The rad51, rad54, rad55, and rad57 mutants are reduced for mitotic gene conversion between homologous chromosomes (19). The rad51 mutants are reduced for spontaneous and induced mitotic gene conversion between direct repeats, but are slightly hyper-rec for spontaneous deletion events between direct repeats that produce a prototrophic segregant (Fig. 2) (75). When all deletions events are recovered (see Fig. l), the radsl mutant shows a greater enhancement of recombination (26, 27). Similarly, ra&, rad55, and rad57 mutants show a level of enhancement of mitotic deletions between direct repeats identical to that of the rad51 mutant (26, 27). One interpretation of these results is that in the mutant strains a substrate accumulates that can be repaired, either through a nonrecombinational mechanism when direct repeats are not present or through a recombinational mechanism when repeats are present, resulting in loss of the repeat. Loss of the repeat requires the RAD52 gene (J. McDonald and R. Rothstein, personal communication), suggesting that repair of a DSB through DNA homology is involved.

282

HANNAH L. KLEIN

I

Deletion

HIS4

His+ Ade-

FIG. 2. Duplication to measure deletions events. Shown is the duplication construct used by Shinohara et d. (74)to examine intrachromosomal recombination in a radsl mutant strain. The construct consists of two copies of the HIS4 gene, each hearing a mutation that results in a His- phenotype. The ADE2 gene is located between the duplicated HIS4 genes. The authors selected His' segregants and then examined these for the adenine phenotype. His+ Adesegregants are deletion events.

Screens for mitotic hyper-rec mutants in haploid cells, using direct repeat constructs, have been devised. The hprl-hpr8 mutants were recovered from mutagenized cells carrying two different duplications, one of the LEU2 genes and one of the HIS3 genes (see Fig. 3). Each mutant increased recombination in both substrates (44). These mutants are discussed more fully in the next section. In another study involving a similar type of duplication ofADE2 genes, a mutant was recovered that increased both gene conversion and deletion loss of the duplication (41). Further studies showed that mitotic gene conversion between homologs at heteroallelic loci was also increased in the mutant. The mutation harbored by this strain turned out to be in the PKCl gene, which

leu.?-7 12

URA3 Chromosome 111

hi~3-513

TRPl

hh3-537 Chromosome XV

FIG.3. Duplications used to isolate hpr mutants. Two duplications are shown: a duplication of LEU2 on chromosome I11 and a duplication of HIS3 on chromosome XV. Each duplication carries heteroallelic repeats and the phenotype of the strain is Leu- Ura+ His- Trp+. At a low frequency, Leu+ and His+ segregants can be recovered, due to the types of recombination events shown in Fig. 1. Following mutagenesis, colonies were recovered that increased at least 10-fold in Leu' and His' segregants

MITOTIC HYPER-RECOMBINATION MUTANTS

283

encodes protein kinase C1. Although the screen used to isolate this mutant is similar to that used to identify the hpr mutants, none of the hpr mutations have been shown to reside in the PKCl gene. The PKCl gene is proposed to have a role in regulating DNA metabolism, and the defective pkcl allele could result in the accumulation of recombinogenic substrates indirectly via regulation of genes involved in DNA replication or repair. Deletion of a tRNA gene flanked by 6 sequences was used to isolate a mutant called top3 (76). Further studies showed that recombination in the rDNA repeat was also enhanced, but recombination at other direct repeats, such as the ones used to study the hpr mutants, was not increased in top3 strains. The TOP3 gene encodes a type-I topoisomerase with negative supercoil unwinding activity (77) and is homologous to the E . coli topB gene (76). The recombination phenotype of the top3 mutant partially overlaps that found in top1 and top2 mutants. TOP1 encodes DNA topoisomerase 1, a type-I topoisomerase, and TOP2 encodes topoisomerase 2 , a type-I1 topoisomerase. Mutations in either gene increase recombination in the rDNA locus, but neither mutant affects recombination between 6 repeats or other direct repeats (76, 78). The sgsl mutant was not originally identified as a hyper-rec mutant. It was first isolated as a suppressor of the poor growth of top3 mutants (79). Further studies showed that this mutant also suppresses the hyper-rec phenotype of top3 mutants at the rDNA locus, but the suppressed rate was still significantly above the wild-type rate. This rate turned out to reflect the slight hyper-rec phenotype of the sgsl allele, which is epistatic to the top3 mutant (79). Interestingly, the soh1 mutant appears to act in a similar epistatic fashion to suppress the hyper-rec rate of the hprl mutant to the level of the soh1 mutant (26). The involvement of topoisomerases in rDNA hyper-recombination is interesting, in that a sequence from the rDNA called HOTl stimulates recombination when placed adjacent to a duplication of HIS4 genes at the HIS4 locus (80). The HOT1 region contains the promoter of the 35-S rRNA precursor (80). Hyper-recombination was clearly transcription-dependent, because the insertion of an RNA polymerase I transcription terminator between the HOTl sequence and the HIS4 duplication eliminated the hyper-recombination (81). Recently, mutants called rrml, r d , and r d were isolated on the basis of increased recombination in the rDNA locus, using increased loss of a marker inserted into the rDNA array to screen for hyper-rec mutants (45). The RRMl gene is CDCl7, but the RRM2 and RRM3 genes are not allelic to any gene known to affect recombination in the rDNA locus. The r d mutant also results in increased recombination at another repeated cluster, the CUPl locus. This is a multiply repeated direct array of the CUPl gene,

284

HANNAH L. KLEIN

which encodes copper chelatin. The rrm3 mutant does not increase recombination at the other repeated sequences such as a duplication of the LYS2 gene. It has been suggested that the rrm3 mutant is more sensitive to the particular sequence of the repeat rather than to the presence of a repeated sequence (45). The rrm2 mutant is not specific for rDNA hyperrecombination and affects recombination at other repeats. The mutant phenotype suggests that RRM2 functions in DNA replication and presumably mutations in this gene result in increased recombination through the accumulation of damaged DNA. In two separate studies haploid yeast strains that carried one disomic chromosome, marked such that both mitotic crossing-over and gene conversion could be monitored, were used to isolate hyper-rec mutants. The MZC mutants recovered from this screen were semidominant and increased mitotic gene conversion (43), the rec mutants, identified in a second study, showed a variety of hyper-rec phenotypes (42). Some mutants increased crossing-over, whereas others increased both gene conversion and crossingover. The relationship of the MZC and rec mutants to other hyper-rec mutants is unknown because the genes have not yet been cloned and allelism with other hyper-rec mutants has not been demonstrated. However, some of the mutants do have a DNA-repair defect, manifest as UV and/or X-ray sensitivity, suggesting that accumulation of DNA damage may be involved in the hyper-recombination.

C. Characterization of Yeast hpr Mutants This section contains a detailed description of the studies of two hyperrec mutants, hprl and sr&(hpr5). These two mutants were chosen because they affect different types of mitotic recombination events with no apparent overlap. First a general description of the hpr mutants and the methods of analyses are presented and then the individual mutants are discussed.

1. ISOLATION AND GENERAL CHARACTERIZATION OF THE hpr MUTANTS A strain with two direct repeat duplications that carried heteroallelic mutant copies of the LEU2 and H I S 3 genes (Fig. 3) was used to isolate mitotic hyper-rec mutants (44).The unmutagenized strain had a phenotype of leucine and histidine auxotrophy, but segregated Leu+ or His+ colonies at a frequency of 10-5. This appears as a very low level of papillation of growing colonies in a background patch of nongrowing cells on medium lacking leucine or histidine. Approximately 70,000 mutagenized colonies were screened for increased papillation on medium lacking leucine and histidine and on the single-omission media. The screen required both increased leucine and increased histidine papillation in order to eliminate mutants that

MITOTIC HYPER-RECOMBINATION MUTANTS

285

were locus-specific. From this screen, eight mutants were recovered that consistently showed an increase in recombination rates 10-fold over wild type. Each mutant was recessive and segregated as a single gene mutation. Further genetic characterization showed that the eight mutants represented eight separate complementation groups, called hpr (for hyperrecombination). This indicates that the collection is nonsaturating, because multiple alleles of a complementation group were not recovered. Later analyses showed that, within the hpr mutants, there were alleles of some previously identified hyper-rec mutants, cdc2 (hpr6)and cdcl7 (hpr3).This gave confidence that the screen was positive for b o n a j h hyper-rec mutants and that the remaining mutants might be in novel functions. This was given further support on determining that the remaining mutants did not display a strong mutator phenotype or strong DNA repair defect, which reduced the likelihood that they were defective in the mismatch repair, excision repair, error-prone repair, or recombinational repair pathways. The exception to this was the srs2(hpr5) mutant, which had slight UV- and MMS-sensitive phenotypes and turned out to function in the error-prone repair pathway. A second method of characterization of the hpr mutants was to examine the type(s) of recombination elevated. The duplication substrate can give a prototrophic segregant either through a gene conversion event, which retains the selectable marker between the duplicated copies, or a deletion event, which eliminates the selectable marker (Fig. 1). In the case of the LEU2 duplication, this is seen as a Leu+ Ura+ gene conversion recombinant or a Leu+ Ura- recombination for a deletion recombinant (Fig. 1). Therefore, independent Leu+ recombinants were examined for the uracil phenotype and the His+ recombinants were examined for the tryptophan phenotype. This analysis allowed us to categorize the mutants into three classes; those that exclusively increased deletion events, those that exclusively increased gene conversion events, and those that increased both events (Table 11). Last, the mutants were examined for effects on recombination between homologous chromosomes in diploids.

2. THE H P R l GENE As can be seen from Table 11, the hprl mutant increases deletion events between direct repeats, but has no effect on intrachromosomal gene conversion events. The mutant has no apparent DNA repair defect, suggesting that the primary role of the gene is not in DNA repair. Studies using other recombination substrates, such as inverted repeats, ectopic repeats (repeats located on nonhomologous chromosomes) and allelic gene copies, have shown that recombination in these systems is not increased by the hprl mutant (44, 82). Mitotic or meiotic crossing-over also is unaffected by the hprl mutant, and spore viability is normal (44, 82). These data indicate that

286

HANNAH L. KLEIN

TABLE I1 SUMMARY OF hpr MUTANTS

Mutant

Hyper-geneconversion

Hyper-deletion

UV sensitive

MMS sensitive

Mutator

hprl hpr2 hpr3 hpr4 hpr5 hP6 hpr7 hpr8 .Here I.Here .Here

"+" indicates a slight inutator effect of five- to eightfold over wild type.

"+" indicates a 10-fold increase in UV sensitivity over wild type.

"+ +" indicates a SO-Rild increwe in spontaneous mutation frequency over wild type.

the HPRl gene product is not a recombination protein. Most likely the mutant increases the production of a recombinogenic substrate.

a. Comparison to Other Mitotic Hyper-deletion Mutants. Table 111 lists mutants of S. cerevisiae that increase deletion formation between direct repeats. Several features of the hprl mutant are outstanding. First, the hprl mutant shows the highest increase in instability of a direct repeat, by at least 10-fold in comparison to other hyper-rec mutants, with the exception of the hpr6 mutant. Second, this 700-fold increase in instability is accompanied by no apparent repair defect or evidence of DNA damage. We have examined hprl cells directly for the accumulation of double-strand DNA breaks and find no evidence for such (M. Aquino de Muro and H. Klein, unpublished). The only other mutant whose hyper-rec approaches that of the hprl mutant is hpr6. As can be seen from Table 11, this mutant has a DNA-repair defect, is MMS-sensitive, and has increased mutation rates. This is in great contrast to the hprl mutant. Third, the hprl mutant only has increases in the instability of direct repeats outside of the rDNA. In this regard, it is different from the hpr3 mutant, in that the rnnl allele of HPR3ICDCl7 has been shown to increase rDNA recombination. It also differs from the top mutants, which increase rDNA recombination. Although HPRl has homology to TOP1 and shows a genetic interaction (discussed in Section II,C,2,b) with a top1 mutant, the specificity of the hyper-rec phenotype is completely different. In spite of these distinguishing features of the hprl mutant, the recombination event that is observed appears to utilize the same recombination

MITOTIC HYPER-RECOMBINATION MUTANTS

287

functions that are required for other hyper-rec mutants. The hprl hyper-rec phenotype partially requires the RAD52 gene and hyper-deletion rates are reduced 10-2 in the hprl rad52 double mutant (26,44). Table I11 shows the RAD52 dependency of other hyper-deletion mutants. In all cases where this has been tested, the hyper-rec phenotype completely or partially requires the RAD52 gene, with the exception of the hyper-rec gal mutant. This is particularly intriguing because spontaneous mitotic rDNA recombination in wild-type strains is independent of RAD52 (45, 78) and spontaneous deletions in direct repeats of LEU2 or H I S 3 or other similar duplication constructs are also independent of RAD52 in wild-type strains (26, 27, 37-39, 44,45). Either the events that occur in these hyper-rec mutants arise from a substrate that is different from the spontaneous events in wild-type strains, or they occur through a different recombination pathway. The observation that some hyper-rec phenotypes are only partially dependent on RAD52 suggests that more than one pathway may be utilized to produce a recombinant product in the mutants.

b. hprl Synthetic Lethal Mutants. Because the H P R l gene showed homology to DNA topoisomerases and both the hprl mutant and top mutants are hyper-rec, we were prompted to examine the double mutants hprl topl, hprl top2, and hprl tops. We observed synthetic lethality in all three double mutants (46; A. Aguilera and 13. Klein, unpublished). Each single mutant was viable, but the double mutant was inviable. This could occur through an overlap in function or functioning in the same process. Alternatively, there may be an accumulation of too much recombination substrate, some sort of damaged DNA, in the double mutant, and the unrepaired damage becomes lethal. Other genes have been tested directly for synthetic lethality with the hprl mutant. We have found that a deletion of copy 1 of the histone H3-H4, while fully viable in an otherwise wild-type strain, is lethal in combination with an hprl mutation (26). As yet we do not understand the basis for this synthetic lethality. We have not found any change in overall nucleosome organization or positioning in the hprl mutant, including an examination of the LEU2 duplication used as a recombination substrate (H. Fan and H. Klein, unpublished). We have looked directly for hprl synthetic lethal mutants by a screen for mutants that have an absolute requirement for H p r l function. We recovered six mutants that segregated in a monogenic fashion from a screen of 20,000 mutagenized colonies. The six mutants, called slg (for synthetic lethal gene), represent four complementation groups. None of these are allelic to the TOP genes or the copy 1of histone H3-H4 ( H H T I - H H F I ) . Preliminary characterization of the mutants showed that one mutant, slgl0, has a hyper-deletion

TABLE 111 MITOTICHYPER-DELETION MV~ANTS Gene type Replication and cell cycle genes hpr3lcdcI 7lrrml hpr6lcdc21tex2

pril P d Cdc5 CdC6 CdC9

cdcl3 cdcll srrl(rfal-D288Y) ctf4/chll5/pobl Excision repair genes rod-102 (reml-2)

Gene productlfunction DNA polymerase a DNA polymerase S DNA primase p48 DNA primase p58 Protein kinase DNA ligase

Fold increase for rDNA in rrml 4OOx, 35x for Tn5 excision in 5OX. 1OX

tex2 3OX

RAD52 dependent

Yes Yes

Ref. 44,45 44, 92

Tyrosine phosphatase homology Subunit of RF-A Binds to DNA polymerase a

lWX lox lox 15X, 2 5 X for rDNA 5Xb

Yes Yes NDa ND ND ND ND No ND

67 67 44 44 44 44 44 67a 93

DNA helicase

lox

ND

85; Y. Zhang and H.

-

6X

lOOX 10 x

Klein, unpublished Recombination repair genes rad51

ra&

recA homology

lox

Yes

DNA helicase homology

lox

Yes

H. Fan and H. Klein, unpublished; J. McDonald and R. Rothstein, unpublished H. Fan and H. Klein, unpublished; J. McDonald and R. Rothstein, unpublished

rad55

recA homology

lox

Yes

rad57

RADSl/recA homology

10x

Yes

DNA topoisomerase DNA topoisomerase DNA topoisomerase

25X for rDNA SOX for rDNA 8 O X for rDNA, 100 X for 6 repeats

Partial ND Yes

DNA topoisomerase homology -

700 x 20 x 10x

Partial Yes Yes

lox

Yes Yes Yes Yes ND Yes

Topoisomerase genes top1 top2 top3

Other genes hprl hpr7 slgl0 soh1 sgsl pkcl sir2 r d rrm3

-

RNA polymerase homology recQ homology Protein b a s e C Histone deacetylation -

30X for rDNA, 15X for 6 repeats

15X 15X l o x , 7X for rDNA 1 0 0 ~for CUP1 repeats, 1 0 for ~ rDNA

H. Fan and H. Klein, unpublished; J. McDonald and R. Rothstein, unpublished H. Fan and H. Klein, unpublished; J. McDonald and R. Rothstein, unpublished

78 78 76, 79

26, 36, 44

44 D. Dimova and H. Klein, unpublished 26 78 41 47 45 45 ~

aND, Not determined bThis rate was determined using a heteroallelic his4 duplication with URA3 in between the repeats and only HIS+ Ura- recombinants were studied. The rate for all Urasegregants was not reported.

290

HANNAH L. KLEIN

phenotype and is sensitive to MMS (D. Dimova and H. Klein, unpublished). Further characterization of the mutants is in progress. C . Suppressors of h p r l . The hprl null allele strains grow at almost wild-type rates at 30", but fail to grow at 37". We have selected hprl suppressors by isolating segregants that grow at 37". Fourteen mutants were recovered that contained a suppressor mutation unlinked to the hprl allele, indicating that these were second-site bypass suppressors. The fourteen mutants represented eight different complementation groups called SOH (for suppressor of hprl) (26). The soh mutants were examined for suppression of the hprl top1 and hprl hhtl -hhfl synthetic lethalities; all eight complementation groups suppressed these phenotypes. Recombination rates of soh and soh hprl strains were determined. The soh mutants showed varying degrees of suppression of hpr hyper-recombination, but none suppressed fully to wild-type levels. The suppression of recombination ranged from a %oth reduction of hprl rates by an sohl mutant to no effect by an soh8 mutant, where wild-type recombination is 1/7~iththat of h p r l . Most of the soh mutants have no effect on wild-type (HPRL) recombination rates, but the soh1 null mutant shows a 10-fold enhancement of recombination and is listed as a hyper-rec mutant in Tables I and 111. Similar to hprl , the sohl mutant affects only deletion recombination between direct repeats and has no effect on gene conversion events. The sohl mutant may act by not allowing most of the hprl lesions to occur, or by allowing them to be processed in a nonrecombinogenic fashion, thus suppressing hyper-rec, but in this case, one might not expect the sohl mutant alone to have a hyper-rec phenotype. Three of the SOH genes have been identified. SOHl, located on chromosome VII, is a novel gene with some homology to DNA- and RNA-directed RNA polymerases. The null allele shows normal viability and has no effect on meiosis or sporulation. SOH2 is allelic to RPB2, which encodes the @subunit of RNA polymerase 11, and SOH4 is allelic to SUA7, which encodes TFIIB (H. Fan and H. Klein, unpublished). RPB2 and SUA7 are essential genes and therefore the soh2 and soh4 alleles must be leaky. Although the soh2 and soh4 mutants grow normally at 30", their growth is impaired when combined with an sohl mutant and the triple mutant sohl soh2(soh4) hprl is inviable. Although we do not know the function of SOHl, it is not unreasonable to suggest that it functions as an accessory protein to the Pol-I1 holoenzyme. We suggest that Sohlp and Hprlp are in separate protein complexes. The Sohlp complex functions in transcription, and may possibly also be part of a postreplication repair complex, but we have no data as to the type of complex

MITOTIC HYPER-RECOMBINATION MUTANTS

291

ofwhich H p r l p is a part. When either complex is deficient, through deletion of the SOH1 gene or the H P R l gene, a recombinogenic substrate is formed that is specific for deletions between direct repeats. Because we have found no evidence for double-strand breaks in hprl strains, we suggest that this is not the substrate. One possibility is that the hprl mutant promotes nascent sister chromatid exchange in a replication bubble when long, repeated sequences are present, as has been suggested (83).Such events would result in a deletion between direct repeats. The sohl mutation permits a bypass of the hprl defect, possibly by changing the hprl substrate so that it is not recombinogenic, or by shunting the substrate into a nonrecombinogenic repair pathway, possibly the nonrecombinogenic postreplication repair pathway. However, the sohl mutant is slightly hyper-rec and must lead to the formation of some amount of a recombinogenic substrate. It is not possible to distinguish whether sohl is epistatic to hprl or just fails to suppress the hprl mutant fully.

3. THESRS2IHPR5 GENE The srs2/hpr5 mutant was identified in the initial screen for hpr mutants (44). It was selected for further studies because the mutant resulted in an exclusive increase in gene conversion events: this is in complete contrast to the hprl mutant, which increased deletion events only between direct repeats. Further studies showed that the hpr5 mutant is allelic to the radH and srs2 mutants, which had been identified on the basis of different phenotypes. The RADH gene was cloned (70) and shown to encode a gene with high homology to the uvrD and rep genes of E . coli, which encode DNA helicases. Rong and Klein (84)purified the Srs2 protein and demonstrated in vitro ATPase activity and DNA helicase activity with a 3'-to-5' polarity. The first clue as to the role of the SRSB gene in mitosis came from the observations that mutants showed a slight increase (10-fold over wild type) in UV sensitivity (70, 85), and that diploids showed an increase in MMS sensitivity (70);this suggested that the gene might function in a error-prone DNA repair pathway. Further studies showed that srs2 mutants suppress the strong UV sensitivity of rad6 and radl8 mutants, which are deficient in error-prone repair (70, 85). We and others (70,85)have interpreted this as a channeling of DNA lesions normally repaired by the error-prone repair process into the recombinational repair pathway (Fig. 4). This has been borne out by the observation that mutations in the recombinational repair pathway eliminate the srs2 hyper-gene conversion and the ability of these mutants to suppress the rad6 and rudl8 high UV-sensitivity phenotypes (70, 85, 86). We have used the suppression of radl8 UV sensitivity to isolate additiona. Mitotic and Meiotic Roles of SRSB.

292

HANNAH L. KLEIN

A LESION

strandexchange RAD51 processlDSB8 W 5 2

hellcase homology

SRS2 helicase RAD6 ubiquitination RAD5 hellcase homology RAD18 DNA binding?

R A W RECOMBHATION

REPAIR

LESION

strand exchange RAD51 procassDSBB RAD52 helicase homology R A W

processDSBsRADs

A

SRS2 helicase

1 1

RAW ubiquitination RAD5 helicase homology RAD18 DNA binding?

HYPER-RECOMBINATION REPAIR

FIG.4. Channeling model for the srs2 mutant. (A) The SRS2 gene, encoding a D N A helicase, is proposed to act in the error-prone repair pathway in an early step. Downstream genes involved in repair are RADG, RAD5, and RAD18. An alternate repair pathway, the recombinational repair pathway, is shown as being an alternative pathway for repair. The heavy line indicates that the repair pathway is the major pathway used to repair the lesion. (B) When the repair pathway is blocked by a mutation in the SRS2 gene, the lesion is shunted into the recombinational repair pathway, indicated by the heavy line. Events that are normally repaired in a nonrecombinational mode are now repaired in the recombinational mode, with the end result of hyper-recombination. DSBs, double-strand breaks.

a1 alleles of SRS2 and have also constructed deletion null alleles. Although the null alleles suppress radl8 UV sensitivity, they do not increase gene conversion to the same extent as the original srs2-101 lhpr5-1 allele (86).The new alleles show a range of phenotypes. A few are as hyper-rec as the original hpr5 allele, but others show only a slight increase in gene conversion rates. However, there are two additional criteria suggesting that these are defective SRS2 alleles. First, all mutants show some degree of UV sensitivity in a Rad+ background and the increase in UV sensitivity is correlated with the increase in gene conversion (86).Second, in the Leu+ segregants in wildtype strains, half are gene-conversion events and half are deletion events. In

293

MITOTIC HYPER-RECOMBINATION MUTANTS

all of the srs2 mutants, 80-90% of the Leu+ segregants are gene-conversion events. The hyper-rec srs2 alleles display what we have termed gene-conversion bias. Figure 1shows the three possible Leu+ gene-conversion products from the duplication substrate. If the first two are considered, where only one mutant allele is converted to wild type, these are recovered in approximately equal frequency in the Leu+ Ura+ segregants in SRS2 strains. This ratio is altered in the hyper-rec srs2 alleles (44, 87) (Fig. 5). We have determined that the effect is neither allele- nor locus-specific. However, the position of the allele within the LEU2 gene and the position of the gene in the duplication (left or right, according to Fig. 5 ) is important. Biased gene conversion or gene-conversion polarity is generally associated with meiotic gene conversion, not mitotic gene conversion. It reflects preferential initiation of recombination within a gene and has been correlated with double-strand breaks at the high end of the polarity gradient (1). Deletion of the initiation site at ARG4 results in a flat meiotic conversion rate across the gene (5). Our observations of a gene-conversion gradient in srs2-101 strains during mitosis suggested that (1) some region flanking the duplication might be acting as a preferential site of initiation, and (2) that

Genotype

Duplication

his3-513

Ratio

Rates ( ~ 1 0 6 )

(wL)

x2

R

L leu2-r

Conversion

URA3

TRP7

leu2-k

his3-537

SRS2 2.6 21 srs2-101 28.8 (xll) 8 s&'-101 (Ty-) 4.5 ( ~ 2 ) 21

35 48 22

1.7 6.0 1.0

6.70' 0.01

sfis'

19 12 10

0.8 0.2 0.3

5.00' 2.32

5.5 srs2-101 34.0 ( ~ 6 1 ~f~2-10 (Ty-) 1 25.0 (X5)

23 43 28

FIG.5. Bias in mitotic conversion events in the srs2 mutant. Two duplications, one at LEU2 and the other at HZS3, are shown. srs2-101 (Ty-) is a strain that lacks the Tyl-17 element that is adjacent to the LEU2 locus. All three strains, SRS2, srs2-101, and srs2-101 (Ty-), are isogenic. Conversion rates are shown for Leu+ Ura+ or His+ Trp+ events. The numbers in parentheses are the fold increase over the SRS2 rate. L refers to conversion of the L allele to prototrophy, either the leu2-r or the his3-513 gene, and R refers to conversion of the R allele to prototrophy, either the leu2-k or the his3-537 gene. Southern analysis was used to determine which allele had converted, because all four mutations are ablation of a restriction enzyme site. The contingency x 2 was performed to compare the distribution of the SRS2 L and R conversion events to those in the srs2-101 or the srs2-101 (Ty-)strain. The asterisks indicate f-values that are statistically significant (P < 0.05).

294

HANNAH L. KLEIN

leu2

13x

Vl-17

leu2

leu2 FIG.6. Schematic illustration of the effect of the 51-17 element on biased gene conversion in the LEU2 duplication in srs2-101 strains. The arrows indicate the direction of the conversion event, which changes a negative allele to a positive allele, with resulting Leu+ prototrophy. The numbers beside the arrows indicate the fold increase in conversion rate of each allele over the SRSP strain. Note that the conversion rate is unchanged for the EcoRI allele, but that the mnversion rate for the KpnI allele is greatly increased over the SRS2 strain when the Tyl-I7 element is present. This increase is lost when the 51-17 element is deleted.

absence or malfunctioning of the Srs2 DNA helicase enhanced the effects of the preferential initiation site. Transcription of a gene can enhance mitotic recombination (80, 88). Immediately upstream of the LEU2 duplication is a transposable element Tyl-17. The Ty elements transpose through a cDNA intermediate and have a high transcription rate in haploid mitotic cells (89). To determine whether the adjacent Tyl-17 element affects biased gene conversion in the srs2-lOl mutant, we deleted this sequence upstream of the LEU2 duplication. This deletion had two effects: hyper-gene conversion at the LEU2 duplication was eliminated, and the gene conversion bias was lost (Figs. 5 and 6). Deletion of the Ty element upstream of LEU2 did not eliminate hyper-gene conversion at the H I S 3 duplication, and biased gene conversion was still observed. This indicates that the Tyl-17 element has a local effect on biased hyperrecombination, presumably by affecting the recombination initiation events. We do not believe that the effect is specific to Ty elements as there is no Ty element in the vicinity of the H I S 3 duplication. Rather, transcription in a

MITOTIC HYPER-RECOMBINATION MUTANTS

295

region adjacent to the duplication can influence recombination rates in the srs2 mutant. We have preliminary evidence that gene-conversion tract lengths are reduced in the srs2-101 mutant (85). This could explain the biased gene conversion. If a preferential initiation site occurs downstream of the Tyl-17 element in the first copy of the LEU2 duplication, this could result in heteroduplex formation involving only the distal LEU2 copy (shown as the upper duplex in Fig. 7A). The heteroduplex tract length is reduced in the helicasedefective mutant and thus the heteroduplex would cover only one allele, the leu2-k allele in the R copy as shown in Fig. 5 . This will result in preferential conversion of the R copy and hyper-recombination. If heteroduplex covers both alleles, mismatch repair on one strand will not give a Leu+ segregant. The probability of an initiation event occurring decreases toward the 3’ end of the gene; hence alleles at the 3’ end are less likely to be converted, as is shown in Fig. 7B. These observations suggest that the Srs2 DNA helicase may be involved in extension of heteroduplex DNA through branch migration. The same activity could explain the channeling hypothesis. The Srs2 helicase may recognize heteroduplex intermediates that form during postreplication or error-prone repair. The intermediate could be disrupted by the helicase (Fig. 8) to ensure that the lesion is repaired in a nonrecombinogenic fashion. When the helicase is defective, the intermediate is recognized by the recombination repair apparatus, with the consequent phenotype of hyper-recombination. This is an antirecombinase activity, as recombination intermediates are disrupted by the action of the helicase. A similar role has been proposed for the uurD helicase I1 of E . coli (90). Although the SRS2 gene is not essential for vegetative growth, it does have a critical role in meiosis. Diploid strains that are homozygous null for the SRS2 gene show normal levels of sporulation, but reduced spore viability (86). The spore viability is approximately 50% of the wild-type level, with only 10% complete tetrads. Map distances derived from the complete tetrads are reduced by half from those obtained with isogenic wild-type strains, indicating that the srs2 mutants are recombination proficient. Not all of the srs2 alleles have a strong meiotic phenotype. Rather, the phenotype described above is restricted to those alleles that should eliminate all Srs2 DNA helicase activity. The null allele, a nonreverting frameshift mutant, and two hyper-rec mutants, one that maps to the conserved domain I and the other to the conserved domain V of DNA helicases, all have the strong meiotic phenotype. The remaining alleles display varying degrees of reduction in meiotic viability and some alleles are indistinguishable from wild type by this criterion (86). We have compared the progression through meiosis of the wild-type

296

-

HANNAH L. KLEIN

A

1

STRAND TRANSFER

-

. W I

v

J

GENE CONVERSION

W I 1

v

B

Fold Increase in Gene Converslon

FIG. 7. Model for biased gene conversion in the srs2-101 mutant. (A) The duplicated LEU2 gene is depicted as a double thin line and the second beneath it, as a double thick line. The stippled box represents the Ty-17 element. The downward arrowheads represent the mutant sites in the LEU2 genes. A nick is proposed to m u r preferentially near the Ty-17 element. Strand transfer initiated at the nick and extending along the homologous region of the LEU2 gene forms heteroduplex on the top copy of the LEU2 gene. Mismatch repair results in a gene conversion of the top LEU2 gene and is detected as a Leu+ segregant. (B) Graph of the fold increase in gene conversion over wild type versus position of the allele within the LEU2 gene. The region most proximal to the 5 1 7 element preferentially acts as the donor in a geneconversion event, resulting in bias and a mitotic gene-conversion gradient.

297

MITOTIC HYPER-RECOMBINATION MUTANTS

SRSP Helicase

\.s

5'

3’

3

5'

STRANDREJECTION

II I I

STRAND

" \

FIG. 8. Model for the Srs2 DNA helicase as an antirecombinase. The middle panel shows a region of heteroduplex formed between a + strand and a strand containing a lesion (m). The Srs2 DNA helicase is proposed to recognize the heteroduplex, probably by association with other proteins, and then to enter the region at the nicked or gapped end and to disrupt the heteroduplex through the 5'-to-3' DNA helicase activity. Failure to disrupt the heteroduplex allows the stable formation of an intermediate that is processed by the recombination repair pathway to yield a gene-conversion event.

diploid to the srs2-lOl diploid. The srs2 mutant goes through the meiotic divisions with the same kinetics as wild type, but shows a delay in appearance of recombinants in return-to growth experiments that measure the commitment to recombination (86). The delay in recombination does not

298

HANNAH L. KLEIN

occur in the initial stages of recombination. Double-strand breaks, which appear early in meiosis prior to the appearance of recombinant molecules, are formed to approximately the same extent and with the same kinetics as wildtype (M. Aquino de Muro and H. Klein, unpublished). Recombinant molecules, detected by physical methods, are delayed in appearance (H. Klein, unpublished). This suggests that the srs2 mutant is blocked in the recombination process. One possibly is that the Srs2 DNA helicase is required for branch migration of the recombination intermediate. Why would elimination of such an activity result in meiotic inviability? We have shown that short gene-conversion tracts are preferentially resolved as noncrossovers in mitosis (36). If the same holds true for meiosis, the reduction in crossing-over could account for the meiotic inviability, as crossing-over is required for proper meiotic segregation of homolog pairs and hence meiotic product viability. Another possibility is that short heteroduplex regions are not engaged correctly within the synaptonemal complex. Cross-overs that do not occur within the context of synaptonemal complex are not sufficient to restore spore viability to defective strains (91).

b. Srs2 DNA Helicuse Action as an Antirecombinase Function in Recombination. We have suggested (in Fig. 8) that the Srs2 helicase disrupts a heteroduplex intermediate by helicase action in the 3'-to-5' direction. This may be the primary role in mitosis, thus ensuring that repair substrates are acted on by the nonrecombinational DNA repair pathways. The Srs2 helicase could have a similar role in meiosis, acting to disrupt heteroduplexes that have a high degree of mismatches. This type of substrate could occur in a search for homology, prior to the initiation of meiotic recombination. In the absence of the Srs2 helicase function, nonhomologous recombination events could be initiated. These could result in abortive recombinations, leading to increased chromosome missegregation and reduced spore viability. If this were the case, then the search for homology does not have to be completed prior to the formation of meiotic double-strand breaks, as we have observed that these occur with wild-type kinetics in srs2-101 diploids (M. Aquino de Muro and H. Klein, unpublished). The Srs2 DNA helicase could also function in branch migration of a Holliday junction. It could act ahead of the invading strand to form more single-stranded DNA, which could then enter into heteroduplex through a strand-exchange reaction or actually interact with the Holiday junction in a manner analogous to the RuvB or RecG proteins of E . coZi (12). In the absence of the helicase function, heteroduplex regions will be short and could lead to cross-over and segregation problems as described above. The effect of multiple mismatches on the ability to form heteroduplex DNA in srs2 strains and the ability of the helicase to move a Holliday junction

MITOTIC HYPER-RECOMBINATION MUTANTS

299

through branch migration of the heteroduplex intermediate are currently being tested.

111. Concluding Statements Mitotic hyper-recombination mutants include defects in functions required for DNA repair systems and functions required for DNA replication. Most of these mutants result in the accumulation of DNA lesions that are recombinogenic. These lesions are probably single-strand gaps, doublestrand breaks, mismatches, and other discontinuities in the DNA duplex. Other mutants, such as the top mutants and the sir2 mutant, may make regions of the chromatin or the DNA duplex more accessible to the recombination and repair machineries, although why the recombination is restricted to certain sequences such as the rDNA or the 6 repeats in these mutants is unknown. As transcription of a gene can enhance its recombination, this would argue for accessibility or a repair/recombination factor being associated with the transcription apparatus. Actively transcribed genes show preferential repair following UV damage. It seems likely that actively transcribed genes would show preferential recombination. It is not known if this recombination is occurring in response to DNA damage, but this is a strong possibility. Thus far, the srs2 mutants are unique in their action as hyper-rec mutants in that these mutants do not appear to create substrate for recombination through the induction of DNA damage. Rather, these mutants fail to channel a repair lesion into the error-prone repair pathway, and hence the lesion is repaired through the recombination-repair pathway. Thus events that are normally destined to be repair events become recombination events with the ensuring hyper-rec phenotype. Could such an antirecombinase function also be compatible with a recombinase activity of the same Srs2 protein? We have evidence that gene conversion is altered in the srs2 mutant in two ways. First, gene-conversion tracts are shorter than those seen in SRSB strains. If the sole role of the Srs2 protein is to disrupt heteroduplexes, shorter conversion tracts are not expected. When the Srs2 protein fails to disrupt a heteroduplex, it is channeled into the RAD52 recombinational repair pathway and becomes a geneconversion event. Unless the channeled intermediate is different from other substrates that are acted on by the RAD52 pathway, the same length of geneconversion tract should be observed in srs2 and SRSB strains. The Srs2 protein could interact with different proteins and thereby be directed to different substrates for helicase activity. It could complex with one factor from the error-prone repair pathway and have the activity of disrupting

300

HANNAH L. KLEIN

heteroduplexes and complex with another factor in the RAD52 recombination pathway to recognize and promote branch migration of Holliday junctions. Such a dual activity, effected by different complexes, could explain the hyper-rec and short conversion-tract phenotypes of the Srs2 DNA helicase. However, this does not explain the meiotic-like polarity gradient of gene conversion seen in the srs2 mutant. This is the second type of alteration in gene conversion in the srs2 mutant. We have suggestive data that transcription of a nearby region is correlated with hyper-rec and biased gene conversion in the srs2 mutant. The Srs2 protein may be part of a repair complex associated with a transcription complex that ensures that repair events are not channeled into the recombination pathway. Lesions may occur more frequently in highly transcribed sequences because the chromatin template could be more accessible to agents that nick and otherwise damage DNA. The hprl and srs2 mutants stimulate two ditferent types of mitotic recombination events, deletions between direct repeats and gene conversions between repeats and homolog alleles. This suggests that there is more than one pathway or substrate for mitotic recombination. This view is borne out by many experiments from many different laboratories. Deletions between repeats are elevated in many dfierent strain backgrounds (see Table 111)and may be initiated by more than one type of lesion. The observation that the radsl, rud54, rad55, and rad57 mutants show elevated levels of directrepeat deletions suggests that spontaneous lesions that are double-strand breaks . that are repaired by the RAD52 pathway genes are one type of substrate. Deletion rates are elevated to a higher extent in some mutants, most notably c&2 and hprl. Whether these mutants create the same spontaneous substrate for repeat deletions is not known, but these mutants, which have more substrate available for recombination, do not have the radiationdefective phenotypes of the rad mutants and do not reduce gene-conversion events, in contrast to the ru&, rad54, rud55, and rad57 mutants. What is clear from Table 111 is that all of the mitotic hyper-deletion mutants stimulate a recombination event that requires the RAD52 function, with the exception of the rpal-D288Y mutant. This mutant definesa novel pathway for activating recombination in a recombination-defective strain ( 6 7 ~ ) . The radl rad52 double-mutants show a synergistic decrease in deletions between direct repeats (23, 37, 38). This is thought to result from blocks in the RAD52-dependent recombination pathway and the single-strand annealing pathway that requires RADl, but is RAD52 independent (24, 25,94). In rud52 mutants, some events may be shunted into the single-strand annealing pathway, but this is clearly not the case for most events in the hyper-deletion mutants, because the hyper-recombination is RAD52 dependent. This suggests that spontaneous events can be shunted while the mutant induced

MITOTIC HYPER-RECOMBINATION MUTANTS

301

events cannot. Spontaneous lesions in a direct repeat can be repaired by the single-strand annealing mechanism, which requires that a double-strand break be flanked by directly repeated sequences. When the spontaneous lesion occurs in a region that is not flanked by repeats, it is either repaired in a nonrecombinogenic fashion or must use the sister chromatid for a silent recombination repair event. The genes involved in this process are not known, but the observation that viability is not significantly reduced in a rud52 mutant suggests that multiple double-strand-break repair mechanisms exist.

IV. Glossary DSB

LOH LTR TY SSB 6

double-strand break loss of heterozygosity long terminal repeat yeast retrotransposon; transposable element that has a cDNA intermediate single-strand binding delta sequence, the LTR sequence of ty elements ACKNOWLEDGMENTS

This work was supported by Grant GM30439 from the National Institutes of Health and Grant NP-77186 from the American Cancer Society. I am grateful to H.-Y. Fan, R. Rothstein, and L. Symington for discussions and comments on the manuscript.

REFERENCES 1.

T.D. Petes, R. E. Malone and L. S. Symington, in “The Molecular Biology of the Yeast

Soccarornyces: Genome Dynamics, Protein Synthesis, and Energetics” (J. R. Broach, J. R. Pringle and E. W. Jones, eds.), p. 407. CSH Lab Press, Cold Spring Harbor, NY, 1991. 2. J. E. Haber, Curr. Opin. Cell B i d . 4, 401 (1992). 3. C. L. Atcheson and A. E. Esposito, Curr. Opin. Genet. Den 3, 736 (1993). 4. J. N. Strathern, Curr. Opin. Genet. Den 2, 691 (1992). 5. B. de Massey and A. Nicolas, EMBO J . 12, 1459 (1993). 6. P. Detloff, M. A. White and T. D. Petes, Genetics 132, 113 (1992). 7. K. K. Willis and H. L. Klein, Genetics 117, 633 (1987). 8 . R. H. Borts and J. E. Haber, Science 237, 1459 (1987). 9. J. E. Golin and M. S. Esposito, Genetics 107, 355 (1984). 10. B. D. Bethke and J. E. Golin, Genetics 137, 439 (1994). 11. S. K. Mahajan, in “Genetic Recombination” (R. Kucherlapati and G . R. Smith, eds.), p. 87. American Society for Microbiology, Washington, DC, 1988.

302

HANNAH L. KLEIN

12. S. C . West, Cell 76, 9 (1994). 13. S. C. Kowalczykowski, D. A. Dixon, A. K. Eggleston, S. D. Lauder and W. M. Rehrauer, Microbiol. Reu. 58, 401 (1994). 14. E. B. Konrad, J . Buct. 130, 167 (1977). 15. S. I. Feinstein and K. B. Low, Genetics 113, 13 (1986). 16. J. Zeig, V. F. Maples and S. R . Kushner, J . Bact. 134, 958 (1978). 17. M. A. Schofield, R. Agbunag, M. L. Michaels and J. H. Miller, J. Bact. 174, 5168 (1992). 18. G. R. Smith, Microbiol. Rec. 52, l(1988). 19. T. Saeki, I. Machida and S. Nakai, Mutut. Res. 73, 251 (1980). 20. J. E. Haber and M. Hearn. Genetics 111, 7 (1985). 21. R. E. Malone and R. E. Esposito, PNAS 77, 503 (1980). 22. J. A. Jackson and G. R. Fink, Nuture 292, 306 (1981). 23. H. L. Klein, Genetics 120, 367 (1988). 24. 8.A. Ozenberger and G . S. Roeder, MCBiol 11, 1222 (1991). 25. J. Fishman-Lobell, N . Rudin and 1. E. Haber, MCBiol 12, 1292 (1992). 26. H. Fan and H. L. Klein, Genetics 137, 945 (1994). 27. J. P. McDonald and R. Rothstein, Genetics 137, 393 (1994). 28. J. C. Game, T. J. Zamb, R. J. Braun, M. A. Resnick, and R. M. Roth, Genetics 94,51(1980). 29. T. M. Menees and G. S. Roeder, Genetics 123, 675 (1989). 30. M. Ajimura, S.-H. Leem and H. Ogawa, Genetics 133, 51 (1993). 31. R. E. Malone, S. Bullard, M. Hermiston, R. Rieger, M. Cool and A. Galbraith, Genetics 128, 79 (1991). 32. S. Klapholz, C. S. Waddell and R. E. Esposito, Genetics 110, 187 (1985). 33. J. Engebrecht and G. S. Roeder, Genetics 121, 237 (1989). 34. B. Rockmill and G. S. Roeder, PNAS 85, 6057 (1988). 35. N. M. Hollingsworth and B. Byers, Genetics 121, 445 (1989). 36. A. Aguilera and H. L. Klein, Genetics 123, 683 (1989). 37. R. H. Schiestl and S. Prakash, MCBiol8, 3619 (1988). 38. B. J. Thomas and R. Rothstein, Genetics 123, 725 (1989). 39. B. R. Zehfus. A. D. McWilliams. Y.-H. Lin, M. F. Hoekstra and R . L. Keil, Genetics 126, 41 (1990). 40. R. H. Schiestl and S. Prakash, MCBiol 10, 2485 (1990). 41. K. N. Huang and L. S. Symington, MCBiol 14, 6039 (1994). 42. M. E. Esposito. J. Osoda, J. Golin, H. Moise, K. A. Bjornstadt and D. Maleas, C S H S Q B 49, 41 (1984). 43. D. H. Maloney and S. Fogel, Genetics 94, 825 (1980). 44. A. Aguilera and H. L. Klein, Genetics 119, 779 (1988). 45. R. L. KeiI and A. D. McWilliams, Genetics 135, 711 (1993). 46. A. Aguilera and H. L. Klein, MCBiol 10, 1439 (1990). 47. S. Gottlieb and R. E. Esposito. Cell 56, 771 (1989). 48. M. Braunstein, A. B. Rose, S. C. Holmes, C. D . Allis and J. R. Broach, Genes Deu. 7,592 (1993). 49. S. K. Whoriskey, M. A. Schofield and J. H. Miller, Genetics 127, 21 (1991). 50. J. W. Chase and C. C. Richardson, J. B a t . 129, 934 (1977). 51. J. Clyman and R. C. Cunningham, J . B u d . 169, 4203 (1987). 52. S. T. Lovett, C. Luisi-DeLuca and R. Kolodner, Genetics 120, 37 (1988). 53. R. Kern and F. K. Zimrnerman, Mol. Gen. Genet. 161, 81 (1978). 54. J. E. Golin and M. S. Esposito, Mol. Gen. Genet. 150, 127 (1977). 55. R. E. Malone and M. F. Hoehtra, Genetics 107, 33 (1984). 56. B. Montelone. S. Prakash and L. Prakash, MGG 184, 410 (1981).

MITOTIC HYPER-RECOMBINATION MUTANTS

303

W. R. Boram and H. Roman, PNAS 73, 2828 (1976). S. Kowalski and W. Lakowski, MGG 136, 75 (1975). B. E. Montelone, S. Prakash and L. Prakash, Curr. Genet. 4, 223 (1981). A. Aboussekhra, R. Chanet, A. Adjiri and F. Fabre, MCBiol 12, 3224 (1992). B. S. Montelone, S. Prakash and L. Prakash, J. B a t . 147, 517 (1981). 62. F. Fabre and H . Roman, PNAS 76, 4586 (1979). 63. J. C. Game, L. H. Johnston and R. C. von Borstel, PNAS 76, 4589 (1979). 64. L. H. Hartwell and D. Smith, Genetics 110, 381 (1985). 65. K. M. Hennessy, A. Lee, E. Chen and D. Botstein, Genes Deu. 5, 958 (1991). 66. E. A. Howell, M. A. McLear, D. Rose and C. Holm, MCBiol 14, 255 (1994). 67. M. P. Longhese, L. Jovine, P. Plevani and 6. Lucchini, Genetics 133, 183 (1993). 67u. J. Smith and R. Rothstein, MCBiol. 15, 1632 (1995). 67b. A. Firmenich, M . Elias-Arnanz and P. Berg, MCBiol. 15, 1620 (1995). 68. M. S. Williamson, J. C. Game and S. Fogel, Genetics 110, 609 (1985). 69. W. Kramer, B. Kramer, M. S. Williamson and S. Fogel, J . Bact. 171, 5339 (1989). 70. A. Aboussekhra, R. Chanet, Z. Zgaga, C. Cassier-Chauvat, M. Heude and F. Fabre, NARes 17, 7211 (1989). 71. T. A. Prolla, Q. Pang, E. Alani, R. D. KolodnerandR. M. Liskay, Science265,1091(1994). 72. S. Fogel, R. K. Mortimer and K. Lusmak, in “The Molecular Biology of the Yeast Saccharomyces” (E. W. Jones and J. R. Broach, eds.), p. 289. CSH Lab Press, Cold Spring Harbor, NY, 1981. 73. R. E. Malone, T. Ward, S. Lin and J. Waring, Curr. Genet. 18, 111 (1990). 74. E. L. Ivanov, V. 6. Korolev and F. Fabre, Genetics 132, 635 (1992). 75. A. Shinohara, H. Ogawa and T. Ogawa, Cell 69, 457 (1992). 76. J. W. Wallis, G. Chrebet, G. Brodsky, M. Rolfe and R. Rothstein, Cell 58, 409 (1989). 77. R. A. Kim and J. C. Wang, JBC 267, 17178 (1992). 78. M. F. Christman, F. S. Dietrich and G. R. Fink, Cell 55, 413 (1988). 79. S. Gangloff, J. P. McDonald, C. Bendixen, L. Arthur and R. Rothstein, MCBiol 14, 8391 (1994). 80. R. Keil and G. S. Roeder, Cell 39, 377 (1984). 81. K. Voelkel-Meiman, R. L. Keil and C . S. Roeder, Cell 48, 1071 (1987). 82. A. Aguilera and H. L. Klein, Genetics 122, 503 (1989). 83. S. T.Lovett, P. T. Drapkin, V. A. Sutera, Jr. and T. J. Gluckman-Peskind, Genetics 135,631 (1993). 84. L. Rong and H . L. Klein, JBC 268, 1252 (1993). 85. L. Rong, F. Palladino, A. Aguilera and H. L. Klein, Genetics 127, 75 (1991). 86. F. Palladino and H. L. Klein, Genetics 132, 23 (1992). 87. L. Rong, Ph.D. thesis. New York University, 1992. 88. B. J. Thomas and R. Rothstein, Cell 56, 619 (1989). 89. J. D. Boeke and S. B. Sandmeyer, in “The Molecular Biology of the Yeast Sacchuromyces. Genome Dynamics, Protein Synthesis, and Energetics” (J. R. Broach, J. R. Pringle and E. W. Jones, eds.), p. 193. CSH Lab Press, Cold Spring Harbor, NY, 1991. 90. C. Rayssiguier, D. S. Thaler and M. Radman, Nature 342, 396 (1989). 91. J. Engebrecht, J. Hirsch and G . S. Roeder, Cell 62, 927 (1990). 92. D. A. Gordenin, A. L. Malkova, A. Peterzen, V. N. Kulikov, Y. I. Pavlov, E. Perkins and M. A. Resnick, PNAS 89, 3785 (1992). 93. N. Kouprina, E. Kroll, V. Bannikov, V. Bliskovsky, R. Gizatullin, A. Kirillov, V. Zakharyev, P. Hieter, F. Spencer and V. Larionov, MCBiol 12, 5736 (1992). 94. J. Fishman-Lobell and J. E. Haber, Science 258, 480 (1992).

57. 58. 59. 60. 61.

This Page Intentionally Left Blank

Gene Structure at the Human UGT7 Locus Creates Diversity in lsozyme Structure, Substrate Specificity, and Regulation IDAS. OWENS*.'AND K. RITTER~

JOSEPH

*Sectwn on Genetic Disorders of Drug Metabolism Human Genetics Branch National Institute of Child Health and Human Development Bethesdu, Maryland 20892-1830 'Department of Phannocology and Toxicology Virginia Commonwealth Unioersity Richmond, Virginia 23298-0613

I. Function and Distribution of the UDP-Glucuronosyltransferase

................................................ 11. Chemistry and Biochemistry of Bilirubin and Phenolic Substrates . . . . 111. The UGTI Gene Complex Locus . . . . . . . . . . . . . IV. Exons 1 Determine Structural Diversity of the Transferases . . . . . . . . . rase with Important

....................

306 307 314 319 324

t Steroid Transferase

................................. ................... VIII. Regulation of the UGTl Genes . . . . . . . . . . . . . . . . . . . IX. Significance of the Arr Direction of Research ............................ ............ References . . . . . . . . . .

326 327 328 334 336

This essay focuses on the extraordinary features of the novel UGTl2 gene complex, which encodes, unexpectedly, a subfamily of UDP-glucuronoTo whom correspondence should be addressed. Abbreviations used in this essay: transferase, UDP-glucuronosyltransferase (EC 2.4.1.17); UGTl, gene complex locus encoding UDP-glucuronosyltransferase subfamily 1; UGTIAUGTlG (UGTlA-UGTlG) refer to the individual genes (gene products) at the UGTl locus; HUG-Brl and HUG-Br2, major and minor bilirubin transferase cDNA, respectively; HLUG P1 and HLUG P4, planar and bulky phenol transferase cDNA, respectively; PAH, polycyclic aromatic hydrocarbon. 1

Progress in Nncleic Acid Research and Molecular Biology, Vol. 51

305

Copyright 8 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.

306

IDA S . OWENS AND JOSEPH K. RITTER

syltransferase (transferase) isozymes that metabolize either bilirubin or phenolic substrates. The relationship between substrates metabolized by the encoded enzymes and the mechanism by which diversity is achieved at the complex are remarkable. The subfamily contains members of a broad family of UDP-glucuronosyltransferase isozymes considered one of the most important detoxifying systems for clearing endogenous and exogenous lipophilic compounds (some of which are toxic) from the body. In the 5' region, the locus has undergone expansive duplication and divergent evolution of an apparent ancestral exon to create a series of unique exons of uniform size. Each unique exon encodes the peptide domain that determines the acceptor-substrate (aglyrone) of a unique isoform. In the 3' region of the locus, four shared exons specify a common peptide structure important for the interaction with the common donor substrate, UDP-glucuronate. Based on exonic sequence data, exon-intron arrangement, consensus sequences for RNA splicing, and promoter elements for RNA polymerase I1 transcription initiation, the UGTl locus provides for diversity in enzyme structure and, thus, different substrate-specific isoforms that are independently regulated by the respective proximal promoter element. The origin of such a locus necessarily emanates from its critical function in coding for the enzyme responsible for clearing the complex and toxic chemical, bilirubin. An efficient detoxification system is crucial, because there is a constant high rate of bilirubin replenishment from the normal and constant turnover of heme salvaged from senescent erythrocytes. The locus provides both flexibility of multiple isoforms and modes of regulation.

1. Function and Distribution of the UDP-Glucuronosyltransferase System This family of enzymes functions to detoxify lipophilic endo- and xenobiotics through the conversion of each to a water-soluble glucuronide by covalently linking the lipophile to glucuronate donated by UDP-glucuronate (1).Typically, a subfamily of isozymes is grouped according to substrates metabolized (planar phenols, bulky compounds, steroids) and mode of regulation (PAH-inducible, phenobarbital-inducible, constitutive expression). Glucuronidation detoxifies by blocking biological reactivity, as well as by enhancing water solubility and the rate of excretion of lipophiles from the body. Collectively, the endoplasmic reticulum-bound enzymes detoxify estimably hundreds of chemicals encountered by animals. Typical endogenous chemicals include bile acids, steroid hormones, thyroid hormones, and bilirubin, whereas exogenous chemicals include food additives, therapeutic drugs and their oxidized metabolites, and environmental hydrocarbons,

GENE STRUCTURE AT T H E HUMAN

UGTl

LOCUS

307

some of which are potential carcinogens. The transferase isozymes are located primarily in the liver, but some forms are also found in extrahepatic tissues. Evidence is emerging that steroid-metabolizing isoforms, for example, are also distributed in the target tissues of the steroid substrate (2). In the case of the UGT1-encoded isozymes, there is differential expression in tissues, as discussed in Section VIII. Based on the persistent unconjugated hyperbilirubinemia in response to totally inactive bilirubin transferase in Crigler-Najjar Type-I disease,3 this is the single efficient system in humans for detoxifying bilirubin. Although it is present in mammals and some fish and amphibians, it is absent from birds, reptiles, and from certain fish and amphibians (1).Oddly, the cat has an intermediate level of glucuronidating activity toward both bilirubin (3)and nonplanar nonsteroidal antiinflammatory drugs (phenols) compared to that in rat (4), but it has no or poor glucuronidating activity toward planar phenolic substrates (1,5,6). It appears that, in addition to bilirubin, both planar and nonplanar (bulky) phenols are metabolized by isozymes encoded at the UGT1 locus. Because bilirubin is the most critical transferase substrate, due to its lethal neurotoxicity when unconjugated hyperbilirubinemias become extreme and sustained, we provide details of its origin, complex chemistry/biochemistry, and toxicity. Further, we consider the phenols and their similarities to bilirubin IXa as likely factors at play in pressuring a phenolspecific domain of a transferase to give rise to a bilirubin-specific domain, or vice versa, in the evolution of catalytic domains adapted to glucuronidate these compounds.

II. Chemistry and Biochemistry of Bilirubin and Phenolic Substrates

A. Bilirubin: The Molecule Since antiquity, the heme metabolite, bilirubin, has been associated with jaundiced phenotypes. Heme, salvaged primarily from senescent erythrocytes and secondarily from other inactive heme-containing proteins, is the precursor of bilirubin. Heme oxygenase (EC 1.14.99.3)catalyzes oxidation to open the protoporphyrin IX ring with the loss of the a-methene carbon to 3 Crigler-Najjar Type-I disease is a potentially fatal hereditary unconjugated hyperbilirubinemia characterized by a complete lack of bilirubin transferase activity. Crigler-Najjar Type-I1 syndrome is a nonfatal hereditary unconjugated hyperbilirubinemia reflecting 10%or less of normal bilirubin transferase activity, which is responsive to phenobarbital treatment. The Gunn rat has a hereditary unconjugated hyperbilirubinemia characterized by a complete lack of bilirubin transferase activity and is classically used as a model for the Type-I disease.

308

IDA S. OWENS AND JOSEPH K. RITTER

A

B

FIG. 1. (A) Simplified version of internally hydrogen-bonded bilirubin IXa. (B) Structure of bilirubin. Reprinted with permission from Nature (7). Copyright 1976 Macmillan Magazines Limited.

generate biliverdin, and finally, biliverdin reductase (EC 1.3.1.24)reduces the y-methene bridge to form bilirubin IXa.Water insolubility of the diacid form of bilirubin IXa occurs as a consequence of the rotation at the methane bridge between the dipyrroles, allowing for a trio of hydrogen bonds through the carboxyl group of the propionic acid side-chain to two amino groups and a carbonyl oxygen in each half of the molecule. From structural studies bilirubin IXa is regarded as a 2,2'-dipyrrolylmethane taking the form of a ridge-tile, with the ridge along the three carbons of the two carboxylic acid groups and the methane bridge. Rings A and B are in one plane, and rings C and D are in another (Fig. 1A and B) (7). The structure is very hydrophobic and requires detoxification. The bonded structure, placing all hydrophobic groups at the outer aspects of the molecule, is a diacid with two pK values at 8.0 and two at 13; in alkaline solution, the carboxyl protons are lost and the molecule can form a soluble sodium or potassium salt (8).When solubilized in dimethyl sulfoxide, bilirubin IXa exhibits two pK values at 5.1 and two at 13 for the two carboxyl and two carbonyl groups, respectively, corrected to pK values of 4.4 and 13 under aqueous conditions by using analogs (9). Thus, the solubility properties indicate that bilirubin IXa is aromatic with a set of dianionic groups having pK values (4.4) close to those expected ofpropionic acid. If not for the hydrogen bonding, the ionic molecule is predicted to be water soluble. Due to the acid groups, under acidic conditions, the dianionic bilirubin K a will leave plasma albumin and deposit in tissue. As pH is elevated the bilirubin will leave the tissue and return to the plasma to bind to albumin. Thus, the chemical properties of bilirubin can create a serious clinical challenge when its homeostasis is disrupted, especially in children whose plasma

GENE STRUCTURE AT THE HUMAN

UGT1

LOCUS

309

albumin levels have not yet reached that for adults. The special and complex chemistry of the bilirubin IXa, no doubt, led to the adaptation of an agent, albumin, for binding the potentially toxic, insoluble metabolite. At normal serum concentrations ranging between 5 and 7 pM, the dianionic form of bilirubin is essentially all bound to albumin (700 pA4) and sequestered in a nontoxic form (10).Because the binding of bilirubin to albumin represents a specific high-affinity interaction that could have implications for the characteristics of its binding to the transferase molecule, we provide some of the properties of the proposed albumin site. A model of the albumin structure (11)accounts for much of the experimental data accumulated on bilirubin. The protein has six subdomains, each having three a-helices (X, Y, and Z)connected by peptide hinge regions and arranged in parallel forming a half-domain, a trough structure; each helix has about 22 residues, giving six turns. The pliable hinge regions between the six subdomains lead to the formation of three domains. Each of two sites-one in each of the outer domains-is well-suited for binding to an aromatic and polar dianionic bilirubin IXa (8,11). The sites are helical surfaces with (a) carboxylates and hydroxyl groups (aspartic and glutamic acids and tyrosine) capable of forming hydrogen bonds with the amino groups in the dipyrroles of bilirubin IXa, (b)positively charged residues (lysine and arginine) capable of forming salt bridges with the carboxylate groups of bilirubin, and (c) aromatic residues (tyrosine and phenylalanine) to form a milieu for interacting with the aromatic aspect of bilirubin IXa (8,11). The high (K, = 1 . 4 x 108/M-1) and intermediate affinity ( K , = 5 x 105/M-1) sites of bilirubin are adapted for bilirubin transport in the plasma. Typically, the bound dianion is conveyed from its major site of production in the spleen to the liver, where it is taken up by a carrier-mediated transporter(s) (12, 13) and bound to ligandin for intracellular delivery (14)to the endoplasmic reticulum, the site of glucuronidation and detoxification. Albumin is adapted specifically to bind bilirubin in plasma, and ligandin is adapted for binding bilirubin in the cytosol(15) with a 1/10 binding constant (5 X 107 M-1) relative to that for albumin (15a), but with the specificity that enables the metabolite to enter, undergo retention, and transport within the hepatocyte. Due to differential binding affinities (16),unconjugated bilirubin dissociates from albumin at a rate 10-3 of that from phosphatidylcholine. The relative affinities account for the capacity of albumin to sequester the pigment away from tissue phospholipid under physiological conditions. With bound bilirubin, some 60% of the albumin is actually contained in extravascular space, with certain implications that are described below. When unconjugated hyperbilirubinemias greater than 300 pM are encountered, the binding capacity of the serum albumin is exceeded. This

310

IDA S . OWENS AND JOSEPH K. RITTER

creates conditions in which the pigment is deposited in the brain, with the highest association constant toward sphingomyelin (3.7 x 106 M - 1 ) and diphosphatidylcholine containing 5% cholesterol (2.6 X 106 M - l ) , particularly in mitochondria1 membranes. The large cells of the brain are, in fact, rich in mitochondria; hence, this &nity could account for the intensity of the yellow color of brain mitochondria of kernicteric Crigler-Najjar Type-I children. The consequence of this &nity is that when an icteric child has 10 to 50 nM free bilirubin anion (a supersaturated solution), at equilibrium some 3 to 15%of brain sphingomyelin is saturated with bilirubin. The binding likely occurs through the formation of polar complexes of the bilirubin anion to positive “head” groups of phospholipids (8). Complexes forming in the brain can account for the uptake of large amounts of the heme derivative. In uitro, aggregation of bilirubin acid from a saturated solution is accelerated by phospholipid vesicles forming complexes. Whether bilirubin deposits in the brain (kernicterus) are dlguse, as in young adults, or specifically in the basal ganglia, as in infants, the kernicterus (8)creates the risk of serious bilirubin encephalopathy and death. Children with the CriglerNajjar Type-I disease, having a complete loss of bilirubin transferase activity, generally suffer fatal encephalopathy at an early age. Many other chemical details of the sequestration of bilirubin by albumin that impact the potential to develop bilirubin encephalopathy have been considered by Brodersen (8). Although pH, temperature, ionic strength, fatty-acid binding, light adsorption, perturbations of albumin tyrosines, and co-binding of other ligands are parameters that affect the affinity of binding, it is the assessment that some anionic drugs competitively displace bilirubin from albumin, especially in children, which is of great interest to neonatologists and clinicians. The lower albumin concentration compared to that for adults creates increased risks when treating young children with pharmacological agents (8). Before considering mechanisms to clear the metabolite from the body, one is prompted to ponder the physiological significance or benefit derived from the second reaction in the heme catabolic pathway. It is catalyzed by the energy-requiring enzyme, biliverdin reductase, to convert the nontoxic, soluble, and excretable biliverdin to the toxic and presumably nonbeneficial and insoluble bilirubin IXa. Additional energy is required to detoxlfy the bilirubin IXa through glucuronidation to generate a water-soluble and excretable derivative. Physiologically, it appears that the energy expended to generate and detoxify bilirubin in mammals is squandered because it has no apparent beneficial role at any developmental stage. Either biliverdin or bilirubin derived from senescent fetal erythrocytes, for example, can move transplacentally to reach the maternal circulation for hepatic uptake and clearance.

GENE STRUCTURE AT THE HUMAN

UGT1 LOCUS

311

The capacity to convert biliverdin to bilirubin evolved and is specific (17). Certain fish and amphibians contain biliverdin reductase for the production of bilirubin. Prior to the discovery of the reductase, the conversion was considered a “metabolic accident” whereby biliverdin underwent reduction by acting as a nonspecific electron acceptor from a variety of dehydrogenases. To the contrary, the biliverdin reductase in each species examined is soluble and very specific for biliverdin, requiring NADH or NADPH as cofactor, depending on the species (17). The species that excrete bilirubin have high levels of the reductase in spleen and in liver, in some cases, and those that do not have the enzyme produce and excrete biliverdin into bile. Recent reports indicate that bilirubin IXa is not simply a waste product but is an efficient antioxidant. Several models show that bilirubin IXa provides effective protection against peroxyl-radical damage (18,19) comparable to a-tocopherol under a number of physiological and pathological conditions. The conformation of the albumin-bilirubin complex appears to enhance the antioxidant properties of bilirubin IXcx. The complex fixes the two planar dipyrroles of the bilirubin in an out-of-plane configuration on the protein. The asymmetric position of the bilirubin in a trough allows for easy abstraction of the reactive C-10 hydrogen atom (8).A comparison of either albuminbound or cytosolic-bound bilirubin or biliverdin revealed that biliverdin is one-tenth as potent as bilirubin in protecting microsomal membrane lipids against Fe3+-catalyzed lipid peroxidation. Because, 60% of human albumin is extravascular, bilirubin could be serving as an antioxidant under a number of clinical conditions. Albumin carrying its bound molecules leaves the bloodstream and appears at inflammation sites where increased oxygen radicals are produced by phagocytic cells (19). If the pigment serves as a physiological antioxidant, then its superior antioxidant effects over biliverdin could serve as a selective pressure favoring the conversion of biliverdin to bilirubin during the evolution of heme catabolism (19). But in the case of severe hyperbilirubinemias, the evidence supports the view that the binding of bilirubin IXa to phospholipids of membranes imparts antioxidant effects disrupting mitochondrial respiration by breaking the electron transport/free radical chain (19).These recent studies on antioxidation suggest that a previously unknown beneficial role of albumin-bound bilirubin IXa could be the basis of the mechanism of neurotoxicity of bilirubin when bound to mitochondrial-rich neuronal tissue.

B. Glucuronidation of Bilirubin and Phenols by UGT7 -encoded lsozymes Carbons 8 and 12 of the propionic acid side groups of bilirubin IXa are esterified as glucuronides (reviewed in 10). Detoxification by bilirubin transferase utilizes UDP-glucuronate to produce either the IXaC8 mono- or

312

IDA S . OWENS AND JOSEPH K. FUTTER

IXaC12 mono-P-glucuronide ester and the di-p-glucuronide ester of bilirubin. This esterification disrupts the hydrogen bonds and creates a watersoluble and highly excretable derivative. The importance of this reaction and the potential for bilirubin toxicity in its absence is heightened by the constant high production of bilirubin-some 200 to 400 mg daily--from salvaged heme. This sustained rate of production represents considerable pressure for biological systems to evolve and maintain an efficient pathway(s) for clearing the metabolite. Furthermore, increases in the heme load in plasma, as occurs with hemolysis and serious liver disease, lead to a manifestation of severe jaundice resulting in even greater stress on the detoxification system. Transferase isozymes that glucuronidate phenolic acceptor (aglycone) substrates are encoded at the same genetic locus as that for bilirubin. These phenolic chemicals have, no doubt, entered our biosystem from plants. The flavonoids, coumarins, and terpenes are plant-derived phenols. Exogenously administered phenols are not readily recognized as toxic agents. It is known, however, that acetaminophen in large doses is toxic to certain individuals, presumably, those with an impaired HLUG-P1 (20),which metabolizes this planar substrate. Aromatic hydrocarbons that are metabolized by the arylhydrocarbon hydroxylase (cytochrome P450-IA1) enzyme system to phenol metabolites are toxic to animals (21).Whether the parent or hydroxylated aromatic hydrocarbon is the proximal toxin is not known. The need to detoxify phenolic chemicals to hasten excretion from the body probably began with the beginning of the ecosystem. Most mammals, some amphibians, birds, reptiles, and some fish species glucuronidate the planar phenols (discussed in 1). As pointed out above, cats glucuronidate bilirubin at an intermediate rate, but fail to glucuronidate planar phenols or very poorly glucuronidate a few (I). Fish species lacking the phenol transferase are subject to poisoning by their exposure (1).The planar phenols (22, 23) no more than 4.0 A thick were the typical substrates for the PAHresponsive transferases (described below). Hence, the discussions that follow relate primarily to this class of phenols.

C. Similarities of Phenols and Bilirubin as Substrates It is of interest to examine the phenolic substrates for relationships and chemical analogies to bilirubin IXa. The planar phenolic substrates, such as 1-naphthol, 3-hydroxybenzo[a]pyrene,4-nitrophenol, 4-methylumbelliferone, and acetaminophen, have been used routinely to characterize a subfamily of transferase isozymes typically up-regulated (induced) in rodents by PAHs (24).This induction is known to be mediated by the aryl hydrocarbon (AH) receptor (25). The P450-IA1 family of cytochrome P450-dependent monooxygenases that catalyze the formation of phenols of benzo[a]pyrene

GENE STRUCTURE AT THE HUMAN

UGTl

LOCUS

313

and 3-methylcholanthrene is also PAH responsive and is associated with chemical carcinogenesis in response to treatment with a PAH (25). The parallel in regulation of the phenol transferases and the P450-IA1 monooxygenase has been interpreted as signaling that coordinate regulation of transferase induction protects against the carcinogenic intermediates produced by the monooxygenase enzymes. Further, the parallels in regulation of oxidation of PAH by the cytochrome P450-IA1 system and glucuronidation of the hydroxylated PAHs by select transferases have been extended to the regulation of glucuronidation of bilirubin IXa (26). Time-course studies with both PAH-responsive and PAH-nonresponsive mice showed two peaks of bilirubin transferase activity, one genetically regulated in the responsive mice and a second peak of activity in all PAH-treated mice (26). All mice showed a single peak of activity in response to phenobarbital treatment (26). These results are the first indication that (a) phenols and bilirubin transferase isozymes are genetically linked, at least in mice, and (b)that multiple transferases or multiple mechanisms of regulation of a single isoform may be involved in bilirubin IXa glucuronidation. The Gunn rat shows a totally defective bilirubin transferase phenotype (26u, 26b). In addition, it shows severely defective transferase activity toward (a) a number of planar phenols, (b) at least one bulky phenol (>4.0 8, thick), (c) two estrogen-like derivatives, and (c) a digitoxigenin monodigitoxoside (reviewed in 27). Furthermore, the Gunn rat lacks both bilirubin transferase responsiveness to phenobarbital treatment and phenol transferase responsiveness to PAH treatment, compared to the normal Wistar rat (reviewed in 27). In spite of these observations on relatedness, bilirubin- and phenolmetabolizing transferases are not considered members of the same subfamily. The results in the Gunn rat were explained by the presence of more than one defect. Hence, it was not expected that the enzymes responsible for the glucuronidation of these two chemical types share a genetic locus. The question concerns what might be the factors that led to the amplification of an ancestral exon to encode bilirubin or phenol glucuronidating activity. A comparison of the crystalline structure of the rigid ridge-tile for bilirubin IXa (described in Section II,A) (Fig. 1B) with the chemical coordinates of the planar phenols suggests structural similarities between these two substrates. The two hydrophobic dipyrroles in bilirubin IXa generating the rigid hydrogen-bonded configuration (Fig. 1A and B) (7, 8) with aromaticity via the conjugated ring structures (Fig. 1A) compare with the conjugated aromatic ring structures of the phenols, which are generally planar and hydrophobic. In the case of the binding of bilirubin IXu to specific sites in albumin, the evidence is that the two planar dipyrroles of bilirubin IXa are rotated with respect to each other (8)and interact with a polar and aromatic

314

IDA S. OWENS AND JOSEPH K. RITTER

site with strong hydrogen-bonding characteristics. How the configuration of the rotated two-planar heme derivative compared with that of the singleplanar phenol configuration in their binding to an active center of a transferase is not known. Whether the hydrogen-bonded or an anionic state of bilirubin IXol represents the appropriate model to consider for chemical similarities to phenolic substrates, the aromaticity of both molecules no doubt requires a hydrophobic protein site for interaction. Hence, it is possible that similarities in chemistry did give rise to adaptations of an ancestral gene for the substrate-selecting domains for the two chemical types. It is possible that fewer changes were required between these two isoforms than between other substrate isoforms to achieve the required adaptations for catalytic turnover. We have compared two different critical microregions of the major human bilirubin isoform with similar regions in the phenol and steroid isoforms in a discussion below (Section VI).

111. The UGT7 Gene Complex locus

A. The Human UGT7 Locus Initially, two human liver bilirubin transferase cDNA clones (HUG-Brl and HUG-Br2) were isolated, characterized, and shown to encode proteins with 533 and 534 amino-acid residues, respectively (28).The expressed proteins catalyzed the formation of the three bilirubin glucuronides produced in uiuo. At the nucleic-acid and deduced amino-acid sequence levels for the 5' end, the two clones are 58% identical and 67% similar (Table I), respectively. The clones are identical after codon 288, thereby encoding proteins with the same carboxyl terminus. A report (20)has appeared that demonstrates that a cDNA encoding a planar phenol-metabolizing transferase (HLUG P1) also encodes a protein with this same common peptide sequence. The identity in the three cDNAs suggests that the genetic locus utilizes shared exons to generate the mature mRNAs. The unique ends of the HUG-Brl and HUG-Br2 cDNAs have been used to select three overlapping clones from a human genomic cosmid library that spanned some 95kb. Through restriction enzyme analysis, subcloning, Southern blot analysis, and DNA sequencing, we uncovered and determined the arrangement of the novel UGTl gene complex locus (30),which contains four exons (shared) in the 3’ region that encode the identical carboxyl-terminal region of the two bilirubin and the phenol transferase cDNAs. In the 5' region of the locus, six different exons 1, each with a 5' proximal promoter element and each encoding the unique amino-terminal region of a transferase, are arrayed in series with the four common exons. Figure 2 depicts

HOMOLOGYAMONG

1BP

TABLE I BILIRUBIN-TYPEISOFORMS"

THE

1c

1E

1D

Nucleic acid identity

Nucleic acid identity

Protein similarity

Protein identity

Nucleic acid identity

Protein similarity

Protein identity

Nucleic acid identity

(%)

(%)

(%)

(%)

(%)

(%)

(%)

(%)

Protein similarity

Protein identity

(%)

(%)

69 95 93

50

~~

1A 1c ID

58 94 93 eAs a percentage of total.

58

68

49

58 93

67 93

49 88

60 94 93

90 88

316

IDA S. OWENS AND JOSEPH K. RITTER

A

UGTIG

UGTlF

UGTlE UGTlD

UGTlC

UGTlBPUGTlA

B

1

1

; 2 3 4 5 2 3 4

5

16

C

FIG. 2. Schematic representation showing the arrangement of the UGTl locus and seven of the predicted UDP-glucuronosyltransferase RNAs and isozymes. (A) The exon organization (open boxes), consisting of a tandem array of unique exons 1 (lA-lC) in the 5’ region, with variable coding.to select different substrates, and four common exons (2-5) in the 3‘ region coding for the reactive center for the common substrate, UDP glucuronate acid. (Other exons 1 inferred from ongoing mapping. subcloning, and sequencing data at this locus are not shown.) Each exon 1 is flanked by an independent promoter (arrow) and 5’ regulatory sequences allowing for independent expression of primary transcripts (designated UCTlA-UCTIG. The complex, spanning 107 kb, is not drawn to scale. (8)The predicted overlapping primary transcripts synthesized, due to alternative transcription initiation at this locus. Hence, each primary transcript, except UCTlA, has 1 or more internal exons 1. Based on the location of donor splice sites at the 3’ exon-intron junction of each exon 1, acceptor/donor splice sites in exons 2-4, and an acceptor site in exon 5, the processing of mRNA is best described as differential splicing of the most 5’ exon 1 to the wmmon exons for each transcript. (C) The unique and common aspects of the UDP-glucuronosyltransferases encoded by the UGTIA-UGTIG genes. The unique ends are 285-289 residues, specified alternatively by exons 1, and each common carboxyl terminus is 246 residues, specified by common exons 2-5.

GENE STRUCTURE AT THE HUMAN

UGTl

LOCUS

317

the gene arrangement. The presence of a promoter element (Fig. 2A, arrows) upstream of each exon 1, a donor splice site at the 3‘ exon-intron junction of each, the presence of both acceptor and donor splice sites in exons 2-4, and, finally, the presence of only an acceptor site in exon 5 prompted the proposal for alternative transcription initiation at each promoter to generate a series of nested primary transcripts (Fig. 2B). For all UGTl transcription units except UGTlA (the shortest one), the primary transcript is predicted to contain in its first intron one or more unique exons 1 from downstream UGTl transcription units. Due to the presence of only a donor splice site at the 3‘ exonfintron junction of each unique exon, it was considered possible that a number of potential splicing products could form. Yet, when UGTl transcripts are analyzed by Northern blot analysis using a specific probe for each UGTl mRNA, only a single hybridizing band is observed in each case. The transcripts appear to correspond to products generated by efficient splicing of the 5’-most donor splice site (at the 3’ exon-intron junction of the leader first exon) to the acceptor splice site of exon 2. With our present understanding of RNA splicing, there is no theoretical mechanism that predicts a viable mRNA when utilizing an internal exon(s) 1 of a primary transcript. There is no predictable acceptor site for the removal of upstream intron segments. Therefore, digerential splicing may best describe the processing of UGTl primary transcripts. The novel structure of the UGTl locus, with its multiple exons 1and its shared exons 2-5, has prompted debate over the classification and nomenclature of the gene. A central question has been whether UGTl should be viewed as a single gene, a complex of multiple genes, or a complex of multiple alleles. We consider that UGTl represents a gene complex with as many genes as there are unique exons 1.The reasons are that (1)each of the genes gives rise to a transferase protein with a unique amino-acid sequences and unique substrate specificity (i.e., each has unique lipophile-detoxifying functions), (2) each gene is independently regulated and can be independently mutated, and (3)the genes are not separately inherited and, therefore, do not represent allelic variants. The promoter elements upstream of each exon 1represent the beginning of each gene. Starting from the 3’-most exon 1and proceeding 5’, the genes are designated UGTlA through UGTIF. The exons 1are designated exon 1A to exon 1F. The unique portion of HUG-Brl, HUG-Br2, and HLUG P1 are specified by exons lA, l D , and l F , respectively. Exons lBP, lC, and 1E were uncovered for the first time on characterizing the gene complex. Exon 1BP has a deleted nucleotide and is, therefore, designated a pseudogene. The nucleotide and protein sequence data indicate that exons 1BP to 1E are 93-94% identical (Table I). Due to this identity of exons lBP, lC, and 1E to that of l D , we designate these as bilirubin-like until such time as the con-

318

IDA S . OWENS AND JOSEPH K . RITTER

structs for expression of the two viable proteins are studied in transfection systems to establish substrate specificities. The UGTl gene complex exists as a single copy on chromosome 2 (29). The extensive duplication and insertion of an exon 1 that took place at this locus suggests that multiple recombination sites exist(ed) for insertion in the intronic regions harboring the exons 1. In our search for intronic sequences to generate overlap of the cosmid clones containing the HLUG P4type (discussed in Section 111,B), we have identified, through computer database searches, two different 200-bp unique and nonrepetitive sequences with about 70% identity in different introns approximately 38 kb apart. Second, a 180-hp unique sequence identified in an intronic DNA segment shares 61% identity with an intronic sequence in the rat UGT2B2 gene (29a). These sequences will be extended and used as probes to search for similar sequences in other introns at this locus. This locus with the extensive insertion of duplicated exons offers an opportunity to search for recombination sites.

B. Extension of the UGT7 Locus The UGTl gene complex contains, as we suggested (30),additional exons 1 upstream of the version of the gene already described. In particular, one anticipates an exon 1 to encode a bulky-type phenol transferase cDNA (31), designated HLUG P4, which contains the sequence for the common carboxyl terminus encoded by the UGTl locus. By using exon 1F (the most 5’ exon) as a probe, we have extended the locus from 95 to 107 kb through mapping with restriction endonucleases and sequencing an exon 1G that has 77% identity to the unique portion of HLUG P4. Through mapping and sequencing three other cosmid clones selected by hybridization to exon lG, at least five other exons 1 have been identified, but their positions in the locus have not been established. Partial sequence data indicate that one encodes the HLUG P4 cDNA and five are HLUG P4-like. Therefore, there are at least seven phenol transferases encoded at this locus. We continue the characterization of the cosmid clones containing HLUG P4-like exons by restriction endonuclease mapping, subcloning, and sequencing, which promises to at least double the size of the original locus. Due to the uncertainty of the number of exons that are ID-like (30),it was of interest to determine whether all such members were identified in the original description of the locus. For this purpose, we carried out Southern blot analysis of genomic DNA with the complete unique sequence of exon 1D and compared the pattern with specific oligo sequences hybridizing to exons 1BP to 1E. The oligo-specific data accounted for all the DNA fragments observed with the full-length exon 1D probe (F. Chen, A. Pilon

GENE STRUCTURE AT THE HUMAN

UGTl

LOCUS

319

and I.S. Owens, unpublished). Hence, it is unlikely that other exons like exon 1 D exist as speculated (30). The evidence to date indicates that the locus has three distinct regions, starting from the most 5‘ and proceeding to the most 3’. In the first region, spanning more than 100 kb, there is a series of at least seven phenolselecting exons, of which at least six, designated D, D’, D”, D’”, D’”’, and D’””,form a group D (HLUG P4-type) followed by the single C (HLUG Pl). Hence, the phenol-selecting domains have a pattern of diversity according to D, D!, Df!, DVP, DfW, Df””, and C; D indicates that multiple HLUG-P4-type exons 1 exist (with partial sequence data), and C denotes exon 1F (HLUG Pl), which has no highly related homolog. Second, spanning approximately 65 kb, there are the five bilirubin-type exons 1, designated B, B’, B“, B”’, and A, with B representing a highly homologous group (exons 1BP to 1E or exon 1D-like or HUG-Br2-like) followed by the single A bilirubin-type (HUG-Brl), which has no closely related homolog. The divergency among the bilirubin-type exons is thus represented by B, B’, B”, B”’, and A. Third, there is the series of common exons 2-5 contained in 6 kb, as shown in Fig. 1.Thus, the composite for the diversity of the exons 1starting 5’ to 3’ at this locus is: D, D‘, D”, D‘”, D‘”‘, D’””,C, B, B‘, B”, B”’, and A. It is of interest as to the most ancestral exon 1giving rise to this series. Is it possible that the exon 1 series represents a rapid adaptive response to a momentous evolutionary or environmental pressure?

C. The UGT7 Locus of Other Species Although no description of the UGTl locus in other species has been reported, it was first recognized, in the rat, that both the 3-methylcholanthrene-inducible phenol and the bilirubin UDP glucuronosyltransferases share a common carboxyl terminus that probably arose through exon sharing (32). A more recent study (33) showed that four other transferase isoforms from this species share the same common carboxyl end; substrate selectivity of the forms was not reported. In the mouse, both a putative phenol and a bilirubin isoform share a common carboxyl terminus (34).Thus, the common carboxyl terminal region is the landmark for the UGTl locus.

IV. Exons 1 Determine Structural Diversity of the Transferases

A. Diversity at the Human UGT7 Locus The nine unique exons 1 (already sequenced) encoding the amino termini of nine transferase isoforms are essentially uniform in size, encoding 285-

320

IDA S. OWENS AND JOSEPH K. RITTER

289 residues out of 530-534. In considering the homology between the bilirubin-type exons 1 (1A to lE), the data show that 1BP through 1E have 93-95% nucleic-acid identity and deduced amino-acid similarity to each other. Exon 1A has 5 8 4 0 % nucleic-acid identity and 67-69% deduced protein-sequence similarity to exons 1BP to 1E as shown in Table I, indicating that exons 1BP to 1E are quite divergent from exon 1A. This pattern of similarity suggests that the duplications involving the exon 1D-like series were more recent evolutionary events. The basis for the extensive duplication most likely relates to the biological function provided by the domain. Because exon 1D specifies bilirubin detoxification, one can conclude that evolutionary pressure exists (or existed) to expand this function by creating multiple bilirubin-specific isozymes and/or by developing a more optimal domain for glucuronidating bilirubin and/or by utilizing domains with independent modes of regulation. The deduced sequence analysis shows that the bilirubin-type isoforms encoded by exons 1BP to l E , exon lA, and that for rat (32) are approximately 66-76% similar, signifying that multiple primary amino-acid sequences can yield a tertiary structure capable of glucuronidating bilirubin. The comparisons of identity and similarity for the phenol-selecting domains characterized to date are shown in Table 11. Exon l G is 90% identical to the unique region of HLUG P4. Based on Southern blot analysis of recently isolated and mapped cosmid clones, this most 5' exon 1G (HLUG P5) cross-hybridizes with at least four other exons 1, including the one encoding the unique region of HLUG P4. Nucleic-acid sequence data show that another of these HLUG PClike exons 1has 8 8 4 0 % identity to exon 1G and to the one speclfying HLUG P4. The order at the locus beyond exon 1G has not been established. Among the phenol-selecting domains, exon 1F is significantly different (51-54% nucleic-acid identity and 61-63% amino-acid similarity) from the HLUG P4-like series as shown in Table 11. We do not yet

TABLE I1 HOMOLOGY AMONG PHENOL-TYPE ISOFORMS~

lG Nucleic acid identity

Protein similarity

Protein identity

Nucleic acid identity

Protein similarity

(%)

(%)

(%)

(46)

(a)

63

42

54 90

61

42 77

51

1F lC aAs

HLUG-P4

a percentage of total

85

Protein identity

GENE STRUCTURE AT THE HUMAN

UGTl

LOCUS

321

know the substrate specificities encoded by the newly isolated HLUG-P4like exons 1. The depiction of the protein structures encoded at this locus is shown in Fig. 2C. An inter-exon 1 comparison of the bilirubin- and phenol-selecting domains is shown in Table 111. The higher intra-exon 1 identity for the exon 1D-like and for the HLUG P4-like forms than exists between the bilirubin types and between the phenol types, respectively, suggests that both groups resulted from recent evolutionary events and are not likely closely related to a more distant ancestral exon 1. The exon 1F-encoded phenol isoform is more related to the exon 1A-encoded bilirubin form than to the phenolic HLUG P4-type. On this basis one could argue that a 1F-like exon gave rise to both an exon 1A-like (not vice versa) and to the HLUG P4-like series, and that exon 1A generated the exon 1D-like series. The descendancy of exon 1A from an exon 1F-like (rather than vice versa) is based on the presence of plant-derived phenols in biological systems before the appearance of bilirubin, and thus the need to detoxlfy the phenols arose before that for bilirubin IXa. We would argue, therefore, that an ancestral planar-phenolselecting exon 1 gave rise to the exon 1A-type bilirubin-selecting domain. It is also possible that HLUG P4-type exons 1(the most divergent of the two highly related series) are more ancient but maintained the homology due to unidentified selective pressure. The extensive duplication of the HLUG P4-like phenol isoforms may signify that phenolic toxicity in animal models exists (or existed) and is more critical than we appreciate. Certain fish that lack phenol transferase activity are poisoned by phenols (1). It is not appreciated that bulky phenols are a greater risk for toxicity than the planar phenols, and, hence, exert greater pressure to expand this function. Bulky phenols with complex alkyl substituent groups are less water soluble and more dependent on glucuronidation for excretion than are the simple planar phenols. It is suggested that the lack of planar phenol transferase in the cat may be related to the generally carnivorous diet and/or the evolution of its efficient sulfation system to detoxify the planar phenols (5). Phenotypically nonjaundiced individuals who are sensitive to high doses of the planar phenolic acetaminophen have been identified. In the genome of these individuals, it remains to be seen if a mutation exists in the exon 1F encoding the unique region of the acetaminophen-metabolizing HLUG P1 isoform.

B. Diversity among Rat lsoforms Having Ident ica I Ca rboxy I Term ini A comparison among species for similarities among isoforms encoded at the UGTl loci will provide valuable insight about the timing and sequence of evolutionary events that shaped UGTl into its modern form. The sequence similarities suggesting the evolution of four distinct subgroups (A, B, C, and

HOMOLOGY OF

MAJOR AND

1F

1A

1D

ISOFORMS'

HLUG-P4

1G

Nucleic acid identity (%)

Protein similarity @b)

Protein identity (%)

52 51

67 61

43 41

0As a percentage of total.

TABLE I11 MINOR BILIRUBINAND PHENOL

Nucleic acid Protein identity similarity ( I ) @)

52 51

63 57

Protein identity (96)

39 39

Nucleic acid identity

(a) 52 52

Protein similarity @b)

Protein identity

61 59

39 40

(%)

GENE STRUCTURE AT T H E HUMAN

UGTl

LOCUS

323

D) of human UGTl exons have prompted the question about whether the UGTf locus of other species has a similar structure and pattern. To address this, we compared the protein sequences of the five UGTl-encoded enzymes known for the rat (33). Only two of the five were found to be highly related, the rat bilirubin transferase enzyme being 88% similar to the isozyme encoded by the B6 cDNA. The higher similarity of rat bilirubin transferase to the human UGTlD enzyme (76%) compared to the UGTlA enzyme (71%) suggests that both rats and humans have evolved a series of B-type exons. In contrast, the sequences of the remaining three proteins were much more divergent (similarities of 56-70% to each other and to the bilirubin transferase and B6 cDNAs). One of these is the form coded by the 3-methylcholanthrene-inducible 4-nitrophenol transferase cDNA (35),which is thought to represent the rat ortholog of the human UGTlF isoform based on its high similarity (84%). These observations suggest that rats may have at least five subgroups of related exons 1, two of which have a high degree of resemblance to the B and C groups of human.

C. Comparisons to Isoforms Encoded at Other Loci The important feature of the UGTl locus is the completely static nature of the 3’ region encoding the common carboxyl termini of the isoforms, whereas the exons encoding the acceptor substrate selecting-domains have undergone significant diversity (6-49%) consistent with transferase isoforms, i.e., those conjugating steroids, encoded at typical loci (29a,36). Simple alignments of two isoforms (37-39) or multiple alignments show that the carboxyl terminus is remarkably constant, whereas diversity occurs in the amino terminus. It was first recognized, through chimeric constructs of two different isoforms, that the 298-residue amino-terminal portion selects the acceptor substrate (40). This is consistent with the 288-residue aminoterminal region providing substrate specificity by the UGTl locus through duplication and divergent evolution, whereas the common exons have remained essentially static in the expansion of the glucuronidation function and, presumably, through selective pressure. At UGTl loci, the similarity of this region between human and rat is 86.7%,between human and mouse is 86.0%,and between rat and mouse is 94.4%. Although there are many examples of other isozyme systems with alternative substrates or tissue distribution, alternative exon splicing of primary transcripts has often allowed different functions. It has not been previously reported that expansion of function of a protein system occurred through such extensive but discrete duplicationtdivergency of a single encoded domain organized to allow for independency of expression. One might con-

324

IDA S . OWENS AND JOSEPH K. RITTER

clude that the force of “need” for the function was pivotal in the development of the locus.

V. Defects Define Microregions in Bilirubin Transferase with Important Clues Since uncovering this locus through selecting cosmid clones with the unique ends of HUG-Brl (1A) and HUG-Br2 (lD), we detected (42-44), for the first time, defects in the genome of Crigler-Najjar Type-I and -11 patients that account for the loss of bilirubin transferase activity. Mutations in two different Type-I patients led to the identification of two conserved microregions (A and B) in the unique amino terminus of the HUG-Brl protein, the major bilirubin isoform. Both microregions are critical for bilirubin glucuronidating activity. Each microregion is conserved in the bilirubin-type (exons 1A-1E) isoforms. In microregion A (amino-acid residues 161-180), a Phe codon deletion at position 170 disrupted a diphenylalanine (43) in a conserved hydrophobic region and caused a pH-sensitive mutant with loss of activity at the major optimum of pH 6.4, but not at the minor one at pH 7.6. A comparison of microregion A of both UGTl-encoded bilirubin (43)and phenol transferases with that for all other transferases (generally steroid metabolizing) encoded at other loci shows remarkable sequence identity between members of the former two groups, in contrast to that between the bilirubin and the latter (steroid) isoforms. The sequences surrounding the diphenylalanine (residues 1701171) define a characteristic hydrophobic site for the three types of isoforms. The consensus sequences of six bilirubin/bilirubin-like (32, 34, 43), of four phenol (34, 35), and of ten steroid (2, 27, M a ) isoforms, respectively, contain Pro, Thr/Ala, and four consecutive hydrophobic residues with two consecutive aromatic residues; Pro, Ser, and four consecutive hydrophobic with one or two nonconsecutive aromatic residues; and Pro, three hydrophobic with one or two nonconsecutive aromatic residues, Ser, then the fourth hydrophobic residue (consensus sequences shown in Fig. 3A) (M. Ciotti, J. Cho and I. Owens, unpublished). The bold residues are invariant in Fig. 3A. Evidently the Thr/Ser-168 in the bilirubin and phenol isoforms translocated to position 171, disrupting the string of four consecutive hydrophobic residues in the steroid isoforms, thereby reducing the hydrophobic properties of its microregion A. In the limited UGTl-encoded isoforms characterized for substrate preference, the phenol isoforms identified to date have no consecutive aromatic residues in microregion A. The microregion A in the bilirubin and phenol isoforms is more hydrophobic than that in the steroid forms and contains one critical difference (no

GENE STRUCTURE AT THE HUMAN

A

PiVlRN 167

Phenol

P S W kLRG

-

167

Steroids

325

LOCUS

B

167

Bilirubin

UGTl

PLYSLRF

Rabbii 107 UGTl.4 PSVFLLRF UGTl .6 PSWLFRG Rat

B6 A10 A18

PTVFFLRY PSVILAKG PAWFLNA

FIG. 3. (A) Consensus sequences of conserved microregion A of the UGTZ-enwded bilirubin and phenol isoforms compared with that of steroid isofonns. The peptide regions between residues 167 and 174 were compiled from various isoforms as described in Section V. The sequences are in the one-letter symbols for the amino acids. The bold letters represent invariant positions. Of the isoforms assigned substrate preferences, the bilirubin isoforms contained two consecutive aromatic residues at 170/171;the phenol forms did not contain consecutivearomatic residues. No steroid isoforms contained consecutive aromatic residues. (B) Microregion A in transferases encoded at the UGTZ locus but not assigned a substrate preference.

consecutive aromatic residues) at this site; when contrasted, this region in the bilirubin form is far more hydrophobic than that in the steroid forms with at least two or more critical differences (disruption of four hydrophobic residues and nonconsecutive aromatic ones). Structure-function data (M. Ciotti, J. Cho and I. S. Owens, unpublished) indicate that Ile or Ala instead of Phe-170 causes a loss of bilirubin glucuronidation by the HUG-Brl protein. Prediction of secondary structure of these membrane proteins by the computer program RAOARGOS (44b) suggests that both the HUG-Brl and HLUG P1 (M. Ciotti, J. Cho and I. S. Owens, unpublished) isoforms have a buried (membrane-associated) helical structure in the microregion A, whereas a human prototype steroid isoform, UDPGTh-2 (45), does not contain such a structure in this region. All UDP-glucuronosyltransferase isoforms contain a hydrophobic signal-peptide insertion sequence within the first 20 residues and a membrane-anchoring domain between residues 475 and 494, both of which this program shows as buried (membrane associated). In order to make further comparisons we provide, in Fig. 3B, the sequence data between residues 167 and 174 of UGTl -encoded isoforms from rabbit (GenBank Accession numbers UO9101, U09030) and from rat (33), which have not been assigned substrate preference(s). A second Crigler-Najjar Type-I patient contained a Gly-to-Arg mutation at codon 276, which disrupted a strictly conserved diglycine at position 2761277 in every UDP-glucuronosyltransferase isoform. This mutation defined a conserved microregion B (residues 269-280) that is identical, considering conservative substitutions, between UGTl -encoded bilirubin and phenol isoforms: all other isoforms, encoded at non-UGTI-encoded loci, contain

326

IDA S . OWENS AND JOSEPH K. RITTER

UGT7 ~ 1 3 ~ 115 1 6

M PNM

,4

Ll

"2

116

269

G G T l N C-MPNMY

280

F I GG I NC

Non-UGT7 L'5 si

vloD12 H13 F9V'O P N F5 E 3 Y 7 (6 G G L Q 2 CI' Q' G'

269

L PNV

280

F V GG L HC

FIG.4. Microregmi B i n transferases encoded at the UCTl locus or at non-UGT1 loci. The sequences are defined between residues 269 and 280; the letters are the standard abbreviations for amino acids. The superscripts represent the number of isoforms that contained that residue at that position.

a distinguishing Asp- or Glu- or Gln-273 residue versus an aliphatic VallIle residue for the UGTl -encoded isoforms (Fig. 4, underlined) (44). In these two microanalyses on relatedness of critical structures in the three classes of isoforms, the results indicate that the bilirubin form is more akin to the phenol one than to the steroid isoforms. Hence, this close identity suggests that fewer substitutions were required to evolve a phenol-tobilirubin selecting-domain than from a steroid to bilirubin. The closer relationship between these proteins and necessarily catalytic requirements most likely drove the encoding of these two glucuronidating activities at the same genetic locus.

VI. Comparisons of the UGT'I Gene Complex to Rat Steroid Transferase Genes Although many of the isoforms described to date (see reviews 27, 44a) conjugate xenobiotics, each will also conjugate a steroid. Aspects of two rat UDP-glucuronosyltransferase genes, members of the UGT2B family, described in the literature (29a, 36) show remarkable similarity to that of the UGTl gene. Exons 1 and 2 of the testosterone and etiocholanolonelandrosterone transferase genes encoding 2931291 amino acids, respectively, are the equivalent of exon 1, evidently resulting from exon fusion at the UGTf locus coding for 288 residues. Exons 3 to 6 code for 44144, 29129, 74/74, and 91/92 residues in the rat genes, whereas exons 2 to 5 code for 44, 29, 74, and 99 amino acids for the human gene. Although the exon sizes are rigidly conserved, a comparison of the combined intronic size harboring the equivalent of the evolutionary static common exons (3-6) for the UGT2BIIUCT2B subfamilies are 2.1/9.5 kb, compared to 5.3 kb for the UGTl locus.

GENE STRUCTURE AT THE HUMAN

UGTl

LOCUS

327

The conservation of exon sizes encoding the carboxyl-terminal region suggests that distinct functional domains exist in this portion of the molecule, most likely, for interacting with the common substrate UDP-glucuronate. Because no other cases of identity at the cDNA level have been uncovered, the sharing of exons to generate mature mRNA species appears to be unique to the UGTl locus. Thus, we consider the uniqueness of this occurrence and the similarity in overall gene structure to that of the rat UGT2B loci as evidence that biological pressure modified an existing locus to protect the species against the toxicity of the heme metabolite or, perhaps, phenolic compounds. It is of interest to determine the arrangement of the locus in the cat, which evidently does not possess planar phenol activity, but has intermediate levels of bilirubin transferase activity.

VII. Substrate Specificity of UGT7 -encoded Isoforms A characteristic of transferase enzymes ideally suited to their function in the detoxification of lipophilic substances is their broad substrate specificity. Most transferases can catalyze the glucuronidation of multiple acceptor substrates. These substrates may be structurally related or diverse. Ironically, it was their broad substrate specificity that confounded early attempts to characterize the substrate preferences of individual isozymes. This obstacle has been overcome through the cloning and expression of transferase cDNAs in cultured cells, either COS or Chinese hamster V79. The expression of functional isoforms in an isolated, pure state has allowed precise assessment of their individual substrate specificities. Only minimal progress has been made in characterizing the substrate specificity of individual UGTl isozymes. Among the dozen or so isoforms predicted to exist from the structure of the UGTl gene complex, only four have been expressed in cell culture (UGTlA, UGTlD, UGTlF, and the UGTlG-like gene product corresponding to the HLUG P4 cDNA), and only three have been tested toward substrates other than bilirubin. UGTlA (HUG-Brl) catalyzes the glucuronidation of bilirubin, l-naphthol, estriol, and 17a-ethynylestradiol, but not 2,B-diisopropylphenol (46). This selectivity pattern contrasts with that of the UGTlF (HLUG P1) enzyme (active toward naphthol, but not bilirubin, 2,6-diisopropylphenol, estriol, or l7aethynylestradiol) and the HLUG P4 enzyme (active toward l-naphthol, 2,6diisopropylphenol, and 17a-ethynylestradiol, but not bilirubin or estriol) (46). In a more extensive study of the substrate specificities of the HLUG P1 and HLUG P4 enzymes (47), HLUG P1 exhibited a preference for small

328

IDA S. OWENS A N D JOSEPH K. R I l T E R

planar phenols (for example, the classical substrates, 4-nitrophenol, l-naphthol, and 4-methylumbelliferone) and phenol derivatives with alkyl substitutions (methyl, ethyl, propyl, isopropyl, and butyl) in the 4-position. HLUG P4, in contrast, demonstrated much greater functional diversity than HLUG P1 (and most other isoforms). The enzyme catalyzes the glucuronidation of bulky compounds, including aliphatic alcohols, complex phenols, carboxylic acids, and amines. Its substrates include drugs with widely differing structures and therapeutic activities, natural substances fou-d in the diet, such as anthraquinones and flavones, and also toxic xenobiotics.

VIII. Regulation of the UGTl Genes The regulation of the UGTl genes is an important aspect of the biology of this subfamily of isozymes and of the UDP-glucuronosyltransferases in general. Alterations in the amount of these enzymes resulting from altered gene expression have the potential to change the steady-state levels of a glucuronicate acceptor. Depending on the nature of the acceptor (i.e., whether it is a drug, environmental toxin, or an endogenous waste product), this may result in an altered biological, therapeutic, or toxic effect of the acceptor substance. This can be attributed to the markedly reduced biological activity (or toxicity) of the glucuronide conjugate compared to the unconjugated counterpart. An excellent example of how transferase enzyme regulation can alter the outcome associated with exposure to a toxic acceptor is afforded by the induction of bilirubin glucuronidating activity by phenobarbital (474. Treatment with phenobarbital induces the synthesis of new bilirubin transferase enzyme, an effect thought to involve transcriptional activation of the gene. The increased enzyme activity leads to a reduction in the steady-state plasma level of unconjugated bilirubin. When individuals afflicted with the CriglerNajjar Type-I1 syndrome are treated with phenobarbital, the resulting reduction in plasma bilirubin levels reduces the risk of kernicteric injury to the nervous system in response to high bilirubin concentrations. The increase in bilirubin transferase by phenobarbital constitutes a unique example of a therapeutic mechanism of action involving enzyme induction. The potential for gene regulation to influence the activities of the UDPglucuronosyltransferases and, in turn, the detoxifying capacity of the glucuronidation system emphasizes the need to understand the regulation of UCTl genes. Like other transferases and biotransformation enzymes in general, the UGTl enzymes are differentially regulated during growth and development by tissue-specific mechanisms and in response to exposure to drugs and other types of xenobiotic inducing substances (i.e., inducibility).

GENE STRUCTURE AT THE HUMAN

UGTl

LOCUS

329

Although the exact mechanisms of regulation of the UGTl enzymes remain obscure, experience with other biotransformation enzymes suggests that the answers will be found through an understanding of their gene expression. One of the major benefits of identifying and characterizing the UGTl gene locus is that it provides basic tools to begin to resolve questions regarding its regulation.

A. Independent Regulation of UGTl Enzymes Clearly, the structure of the UGTl gene complex implies that each of the transcription units is independently initiated and regulated (Fig. 2B). Each transcription unit has a flanking TATA-like element, characteristic of genes transcribed by RNA polymerase 11, located between bases -31 and -25 relative to their transcription start sites (determined by primer extension analyses). This is best illustrated for the UGTlA unit, which features an unusual string of seven consecutive, repeated TA residues, TATATATATATATAA, at positions -37 to -23 relative to its transcriptional start site. The data suggest that the unique amino termini of the proteins correspond to first exons of transcription units (with the possible exception of the UGTlF exon from the rat, as discussed in Section VI11,D). In agreement with this hypothesis, fragments encompassing the putative promoters from the UGTIA, UGTl D, and UGTl F genes were able to drive the transcription of a reporter gene (chloramphenicol aceytyltransferase, or CAT) in transient transfection gene-reporter assays (I. S. Owens, J. K. Ritter, F. Chen and A. Pilon, unpublished).

B. UGT7A Gene Regulation The most definitive evidence for the independent expression of UGTl genes comes from comparison of the patterns of constitutive, developmental, and tissue-specific expression and their inducibility by xenobiotics. These patterns emphasize diversity in the regulation of expression of individual UGTl genes. The U G T l A gene is distinct among the UGTl genes because it is expressed constitutively in liver at a higher level than any of the others so far examined. Estimates of mRNA expression from our analysis of two samples of human liver mRNA showed that the UGTlA mRNA is present at a 2.5-fold greater level than the UGTlD RNA (28, 30) and a 5-fold greater level than that of UGTlF (30). A greater abundance of UGTlA (HP3) mRNA relative to that of UGTlD (HP2) or UGTlF (HP1) was also observed in a study of RNA samples from seven human liver specimens (48). Transferase enzyme activity toward bilirubin is unique compared to the phenol isoforms in its developmental and tissue distribution patterns. Human bilirubin transferase activity first appears during the neonatal period

330

IDA S. OWENS AND JOSEPH K. FUTTER

and develops to adult levels over the course of the first year (49).The onset in bilirubin transferase expression is often delayed in newborns, resulting in the accumulation of serum and tissue bilirubin and the manifestation of jaundice during the 48- to 72-hour period post-partum. These observations suggest that onset of UGTlA gene expression occurs between birth and 72 hours, although this has not yet been directly demonstrated. UGTlA gene expression is also distinct from the phenol forms in its restricted tissue distribution; it is present in human liver, but not kidney or skin, possibly representing a liver-specific isoform (30, 48). The expression of the phenol transferase genes (UGTlF and HLUG P4-like genes) in many extrahepatic tissues is evidence that the lack of UGTlA gene expression is not due to a closed chromatin structure. It most likely indicates that transcription of UGTlA in humans requires the participation of liver-specific transcription factors. It remains unclear whether UGTlA is regulated by exposure to phenobarbital or other chemical inducing agents. Induction of hepatic bilirubin transferase activity in humans following phenobarbital treatment suggests that at least one of the bilirubin transferase-encoding genes, UGTlA, UGTlD, or possibly another of the UGTlD-like genes, is responsive to Phenobarbital. Northern analyses of liver RNA from a control monkey and a monkey that received phenobarbital in its drinking water showed no effect of phenobarbital on hepatic UGTlA mRNA expression, suggesting that UGTlA is not phenobarbital responsive and therefore represents a high constitutive bilirubin transferase. However, in another study (48), UGTlA (HP3) mRNA levels were two- to threefold higher in liver tissue from patients known to be treated with phenobarbital or the phenobarbital-type inducing agent, phenytoin, suggesting that UGTlA is phenobarbital responsive. This is supported by the finding that the glucuronidation of 17aethynylestradiol, a substrate for the UGTlA-encoded transferase (46), is induced by phenobarbital in human liver (50).

C. UGT7D and

UGT7D-like Gene Regulation

Analysis of expression of UGTl D-like genes has posed a challenging problem because of the high degree of sequence identity in their respective mRNAs. To overcome the problem of distinguishing between isoforms, a probe strategy was devised using 32P-labeled oligonucleotides corresponding to unique sequences of the exons lBP, lC, lD, and 1E (F. Chen, M. Yeatman and I. S. Owens, unpublished). Hybridization conditions were established that permit a high (if not absolute) degree of binding specificity of each oligonucleotide to its respective UGTl mRNA. The use of this strategy enabled the determination that among the UGTlD-like genes, only UGTlC

GENE STRUCTURE AT THE HUMAN

UGTl

LOCUS

33 1

and UGTl D are expressed in normal liver of transplant-donor specimens. The level of UGTlD mRNA was approximately 5- to 10-fold that of UGTlC mRNA. The development of specific probes for UGTlD and its related genes will facilitate further studies focused on regulation of the individual genes. The lack of bilirubin transferase activity prior to the neonatal period indicates that UGTID, like UGTfA,must also be programmed for a neonatal onset. In contrast to the UGTlA mRNA, the UGTlD mRNA was increased two- to threefold in the phenobarbital-treated monkey, suggesting that UGTlD represents a low constitutive, phenobarbital-inducible isoform (28). This result was supported, at least in part, by the study with human liver specimens; UGTlD (HPZ) mRNA was elevated threefold in the liver sample of the patient treated with phenytoin. However, it did not appear to be affected in the patient treated with phenobarbital.

D. UGT7F Gene Regulation In contrast to the bilirubin transferase genes, the regulation of UGTlF gene expression has attracted considerable interest, due to the inducibility of the ortholog of UGTlF (4-nitrophenol transferase, 4NP-GT) in rats treated with PAHs (35),its proposed involvement in the normal detoxification and elimination of phenols and amines, and its role in the toxin-resistant phenotype of initiated, preneoplastic hepatocytes (51).The independent regulation of UGTlF is emphasized by its distinct profile from that of the bilirubin transferases. The developmental onset of phenol glucuronidating activity occurs during the late fetal stage of gestation (49). In rats, the inducibility of the late-fetal phenol transferase by PAHs suggests its identity with the UGTlF homolog. Phenol transferase is also characterized by its broader tissue distribution compared to the bilirubin isoforms. 4-Nitrophenol glucuronidating activity is expressed in liver as well as in a broad array of extrahepatic tissues from rat, including kidney, lung, ovary, testes, and epididymis (52), brain ( 5 4 , and lymphocytes (54). The wide distribution may signal that the phenol transferase plays a crucial role in the cellular detoxification of endogenous and xenobiotic substances. The feature that has most distinguished the regulation of the 4-nitrophenol transferase from that of other UGTl isozymes is its inducibility in liver of rats and mice by a variety of xenobiotic-inducing agents. The responsiveness of phenol glucuronidating activity to PAHs (55) and phenolic antioxidants (56) has been known and studied for years (55). The identity of the PAH-inducible phenol transferase was established by cDNA cloning (35). Northern blot analyses showed that the mRNA was increased 15-fold after

332

IDA S. OWENS AND JOSEPH I(. RITTER

treatment with 3-methylcholanthrene (35)and 10-fold in rat liver after treatment with the PAH-type inducing agent, 2,3,7,8-tetrachlorodibenzo-pdioxin (“dioxin”) (52). A phenol transferase activity thought to correspond with UGTlF is also up-regulated in liver cells with precancerous lesions induced by treatment with tumor-initiating and -promoting agents (51). The mechanism of rat liver 4-nitrophenol transferase inducibility by PAHs is considered to be related to the aryl hydrocarbon (AH) receptormediated mechanism (as noted earlier) of cytochrome P450-IA1 gene activation. In PAH-nonresponsive inbred strains of mice with a defective AH receptor, the induction of phenol transferase activity is muted or entirely absent (24). Another study showed that the median concentration of PAHtype inducing agents required to induce half of the maximum inducible enzyme activity was similar for both phenol transferase and cytochrome P450-IA1, the major PAH-inducible cytochrome P450 (57). These observations suggest that induction of phenol transferase gene expression involves the AH receptor and therefore predict that the 5‘ flanking sequence of the rat UGTl F gene contains one or more copies of the core xenobiotic-response element (58), TNGCGTG, which recognizes the ligand-bound receptor. The regulation of the rat phenol transferase gene by PAHs exhibits a marked tissue specificity. The effect was first observed in kidney, where 4NP GT mRNA was induced only 3-fold by PAHs compared to 15-fold in liver (35).Because the liver and kidney of induced animals exhibited the same amount of hybridizable RNA, the difference was attributed to the higher level of constitutive phenol transferase RNA expression in kidney compared to liver. In another study, similar findings for the 4NP-GT mRNA were reported, using dioxin as the inducing agent (52). Inducibility by dioxin was highest in liver, and moderate or low in kidney, epididymis, testis, and ovary. Responsiveness to PAH-type inducing agents thus appears to correlate inversely with the level of constitutive expression. The basis for these tissuedependent UGTlF induction responses is not understood. The studies showing rodent 4-nitrophenol transferase inducibility by PAHs raise the obvious question as to whether the human UGTl -encoded 4-nitrophenol transferase (UGTlF) is regulated in a similar manner. Primarily indirect evidence has been offered in support of this possibility. Clearance of acetaminophen, a substrate for the UGTlF isoform (59), is increased in cigarette smokers (60),cigarette smoke being a known source of PAHs. Furthermore, Northern blot analysis of a liver RNA sample from a heavy smoker showed a fivefold increase in the level of UGTlF mRNA, evidence supporting its inducibility by PAHs. However, other lines of evidence do not support this hypothesis. Administration of P-naphthoflavone had no effect on expression of liver UGTlF

GENE STRUCTURE AT T H E HUMAN

UGTl

LOCUS

333

RNA in monkeys, whereas the RNA for cytochrome P450-IA1 (or IA2) was elevated between 10- and 20-fold (R. Lubet, J. Ritter and I. Owens, unpublished. Gene regulation studies with a promoter fragment from the human UGTlF gene (bases -3000 to +4) failed to provide evidence of PAH responsiveness. No stimulation of CAT activity could be demonstrated in HepG2 (human hepatoma) cells transfected with the UGTlF gene-reporter construct and treated with either dioxin or p-naphthoflavone, In contrast, the PAH-type inducing agents stimulated CAT activity 10- to 20-fold in cells transfected with cytochrome P450-IA1-CAT fusion plasmids. Because the xenobiotic-response elements of other PAH-inducible genes are located in the -1500 to -800 region relative to the transcription start site, the data suggest that the human UGTlF gene does not contain functional xenobioticresponse elements and that a species difference most likely exists in UGTl F gene regulation. The genetic basis for this difference is not apparent, but can be revealed by further structural and sequence comparison of the human and rat UGTlF genes. A recent report suggests that the rat UGTl F gene may possess a fundamentally different exon organization (61). The unique 5’ sequence of the rat UGTlF mRNA may be encoded not by one large exon (as for human UGTlF) but by two exons. The two exons represent a smaller noncoding first exon and a larger second exon containing the coding information for the unique amino terminus of the rat UGTlF isoform. Further work is required to confirm the different exon organizations of the rat and human UGTlF genes and its role in the species-specific response to PAHs. Phenol transferase activity in rats is also selectively stimulated by exposure to antioxidants (56, 62) or antioxidant-type inducing agents such as the chemopreventive agent oltipraz (63, 64). (J. K. Ritter and F. Kessler, unpublished). Our data show that oltipraz administered orally to female Fisher 344 rats increased the level of liver 4NP GT mRNA by over 50-fold. This observation suggests that, in addition to the xenobiotic-regulatory-element controlling the expression of rat UGTl F , this same gene is also under the control of an electrophile-response-element (65,66) [antioxidant-responseelement (67)].This transcriptional regulatory element has been shown to mediate the selective induction of other phase 11 enzymes in response to antioxidants or antioxidant-type inducing agents (68).It consists of two adjacent AP-1-like (activator protein) recognition motifs containing the core sequence, TGA(C/G)T(C/A)A (65). The sites are proposed to represent lowaffinity AP-1 binding sites. Treatment with oltipraz or agents that elicit oxidative stress (such as phenolic antioxidants) results in increased de nmo synthesis of AP-1 transcription factor, which binds in a cooperative manner to the adjacent AP- 1-like binding sites to activate gene transcription (6s).

334

IDA S. OWENS AND JOSEPH K. RITTER

E. Regulation of HLUG P4-encoding and HLUG P4-like UGT7 Genes Only limited information currently exists regarding the regulation of the HLUG P4-encoding phenol transferase gene during development, in tissues, and in response to chemical inducing agents. Northern analysis of human liver and kidney RNA with a probe corresponding to the unique end of the HLUG P4-type mRNA (HP4 RNA) (48) revealed a pattern of expression resembling that of the UGTlF mRNA, low in liver and higher in kidney. Because a high degree of sequence similarity exists in the 5’ unique exons of the HLUG P4-like genes, it is not clear which of these genes is expressed in liver or kidney. Thus, a specific oliogonucleotide-based probe strategy similar to the one developed for the UGTlD-like genes is essential in order to distinguish between individual UGTlG mRNAs.

IX. Significance of the Arrangement of the UGTJ locus and Future Direction of Research The identity in the carboxyl termini of isoforms encoded at the UGTl locus dramatically highlights the general observation that all transferase isoforms are highly related to each other in the carboxyl-terminal region. Furthermore, the arrangement at this locus, the comparisons of the common ends at this locus across species, and the studies with transferase chimeras (40) argue that alignments designed to established similarities between isoforms and subfamilies should consider only the first 300 residues of isoforms, because this represents the most evolutionarily sensitive variable (69). That is, the amino-terminal portion of the molecule has undergone a far more rapid rate of divergency than the carboxyl region. This strategy would differ from that for the cytochrome P450 superfamily, where complete sequence data for each protein are necessarily required (70). The diversity at the amino termini and constancy at the carboxyl termini of the proteins indicate that interactions of the two portions of the molecule to create an active center for transfer of glucuronate are not particularly rigid. This flexibility in structure, no doubt, has allowed the locus to respond more directly and rapidly to selective pressure(s) favoring the development of the encoded functions. The locus shows evidence that only one region of the protein molecule need be altered to create an enzyme with altered glucuronidating specificity. It is possible that a number of optimal conditions converged to make this locus possible: (a) the capacity of multiple unique amino termini of isoforms to interact productively with a constant carboxyl terminus in a catalytic reaction; (b) the presence or adaptation of intronic

GENE STRUCTURE AT THE HUMAN

UGTl

LOCUS

335

recombination sites for the insertion of duplicated exons 1;and (c)the evolutionary ease of adapting an acceptor substrate-selecting domain for the chemical types to be metabolized. The results of the analyses of the two conserved microregions in the HUG-Brl protein and the similarity in structure of the UGTl locus and the rat UGT2B genes suggest that a simple gene was modified through the amplification of an existing domain in response to biochemical/physiological pressure(s), creating an extended 5' region with a series of independently regulated exons 1. The locus that has evolved is flawed, however, by the potential for each of the encoded isozymes to be inactivated by a single deleterious mutation in one of the common exons, as seen in the genome of Crigler-Najjar Type-I patients (42, 71, 72) and of the Gunn rat (33). Other patients have bilirubin-transferase-specific mutations in exon 1A (43, 44). Hence, the locus is necessarily under pressure to maintain the integrity of the common region of the encoded proteins. Gene conversion for conservation of sequences is favored by the clustering of exons 2-5 in a 5-kb region ( 7 3 , unlike the unique exons that are spaced, on average, 10 to more than 20 kb apart. The evolution of an arrangement that allows for differential regulation of the exact genetic unit necessary to generate a new function represents a remarkable development. Because the hepatocyte expresses several of the isoforms, one wonders what controls the overlapping transcriptional tr&c with the potential to obtain different rates of initiation and elongation. It does not necessarily relate to the relative position of the exon 1to that of the common exons (1C mRNA level is less than 1D). Based on the mRNA levels uncovered to date, exon 1A with its unusual string of TAs in the promoter element has the highest transcriptional rate. Does an increased rate of UGTlF transcription (e.g., following PAH induction) influence the rate of constitutive UGTlA transcription? Is there a different set of regulatory structures/factors operating at this locus to allow for the successful expression of the encoded functions? Are some exons 1 relics representing nonfunctional duplications? The future direction of research surrounding this novel gene locus, in the short term, is to complete the description of the exon 1series, including the determination of the substrate preferences specified by these exons. Because bilirubin represents a critical substrate we would like to determine the regulation of these isoforms by using the gene-reporter fusion systems to establish whether certain bilirubin-specifying UGTl genes are selectively transcribed in response to certain agents or conditions giving an expanded detoxlfying system with multiple modes of regulation. Because phenolic therapeutic drugs, based on acetaminophen data, have the potential to generate toxicity, will mutations be found in the appropriate

336

IDA S. OWENS AND JOSEPH K. RITTER

unique exon 1at the UGTl locus of suspect individuals to explain idiosyncratic drug reactions? The finding that intronic sequences in transferase genes are conserved between rat and humans is very interesting. Their potential role in the recombination events leading to exon duplication and expansion of UGTl will require further analysis. If these sequences represent recombination sites, it should be possible to demonstrate this directly via a recombination assay (74). Structure-function relationships of amino-acid residues in the HUG-Brl protein will continue as we probe its secondary and tertiary structure utilizing critical amino acids uncovered in the genome of Crigler-Najjar patients as well as selected residues. These studies will be aided by crystallographic studies of a surrogate soluble protein with adaptations of its structure by computer modeling to fit structure-function data accumulated on the HUGBrl protein.

REFERENCES 1 . G. J. Dutton, in “Glucuronic Acid, Free and Combined (G. J. Dutton, ed.), pp. 185-299.

Academic Press, New York, 1966. 2. F. Chen, J. K. Flitter, M . G . Wang, 0. W. Wesley, R. A. Luhet and I. S. Owens, Bchem 32, 10648 (1993). 3. C. E. Cornelius, K. C. Kelley and J. A. Himes, Cornell Vet. 65, 90 (1975). 4 . J. Magdalou, V. Chajes, C. Lafaurie and 6 . Siest, Drug Metab. Dispos. 18, 692 (1990). 5. E. Hietanen and H. Vainio, A d a Phormucol. Toxicol. 33, 57 (1973). 6. J. E. A. Leakey, B/ 175, 1119 (1978). 7. R. Bonnett, J. E. Davies and M. B. Hursthouse, Nature 262, 326 (1976). 8. R. Brodersen, CRC Crit. Reu Clin. Lob. Sci. 11, 304 (1980). 9. P. E. Hansen, H.Thiessen and R. Brodersen, Acta Chem. Scand. B33, 281 (1979). 10. J. Roy Chowdhury, A. W. Wolkoff and I. M. Arias, in ‘The Metabolic Basis of Inherited Disease” (C. R. Scriver, A. L. Beaudet, W. S. Sly and D. Valle, eds.), 6th ed., pp. 13671408. McGraw-Hill, New York, 1989. 11. J. R. Brown, in “Albumin, Structure, Biosynthesis, Function” (T. Peters and I. Sjoholm, eds.), FEBS, 11th Meet. Pergamon, Oxford and Copenhagen, 1978. 12. G. Baumgartner and J. Reichen, Clin. Sci. Mol. Med. 51, 169 (1976). 13. C. A. Goresky, Can. Med. Assoc. /. 92, 851 (1965). 14. A. J. Levi, 2.Gatmaitan and I. M. Arias, /. Clin. Incest. 48, 2156 (1969). 15. I. Listowsky, 2. Gatmaitan and 1. M. Arias, PNAS 75, 1214 (1978). 15a. K. Kamisaka, I. Listowsky, Z. Catmaitan and I. M. Arias, Bchem 14, 2175 (1975). 16. D. I. Whitmer, P. E. Russell and J. L. Gollan, B/ 244, 41 (1987). 17. E. Colleran and P. OCarra, in “Chemistry and Physiology of Bile Pigments” (P. D. Berk and N. I. Berlin, eds.), pp. 69-80. U. S. Government Printing Office, Washington, DC, 1977. 18. R. Stocker, Y. Yamamoto, A . F. McDondgh, A. N. Glazer and B. N. Ames, Science 235, 1043 (1987).

GENE STRUCTURE AT T H E HUMAN

UGTl LOCUS

337

19. R. Stocker, A. N. Glazer and B. N. Ames, PNAS E4, 5918 (1987). 20. D. Harding, S. Fournel-Gigleux, M. R. Jackson and B. Burchell, PNAS 85, 8381 (1988). 21. J. R. Robinson, J. S. Felton, R. C. Levitt, S. S. Thorgeirsson and D. W. Nebert, Mol. P h a m o l . 11, 850 (1975). 22. G . J. Wishart, M. T. Campbell and G . J. Dutton, in “Conjugation Reactions in Drug Biotransformation” (A. Aitio, ed.), pp. 179-187. Elsevierf North-Holland Biomedical Press, Amsterdam, 1978. 23. I. Okulicz-Kozaryn, M. Schaefer, A.-M. Batt, G. Siest and V. Loppinet, Chen. Pharmucol. 30, 1457 (1981). 24. I. S. Owens, JBC 252, 2827 (1977). 25. D. W. Nebert and N. M. Jensen, Crit. Rev. Biochem. 6, 401 (1979). 26. N. Malik and I. S. Owens, JBC 256, 9599 (1981). 26a. C. H. Gunn, J. Hered. 29, 137 (1938). 26b. R. Schmid, J. Axelrod, L. Hammaker and R. L. Swarm, J. Clin. Invest. 37, 1123 (1958). 27. I. S. Owens and J. K. Ritter, P h a m o g e n e t i c s 2, 93 (1992). 28. J. K. Ritter, J. M. Crawford and I. S. Owens, JBC 266, 1043 (1991). 29. D. Harding, S. Foumel-Gigleux, S. J. Jeremiah, S. Povey and B. Burchell, Ann. Hum. Genet. 54, 17 (1990). 29a. S. J. Haque, D. D. Petersen, D. W. Nebert and P. I. Mackenzie DNA Cell B i d . 10, 515 (1991). 30. J. K. Ritter, F. Chen, Y. Y. Sheen, H. M. Tran, S. Kimura, M. T.YeatmanandI. S. Owens, JBC 267, 3257 (1992). 31. R. Wooster, L. Sutherland, T.Ebner, D. Clarke, 0. Da Cruz E Silva and B. Burchell, BJ 278, 465 (1991). 32. H. Sato, 0. Koiwai, K. Tanabe and S. Kashiwamata, BBRC 169, 260 (1990). 33. T. Iyanagi, JBC 266, 24048 (1991). 34. T.Ah-Ng Kong, M. Ma, D. Tao and L. Yang, Phann. Res. 10(3), 461 (1993). 35. T. Iyanagi, M. Haniu, K. Sogawa, Y. Fujii-Kuriyama, S. Watanabe, J. E. Shively and K. F. Anan, JBC 261, 15607 (1986). 36. P. I. Mackenzie and L. Rodbourn, JBC 265, 11328 (1990). 37. J. K. Ritter, F. Chen, Y. Y. Sheen, R. A. Lubet and I. S. Owens, Bchem 31, 3409 (1992). 38. F. Chen, J. K. Ritter, M. G. Wang, 0. W. McBride, R. A. LuhetandI. S. Owens, Bchem 32, 10648 (1993). 39. P. I. Mackenzie, JBC 261, 14112 (1986). 40. P. I. Mackenzie JBC 265, 3432 (1990). 41. J. K. Ritter, M. T.Yeatman, P. Ferreira and I. S. Owens, J. Clin. Invest. 90, 150 (1992). 42. L. T. Erps, J. K. Ritter, J. H. Hersh, D. Blossom, N. C. Martin and I. S. Owens,J. Clin. Znuest. 93, 564 (1994). 43. J. K. Ritter, M. T.Yeatman, C. Kaiser, B. Gridelli and I. S. Owens, JBC 268,23573 (1993). 44. M. Ciotti, M. T. Yeatman, R. J. Sokol, and I. S. Owens, JBC 270, 3284 (1995). 44a. J. 0. Miners and P. I. Mackenzie, Pharmacol. Ther. 51, 347 (1991). 44b. J. K. M. Rao and P. Argos, BBA 869, 197 (1986). 45. J. K. Ritter, Y. Y. Sheen and I. S. Owens, JBC 265, 7900 (1990). 46. T. Ebner, R. P. Remmel and B. Burchell, Mol. P h a m o l . 43, 649 (1993). 47. T. Ebner and B. Burchell, Drug Metab. Dispos. 21, 50 (1993). 47a. S. J. Y&e, G. Levy, T.Matsuzawa and T. Baliah, N . Engl. J. Med. 275, 1461 (1966). 48. L. Sutherland, T. Ebner and B. Burchell, Biochem. Pharmacol. 45, 295 (1993). 49. M. W. H. Coughtrie, B. Burchell, J. E. A. Leakey and R. Hume, Mol. Pharmacol. 34,729 (1988). 50. 6. M. Pacifici and D. J. Back, J. Steroid Biochem. 31, 345 (1988).

338

IDA S. OWENS AND JOSEPH K. RITTER

51. K. W. Bock, Crit. Reu. Biochem. Mol. B i d . 26, 129 (1991). 52. P. A. Munzel, M. Briick and K . W. Bock, Biochem. Pharmucol. 47, 1445 (1994). 53. J. F. Ghersi-Egea, A. Minn and G. Siest, Lye Sci. 42, 2515 (1988). 54. H. C. Li, N. Porter. G . Holmes and T. Gessner, Xenobiotica 11, 647 (1981). 55. K. W. Bock, W. Frohling, H. Remmer and B. Rexer, BBA 327, 46 (1973). 56. K. W. Bock, R. Kahl and W. Lilienblum. Arch. Pharmacol. 310, 249 (1980). 57. D. Schrenk, H.-P. Lipp. T. Wiesmiiller, H. Hagenmaier and K. W. Bock, Arch. Toricol. 65, 114 (1991). 58. J. P. Whitlock, Annu. Reu. Phannacol. Toxicol. 30, 251 (1990). 59. K. W. Bock, A. Forster, H . Ckchaidmeier, M. Briick, P. A. Miinzel, W. Schareck, S. Fournel-Gigleux and B. Burchell, Biochem. Phannacol. 45, 1809 (1993). 60. K. W. Bock. J. Wiltfang, R . Blume, D. Ullrichand J. Bircher, Eur. /. Clin. P h a m o l . 31, 677 (1987). 61. T. Iyanagi, Absr. 7th Int. Glucuronidation Workshop, Pitlochry, Scotland, September 1215. 1993, 23. 62. J. B. Watkins, 2. Gregus. T. N . Thompson and C. D. Klaassen, Toxicol. Appl. Phannocol. 64, 439 (1982). 63. T. W. Kensler. P. A. Egner. M. A. Trush, E. Buedingand J. D. Groopman, Carcinogenesis 6, 759 (19%). 64. M. H. Davies and R. C. Schnell, Toxicol Appl. Phannucol. 109, 29 (1991). 65. S. Bergelson, R. Pinkus and V. Daniel, Oncogene 9, 565 (1994). 66. S. Bergelson, R. Pinkus and V. Daniel, Cancer Res. 54, 36 (1994). 67. T. H. Rushmore and C. B. Pickett, ]BC 265, 14648 (1990). 68. P. A. Egner, T. W. Kensler, T. Prestera, P. Talalay, A. H. Libby, H. H. Joyner and T. J. Curphev, Carcinogenesis 15, 177 (1994). 69. B. Burchell, D. W. Nrbert, D. R. Nelson, K. W. Bock, T. Iyanagi, P. L. M. Jansen, D. Lancet, G . J. Mulder, J. Roy Chowdhury, G. Siest, T. R. Tephley and P. I. Mackenzie, DNA Cell B i d . 10, 487 (1991). 70. D. R. Nelson, T. Kamataki, D. J. Waxman, F. P. Guengerich, R. W. Estabrook, R. Feyereisen, F. J. Gonmlez, M . J. Coon, I. C. Gunsalus and 0. Gotoh, DNA Cell Biol. 12, 1 (1993). 71. N. Moghrabi, D. J. Clarke, B. Burchell and M. Boxer, A m . 1.Hum. Genet. 53,722 (1993). 72. P. J. Bosma, N. Roy Chowdhury. B. G. Goldhoorn, M. H. Hofker, R. P. J. Oude Elferink, P. L. M . Jansen and J. Roy Chowdhury, Hepatology 15, 941 (1992). 73. D. Baltimore, Cell 24, 592 (1981). 74. H. Gu, J. D. Marth, P. C. Orban, H. Mossrnann and K. Rajewsky, Science 265, 103 (1994).

Growth Control of Translation in Mammalian Cells DAVIDR. MORRIS Department of Biochemistry University of Washington Seattle, Washington 98195 I. Cellular Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

....................... Regulation by mRNA Binding Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation by Open Reading Frames within the 5‘ Leader . . . . . . . . . . Translation and Oncogenic Transformation ........................ Conclusions: Physiological Roles of Translational Control . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11. Regulation of Translational Initiation Factors

111. IV. V. VI.

340 343

348 350 355 357 359

Most cells in a mammalian organism are not proliferating, but are in a quiescent, resting state referred to as “G,.” Cells in the G, state may be activated to enter into the cell cycle by external stimuli such as growth factors and hormones, or in the case of cells of the immune system, by a specific antigen. The immediate response of a cell to a growth stimulus is the activation of various cytosolic signaling cascades consisting of soluble secondmessengers and sequential protein phosphorylations (for reviews, see 1-5). A central target of these signal transduction cascades is the activation of expression of numerous genes, the products of which are critical in moving a cell out of G, and through the cell cycle (reviewed in 1-3,6, 7). These “earlyresponse” genes encode a variety of products that are important for ultimately advancing a cell through DNA replication and division; these products include transcription factors, certain metabolic enzymes, proteins involved in the metabolism of the extracellular matrix, cyclins, and the cyclin-dependent protein kinases. The mechanisms by which the levels of the early-response gene products are regulated and the modulation of these mechanisms by signal transduction cascades are issues central to our understanding of cell growth. Production of many of the early-response proteins is regulated by the level of their respective mRNA molecules (3,6, 7).Although in the best understood examples, mRNA level is controlled by the rate of transcription, it is clear that post-transcriptional stabilization mechanisms may also come into play (for Progress in Nucleic Acid Research and Molecular Biology, Val. 51

339

Copyright 8 1995 by Academic Press, Inc. All rights of reproduction in any form reserved.

340

DAVID R. MORRIS

example, see 8-11). Although mRNA levels are involved in a major way in growth control, there are also a number of older observations in the literature that argue compellingly for a substantial role for regulation of the translation of preformed mRNA molecules. The purpose of this essay is to review these older observations and place them in the context of our rapidly expanding knowledge of the mechanisms of translational control.

1. Cellular Studies

A. Global Regulation of Protein Synthesis in Response to Mitogenic Activation The overall rate of translation is elevated quite early after activation of a variety of resting cell types (the older literature is reviewed in 12). For example, the cellular rate of protein synthesis is increased significantly over the basal rate by 15 minutes after treatment of T lymphocytes with mitogen (13).This early translational activation is reflected in a rapid recruitment of ribosomal subunits into polysomes (14-1 7). These early changes in protein synthetic rate seem to result both from elevated translational initiation activity and increased availability of mRNA. The latter apparently arises both from new synthesis and from mobilization of pre-existing cellular mRNA. At later times the production of new ribosomes also significantly influences the general rate of translation (discussed in 12). There is no detectable change in the rate of elongation of nascent polypeptide chains coincident with these changes in synthetic activity arising from mitogenic activation (15, 18, 19). An important fraction of this new protein synthesis in activated cells, particularly the products of proto-oncogenes and other early-response genes, results from transcription of new mRNA species (reviewed in 3, 7, 20). However, considerable evidence has accumulated since the 1970s that implicates regulation at the translational level as well. For example, only a fraction of the early translational activation in T lymphocytes requires new RNA synthesis (13) and the net increases in poly (A)+ RNA at early times after activation of several cell types falls short of the general elevation in translation rate (for example, see 19, 21-24). Representative results are summarized in Table I for T lymphocytes. Over the first 6 hours after activation, there is a threefold increase in the rate of protein synthesis, with no increase in ribosome content and only a minor elevation of total cellular mRNA [poly(A)+ RNA]. The fraction of cellular ribosomes associated with mRNA parallels the increase in protein synthesis over the early time points. Similar observations from a number of laboratories suggest that a significant fraction of enhanced protein synthesis is due not to new mRNA species, but to

341

MAMMALIAN CELL GROWTH CONTROL OF TRANSLATION

TABLE I PROTEINSYNTHESIS IN ACTIVATED T LYMPHOCYTES~ ~~

~

Time after activation (hr) 0 1 6

20

Rate of synthesis (fold)

mRN A-associated ribosomes (8 of total)

mRNA content (fold)

Ribosome content (fold)

1.0 1.3 3.0 10

12 15 42 70

1.0

1.0 1.0 1.0 2.0

1.0

1.25 1.8

OResults are summarized from reference 24 and unpublished data of T. E. Martin and D. R. Moms.

molecules already present in resting cells prior to activation. The excess mRNA in resting cells has been found in mRNP1 particles (21, 22, 25, 26) and in small polysomes (19). The RNP-associated mRNA species seem to be mobilized into large, actively translating polysomes after mitogenic activation.

6. Translational Control of Specific Genes Early results suggested that translational control observed in quiescent cells is gene specific; that is to say, a subset of the assortment of cellular mRNAs is sequestered into mRNP particles during growth arrest and recruited back into polysomes on mitogenic activation. A study of Swiss mouse 3T3 cells by two-dimensional gel electrophoresis (27) detected 16 proteins that increased in rate of synthesis after cellular activation; 7 of these were unaffected by blocking RNA synthesis with actinomycin D. The mRNA sequences present in mRNP particles and polysomes were compared by hybridization kinetics (28)and by in uitro translation (18,29). The conclusion in all cases was that only a fraction of the total cellular mRNA species is localized in the untranslated particles. These early conclusions concerning gene-specific translational control have now been confirmed using cloned probes. For example, the synthesis of cytoskeletal actin (30) is controlled only by mRNA abundance in mitogenactivated T lymphocytes, showing no element of translational control. In The following abbreviations are used in this essay: AdoMetDC, S-adenosylmethionine decarboxylase; CD4, cluster of differentiation4 eEF, eukaryotic elongation factor; eIF, eukaryotic initiation factor; HRI, heme regulated inhibitor; IKB, inhibitor of transcription factor NFKB;IRES, internal ribosome entry site; LAP, liver-enriched activating protein; LIP, liverenriched inhibitory protein; mRNP, messenger ribonucleoprotein; NF-AT, nuclear factor of activated T cells; ODC, omithine decarboxylase; PKR, RNA-activated protein kinase; TGF-P, transforming growth factor P; Th2, T cell subset producing interleukin-4; TIMP, tissue inhibitor of metalloproteases; uORF, upstream open reading frame; UTR, untranslated region.

342

DAVID R. MORRIS

resting and activated T cells, actin mRNA was fully loaded with translating ribosomes and the rate of actin synthesis paralleled the cellular level of its mRNA. In contrast, in T cells and in other cell types, the synthesis of several proteins, including S-adenosylmethionine decarboxylase (AdoMetDC) (31), ornithine decarboxylase (ODC) (32, 33), and the structural proteins of the ribosome (34-37), all demonstrate elements of translational control after initogenic activation. Interestingly, however, the mechanisms of regulation of translation of these proteins seem to be quite different. Generally, translational control could be imposed at one of two stages of mRNA utilization (discussed in 38). One level of control could be exerted on the rate of addition of ribosomes to existing polysomes, which is defined as regulation of initiation. The other site of control is at the recruitment of mRNA from mRNP particles into polysomes, which is defined as mobilization. Experimentally, these two situations can be easily distinguished by examining the intracellular localization of the mRNA molecules of interest using sucrose gradient centrifugation. As illustrated in Fig. 1, regulation of initiation results in a change in number of ribosomes associated with an mRNA species, whereas regulation of mobilization results in a recruitment of the mRNA into polysomes from mRNP particles, with no change in polysome size. By this definition, ODC (33),eEF-la (39-41) and the ribosomal proteins (34-37) are all regulated by mRNA mobilization, because growth stimulation moves these mRNAs from mRNP particles into polysomes. On

LEVEL OF CONTROL

RESTING

ACTIVATED

Increased mRNA

Translational lnltiatlon

mRNA Mobilization

FIG. 1. Influence of the mechanism of gene regulation on intracellular mRNA distribution in resting and activated Lulls. The niRNA is represented by the heavy horizontal line and the small and large ribosomal subunits are depicted by the stippled circles.

MAMMALIAN CELL GROWTH CONTROL OF TRANSLATION

343

the other hand, AdoMetDC seems to be regulated purely at the initiation level, because in resting T cells this mRNA is located primarily on monosomes, with little or none present in mRNP particles, and it moves into larger polysomes after mitogenic activation (31). It appears that the class of mRNAs regulated by mobilization can be further subdivided based on the response of these mRNAs to treatment of the cells with concentrations of cycloheximide that selectively inhibit elongation, but permit initiation, thus recruiting ribosomes onto polysomes (42). Treatment of resting T lymphocytes in this way caused a complete shift of the ODC mRNA into polysomes within 15 minutes (33).In contrast, the identical treatment had no effect on the mRNP localization of the mRNAs encoding e E F - l a or ribosomal protein L32 (R. L. Kaspar and M. W. White, unpublished), suggesting that these mRNAs are sequestered in a manner that makes them inaccessible to the translation apparatus, even under these extreme conditions.

II. Regulation of Translational Initiation Factors The level and/or activity of translational initiation factors can influence the rates of both protein synthesis and cell proliferation. For example, depletion of eIF-4E through antisense technology (43) or through mutation (44) inhibits progression through the cell cycle. In yeast, a mutation in the eIF-4E gene produces a cdc phenotype, characterized by specific arrest in the G, phase of the cycle (44).Conversely, overexpression of the same factor, eIF-4E, can cause neoplastic transformation of mammalian cells (see 45 for a review). One would surmise that these apparently specific effects on cell cycle progression must arise from modulation of translation of particular genes. Indeed, there is strong evidence that the activities of general translational initiation factors can influence translation in a gene-specific manner (see Section II,C).

A. Regulation of Factor Phosphorylation On mitogenic activation of mammalian cells, there is simultaneous phosphorylation of several components of the translational apparatus, including eIF-4F, eIF-4B, eIF-3, and ribosomal protein S6, along with dephosphorylation of eIF-2a (38, 46-49). Those instances where there is compelling evidence indicating that changes in phosphorylation state play a physiological role in growth control of translation are discussed in the following paragraphs. eIF-4E is the 25-kDa a-peptide of eIF-4F and is the component responsible for the interaction of this initiation factor with the 5' cap structure of

344

DAVID R. MORRIS

mRNA molecules (reviewed in 38, 50).This factor is of interest as a potential site of regulation, because it seems to be rate limiting for translational initiation generally, and may also play a role in discriminating among d a e r e n t mRNA species, depending on the structure of the 5’ leader (discussed in 45). Overexpression of eIF-4E results in oncogenic transformation of mammalian fibroblasts (discussed in Section V). This factor exists in uiuo as both phosphorylated and dephosphorylated species, and the degree of phosphorylation correlates with the cellular protein synthetic activity (38,45,48,51). For example, mitogenic activation of fibroblasts and B and T lymphocytes elevates phosphorylation of eIF-4E over the same time period when translation is activated (35,52-54). Not only does phosphorylation correlate with activation of protein synthesis, but the phosphorylated form binds more tightly to the 5’ cap structure (55),which suggests a possible biochemical role for this modification. Recently, two related small proteins that interact with eIF-4E, 4E-BP1 [previously identified as PHAS-I (56)l and 4E-BP2, have been cloned (57). Both proteins block cap-dependent translation in uitro and in uiuo (57). Treatment of fat cells with insulin causes the phosphorylation of 4EBPUPHAS-I and disrupts its complex with eIF-4E (57,58). 4E-BPUPHAS-I is phosphorylated by MAP kinase (58, 59), which is activated by insulin and a variety of growth factors. Thus, it appears that mitogenic activation of a resting cell results in phosphorylation of both 4E-BPlIPHAS-I and eIF-4E. The former phosphorylation would be expected to release eIF-4E from the inactive complex with the binding protein and the latter would enhance the activity of eIF-4E by increasing its affinity for mRNA molecules. Possible involvement of phosphorylation of the a-subunit of eIF-2 in growth control is suggested by a relative decrease in the level of phosphorylation of this factor in HeLa cells on supplementation with fresh serum (60).During the process of translational initiation, eIF-2 brings the initiator Met-tRNA to the 43-S preinitiation complex in the form of a ternary complex, Met-tRNA,-GTP-eIF-2. On recognition of the AUG start codon and joining of the 60-S subunit to the complex, the bound GTP is hydrolyzed, the Met-tRNA, remains in the P site of the ribosome, and the binary complex eIF-2.GDP is released. A specific guanine-nucleotide-exchangefactor (eIF-2B), present in limiting amounts, then catalyzes the exchange of GTP for bound GDP, regenerating the eIF-2.GTP complex for subsequent binding of another molecule of Met-tRNA, (see 51 for a review of this process). Phosphorylation of eIF-2a on a specific serine residue (Ser-51 in the mammalian factor) leads to inhibition of protein synthesis, due to sequestration of eIF-2B in a stable complex with phosphorylated eIF-2c4 (51). There have been three eJF-2a kinases characterized to date (see 47 for a review): yeast GCN2 (see Section IV), a hemin-inhibited kinase from reti-

MAMMALIAN CELL GROWTH CONTROL OF TRANSLATION

345

culocytes (HRI), and an RNA-dependent kinase (PKR). Phosphorylation of eIF-2a by HRI shuts off globin synthesis in reticulocytes in the absence of a source of hemin. Although a role for HRI has been clearly defined in reticulocytes, recent evidence for a wider tissue distribution of the mRNA encoding this enzyme (61)opens the possibility of broader participation by it in translational control. PKR is a widely expressed eIF-2a kinase induced by interferon and activated by double-stranded RNA. The regulation of this enzyme is complex and of considerable interest in the context of growth control, because interference with its activity can lead to oncogenic transformation (see Section V). Double-stranded RNA activates PKR by inducing autophosphorylation (reviewed in 62). The possibility also exists that the enzyme might be phosphorylated as well at the distal end of a signal transduction cascade, although this has not yet been demonstrated. Because activation is accomplished by polyanions other than double-stranded RNA, it also seems possible that there may be other intracellular regulators of this enzyme that interact with the same regulatory site (discussed in 63). In addition to activation of PKR by phosphorylation, intracellular inhibitors of 58 kDa (M),160 kDa (65), and 15 kDa (66)have been reported. There is evidence suggesting that the 15-kDa inhibitor may participate in downregulating PKR activity in growing 3T3-F422A cells (66). Compelling evidence that the 58-kDa inhibitor can participate in growth control came from studies in which overexpression of this protein in NIH 3T3 cells led to faster, anchorage-independent growth in culture and the formation of tumors in nude mice (67). What is not clear at this point is whether the apparent tumor-suppressor activity of PKR is exerted through phosphorylation of eIF-2c-q or if there are other, as yet unidentified, substrates for PKR that mediate its growth-regulatory effects. As one example, the transcription factor inhibitor IKBis a substrate for PKR (68). Finally, in the interest of completeness, one should note the evidence for the existence of additional eIF-2a kinases that may be unrelated to PKR and HRI (69). These enzymes could also be candidates for regulation of the phosphorylation of eIF-2cz in the context of cell-growth control. In addition, there are reports of a 67-kDa glycoprotein that is associated with eIF-2 in reticulocyte lysates and inhibits the phosphorylation of eIF-2a by both PKR and HRI (70).The generality of the occurrence of this glycoprotein and its biological significance have yet to be defined. Phosphorylation of ribosomal protein S6 and activation of the involved protein kinase(s) are early events in the mitogenic activation of a variety of animal cells (for reviews, see 71, 72). S6 is located in the mRNA binding site of the small ribosomal subunit and is phosphorylated at five serine residues rapidly after cell activation. The location of S6 in the ribosomal structure suggests that its modification could play a role in the interaction of mRNA

346

DAVID R. MORRIS

with the 40-S subunit and, indeed, the degree of phosphorylation of this protein correlates with enhanced protein synthetic rate. There are several additional lines of evidence suggesting that 40-S subunits containing phosphorylated S6 are more active in protein synthesis (summarized in T I ) , although this has yet to be proved with certainty. Interestingly, rapamycin, which has as one of its actions the inhibition of S6 phosphorylation, partially inhibits the mobilization of mRNA molecules containing a polypyrimidine tract from mRNP particles into polysomes in mitogen-activated fibroblasts (see Section 111), but has no effect on other mRNAs that are not under translational control (73). As noted above, the phosphorylation state of other translational components, besides the three well-studied examples discussed above, has been observed to change on mitogenic activation (38, 46). It is conceivable that there is no single pivotal phosphorylation event and that changes in the phosphorylation state of several key components of the translational apparatus may act synergistically to regulate translation rate. Alternatively, because there are clearly multiple mechanisms by which individual mRNAs are regulated, different phosphorylation events may be involved in regulation of difFerent regulatory classes of mRNAs.

6. Regulation of Factor Levels In principle, translation rate could be regulated by the levels of key translation factors, and there are several examples of potential interest. The expression of eIF-4E and eIF-2a seems to be regulated in fibroblasts by the product of the early response gene, c-myc (74), linking the expression of these two translational initiation factors to the pleiotropic cellular response to mitogenic activation. Activation of T lymphocytes resulted in 2- to &fold increases in eIF-4E and eIF-2P, 2- to 5-fold increases in eIF-2a, and 8- to 10-fold increases in elF-4A over the first 24 hours (54,75). Because these increases are of the same magnitude or less than the global elevation of protein synthesis, it is important to distinguish a true regulatory role from a general increase in the level of the protein synthesis machinery as the cells grow in size. In this context, it is interesting to note, in the case of T lymphocytes, that the level of eIF-4E may limit the expression of T-cellspecific transcription factor NF-AT in the CF4+ Th2 subset (76),suggesting the possibility of a specific regulatory role in this instance. As well, the fact that deregulation of several translation factors leads to oncogenic transformation (see Section V) suggests that at least a subset of these factors is limiting, and perhaps regulatory, for growth. Another example of initiation-factor expression that is perhaps of regulatory significance is the existence of isoforms of eIF-4A arising from two related genes, Eif4al and Eif4a2, that are located on different chromosomes

MAMMALIAN CELL GROWTH CONTROL OF TRANSLATION

347

in the mouse (77). These isoforms are expressed in a tissue-specific manner (78), suggesting the possibility of modified translational initiation mechanisms in different tissues. However, one should note that, at this point, there have been no reports of differences in activity of these two isoforms.

C. Gene-specific Regulation through Genera I Tra ns Iat io n Factors Modulation of the activity or level of translational initiation factors clearly should alter the general rate of protein synthesis in a cell, assuming other components, such as mRNA, are not limiting. Although it is not obvious a priori that there should be a differential influence of general translation factors on the expression of particular mRNA molecules, there is compelling evidence that alterations in the cellular activity of at least one initiation factor, eIF-4E, can produce mRNA-specific changes in translational activity. The eIF-4 group of initiation factors is of interest in the context of genespecific regulation, because these factors interact directly with mRNA and these interactions could be influenced by structural features of an mRNA molecule. The multisubunit eIF-4F is probably the first general initiation factor to contact an mRNA molecule and its binding likely influences subsequent interactions of other components of the translational apparatus (discussed in 45). The interaction of eIF-4F with an mRNA would be expected to be affected by particular structural attributes of an mRNA molecule, especially the accessibility of the 5‘ cap structure. Furthermore, the impact of the helicase activity of eIF-4F, in conjunction with eIF-4B, could have genespecific influences, depending on the degree of secondary structure within the 5’ leader of a particular mRNA molecule. The eIF-4E component of eIF-4F has received particular attention, because it seems to be present at limiting intracellular levels. The number of eIF-4E molecules per cell limits the assembly of eIF-4F and is considerably lower than the number of mRNA molecules in a cell. This situation is likely to lead to translational selectivity through competition between mRNA molecules for fully assembled eIF-4F (see 45). Indeed, systematic introduction of structure into a 5’ leader of the mRNA progressively inhibited translation (79, 80), and overexpression of eIF-4E seemed to overcome the inhibitory influence of extensive secondary structure (80). Interestingly, many proto-oncogenes and other growth-related genes have long 5’ leaders with extensive potential secondary structure (80),making them potential targets for translational control through eIF-4E. One fascinating example is cyclin D1; the intracellular level of cyclin D1 protein does not follow the level of its mRNA, suggesting post-transcriptional regulation (81, 82). Overexpression of eIF-4E elevates the level of cyclin D1 pro-

348

DAVID R. MORRIS

tein, with no effect on the levels of several other proteins (82).Although the mechanism by which eIF-4E influences cyclin D1 expression has not been definitely established, it seems likely that it acts by modulating translation of this key agent of cell cycle control. Another intriguing example is that of ornithine decarboxylase (ODC), a protein that shows a significant level of translational control (33, 83). The ODC mRNA has a long and potentially structured 5' leader that inhibits translation both in uiuo and in uitro (84,85). Translation of a reporter construct containing the 5' leader of ODC was strongly up-regulated in fibroblasts by insulin. This enhancement of ODC translation correlated with stimulation of phosphorylation of both eIF-4E and eIF-4B (86),suggesting, but not proving, regulation through factor phosphorylation. Expression of endogenous ODC, as well as a construct containing the 5' leader of the ODC mRNA, is up-regulated in cells overexpressing eIF-4E (83, again consistent with a role for this initiation factor in translational control of ODC. There are other suggestions in the literature of gene-specific effects of general translation factors. In addition to the effects of eIF-4F noted in the preceding paragraphs, this factor has been shown to influence initiation at cap-proximal initiation codons in bicistronic mRNAs (88), which may be of interest in the context of the many growth-related genes containing open reading frames within their 5' leaders (see Section IV). The immunosuppressive drug rapamycin is an inhibitor of ribosomal protein S6 phosphorylation and also of translation of mRNAs containing polypyrimidine tracts at their 5' ends (73); however, it has yet to be proved that these two actions of the drug are related. Overexpression of mutant forms of eIF-2a seems to produce mRNA-specific alterations in translation (89), and there is certainly precedent for gene-specific regulation by eIF-2a in the case of the yeast GCN-4 gene (see Section IV).

111. Regulation by mRNA Binding Proteins The translation of mRNA molecules can be regulated through the specific binding of translational repressor proteins. The classical example is the regulation of expression of the intracellular iron-binding protein, ferritin, by the availability of iron (90, 91). When cells encounter an environment depleted of iron, the translation of ferritin is inhibited and the mRNA moves into mRNP particles. Key to this process is the iron-regulated binding of a repressor protein to a stem-loop structure, termed the iron-response element, which is located in the 5' leader of the ferritin mRNA. The Caenorhabditis elegans l i d 4 and mammalian prm-1 and 15-lipoxygenase genes, which are developmentally regulated, seem to be regulated through a similar

MAMMALIAN CELL GROWTH CONTROL OF TRANSLATION

349

mechanism (92-94). A protein has been cloned that interacts with a region of prm-1 mRNA that is sufficient for translational regulation in transgenic mice (K. Lee, M. Fajardo, S. Edelhoff, C. M. Disteche and R. E. Braun, unpublished) and, in the case of Zin14, an RNA molecule derived from the Zin4 gene seems to be involved in interactions at the critical cis element (92). In all four cases, the repressed form of the mRNA is located in mRNP particles. In the latter three instances, the binding sites seem to be located in the 3’UTRs, in contrast to the location of the iron-response element in the leader of the ferritin mRNA. There is a class of growth-regulated mRNAs whose translation is controlled through interaction of a polypyrimidine [poly(Y)] binding protein with poly(Y) tract at the 5’ end of the leader (95). The mRNAs encoding the structural proteins of the ribosomal subunits are members of this class, and their translation is under mitogenic control in many cell types (96).All of the characterized ribosomal mRNAs in vertebrates contain a poly(Y) tract of 5 to 14 nucleotides in length, located immediately adjacent to the cap structure at the 5’ end. Mutation of this tract, which requires only consecutive pyrimidine nucleosides with no apparent consensus sequence, results in constitutive translation of these mRNAs (36, 97). This result establishes that the poly(Y) tract contributes, at least in part, to the translational control of these mRNAs. A 56-kDa protein has been identified in the cytosol of mouse and Xenopus cells that binds to wild-type 5‘ leaders of ribosomal proteins L32 and L1, but not to mutant leaders that have lost their regulatory properties (36, 98). This protein binds strongly to poly(U), but appears to be different in several other characteristics from a previously described nuclear polypyrimidine tract binding protein (95). In addition to the poly(Y) tract, it seems that other sequences are also necessary for high-afhity binding of this protein to the L32 5’ leader (M. W. White, personal communication) and also for regulation in uiuo (41), suggesting that the interaction is more complex than simply binding of the protein to the tract. Other growth-regulated mRNAs contain this distinctive 5’ tract, including those encoding e E F - l a (40) and a protein of unknown function (99, loo), and probably interact with the same binding protein (R. L. Kaspar and M. W. White, personal communication). Most of our knowledge of the structure of cytoplasmic mRNP particles comes from studies of oocytes of marine invertebrates and Xenopus (101, 102). Although some of these oocyte mRNAs seem to contain cis elements necessary for mRNP formation (reviewed in 103), these oocyte-specific mechanisms for general masking of maternal messages may be different from the mechanisms involved in reversible sequestration of the poly(Y)containing mRNAs in somatic tissues in response to growth status. The mRNP particles in resting cells that contain the sequestered poly(Y)-

350

DAVID R. MORRIS

containing mRNAs sediment at 35 to 45 S, so they clearly contain many more components than just the mRNA and the 56-kDa poly(Y)-bindingprotein. In the simplest model, the poly(Y)-binding protein would remain with the mRNA in translationally repressed particles and perhaps serve as a nucleation site for assembly of the inactive particles (discussed in 95);this has yet to be tested. In contrast to iron regulation of ferritin expression, the regulated step in this assembly process may not be the interaction of the binding protein with the mRNA, because the binding in vitro of this protein to the 5’ leader of the mRNA encoding ribosomal protein L32 is unchanged in extracts from quiescent or activated lymphocytes (36).Therefore a model incorporating constitutive binding of the protein and regulated assembly of the mRNP particle seems possible in this instance (95). In mitogenically activated cells, the shift of the poly(Y)-containing mRNAs from mRNP particles into polysomes correlates with early phosphorylation events (35).Although a strong correlation exists, a causative relationship between phosphorylation of a protein component of the translational machinery and translational activation has not been established (discussed in 35). It is enticing that rapamycin, an inhibitor of ribosomal protein S6 phosphorylation, selectively inhibits translation of the poly(Y) tract family of mRNAs (73). However, interpretation of this interesting result awaits demonstration that the action of the drug is specific to S6 phosphorylation and that it is not influencing some other aspect of regulation of the poly(Y)containing mRNAs. Ornithine decarboxylase (ODC) is an important growth-regulated enzyme encoded by a gene that shows an element of translational control (discussed in Section 1,B). The ODC leader is long and complex, with extensive regions of potential secondary structure and an open reading frame. Truncation of the ODC 5’ leader, removing inhibitory sequences, led to elevated translation both in uitro and in vivo (84, 85). An evolutionarily conserved region of the ODC leader forms specific complexes with a protein of -58 kDa (104). Because of the involvement of mRNP particles in the translational control of ODC (33),the potential role of this protein is interesting. However, although a series of mutations have been created that disrupt protein binding, no effect of these on translational control in vivo has been observed (104). Hence, at this point, the biological role of this protein in translational regulation of ODC is in question.

IV. Regulation by Open Reading Frames within the 5’ leader The presence of initiator AUG codons and upstream open reading frames (uORFs) within the 5’ leader of an mRNA can dramatically inhibit translation

MAMMALIAN CELL GROWTH CONTROL OF TRANSLATION

351

FIG. 2. Model for regulation of translation of a downstream cistron by uORFs. The block to ribosome scanning introduced by the uORF is depicted by the vertical lines (1). Various mechanisms for circumventing the block are collectively represented by the curved arrow (2).

of the downstream cistron, probably by interfering with progress of the scanning 43-S pre-initiation complex (105, 106). Interestingly, less than 10% of 699 vertebrate mRNA sequences surveyed (107) contained AUG codons upstream of the major cistron, and the small subset that did was strongly biased towards proto-oncogenes, growth factor genes, and growth factor receptor genes. In fact, two-thirds of the proto-oncogene transcripts surveyed contained upstream initiator codons. Based on these observations, one suspects that this feature of mRNA structure may turn out to play an important role in the regulation of proliferation of mammalian cells. In order for a uORF to play a regulatory role, two conditions must be met (Fig. 2; discussed in 106): (1)initiation of translation at the uORF must result in a strong block to further scanning and resultant expression of the downstream cistron, and (2) a regulated mechanism must be available to circumvent or relieve the suppressive influence of the uORF. If the block to scanning introduced by the uORF is strong, small changes in the degree of suppression can exert quite profound influence on the rate of translation of the downstream cistron (discussed in 108).With the rudimentary knowledge presently available, it is impossible to predict the full range of detailed mechanisms available in nature for achieving these two conditions. Below, three well-characterized examples are discussed that illustrate the general principles of this mode of regulation: tissue inhibitor of metalloproteinases (TIMP), AdoMetDC, and the yeast gene GCN4. TIMP is a glycoprotein that regulates the breakdown of extracellular matrix by metalloproteinases such as collagenase and stromolysin. In resting fibroblasts, a long form of the mRNA is produced (Fig. 3), which contains in its 5' leader an uORF that overlaps the TIMP cistron by 65 nucleotides in the - 1reading frame (109, 110). This uORF suppressed translation of TIMP by a factor of three in reticulocyte lysates (111). Mitogenic activation of resting fibroblasts induced expression of a short form of TIMP mRNA through the use of an alternative transcriptional start site (109). The growth-

352

DAVID R. MORRIS

FIG. 3. Three representative mRNAs regulated by uORFs. The 5' leaders are represented by the heavy horizontal lines, the uORFs by the cross-hatched rectangles, and the major cistrons by the stippled rectangles that are labeled with the name of the gene. The two forms of the tissue inhibitor of metalloproteases (TIMP)rnRNA that are generated by alternative prornoters are shown.

induced short form of the mRNA lacks the initiation codon of the uORF, thus enhancing the translatability of the message. Referring to the general model (Fig. 2), the strong block to translation of the TIMP cistron is generated by the out-of-frame overlap of the uORF. This block is alleviated on mitogenic activation by elimination of the uORF from the induced transcripts through use of the alternate promoter. AdoMetDC catalyzes a key regulated step in the synthesis of the polyamines, spermidine, and spermine. Translation of this mRNA is strongly suppressed in T-cells and T-cell lines, but not in nonlymphoid cells (31,112). Translational suppression of AdoMetDC is reduced on mitogenic activation of resting T-cells (31).The cell-specific translational behavior of this mRNA is mediated by a small six-codon uORF (Fig. 3) located 14 nucleotides from the 5' cap (112). In this case, the tight block to downstream translation is dependent on the structure of the nascent peptide encoded by the uORF; missense mutation of any one of the three carboxy-terminal codons completely abolishes the inhibitory influence of the uORF, whereas mutations that retain the wild-type coding capacity are without effect (123). The strong, structure-dependent block by the nascent peptide is thought to be due to ribosome stalling, induced by interaction of the C-terminus of the peptidyl tRNA with either the ribosome channel, the peptidyl transferase, or a release factor (discussed in 106).Thus, although initiation at the uORF is relatively infrequent in both T-cells and nonlymphoid cells, as

MAMMALIAN CELL GROWTH CONTROL OF TRANSLATION

353

expected due to its proximity to the 5' cap (108), a ribosome initiating at the uORF and subsequently stalling at termination acts as a block to prevent other ribosomes from entering at the 5' cap and scanning further on the message. The rate of ribosome release from the termination block seems not to be the regulated event; rather, the efficiency of initiation at the uORF is cell-type specific, occurring more frequently in T lymphocytes than in nonlymphoid cells. The mechanism of this difference in recognition frequency between cell types is not yet known. It has been argued that minor, cellspecific variations in the activity of a translation initiation factor or factors could produce these differences (108). The GCN4 protein of Saccharomyces cerevisiae is a transcription factor that regulates expression of more than 30 genes involved in the biosynthesis of 11 different amino acids (114).Synthesis of GCN4 protein is up-regulated at the translational level in cells that encounter amino-acid limitation. Translational control of GCN4 expression involves four short uORFs present in the 5' leader of its mRNA (Fig. 3). These uORFs seem to have different regulatory functions (115,116). Translation of either uORF 2, 3, or 4 leads to strong repression of translation of the downstream GCN4 cistron. The mechanism of this strong blockade has been studied in detail with uORF 4 (117); the high (G + C) content of the final codon and the 10-nucleotide sequence immediately downstream of the termination site seem to prevent further scanning and reinitiation at the GCN4 cistron. The termination region of uORF 1, on the other hand, is (A + U) rich and allows reinitiation further downstream, either at one of the other uORFs, blocking further translation, or at the GCN4 cistron to produce GCN4 protein. In the absence of nitrogen starvation, scanning ribosomes reinitiate at one of the downstream uORFs, thereby suppressing translation of GCN4. When cells are starved for amino acids, the downstream uORFs are passed over at elevated frequency, leading to enhanced GCN4 translation by the reinitiating ribosomes. It has been suggested that the ratio of reinitiation at the GCN4 cistron to that at the downstream uORFs is controlled by the rate at which the scanning ribosome destined to reinitiate reacquires the ternary complex [eIF-2*GTP*Met-tRNAMet](118).In nitrogen-starved cells, the a-subunit of eIF-2 is phosphorylated, depressing its activity (see Section II,A), in turn allowing more time for the scanning ribosomes to get past the suppressive uORFs before acquiring the ability to reinitiate. This circumstance enhances translation of GCN4. Thus, uORFs 2, 3 and 4 in the GCN4 mRNA constitute the tight block illustrated in Fig. 2, whereas the role of uORF 1 is to create a situation wherein reinitiation is required for downstream translation. The suppressive influence of the downstream uORFs is thus modulated by the rate of reinitia-

354

DAVID R. MORRIS

tion relative to the rate of ribosome scanning. If a ribosome scans past the three inhibitory uORFs before acquiring the ability to reinitiate (a situation created by phosphorylation of eIF-2cr during amino-acid starvation), their suppressive influence is avoided and GCN4 translation is stimulated. Although it is dangerous to extrapolate from only three examples, presently it seems that the well-studied instances of regulation by uORFs can be encompassed by the general scheme of Fig. 2. There is one important point about this model that should be emphasized. If the blockade by the uORF is strong, only a small alteration in the frequency of its recognition results in a large change in the translation of the downstream cistron (discussed in 108). To illustrate this amplification effect, if a uORF blocked scanning 99%, only 1% of the scanning ribosomes would get through to the major cistron. If recognition of the uORF were decreased to 80%, which represents only a 20% change in the rate of initiation at the uORF, this would result in 20%of the scanning ribosomes reaching the downstream cistron, producing a 20fold increase in its rate of translation. Hence, if a system is poised appropriately, a small change in recognition of a uORF can result in profound alteration in the rate of translation of the downstream cistron. Therefore, in a situation such as GCN4 or AdoMetDC, small changes in the activity of a general translation factor could produce large effects on synthesis of a gene product in the absence of a noticeable influence on the general rate of protein synthesis. One last circumstance wherein uORFs could play an important role might be in preventing cap-dependent scanning, thereby directing initiation to an internal ribosome entry site (IRES). In the case of the Antennapedia mRNA of Drosophila, the IRES region of the 5' leader is preceded by either 8 or 15 AUG codons, depending on the promoter used (119). At this time, nothing is known concerning regulation of initiation at IRES structures, although one suspects that this may turn out to be an important mechanism of translational control in higher cells. Finally, there are instances where upstream initiation codons, which are in-frame with the major cistron, can produce alternative protein products with different amino termini (106). For example, two liver-specific DNAbinding proteins, LIP and LAP, are produced from the same mRNA molecule. These two proteins have different transcriptional activities, LAP being an activator and LIP an inhibitor, and their ratio is regulated over a fivefold range during rat development (120).Translation also initiates at more than one initiation codon in the transcripts from the oncogenes c-myc (121) and erbA (122),giving rise to multiple protein products. The mechanisms regulating employment of alternate in-frame initiation codons, yielding proteins of different activities, could be of important biological significance.

MAMMALIAN CELL GROWTH CONTROL OF TRANSLATION

355

V. Translation and Oncogenic Transformation In recent years, several key proteins involved in translation have been shown to behave like proto-oncogene products, in that their deregulation resulted in oncogenic transformation of mammalian cells (see 123 and 124 for reviews). These observations seem to be dramatic demonstrations of the key role of translational control in cell growth regulation. The first observation of oncogenesis resulting from overexpression of a translation factor was with the cap-binding protein, eIF-4E. LazarisKaratzas and co-workers (125) showed that overexpression of eIF-4E in either NIH 3T3 or Rat-1 fibroblasts led to growth in soft agar and formation of tumors in nude mice, which constitutes strong evidence for oncogenic transformation. Overexpression of eIF-4E in HeLa cells also led to rapid cell growth with aberrant morphology ( 1 2 5 ~ )In . these experiments, a mutant construct, in which the putative site of phosphorylation, Ser-53, was converted to alanine, was found to not to have biological activity. However, the original interpretation of this control experiment, suggesting that phosphorylation is necessary for transformation, is now altered by the finding that the actual site of phosphorylation is Ser-209 (B. Joshi, A.-L. Cai, B. D. Keiper, W. B. Minich, R. Mendez, C. M. Beach, J. Stepinski, R. Stolarski, E. Darzynkiewicz and R. E. Rhoads, unpublished). This, however, does not detract from the fact that overexpression of eIF-4E does result in a neoplastic phenotype, but only questions the role of phosphorylation in this process. The transformed phenotype of the eIF-4E-overexpressing NIH 3T3 cells is explained by the fact that Ras is constitutively activated in these cells (126). It was subsequently shown that eIF-4E, like Ras, would cooperate with immortalizing genes such as v-myc or the gene encoding adenovirus protein Ela, to transform normal rat embryo fibroblasts (127). Modulation of the activity or level of eIF-4E may also occur in response to other oncogenes. For example, in cells transformed with src (128) or rus (129), the degree of eIF-4E phosphorylation is increased and a dominant-negative mutant form of Ras blocked phosphorylation of eIF-4E induced by nerve growth factor in PC-12 cells (130).Also, overexpression of c-myc led to elevated expression of eIF-4E protein (74). The most straightforward interpretation of the results with eIF-4E is that overexpression of this factor results in deregulation of translation of mRNAs with complex 5’ leaders that encode growth regulatory products. Overexpression of eIF-4E elevates the cellular levels of cyclin D1 and ODC, two key growth-related proteins (discussed in Section 11,C). It was previously shown that overexpression of ODC produced a neoplastic phenotype (131, 132). Interestingly, growth of cells overexpressing eIF-4E in soft agar was

356

DAVID R. MORRIS

inhibited by a-difluoromethylornithine, a specific inhibitor of ODC (83, suggesting that overproduction of polyamines could be an important component in the neoplastic phenotype induced by eIF-4E. It is noteworthy, however, that since a significant fraction of the cellular eIF-4E is located in the nucleus (134, the transforming activity of this protein could be related to a nuclear function and not to its role in translation. This alternative interpretation cannot be ruled out at present. Disruption of the complex regulation of PKR, the RNA-dependent protein kinase that phosphorylates eIF-2a (see Section II,A), can also elicit a tumor phenotype in cells. Expression of certain mutant, catalytically inactive forms of PKR in NIH 3T3 cells caused malignant transformation, presumably by acting in a dominant-negative manner to inhibit endogenous PKR (134, 135). This trans-dominant inhibition of wild-type PKR activity has been demonstrated in oitro in reticulocyte lysates (136). More recently, it has been found that overexpression of the naturally occurring inhibitor of PKR, p58, also gives rise in NIH 3T3 cells to anchorage-independent growth and tumor formation in nude mice (67). In this context, it is also interesting to note that a specific inhibitor of PKR has been found in cells transformed by v-ras (1377, raising the possibility that this inhibitor may be mediating the action of the oncogene. Although these results, taken together, argue strongly for a role for PKR as a tumor suppressor gene, the kinase substrate critical for this activity has not yet been identified. Obviously, from the standpoint of this essay, phosphorylation of eIF-2a is of interest and one could argue that deregulation of this factor might specifically alter expression of proteins critical to growth control (Section 11,C). The influence of expressing the mutant form of eIF-ea, in which the site of phosphorylation Ser-51 has been converted to alanine (138),would be of interest in testing this hypothesis. However, in the absence of experiments of this sort, it should be emphasized that the biological effects of PKR inhibition could be the consequence of modifying the phosphorylation state of some substrate other than eIF-2a. Of particular interest in this context is the recent observation that PKR can activate the transcription factor NFKB through phosphorylation of the inhibitory factor IKB(68, 139).NFKBregulates expression of a variety of cytokines, including interferon-& and thus its activation is likely to modulate cell growth. Alternatively, other regulatory factors could also be substrates for PKR, which in turn might control transcription of key proto-oncogenes and growth factors, as has been previously suggested (140, 141). One last example of the potential involvement of a translation factor in oncogenesis is that of e E F - l a . Constitutive expression of e E F - l a in several fibroblast cell lines led to elevated sensitivity to oncogenic transformation by ultraviolet light and 3-methylcholanthrene (142). Because the level of this

MAMMALIAN CELL GROWTH CONTROL OF TRANSLATION

357

factor in mammalian fibroblasts decreases with aging (143),it is possible that the cellular level of e E F - l a could regulate translation of a protein or proteins involved in defining the life span of cells. On the other hand, it is interesting to note that e E F - l a has been suggested to be involved in a number of other functions in cells, ranging from protein degradation to interaction with the mitotic apparatus (reviewed in 51, and one of these other activities could be related to the complex processes of oncogenesis and life span. For example, e E F - l a has been identified as a microtubulesevering activity in Xenopus eggs that is thought to function in the rearrangement of microtubules during the cell cycle (144).

VI. Conclusions: Physiological Roles of Translational Control The conclusion of this essay is that there is significant regulation of some mRNA species at the translational level in mitogenically activated cells and that there are multiple, gene-specific mechanisms that can achieve this end. In this final section, I address the question of why these elaborate mechanisms of translational control should be superimposed on the wellestablished transcriptional regulation seen with some proto-oncogenes and other genes involved in growth control. There are several explanations for the prominent occurrence of translational regulation in the mechanisms governing cell growth: the speed and possibilities for coordination of the response, an additional regulatory checkpoint for genes with powerful physiological influences, and the ability to target this rapid response to areas within the cell through localization of the mRNA. Each of these aspects of translational control are discussed below. Regulation of the translation of a preformed mRNA in response to a stimulus enables a cell to alter the rate of synthesis of a protein product considerably more rapidly than would be possible by a change in transcription rate. Considering a hypothetical early-response gene, the lag period before appearance of new mature mRNA in the cytosol would be the sum of the time required for transcription, processing, and transport, which can amount to tens of minutes for a large and complex gene. After new molecules begin to emerge from the nucleus, the half-time to a new steady-state level of mature message in the cytosol, and hence to the new steady-state rate of synthesis of the protein product, would be equal to the half-life of the mature mRNA (see 145 for a discussion of the influence of half-life on the rate of accumulation of a macromolecule). Thus, summing these individual steps for a typical early-response mRNA with a half-life of 50-100 minutes, one arrives at a time on the order of hours to elevate the rate of synthesis of its

358

DAVID R. MORRIS

product to a new steady-state level. On the other hand, if the rate of translational initiation is increased for a particular mRNA, the new steady-state rate of synthesis of the protein product could be achieved in the length of time it takes one ribosome to transit the message, which is on the order of a few minutes (see 38 for a discussion of ribosome elongation rates and transit times). Therefore, simply on first principles, one would anticipate translational control to respond much more rapidly to mitogenic activation, compared to transcriptional regulation. This might not be an important consideration for cellular components that change in proportion to cell mass and therefore need to double only once per generation. A good example of this class is cytoskeletal actin, which shows no detectable element of translational control. On the other hand, for key proteins whose functions are needed immediately after mitogenic activation of a resting cell-for example, certain of the proto-oncogenes, the cyclins, or components of the translation apparatus-one might expect a requirement for translational control. This is often observed. It is not unexpected that the synthesis of some classes of growthcontrolled genes must be coordinated. An example is the genes encoding the translation apparatus, the components of which are important for the general increase in protein synthesis that occurs on mitogenic activation. The synthesis of these components must be coordinated, because many of them need to be produced in a definite stoichiometry (see 96 for a review). Many of the mRNAs encoding these proteins, including all of the structural proteins of the ribosomal subunits and e E F - l a , show strong, coordinated regulation at the translational level. Translation of these mRNAs is controlled via a regulatory system that interacts in part with the poly(Y) tract located at the 5’ end of these mRNAs (see Section 111). Control of the expression of these proteins by a single regulatory system ensures the required coordination. How far this sort of coordinate control, either by the poly(Y) system or by other regulatory systems, extends beyond this class of mRNAs is not yet known. However, another candidate class for coordinate control certainly would be those growth-related mRNAs that contain uORFs (Section IV). Many of the genes that show an element of translational control encode products with profound physiological influences, such as proto-oncogenes. The possibility of translational regulation of these mRNAs provides for secure repression of their expression at inappropriate times and for fine-tuning their expression at proper times in the cell cycle and during development. One example is the ODC gene, overexpression of which can lead to oncogenesis (87, 131, 132). This mRNA has a long, complex 5’ leader that is strongly suppressive to translation (84, 85). Another instance is the prominent occurrence of uORFs in the 5’ leaders of mRNAs from proto-oncogenes and genes encoding growth factors. The presence of these uORFs is almost

MAMMALIAN CELL GROWTH CONTROL OF TRANSLATION

359

always repressive, although the degree of translational suppression varies with the particular uORF (see Section IV). In those instances where it has been tested, for example lck (146) and TGF-P (147), removal of upstream AUG codons in 5’ leaders derepresses translation of downstream cistrons (reviewed in 105, 106). Therefore, even if suppression is not regulated, the presence of these leaders assures low levels of expression of the protein products when the mRNAs are at basal levels in the cell. Last, cis elements within the structure of an mRNA molecule can direct its temporal expression to specific areas of a cell. The role of mRNA localization in early development has been recognized for some time. A good example is the role of the 3’ UTR of the nanos mRNA in localization within the early Drosophila embryo and in repression of translation of the unlocalized mRNA (148).Intracehlar targeting of mRNA molecules is also important in nonembryonic cells (reviewed in 149). For example, the mRNA encoding p-actin is localized to the leading lamellae of fibroblasts as a response to growth factors (150, 151). Also, the 3’ UTR of the c-myc mRNA directs it to cytoskeletal-associated polysomes (152). The physiological significance of this intracellular localization of c-myc is not understood at this time, but this specific association suggests that it may be important in the expression of this proto-oncogene and perhaps of other growth-related genes as well. In summary, many cellular studies strongly suggest that a significant fraction of the new protein synthesis after mitogenic stimulation is due to activation of translation of pre-existing mRNA molecules. It is now becoming clear that there are multiple mechanisms of translational activation that are operative under these conditions. As we begin to gain more molecular insights into the details of these mechanisms, we can anticipate a deeper understanding of the significance of control at the translational level and how it fits into the pleiotropic mitogenic program. ACKNOWLEDGMENTS I thank my friends and colleagues John Hershey, Nahum Sonenberg, Bob Rhoads, Adam Geballe, Charles Moehle and Michael Katze for their helpful criticisms of a late draft of this essay. I am also very grateful to Bob Rhoads for discussing with me the unpublished results from his laboratory on phosphorylation of eIF-4E. The studies cited from my own laboratory were supported by a research grant from the NIH (CA39053).

REFERENCES 1 . A. B. Pardee, Science 246, 603 (1989). 2. H . R. Herschman, TZBS 14, 455 (1989). 3. R. Muller, D. Mumberg and F. C. Lucibello, BBA 1155, 151 (1993)

360

DAVID R. MORRIS

K. Seuwen and J. Pouyssegur, Ado. Cancer Res. 58, 58 (1992). M. Whitman and L. Cantley, BBA 948, 327 (1989). H. R. Henchman, ARB 60, 281 (1991). B. J. Rollins and C. D. Stiles, Ado. Cancer Res. 53, 1 (1989). J.-M. Blanchard, M . Piechaczyk, C. Dani, J.-C. Chambard, A. Franchi et a / . ,Nature 317, 443 (1985). 9. G. Manfioletti, M. E. Ruaro, G. D. Sal, L. Philipson and C. Schneider, MCB 10, 2924 (1990). 10. M. S. Abrahamsen and D. R. Morris, MCB 10, 5525 (1990). 1 1 . C. D. Chang, L. Ottavio, S. Travali, K. E. Lipson and R. Baserga, MCB 10,3289 (1990). 12. P. S. Rudland and C. Jimenez de Asua, BBA 560, 91 (1979). 13. R . E. H. Wettenhall and D. R. London, BBA 349, 214 (1974). 14. S. Cohen and M. Stastny, BBA 166, 427 (1968). 15. C. P. Stanners and H. J. Becker, J. Cell. Physiol. 77, 31 (1971). 16. J. E. Kay, T. Ahern and M . Atkins, BBA 247, 332 (1971). 17. T. Ahern and J. E. Kay, Erp. Cell Res. 92, 513 (1975). 18. J. E. Kay, T. Ahern, V. J. Lindsay and J. Sampson, BBA 378, 241 (1975). 19. T. H. Meedel and E. M. Levine, J. Cell. Physiol. 94, 229 (1978). 20. D. T. Denhardt, D. R. Edwards and C. L. J. Parfett, BBA 865, 83 (1986). 21. P. S. Rudland, PNAS 71, 750 (1974). 22. P. S. Rudland, S. Weil and A. R. Hunter, J M B 96, 745 (1975). 23. B. Benecke, A. Ben-Ze’ev and S. Penman, Cell 14, 931 (1978). 24. R. Jagus and J. E. Kay, EJB 100, 503 (1979). 25. E. Bandman and T. Gurney, Exp. Cell Res. 90, 159 (1975). 26. G. TY. Lee and D. L. Engelhardt, JCB 79,85 (1978). 27. G . Thomas and H. Luther, PNAS 78, 5712 (1981). 28. A. J. Kinniburgh, M. D. McMullen and T. Martin, J M B 132, 695 (1979). 29. G. TY. Lee and D. L. Engelhardt, J M B 129, 221 (1979). 30. J. L. Degen, M. 6 . Neubauer, S. J. Friezner-Degen, C. E. Seyfried and D. R. Morns, JBC 258, 12153 (1983). 31. M . Mach, M. W. White, M. Neubauer, J. L. Degen and D. R. Morris, JBC 261, 11697 (1986). 32. J. G. Hovis, D. J. Stumpo, D. L. Hdsey and P. J. Blackshear, JBC 261, 10380 (1986). 33. M . W. White, T. Kaineji, A. E. Pegg and D. R. Morris, EJB 170, 87 (1987). 34. P. K. Geyer, 0. Meyuhas. R. P. Perry and L. F. Johnson, MCBiol2, 685 (1982). 35. R. L. Kaspar, W. Rychlik, M. W. White, R. E. Rhoads and D. R. Morris, JBC 265,3619 (1990). 36. R. L. Kaspar, T. Kakegawa, H. Cranston, D. R. Morris and M. W. White,JBC 267, 508 (1992). 37. R. J. Tushinski and J. R. Warner, J . Chem. 112, 128 (1982). 38. J. W. B. Hershey, ARB 60, 717 (1991). 39. T. R. Rao and L. I. Slobin, MCBioZ 7, 687 (1987). 40. H. B. J. Jefferies and G . Thomas, JBC 269, 4367 (1994). 41. D. Avni, S. Shama, F. Loreni and 0. Meyuhas, MCBiol 14, 3822 (1994). 42. H. F. Lodish, JBC 246, 7131 (1971). 43. A. De Benedetti, S. Joshi-Barve, C. Rinker-Schaeffer and R. E. Rhoads, MCBiol 11,5435 (1991). 44. C. Brenner, N. Nakayama, M. Goebl, K. Tanaka, A. Toh-e and K. Matsumoto, MCBiol8, 3556 (1988). 4. 5. 6. 7. 8.

MAMMALIAN CELL GROWTH CONTROL OF TRANSLATION

36 1

45. R. M. Frederickson and N. Sonenberg, in ’Translational Regulation of Gene Expression 2” (J. Ilan, ed.), p. 143. Plenum, New York, 1993. 46. J. W. B. Hershey, JBC 264, 20823 (1989). 47. C. E. Samuel, JBC 268, 7603 (1993). 48. R. E. Rhoads, JBC 268, 3017 (1993). 49. N. T. Redpath and C. G. Proud, B B A 1220, 147 (1994). 50. R. E. Thach, Cell 68, 177 (1992). 51. W. C. Merrick, Microbiol. Reu. 56, 291 (1992). 52. W. Rychlik, J. S. Rush, R. E. Rhoads and C. J. Waechter, JBC 265, 19467 (1990). 53. S. J. Morley, M. Rau, J. E. Kay and V. M. Pain, EJB 218, 39 (1993). 54. T. R. Boal, J. A. Chiorini, R. B. Cohen, S. Miyamoto, R. M. Frederickson et al., B B A 1176, 257 (1993). 55. W. B. Minich, M. L. Balasta, D. J. Goss and R. E. Rhoads, PNAS 91, 7668 (1994). 56. C. Hu, S. Pang, X. Kong, M. Velleca and J. C. Lawrence, PNAS 91, 3730 (1994). 57. A. Pause, G. J. Belsham, A.-C. Gingras, 0. Donze, T.-A. Lin et al., Nature 371, 762 (1994). 58. T.-A. Lin, X. Kong, T. A. J. Haystead, A. Pause, G. Belsham et al., Science 266, 653 (1994). 59. T. A. J. Haystead, C. M. M. Haystead, C. Hu, T.-A. Lin and J. C. Lawrence, JBC 269, 23185 (1994). 60. R. Duncan and J. W. B. Hershey, JBC 260, 5493 (1985). 61. H. Mellor, K. M. Flowers, S. R. Kimball and L. S. Jefferson, JBC 269, 10201 (1994). 62. S. Pestka, J. A. Langer, K. C. Zoon and C. E. Samuel, A R B 56, 727 (1987). 63. A. 6. Hovanessian, in “Translational Regulation of Gene Expression 2” (J. Ilan, ed.), p. 163. Plenum, New York, 1993. 64. T. G. Lee, J. Tomita, A. G. Hovanessian and M. G. Katze, JBC 267, 14238 (1992). 65. S. Saito and M. Kawakita, Microbiol. Zmmunol. 35, 1105 (1991). 66. R. Judware and R. Petryshyn, MCBiol 11, 3259 (1991). 67. G. N. Barber, S. Thompson, T. G. Lee, T. Strom, R. Jagus, et n l . , PNAS 91,4278 (1994). 68. A. Kumar, J. Haque, J. Lacoste, J. Hiscott and B. R. G. Williams, PNAS 91, 6288 (1994). 69. E. A. Olmsted, L. Obrien, E. C. Henshaw and R. Panniers, JBC 268, 12552 (1993). 70. A. Chakraborty, D. Saha, A. Bose, M. Chatterjee and N . K. Gupta, Bchem 33, 6700 (1994). 71. S. J. Morley and G. Thomas, Pharmocol. Ther. 50, 291 (1991). 72. S. C. Kozma and G. Thomas, Rev. Physbl. Biochem. Phannacol. 119, 123 (1992). 73. H . B. J. Jefferies, C. Reinhard, S. C. Kozrna and G. Thomas, PNAS 91, 4441 (1994). 74. I. B. Rosenwald, D. B. Rhoads, L. D. Callanan, K. J. Isselhacher and E. V. Schmidt, PNAS 90, 6175 (1993). 75. X. H. Mao, J. M. Green, B. Safer, T. Lindsten and R. M. Frederickson et al., JBC 267, 20444 (1992). 76. S. S. Barve, D. A. Cohen, A. Debenedetti, R. E. Rhoads and A. M. Kaplan, J. Immunol. 152, 1171 (1994). 77. P. J. Nielsen, J. M. Rochelle and M. F. Seldin, Marnm. Genome 4, 185 (1993). 78. P. J. Nielsen and H. Trachsel, E M B O J . 7 , 2097 (1988). 79. J. Pelletier and N. Sonenberg, Cell 40, 515 (1985). 80. A. E. Koromilas, A. Lazaris-Karatzas and N. Sonenberg, E M B O J. 11, 4153 (1992). 81. H. Matsushime, M. F. Roussel, R. A. Ashmun and C. J. Sherr, Cell 65, 701 (1991). 82. I. B. Rosenwald, A. Lazaris-Karatzas. N. Sonenberg and E. V. Schmidt, MCBiol13, 7358 (1993).

362

DAVID R. MORRIS

83. P. J. Blackshear, R. A. Nemenoff, J. G. Hovis, D. L. Halsey, D. J. Stumpo and J. K. Huang, Mol. Endocrinol. 1, 44 (1987). 84. A. Grens and I. E. SchefRer. JBC 265, 11810 (1990). 85. J. M. Manzella and P. J. Blackshear, JBC 265, 11817 (1990). 86. J. M. Manzella, W. Hychlik, R . E. Rhoads, J. W. B. Hershey and P. J. Blackshear, JBC 266, 2383 (1991). 87. L. M. Shantz and A. E. Pegg, Cancer Res. 54, 2313 (1994). 88. S. M . Tahara, T. A. Dietlin, T. E. Dever. W. C. Merrick and L. M. Worrilow, JBC 266, 3594 (1991). 89. R. J. Kaufman, M. V. Davies, V. K. Pathak and J. W. B. Hershey, MCBiol9, 946 (1989). 90. R. D. Klausner, T. A. Rouault and J. B. Harford, Cell 72, 19 (1993). 91. 0. Melefors and 111. W. Hentze, BioEssays 15, 85 (1993). 92. B. Wightman, I. Ha and C . Ruvkun, Cell 75, 855 (1993). 93. M. A. Fajardo. K. A. Butner, K. Lee and R. E . Braun, Dew. B i d . 166, 643 (1994). 94. A. Ostareck-Lederer, D. H. Ostareck, N . Standart and B. 1. Thiele, EMBO J. 13, 1476 (1994). 95. D. R. Morris, T. Kakegawa, R. L. Kaspar and M. W. White, Bchem 32, 2931 (1993). 96. R. L. Kaspar. D. R. Morris and M . W. White, in “Translational Regulation of Gene Expression 2” (J. Ilan, ed). p. 335. Plenum, New York. 1993. 97. S . Levy, D. Avni, N . Hariharan. R. P. Perry and 0. Meyuhas, PNAS 88, 3319 (1991). 9&. B. Cardinali, M . Dicristina and P. Pierandreiamaldi, NARes 21, 2301 (1993). 99. R. Yenofsky, S. Cereghini, A. Krowczynska and G. Brawerman, MCBiol3, 1197 (1983). 100. S. T. Chitpatima, S . Makrides, R. Bandyopadhyyay and G. Brawerman, NARes 16, 2350 (1988). 101. J. D. Richter, Dev. Genet. 14, 407 (1993). 102. N . Standard, M d . B i d . Rep. 18, 135 (1993). 103. J. Richter, BioEssays 13, 179 (1991). 102. J. M. Manzella and P. J. Blackshear, JBC 267, 7077 (1992). 105. M. Kozak, Annu. Rev. Cell Biol. 8, 197 (1992). 106. A. P. Geballe and D. R. Morris, TZBS 19, 159 (1994). 107. M. Kozak, NARes 15, 8125 (1987). 108. H. J. Ruan, J. R. Hill, S. Fateinie-Nainie and D. R. Morris, JBC 269, 17905 (1994). 109. D. R. Edwards, P. Waterhouse, M. L. Holman and D. Denhardt, NARes 14, 886 (1986). 110. B. Coulombe, A. Ponton, I-. Daigneault, €3. R. G. Williams and D. Skup, MCBiol8, 3227 (1988).

111. P. Waterhouse, R. Khokha and D. T. Denhardt, JBC 265, 5585 (1990). 112. J. R. Hill and D. R. Morris, JBC 267, 21886 (1992). 113. J. R . Hill and D. R. Morris, JBC 268, 726 (1993). 114. A. G. Hinnebusch, Microhiol Reu. 52, 248 (1988). 115. A. 6 . Hinnebusch, TIBS 15, 148 (1990). 116. A. G. Hinnebusch, R. C. \Vek, T. E. Dever, A. M. Cigan, L. FengandT. F. Donahue, in “Translational Regulation of Gene Expression 2” (J. Ilan, ed.), p. 87. Plenum, New York, 1993. 117. C. M . Grant and A. G. Hinnebusch, MCBiol 14, 606 (1994). 118. T. E. Dever, L. Feng, R. C. Wek, A. M. Cigan, T. F. Donahue and A. G. Hinnebusch, Cell 68, 585 (1992). 119. S:K. Oh, M. P. Scott and P. Sarnow, Genes Dew. 6, 1643 (1992). 120. P. Descomhes and U. Schibler, Cell 67, 569 (1994). 121. J. Bigler. W. Hokanson and R . N . Eisenrnan, MCBiol 12, 2406 (1992).

MAMMALIAN CELL GROWTH CONTROL OF TRANSLATION

363

122. S. R. Hann, M. W. King, D. L. Bentley, C. W. Anderson and R. N. Eisenman, Cell 52, 185 (1988). 123. R. E. Rhoads, Curr. Opin. Cell B i d . 3, 1019 (1991). 124. N. Sonenberg, Curr. Opin. Cell Biol. 5, 955 (1993). 125. A. Lazaris-Karatzas, K. S. Montine and N. Sonenberg, Nature 345, 544 (1990). 125a. A. De Benedetti and R. E. Rhoads, PNAS 87, 8212 (1990). 126. A. Lazaris-Karatzas, M. R. Smith, R. M. Frederickson, M. L. Jarandlo, Y. L. Liu at al Genes Deo. 6, 1631 (1992). 127. A. Lazaris-Karatzas and N. Sonenberg, MCBiol 12, 1234 (1992). 128. R. M. Frederickson, K. S. Montine and N. Sonenberg, MCBiol 11, 2896 (1991). 129. C. W. Rinker-Schaeffer, V. Austin, S. Zimrner and R. E. Rhoads, JBC 267, 10659 (1992). 130. R. M. Frederickson, W. E. Mushynski and N. Sonenberg, MCBiol 12, 1239 (1992). 131. M. Auvinen, A. Paasinen, L. C. Anderson and E. Holtta, Nature 360, 355 (1992). 132. J. A. Moshier, J. Dosescu, M. Skunca and G. D. Luk, Cancer Res. 53, 2618 (1993). 133. F. Lejbkowicz, C. Goyer, A. Darveau, S. Neron, R. Lemieux and N. Sonenberg, PNAS 89, 9612 (1992). 134. A. E. Korornilas, S. Roy, G. N. Barber, M. G. Katze and N . Sonenberg, Science 257,1685 (1992). 135. E. F. Meurs, J. Galabru, G. N. Barber, M. Katze and A. G. Hovanessian, PNAS 90, 232 (1993). 136. T. V. Sharp, Q. R. Xiao, I. Jeffrey, D. R. Gewert and M. J. Clemens, EJB 214,945 (1993). 137. L. J. Mundschau and D. V. Faller, JBC 267, 23092 (1992). 138. V. K. Pathak, D. Schindler and J. W. B. Hershey, MCBiol 8, 993 (1988). 139. A. Maran, R. K. Maitra, A. Kurnar, B H. Dong, W. Xiao et al., Science 265, 789 (1994). 140. K. Zinn, A. Keller, L. A. Whittemore and T. Maniatis, Science 240, 210 (1988). 141. R. Tiwari, J. Kusari, R. Kurnar, and G. C. Sen, MCBiol 8, 4289 (1988). 142. M. Tatsuka, H. Mitsui, M. Wada, A. Nagata, H. Nojima and H. Okayarna, Nature 359, 333 (1992). 143. J. Cavalius, S. I. Rattan and 8. F. C. Clark, Exp. Gerontol. 21, 149 (1986). 144. N. Shiina, Y. Gotoh, N. Kubornura, A. Iwamatsu and E. Nishida, Science 266,282 (1994). 145. R. T. Schirnke, Ado. Enzymol. 37, 135 (1973). 146. J. D. Marth, R. W. Overall, K. T. Meier, E. G. Krebs and R. M. Perlrnutter, Nature 332, 171 (1988). 147. B. A. Arrick, A. L. Lee, R. L. Grendell and R. Derynck, MCBiol 11, 4306 (1991). 148. E. R. Gavis and R. Lehrnann, Nature 369, 315 (1994). 149. J. E. Wilhelm and R. D. Vale, J C B 123, 269 (1993). 150. V. M. Latham, E. H. Kislauskis, R. H. Singer and A. F. Ross, JCB 126, 1211 (1994). 151. M. A. Hill, L. Schedlich and P. Gunning, J C B 126, 1221 (1994). 152. J. Hesketh, G. Campbell, M. Piechaczyk and J. M. Blanchard, BJ 298, 143 (1994).

This Page Intentionally Left Blank

Index

A Airborne molecules, in synthesis of RQ RNAs, 210-241 ALAS, see 5-Aminolevulinate synthase ALAS-1, see 5-Aminolevulinate synthase, housekeeping ALAS-2, see 5-Aminolevulinate synthase, erythroid O6-A1kylguanine-DNA alkyltransferase, see Alkyltransferase O6-A1kylguanines, in chemotherapy, 210-

role in heme biosynthesis, 3-6 transcription, drug-induced, 17-20 Anemia, hereditary sideroblastic, 41-46 Antibodies, to alkyltransferase, 173-174

6 Bacteria, see Escherichia coli 06-Benzylguanine, as alkyltransferase substrate, 189-191 Bilirubin, chemistry and biochemistry, 307-

214 Alkylnitrosoureas, cytotoxic effects, resistance, 201 04-Alkylthymine, repair, 191-192 Alkyltransferase gene structure and promoter activity,

314 Bilirubin transferase, microregion definition,

324-326 Biosynthetic enzymes, heme, gene expression, 25-27

196-198 inactivation, in enhancement of chemotherapy, 209-214 induction and tissue-specific levels, 198-

200 mechanism of inhibition, 187-189 Occurrence and purification, 170-174 small inactivators, 182-187 structure and reaction mechanism, 174178 substrate specificity and metabolism, 191196 transgenic expression, 208-209 a proteins, HSV-1, trans-activation by, 134137 Amino acid sequence, alkyltransferase, 171 5-Aminolevulinate synthase erythroid, synthesis, 33 hepatic, heme effects, 13-15 housekeeping, molecular regulation, 1117 housekeeping and erythroid gene structure, 9-11 isozymes, 6-9

C Carcinogenicity, from alkylating agents, protection, 207-208 Chemotherapy, enhancement, 209-214 X-structure, see also Strand exchange, Holliday intermediate synthetic, resolution by Flp, 81-83 Chlorethylation, carcinogenic, prevention,

207 Chromosomes aberrations, resistance to, 202-203 engineering, Flp as reagent, 86-87 Cloning, cell-free, prospects, 225-270 Core homology, role in resolution reaction, 83 Cytochrome P450, hepatic, drug-induced,

17-20

D Dealkylation, 0-alkylated purine and pyrimidine, 178-181

365

366

INDEX

Diagnostics, in cell-free molecular cloning, 263-264 Dimeriiation, half-FRT sites, 70-72 DNA aberrations, preventive role of alkyltransferase, 202-207 bending, mutant Flp proteins, defect, 66-

68 binding, Flp to FRT site, 64-66 metabolism, genes, 274-277 DNA endonuclease, function of 5’-to-3’ exonuclease. 112-113 DNA polymerases, in recwnstitution of SV40 DNA replication, 100-101 DNA repair alkyltransferase sequence specificity in, 191-194 role of polymerase f3. 113 DNA replication initiation by priming reactions, 96-97 leading-strand, 101-103 origins, initiation at, 93-96 reactions, regulation, 113-115 SV40, 98-101. 108-109 DNA synthesis lagging-strand, completion, 104-113 leading- and lagging-strand. 98-104 Drug induction hepatic ALAS, 13-14 hepatic cytochrome P450 and ALAS transcription, 17-20

t Early promoters, HSV, 144-147 Erythropoiesis, and heme synthesis, 2234

Escherichia coli hyper-recombinant mutants, 278 lagging-strand DNA synthesis, 104-105 Ethylation, carcinogenic, prevention, 205207 Exons 1, determination of transferase structural diversity, 319-324 5’40-3’ Exonuclease in reconstitution, 105- 107 role in Okazaki fragment processing, 110112 substrate specificity. 107- 108

F FlP function, 55 mechanism of action, 62-86 mediated recombination, 59-62 as reagent for chromosome engineering, 86-87 resolution of synthetic X-structures, 8183 as site-specific recombinase, 57-62 structure and function, 84-86 FLP recognition target, see FRT site FRT site asymmetry in, 60-61 half-, see Half-FRT site

G Gene expression cell-free, prospects, 225-270 immediate-early, early, and late, HSV, 129-132 Genes ALAS-1 and ALAS-2, 9-13 alkyltransferase, 196- 198 in DNA metabolism, 274-277 FLP, 55-57 heme biosynthetic enzymes, 25-27 HPRI, 285-291 rat steroid transferase, 326-327 specific, translational control, 34-343 SRS2IHPR5, 291-299 UGTI, 311-336 Genome, HSV, transcription during infection, 123-165 Globip formation, coupled to heme production, 33-34 Glucuronidation, bilirubin and phenols, 311-314 Growth control, translation in mammalian cells, 339-363

H Half-FRT site complementation assay, 80-81 dirnerization, 70-72

367

INDEX

Heme biosynthesis, 3-6 degradation, 21-22 effects on hepatic ALAS, 13-15 synthesis, and erythropoiesis, 22-34 Heme oxygenase, heme degradation, 21-22 Herpes simplex virus infection, transcriptional switches, 133-140 mRNA expression, early/late switch, 153154 promoters, 140-155 type 1, physical description, 125-129 Heterologous templates, RNA amplification, 252-254 copying, 230-231 Higher alkylation, carcinogenic, prevention, 205-207 HSV, see Herpes simplex virus Hyper-recombination, and hypo-rec, mutant isolation, 274-299

I Infection, HSV, productive and latent, 129-

137 Initiation at DNA replication origins, 93-96 in yeast and mammals, 95-96 Iron cellular levels, regulation, 27-31 and heme synthesis, 31-33

Mer- phenotype, alkyltransferase, 197-198 Methylating agents, in chemotherapy, 209210 Methylation, carcinogenic, prevention, 203205 Mitochondria, ALAS transport, inhibition by heme, 16-17 Mitogenic activation, and protein synthesis, 340-343 Molecular biology human hereditary porphyrias, 34-41 human hereditary sideroblastic anemia, 41-46 Molecular colony technique, 261-264 Monopolymerase system, for DNA synthesis, 98 Mosaic analysis, Flp recombinase, 87 mRNA ALAS-2, 33 binding proteins, regulation, 348-350 hepatic ALAS-1, 14-15 HSV, 153-154 RQ RNA, 254-261 Mutants, hypo- and hyper-recombination, isolation, 274-299 Mutations Flp protein, 66-68 prevention, 203-207

N Nuclear localization, alkyltransferase, 195196

L Late promoters, HSV, 147-153 controlling Py transcripts, 15G153 UL38 promoter, 147-150 LAT promoter, HSV, 141-144 Lead poisoning, effects on heme production, 41 Ligation, see Strand ligation Liver, ALAS-I mRNA, heme effects, 14-15

M Mammalian cells, growth control of translation in, 339-363

0 Okazaki fragment processing, 110-112 Open reading frames, regulation within 5’ leader, 350-354

P PCNA, see Proliferating cell nuclear antigen Phosphorylation, translational initiation factor, 343-346 Plasmids, 2+m, structure and function, 5357

INDEX

Polymerase $, role in DNA repair, 113 Polymerase 8, and PCNA, 98-99 Polymerase L, PCNA effects, 103-104 Polymerase switching, on lagging strand, 109 Porphyrias erythropoietic, 39-40 hepatic, 35-39 perspectives, 40-41 Priming reactions, initiation of DNA replication, 96-97 Proliferating cell nuclear antigen effect on polymerase c. 103-104 and polymerase 6, 98-99 Promoter elements, &-acting, HSV. 137140 Proteins alkylated, fate, 195 a, HSV-1, trans-activation by, 134-137 mRNA binding, regulation, ,348-350 Srs2, 299-301 Protein synthesis global control, 340-343 RQ mRNA-directed, 258-259 Purine derivatives, 0-alkylated, dealkylation, 178-181 Pyrimidine derivatives, 0-alkylated. dealkylation, 178- 181

Q Q$ phage, R Q RNAs as satellites, 238-240 Qj3 replicase in RNA synthesis, 227-231 separation from contaminating R Q RNAs, 233-235 Qp RNA, replication, 229-230

R Reaceivation, HSV infection, 133 Recombinants, R Q RNAs, 253-254,257-

258 Recombinases, site-specific, 58-59 Recombination, see also Hyperrecombination excisive, 87 Flp-mediated, 59-62

mitotic, and yeast hyper-rec mutants, 271-303 Reconstitution joining of lagging-strand products, 105107 two-polymerase, SV40 DNA replication, 100-101, 108-109 Repair, see DNA repair Replication-translation system, coupled, 258-259 Resolution, synthetic X-structures by Flp,

81-83 Resolution reaction, role of core homology, 83 RF-C, for complete SV40 DNA replication, 99-101 RNA, see also mRNA; Q$ RNA; RQ RNAs generation de now, 235-236 heterologous templates, copying, 230-231 RNase H1, in Okazaki fragment processing, 110- 112 RNA synthesis by airborne RQ RNA molecules, 240-241 by Q$ replicase, 227-231 RP-A, role in SV40 replication, 99-101 R Q RNAs cloning, 261-263 origin, 236-247 replication and structure, 247-252 separation of Q$ replicase from, 233-235 variant hypothesis, 231-233 vectors, 252-261

S Saccharomyces cerevisioe DNA replication initiation, 95-96 hyper-recombination mutants, 278-299 Simian virus 40, DNA replication, 98-101, 108-109 origins, initiation at, 93-95 Sister chromatid exchange, formation, resistance to, 202-203 Strand cleavage, see also trans cleavage chemistry, 74-75 by Flp, 69 Strand exchange Holliday intermediate, 75-76 topological studies, 76-77

INDEX

Strand ligation assay and chemistry, 77-81 and polymerase switching, 109 Substrate specificity alkyltransferase, 191-194 5'-to-3' exonuclease, 107-108 UGTI-encoded isoforms, 327-328 SV40, see Simian virus 40 Synapsis, FRT sites, for recombination, 6869

T Tissues basal expression of ALAS-1, 11 specific levels of alkyltransferase, 198-200 Toxicity, preventive role of alkyltransferase, 201 trans cleavage, see also Strand cleavage FRT site, 72-73 Transcription early, shut-off, 145-146 HSV-mediated, 134 late, cellular factors, 156 Transcriptional switches, during HSV infection, 133-140 Transcripts, HSV-1, 127-129 Transferases, structural diversity, determination by exons 1, 319-324

Transformation, oncogenic, and translation, 355-357 Translation in mammalian cells, growth control, 339363 and oncogenic transformation, 355-357 Translational control, physiological roles, 357-359 Translational initiation factors, regulation, 343-348 Translation system, coupled, 254-257 Tumors, occurrence of alkyltransferase, 168170

U UDP-glucurollosyltransferasesystem, 306307

v Vertebrates, higher, heme biosynthesis, 3-6

Y Yeast, see Saccharomyces cereuisiae

This Page Intentionally Left Blank

This Page Intentionally Left Blank

ISBN 0-12-5qOO53-9