PROGRESS IN
N u c l e i c A c i d Research a n d M o l e c u l a r Biology Volume 56
This Page Intentionally Left Blank
PROGRESS IN
Nucleic Acid Research and Molecular Biology edited by
WALDO E. COHN
KlVlE MOLDAVE
Biology Division Oak Ridge Nationnl Labmatory Oak Ridge, Tennessee
Department of Molecular Biology and Biochemistry University of Cal$mia, lrvine lrvine, California
Volume 56
ACADEMlC PRESS San Diego London Boston Sydney Tokyo Toronto
New York
This book is printed on acid-free paper.
@
Copyright 0 1997 by ACADEMIC PRESS All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the Publisher. The appearance of the code at the bottom of the fist page of a chapter in this book indicates the Publisher's consent that copies of the chapter may be made for personal or internal use of specific clients. This consent is given on the condition, however, that the copier pay the stated per copy fee through the Copyright Clearance Center, Inc. (222 Rosewood Drive, Danvers, Massachusetts 01923), for copying beyond that permitted by Sections 107 or 108 of the US.Copyright Law. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. Copy fees for pre-1997 chapters are as shown on the title pages, if no fee code appears on the title page, the copy fee is the same as for current chapters. 0079-6603197 $25.00
Academic Press a division of Harcourt Brace & Company
525 B Street, Suite 1900, San Diego, California 92101-4495, USA http://www.apnet.com Academic Press Limited 24-28 Oval Road, London NWI 7DX, UK http://www.hbuk.co.uk/ap/ International Standard Book Number: 0- 1 2-540056-X
PRINTED IN THE U " E D STATES OF AMERICA 97 98 9 9 0 0 01 0 2 B B 9 8 7 6
5
4
3 2
1
Contents
ABBREVIATIONS AND SYMBOLS ................................. SOMEARTICLES PLANNED FOR FUTURE VOLUMES. . . . . . . . . . . . . . .
Developmental Genome Reorganization in Ciliated Protozoa: The Transposon Link . . . . . . . . . . . . . .
ix
xi
....
Lawrence A . KIobutcher and Glenn Herrick I . Genome Organization and Reorganization in Ciliates . . . . . . . . . . . . . . I1. Organization of Eliminated DNA Sequences ...................... I11. Mechanisms of Internal Eliminated Sequence Excision . . . . . . . . . . . . . IV. Possible Functions of Internal Eliminated Sequences . . . . . . . . . . . . . . . V. Evolution of Ciliate lnternal Eliminated Sequences by the Invasion. Bloom. Abdication. and Fading of Transposons . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 8 19 44
46 59
DNA Excision Repair Assays . . . . . . . . . . . . . . . . . . . David Mu and Aziz Sancar I. In Vitro Assays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1. In Vivo Assays ...............................................
I11. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Mitochondria1 Uncoupling Protein: Structural and Genetic Studies . . . . .
. . . . . . . . . . . . . 83
Daniel Ricquier and Frkdkric Bouillaud I. The Uncoupling Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1. The Uncoupling Protein Gene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I11. Conclusions and Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
V
64 73 78 78
85 96 105 106
vi
CONTENTS
Molecular Regulation of Cytokine Gene Expression: Interferon-y as a Model System . . . . . . . . . . . . . . .
. . . 109
Howard A . Young and Paritosh Ghosh I. Extracellular Signals That Modulate IFN-y Production . . . . . . . . . . . . . I1. The Role of DNA Methylation .................................. I11. IFN-y Promoter Structure and Regulatory Elements . . . . . . . . . . . . . . . IV. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References ...................................................
111 114 119 124 125
RecA Protein: Structure. Function. and Role in Recombinational DNA Repair . . . . . . . . . . . . . . . . . . . 129
Albert0 I. Roca and Michael M . Cox I. On the Function of Homologous Genetic Recombination inBacteria ................................................... I1. The Structure of RecA Protein ................................... I11. RecA Protein Interactions with Its Ligands in Vitro; Biochemical Approaches ....................................... IV. RecA Protein-medated DNA Strand Exchange .................... V. Interaction of RecA Protein with Other Proteins ................... VI. Other Functions of RecA Protein in Viuo ......................... VII. Epilogue: Relating RecA Biochemistry to DNA Repair . . . . . . . . . . . . . . References ...................................................
130 138 171 184 200 208 210 213
Molecular Biology of Axon-Glia Interactions in the Peripheral Nervous System . . . . . . . . . . . . . . . . 225 Verdon Taylor and Ueli Suter I. Axon-Glial Interactions during Neural Crest Development . . . . . . . . . . I1. Regulation of Schwann Cell Proliferation and Differentiation by Growth Factors and Their Receptors .......................... I11. Role of the Extracellular Matrix in PNS Development . . . . . . . . . . . . . . IV. Myelination as a Speciality of Axon-Schwann Cell Interactions . . . . . . V. TranscriptionalRegulation of Axon-Schwann Cell Interactions . . . . . . VI. Degeneration and Regeneration in the Nervous System . . . . . . . . . . . . . VII. Axon-Schwann Cell Interactions as a Bilateral Communication . . . . . . VIII. Mechanisms of Membrane Sorting in Myelinating Schwann Cells . . . . IX . Future Perspectives ........................................... References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
227 229 233 235 243 246 248 249 249 250
vii
CONTENTS
Regulation of Eukaryotic Messenger RNA Turnover
. . . . 257
Lakshman E . Rajagopalan and James S. Malter I . Measurement of mRNA Decay Rates ............................ I1. CisElements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. TrunsFactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iv. Overproduction of Cytokines in Cells and Intact Animals: Application to Gene Therapy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
258 267 273 281 282 283
New and Atypical Families of Type I Interferons in Mammals: Comparative Functions. Structures. and Evolutionary Relationships . . . . . . . . . . . . . . . . . . . . .
287
R . Michael Roberts. Limin Liu and Andrei Alexenko I . Interferon-w . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1. Interferon-.r .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
111. Comparison of Structures of IFN-w and IFN-Twith Other Type I Interferons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . n! Evolution of I F N W and IFNT .................................. V. Chromosomal Location and Linkage of IFNW and IFNT . . . . . . . . . . . VI. Other Atypical Type I Interferons ............................... VII. Is There a Human IFN-T? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII. Concluding Remarks .......................................... References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
General Transcription Factors for RNA Polymerase II
291 295 304 309 317 318 319 320 320
. . 327
Ronald C . Conaway and Joan Weliky Conaway I . TFIID and Formation of the First Stable Intermehate in Assembly of the Preinitiation Complex .................................... I1. TFIIB and Selective Binding of RNA Polymerase 11to the TFIID-Core Promoter Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. TFIIF and Assembly of the Active Preinitiation Complex . . . . . . . . . . . IV. Roles of TFIIE and TFIIH in Formation and Activation of the Fully Assembled Preinitiation Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Overview of RNA Polymerase I1 General Elongation Factors . . . . . . . . VI. SII and Nascent Transcript Cleavage ............................ VII. The Elongation Activity of TFIIF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII. The Elongin (SIII) Complex and von Hippel-Lindau Disease . . . . . . .
328 330 330 332 335 336 337 338
...
CONTENTS
W1
IX. ELL and Acute Myeloid Leukemia .............................. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Biochemistry and Molecular Genetics of Cobalamin Biosynthesis . . . . . . . . . . . . . . . . . . . . .
340 341
347
Michelle R . Rondon. Jodi R . Trzebiatowski and Jorge C . Escalante-Semerena I. Nomenclature of Coninoids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
I1. Diversity of Coninoids ........................................ I11. Cobamide-producing Organisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV Cobalamin-dependent Reactions ................................ V. Biochemisiq of Cobalamin Synthesis ............................ VI. Molecular Genetics of Cobalamin Synthesis ...................... VII. Regulation of' Cobalamin Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
INDEX .....................................................
349 350 352 352 354 369 376 378 380
385
Abbreviations and Symbols
All contributors to this Series are asked to use the terminology (abbreviationsand symbols) recommended by the IUPAC-IUB Commission on Biochemical Nomenclature (CBN) and approved by IUPAC and IUB, and the Ehtors endeavor to assure conformity.These Recommendations have been published in many journals (1, 2) and compendia (3);they are therefore considered to be generally known. Those used in nucleic acid work, originally set out in section 5 of the first Recommendations( I ) and subsequently revised and expanded (2,3), are given in condensed form in the frontmatter of Volumes 9-33 of this series. A recent expansion of the oneletter system (5) follows. SINGLE-LETTEH CODERECOMMENDATIONSO (5) Symbol
Meaning
Origin of symbol Guanosine Adenosine (ribo)Thymidine(Uridine) Cytidine
R Y M K S wb
GorA T(U) or C A or C G or T(U) G or C A or T(U)
Keto Strong interaction (3 H-bonds) Weak interaction (2 H-bonds)
H B V D‘
A or C or T(U) G or T(U) or C G or C or A G or A or T(U)
not G; H follows G in the alphabet not A; B follows A not T (not U); V follows U not C; D follows C
N
G or A or T(U) or C
aNy nucleoside (i.e., unspecified)
Q
Q
Queuosine (nucleoside of queuine)
puhe
pyrimidine din0
”Modified from Proc. Nutl. Acud. Scd. U.S.A. 83,4 (1986). bW has been used for wyosine, t h nucleoside ~ of “base Y (wye) ‘D has been used for diliydrouridine (hU or H,Urd). Enzymes
In naming enzymes, the 1984 recommendations of the IUB Commission on Biochemical Nomenclature ( 4 ) are followed as far as possible. At first mention, each enzyme is described either by its systematicname M by the equation for the reaction catalyzed orby the recommended trivial name, followed by its EC number in parentheses. Thereafter, a trivial name may be used. Enzyme names are not to be abbreviated except when the subsixate has an approved abbreviation (e.g., ATPase, but not LDH, is acceptable).
ix
ABBREVLATIONS AND SYMBOLS
X
REFERENCES 1. JBC 241, 527 (1966);Bchem 5, 1445 (1966):BJ 101, 1 (1966);ABB 115, 1 (1966),129, 1 (1969);and elsewhere. General. 2. EJB 15,203 (1970);JBC245,5171 (1970);JMB55,299 (1971);and elsewhere. 3. “Handbook of Biochemistry” (G. Fasman, ed.), 3rd ed. Chemical Rubber Co., Cleveland, Ohio, 1970, 1975, Nucleic Acids, Vols. I and 11, pp. 3-59. Nucleic acids. 4. “Enzyme Nomenclature” mecommendations (1984)of the Nomenclature Committee of the IUB]. Academic Press, New York, 1984. 5. EJB 150, 1 (1985).Nucleic Acids (One-lettersystem). Abbreviations of Journal Titles
Journals
Abbreviations used
Annu. Rev. B i o c h . Annu. Rev. Genet. Arch. B i o c h . Biophys. Biochem. Biophys. Res. C m m u n . Biochemistry Biochem. J. Biochim. Biophys. ActQ Cold Spring Harbor Cold Spring Harbor Lab Cold Spring Harbor Symp. Quunt. Biol. Eur. J . Bwchem. Fed. Proc. Hwe-Seyler’s Z. Physiol. Chem. J. A m . Chem. Soc. J. Bacterial. J . Biol. Chem. J. chem. SOC. J. Mol. Biol. J. Nat. Cancer Inst. Mol. Cell. Biol. Mol. Cell. Bwchem. Mol. Gen. Genet. Nature, New Biology Nucleic Acid Research Proc. Natl. Acad. Sci. U.S.A. Proc. Soc. Erp. Biol. Med. Progr. Nucl. Acid. Res. Mol. Biol.
ARB ARGen ABB BBRC Bchem BJ BBA CSH CSHLab CSHSQB EJB
FP ZpChem JACS J. Bact. JBC JCS JMB JNCI MCBiol MCBchem MGG Nature NB NARes PNAS PSEBM This Series
Some Articles Planned for Future Volumes
Structure and Transcription Regulation of Nuclear Genes for the Mouse Mitochondria1 Cytochome c Oxidase
NARAYAN G. AVADHANI,A. BASU,C. SUCHAHOV AND N. LENKA Regulation of Translational Initiation during Cellular Responses to Stress
MARGARET A. BROSTROM Replication Control of Plasrnid P1 and Its Host Chromosome: The Common Ground
DHRUBA K. CHATTORAJ AND THOMAS D. SCHNEIDER The Internal Structure of the Ribosome
BARRYS. COOPERMAN Tissue Transglutaminase-Retinoid Regulation and Gene Expression
PETERJ. A. DAVIES AND SHAKID MIAN Genetic Approaches to Structural Analysis of Membrane Transport Systems
WOLFGANGEPSTEIN Intronencoded snRNAs
MAURILLEJ. FOURNIER AND E. STUART MAXWELL Mechanisms for the Selectivity of the Cell’s Proteolytic Machinery
ALFREDGOLDBERG,MICHAELSHEKMAN AND OLIVER Coux Mechanisms of RNA Editing
STEPHENL. HAJDUK AND SUSAN MADISON-ANTENUCCI The Hairpin Ribozyme: Discovery, Development, and Applications for Regulation of Gene Expression
ARNOLDHAMPEL Molecular Biology of Trehalose and the Trehalases in the Yeast S. cerevisiae
HELMUTHOLZERAND SOLOMONNWAKA Structure/Function Relationships of Phosphoribulokinase and Ribulosebisphosphate Carboxylase/Oxygenase
FREDC. HARTMANAND HILLELK. BRANDES The Nature of DNA Replication Origins in Higher Eukaryotic Organisms
JOEL A. HUBERMAN AND WILLIAM C. BURHANS
xi
xii
SOME ARTICLES PLANNED FOR FUTURE VOLUMES
Changes in Gene Structure and Regulation of E-cadherin during Epithelial Development, Differentiation, and Disease JANUSZ
J. JANKOWSKI, FIONA K. BEDFORDAND YOUNG S. KIM
Function and Regulatory Properties of the MEK Kinase Family
GARYL. JOHNSONet aZ. The Formation of DNA Methylation Patterns and the Silencing of Genes
JEANPIERREJOSTAND AWNBRUHAT Mammalian DNA Polymerase Delta: Structure and Function
MARIETTAY. W. T. LEE The Role of mRNA Stability in the Control of Globin Gene Expression
J. ERICRUSSELL,JULIAMORALESAND STEPHENA. LIEBHABER Mismatch Base Pairs in RNA
STEFAN LIMMER DNA Helicases: Roles in DNA Metabolism
STEVENW. MATSONAND DANIEL W. BEAM lactose Repressor Protein: Perspectives on Structure and Function
KATHLEENSHNE MATTHEWSAND JEFFRYNICHOLS Molecular Genetics of Yeast TCA Cycle lsozymes
LEE MCALISTER-HENN AND W. CURTISSMALL Stimulation of Kinase Cascades by Growth Hormone: A Paradigm for Cytokine Signaling
TIMOTHY J. J. WOOD,LARS-ARNEHALDOS~N, DANIEL SLIVA, MICHAELSUNDSTF~OM AND GUNNAR NORSTEDT lmmunoanalysis of DNA Damage and Repair Using Monoclonal Antibodies
MANFREDF. RAJEWSKY Bacterial and Eukaryotic DNA Methyltransferases
NORBERT0. REICH Self-glucosylating Initiator Proteins and Their Role in Glycogen Biosynthesis
PETER J. ROACHAND ALEXANDER V. SKURAT Transcriptional Regulation of Small Nuclear RNA Genes
WILLIAME. STUMPH Baci//us subtilis as I Know It
NOBORUSUEOKA Oligonucleotides and Polynucleotidesas Biologically Active Compounds
V. V. VLASSOV,I. E. VLASSOVA AND L. V. PAUTOVA
SOME ARTICLES PLANNED FOR FUTURE VOLUMES
The Mechanism of 3'-Cleavage and Polyadenylation of Eukaryotic pre-rnRNA
ELMAR WAHLEAND UWE KUHN Molecular Genetic Approaches to Understanding Drug Resistance in Protozoan Parasites
DYA" WIRTHet al.
...
XU1
This Page Intentionally Left Blank
Developmental Genome Reorganization in Ciliated Protozoa: The Transposon Link
LAWRENCEA. KLOBVTCHER*~' AND GLENNHERRICK~
*Department of Biochemistry University of Connecticut Health Center Famington, Connecticut 06030 t Oncological Sciences Departmmt Division of Molecular Biology and Genetics university of Utah Medical Center Salt Lake City, Utah 84132 I. Genome Organization and Reorganization in Ciliates . . . . . . . . . . . . . . . ...... A. Nuclear Dualism in Ciliates . . . . . . . . . . . . . . . . . . . . . . B. Conjugation, Macronuclear Development, and Genome Reorganization . .................................. nt . . . C. Types of DNA Re 11. Organization of Eliminated DNA Sequences . . . . . . . . . . . . . . . . . . . . . . . A. Deleted DNA in Hypotrichous Ciliates . . . . . . . . . . . . . B. Deleted DNA in Tetmhymenu . . . . . . . . C. Deleted DNA in Paramecium . . . . . . . . . . . . . . . . . . . . . . 111. Mechanisms of Internal Eliminated Sequence Excision . . . . . . . . . . . . . . A. Analysis of Excision Products and Intermediates; Models of Excision .,........... B. Cis-acting Sequences . . . . . . . . . . . C. Tram-acting Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Relationship of Internal Eliminated Sequence Excision to DNA Rearrangement Processes . . n! Possible Functions of Internal Eliminated Sequences . . . . . . . . . . . . . . . . .
.
.
.
.
.
I
.
.
.
.
.
.
.
.
.
A. Transposon Invasion in Ciliates . . . . . . . . . . . . . . . . . . . .............. B. Bloom; TBE Transposons . . D. Tetrahymena and IBAF Progression E. Phylogenetic Distribution of Internal Eliminated Sequences F. Future Directions . . . . . , . , . . . . . .
............................. ......................
3 3 5 6 8 8 15 18 19 20 30 35 41 44
46 47 48 52 54 56 58 59
To whom correspondence may be addressed. ProFess in Nucleic Acid Research and Molecular Biology, Val. 56
Copynght 0 1997 by Academic Press.
1
AU nglrts of reproduchon in any form reserved 0079-6603,97$2500
2
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
Programmed somatic excision of interstitial segments of DNA, with the rejoining of flanking sequences, occurs in a variety of organisms ranging from bacteria to humans. For example, in the bacteria Anubaenu and Baci2lus subtilis, specific eliminations of large DNA interruptions are required to produce functional genes in the heterocyst and spore-mother, respectively (1-5) (see Section V,A). Similarly, in vertebrates the generation of immunoglobulin and T-cell receptor genes, and indeed the generation of immunological diversity, depend on specific DNA excision events (reviewed in 6). The origin of such rearrangement systems in unclear, but it has frequently been speculated that the excised DNA segments derive from either viruses or transposons that have evolved to coexist with their host organisms such that they are coordinated with normal differentiation pathways (2, 4, 7-9b). Such hypotheses have been bolstered by numerous observations of similarities among viruses, transposons, and site-specific recombination systems, and suggest that there is evolutionary fluidity between these classes of dynamic DNA elements. For example, the structural similarities, as well as mechanisms of replication and integration, of retroviruses and retrotransposons strongly suggest a relationship between these classes of elements. More generally, retroviral integration and transposition utilize related reaction mechanisms (reviewed in 10, 11),and at least some of the proteins that catalyze these events are homologous and structurally related (reviewed in 12,13).Indeed, some of the proteins involved in the developmental DNA excision processes noted herein are related to proteins involved in site-specificDNA inversion and to the resolvases encoded by transposons (14, 15). Despite these intriguing links, the relationship between programmed DNA deletion events and viruses or transposons remains speculative. In this review we focus on programmed DNA deletion events in the ciliated protozoa. In contrast to other species, for which programmed DNA deletion is limited to small numbers of genes, the ciliates typically undergo thousands of deletions as part of their normal development. In some ciliate species, large families of transposons comprise part of the DNA that is specifically deleted. The structures of such transposons are reviewed, along with the organization of other DNA deletion sequences, and the current understanding of the mechanisms of these processes. Finally, a hypothesis for the transposon origin of developmental DNA deletion in ciliates is discussed. Although this review concentrates primarily on DNA excision events during macronuclear development (previously reviewed in 16-19), a number of other DNA rearrangement processes that are of general interest occur in these organisms, including chromosome fragmentation, telomere addition, and DNA scrambling. Review articles covering these other aspects of ciliate DNA rearrangement have been published in recent years (16-18,20,21). In
3
DNA DELETION I N CILIATES
addition, the book by Gall, “The Molecular Biology of Ciliated Protozoa” (22), provides an excellent source of background information on the genetics and molecular biology of many of the ciliates considered here.
1. Genome Organization and Reorganization in Ciliates
A. Nuclear Dualism in Ciliates Developmental DNA deletion has been demonstrated in all ciliate species that have been well-characterized at the molecular level, including the oligohymenophorans (Class Oligohymenophorea) Tetrahymena thermophila and Paramecium and the hypotrichs (Class Hypotrichea) Oxytricha nova, Oxytrichafallax, Oxytricha trqatlax, Stylonychia lemnae (previously Stylonychia mytilus) (23, 2 4 ) , Stylonychia pustulata, and Euplotes crassus. It should be noted that although all these organisms fall within the group referred to as cihated protozoa, the group is ancient, having diverged from the main eukaryotic lineage approximately one billion years ago (25,26) (see Section V,D and Fig. 9). Deep divisions exist within the ciliate evolutionary tree and this must be kept in mind when considering the differences in the DNA deletion processes among the organisms, because there has clearly been a substantial amount of time for diversity to develop, as well as for convergence to occur. Nonetheless, one of the common and defining features of the ciliate group is nuclear dimorphism, and it is this feature that allows for the extensive DNA rearrangement processes in these organisms (reviewed in 16-18, 20). Each cell has one or more micronuclei and macronuclei that play distinct roles during the life cycles of these organisms (Fig. 1).The micronuclei are small in size and in most aspects are similar to the nuclei of other eukaryotic organisms. That is, the micronucleus has its DNA organized into conventional chromosomes associated with histone proteins, and the micronucleus divides by mitosis each asexual or vegetative cell division. Despite its relatively small size, the micronuclear genome in some species can be extremely complex, approaching that of humans. The one unusual aspect of the micronucleus is that it is transcriptionally silent during asexual growth of the organism. As we will discuss, it plays its major role during sexual reproduction, and it is responsible for the genetic continuity of the organism. For these reasons, the micronucleus has often been viewed as a “germ-line nucleus.” The second nucleus, the macronucleus, is responsible for nuclear transcription during asexual growth. It also replicates during asexual reproduction, but it is destroyed and formed anew during sexual reproduction. Because it does not transmit genetic information to sexual offspring, it is often
4
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
ASEXUAL
CELL DIVISIONMACRO.
MEIOSIS, NUCLEAR EXCHANGE, AND FUSION TO FORM ZYGOTIC MICRONUCLEUS
t
DIVISION OF ZYGOTIC MICRONUCLEUS, MACRONUCLEUS DEGENERATES
QQ t
NEW MACRONUCLEUS DEVELOPS FROM ONE OF THE ZYGOTIC MICRONUCLEI
6) FIG.1. Simplified summary of nuclear events during asexual and sexual reproductionin ciliates. See text for details.
viewed as an analog of nuclei in the somatic cells of multicellulareukaryotes. The macronuclear genome comprises but a subset of the sequences present in the micronucleus,arranged in an unconventional manner.The hypotrich-
DNA DELETION IN CILIATES
5
ous ciliates represent an extreme situation in that all of the macronuclear DNA is present in the form of short, linear pieces ranging in size from about 500 bp to 20 kbp, and with an average size of about 2 kbp. Each of these macronuclear DNA molecules appears to carry a single coding region, along with the information required for its expression. These molecules also carry all the sequence information required for replication, and terminate with repeats of the sequence CCCCAAAA (C,A, repeats), which serve as telomeres. For simplicity, we refer to macronuclear DNA molecules as “macronuclear chromosomes,” although they do not appear to possess centromeric functions. The large size of the macronucleus derives from the fact that individual macronuclear chromosomes are present in multiple copies, ranging from about 1000 to 15,000 per cell, depending on the species. The macronuclear chromosomes of Tetrahymena and Paramecium are also smaller than the micronuclear chromosomes from which they derive. However, in these organisms the macronuclear chromosomes are much larger than those of the hypotrichs, and they carry many genes. The average sizes of macronuclear chromosomes in Tetrahymena and Paramecium are about 600 and 300 kbp, respectively. Each chromosome is present in about 45 copies per genome in Tetrahymena, and in about 1000 copies for Paramecium. The macronuclear chromosomes are again bounded by telomeres,which consists of C,A, repeats in Tetrahymena, and C,A, plus C,A, repeats in Paramecium.
B. Conjugation, Macronuclear Development, and Genome Reorganization Both the micronuclei and the macronuclei replicate their genomes and divide during each vegetative, or asexual, cell division (Fig. l),which can occur for hundreds of generations. More rarely, cells undergo sexual reproduction. It is during this period that a new macronucleus is produced via rearrangement of the micronuclear genome. In response to starvation, ciliate cells mate or conjugate (Fig. l),which induces a complex series of nuclear events. The details of the nuclear events vary with the species; a simplified description of the process is shown in Fig. 1. Once the cells pair, the micronucleus undergoes meiosis to generate four haploid nuclei. One of the haploid micronuclei mitotically replicates to generate two identical haploid gametic nuclei. For each of the cells, one of these haploid nuclei (the migratory pronucleus) is transferred to the other member of the cell pair, where it fuses with a resident haploid nucleus (the stationary pronucleus) to generate a new diploid nucleus termed the zygotic micronucleus. The cells than separate, and the zygotic micronucleus replicates its genome and divides. One of the two daughter nuclei serves as the new micronucleus for the cell, while the other undergoes DNA rearrangement to form a new macronucleus (the
6
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
old macronucleus in the cell fragments, and the pieces become pycnotic and are ultimately lost). The development of a new macronucleus is a multistep procedure. Again, the details vary with the species. For illustrative purposes, we describe the process of macronuclear development in hypotrichous ciliates. The developing macronucleus, or anlage, in these organisms displays distinctive cytological features. Although these cytological stages are not typical of other ciliate groups, they provide a visual indication of the molecular events that appear to be common in all groups. Macronuclear development in the hypotrichs begins with multiple rounds of DNA replication that result in the formation of polytene chromosomes. The level of polytenizationvaries with the species, but it can reach a ploidy as high as 64 (27). Once formed, the polytene chromosomes undergo fragmentation, and, concomitantly, vesicle-like structures form within the anlage that encase the individual chromosome fragments. It is during this period that chromosome fragmentation generates the short macronuclear chromosomes (28,29). While the vesicles persist, large amounts of DNA are eliminated from the anlage (ultimately, >90% of the micronuclear genome complexity is eliminated in some hypotrich species; 30,31). Finally, the vesicular structures break down, and the anlage again undergoes multiple rounds of DNA replication to give rise to the final ploidy level of the mature macronucleus. The entire process of macronuclear development occws over a period of about 4 days in the hypotrich species. Although the developing macronucleus of ciliates such as Tetrahymena does not display the unique cytology of the hypotrichs, analogous events appear to occur. For instance, although Teh-ahymenu does not produce visible polytene chromosomes, macronuclear development begins with multiple rounds of DNA replication (reviewed in 19,20). During the period when the nucleus reaches a DNA content of 4-8C,chromosome fragmentation occurs. This is followed by additional DNA replication such that the mature macronucleus has about 45 copies of each macronuclear chromosome.
C. Types of DNA Rearrangement Three main types of DNA rearrangement occur during macronuclear development (Fig. 2). One class of rearrangement is chromosome fragmentation. The magnitude of the chromosome fragmentation process varies with ciliate species and correlates with the ultimate sizes of the macronuclear chromosomes. About 40,000 chromosome fragmentation sites exist in the micronuclear genomes of the hypotrichs, whereas only about 50-200 such sites exist in the Tetrahymenu micronuclear genome. The ends generated by chromosome fragmentation lack the short telomeric repeat sequences characteristic of the organism. These repeats are added to the newly generated
7
DNA DELETION IN CILIATES
A Micronucleus:
---
B Micronucleus:
Macronucleus: (C4Aq)nI
1
1
2
1
3
HC4A4)n
F16.2. Diagrams of the three major types of DNA rearrangement occurring during macronuclear development in ciliates. Macronuclear-destinedDNA sequences are indicated as open rectangles, and micronuclear-specificsequences as lines or black rectangles. (A) The excision of an IES (black rectangle) is illustrated, along with DNA fragmentation to generate the ends of two macronuclear chromosomes. Species-specific telomeric repeats (C3&& are added to the newly formed chromosome ends. (B) Representation of the “gene scrambling” process observed in some oxytrichids. Segments of micronuclear DNA are reordered, and in some cases inverted, to generate a macronuclear chromosome.
ends by the enzyme telomerase (32)in concert with, or very soon after, fragmentation (28, 33). Sequences directing the chromosome fragmentation process have been well defined in T. thermophila (34),and a candidate sequence that specifies fragmentation has been suggested in E. mussus (35). The second type of DNA rearrangement, interstitial DNA deletion, is the subject of this review (Fig. 2A). We define this strictly as the process of removing an interstitial segment of DNA (micronuclear-limited DNA), followed by the rejoining of immediately flanking sequences that are ultimately retained in the macronuclear genome (i.e., macronuclear-destined sequences). In the hypotrichs, the excised DNA sequences have historically been referred to as internal eliminated sequences (IESs), and we will employ this terminology, although it has not been applied uniformly in the ciliate literature. More recently, evidence for a third type of rearrangement has been obtained in the hypotrichs. The micronuclear copies of some genes not only ap-
8
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
pear to be interrupted, but the order of the macronuclear-destined sequences differs from their ultimate arrangement in the macronuclear chromosome (Fig. 2B) (36,37).The interruptions in such scrambled genes superficially resemble IESs, but the unscrambling process involves more than simply excising these sequences and rejoining the flanking regions. As a result, gene scrambling is viewed as a distinct process from IES excision and will not be covered in detail (but see Section III,D for a discussion of how IESs might be related to scrambling).
II. Organization of Eliminated DNA Sequences IESs have been identified in a small number of distantly related ciliate species. General themes for the organization of IESs have emerged, but major differences in IES structure are evident both within and between species. In this section we focus on the organization of the two major types of IESs: (1) what we term “short” IESs, which represent unique, or low copy-number, sequences, in the micronuclear genome, and (2)much longer IESs that are found in the hypotrichs and are members of highly repetitive transposon families.
A. Deleted DNA in Hypotrichous Ciliates 1. SHORTINTERNAL ELIMINATED SEQUENCES
Short IESs ranging in size from 10 to 539 bp have been identified in six different species of hypotrichous ciliated protozoa: 0. nova (36-39), 0.fullax (40),0.tnfullux (40u; A. Seegmiller and G. Herrick, unpublished), S. lemnw (41,42),S. pustulata (43),and E. mussus (44-46) (C.Tebeau and C. Jahn, unpublished). Indeed, with one possible exception ( 4 9 , the precursor of every macronuclear chromosome examined to date contains at least one, and usually multiple, IESs. Based on the analysis of small numbers of precursors of macronuclear chromosomes, extrapolations have been made to estimate the overall frequency of IESs in the micronuclear genome. These calculations indicate that more than 60,000 IESs are excised during macronuclear development in 0. nova (39),and that on the order of 40,000 IESs exist in E. c r ~ w (45) (L. A. Wobutcher, unpublished) and 0.fullax (A. Seegmiller and G. Herrick, unpublished). Short hypotrich IESs are usually AT rich (generally >70°/o AT base pairs) and show no sigdicant open reading frames. They are for the most part dissimilar at the primary sequence level, and thus appear to be part of the unique sequence DNA that is eliminated during macronuclear deveIopment. This is supported by hybridization analyses involving three 0. nova IESs, which failed to iden* any closely related sequences in the micronuclear
DNA DELETION IN CILIATES
9
genome (39).Although dissimilar in primary sequence, the hypotrich IESs share similar organizational features (Fig. 3a). With one exception ( 4 4 ,all are bounded by short direct repeat sequences of 2-7 bp, and one copy of this direct repeat is maintained in the macronuclear DNA molecule following IES excision. The sequence of the direct repeats varies from IES to IES in most of the hypotrich species. The sole exception is E . crassus, where the direct repeats of all characterized IESs can be viewed as 5’TA3’ (Fig. 3b).In addition, a number of IESs have short (<15bp), and often imperfect, inverted repeats near their ends (Fig. 3a) (38, 39, 41, 42, 4 5 ) . The overall dissimilarity in the sequences of the short IESs has led to suggestions that different classes of IESs exist both within and among organisms. In 0. nouu, at least two classes of IESs have been proposed to exist on the basis of short sequence similarities near their ends (39).Similarly,a third class of IES is suggested by terminal sequence similarities of a S. Zemnue and an 0.nova IES (42).For at least two of the above classes of IESs, the length and degree of sequence similarity is compelling (Class I in 39, 42),but additional IESs need to be sequenced to determine if all the proposed classes are significant. Moreover, not all of the known IESs can be placed in the existing classes (e.g., see 39, 40u).For example, there are at least five IESs in 0.fuZlux that show no similarity to the proposed classes, nor to each other ( 4 0 ~ ) . There also appear to be distinct classes of IESs in E. crassus, despite the fact that all of the IESs in this organism share the same TA terminal direct repeat. The two smallest Euplotes IESs (31and 42 bp) differ from the other short IESs in that their sequences are more GC rich and resemble the telomeric repeat sequences of the organism (44).More importantly, although the larger IESs are excised late in the polytene chromosome stage of macronuclear development (29,47),the 31- and 42-bp IESs are removed even later, at the start of the vesicle phase during the period when chromosome fragmentation and telomere addition are occurring (44).The similarity of these small IESs to the telomeric sequences, and the fact that they are removed during a period when telomere formation is occurring, led to the designation of this subgroup of Euplotes IESs as “TelIESs.” 2. TRANSPOSON INTERNAL ELIMINATED SEQUENCES In addition to the short IESs discussed thus far, widely divergent hypotrich species contain transposable elements that are IESs. The first such transposons were identified in 0.fuZlax and were named TBEl-fal (telomerebearing element 1of 0.fuZZux) elements (Fig. 3c) (48).TBEl-fal elements are 4.1 kbp in length. They are bounded by a 3-bp direct repeat of the target sequence ANT and 77- or 78-bp inverted terminal repeats. The elements derive their name from the fact that the distal 17 bp of the inverted terminal repeats consist of the hypotrich telomeric repeat sequence C,A,. Approxi-
a. Hypotrlch Short IESs
.
14539 bp
b. €uploteslParameclum Short IESs I
28882 bp
I
1
TA -+TA
--
c. OxjWcha TBEl transposon (4.1 kbp)
.
-wA
42 kD (transposase)
5 7 kD
22kD
d. €up/otes crassus Tecl Element (5.3 kbp) L
TA
ORF 1
(transposase)
8.
TA
J- *
o*
Tetrahymenathemophila M region 0.9 kbp
I
I= A5G5 tract f. Tetfahymena thefmophilaTlrl deletion Deleted Region; ~ 1 kbp 3 l
-#-: 1= l 9 A Repeat I= 19B Repeat FIG.3. Organization of developmentally deleted DNA segments in various ciliate species. The deleted DNA segments are generally indicated as rectangles; terminal direct repeats are indicated in some instances. Brackets,when shown, denote excision boundaries. Thick arrows indicate inverted repeats. (a) Short hypotrich IES, with XYZ denoting the 2- to 7-bp terminal direct repeats that vary for different IESs. (b)Paramecium and EupZotes massus short IESs with terminal TA direct repeats. (c) The Oxytncha TBEl transposon. Arrows below the transposon denote three conserved open reading frames. (d)The Euplotes crassus Tecl transposon.Arrows below the transposon again denote the major open reading frames. (e) The Tetrahymena t k mophila M-region. The boundaries of the alternative 0.6- and 0.9-kbp deletions are indicated, as are the positions of the A,G, sequence motifs. (f) The T&ahynzaa thermophila Tlrl deletion. The positions of the two repeated 19-bp sequence elements (19A and 19B) are indicated within the inverted repeats. Note that the Tlrl deletion boundaries do not precisely coincide with the termini of the inverted repeats.
DNA DELETION IN CILNTES
11
mately 2000 TBEls exist within the micronuclear genome, and all are eliminated during macronuclear development. About 2000 copies of a closely related transposon, termed TBEl-tri, exist in the haploid micronuclear genome of 0. trifallax (49). The TBEl-bi elements cross-hybridize with the TBEl-fal elements. The similarity of the two elements is also supported by the fact that TBEl-tri elements are 4.1 kbp in length with 78-bp inverted repeats that are 88% identical to TBEl-fal elements, and that they also are bounded by an ANT direct repeat. More recently, the two Oxytricha species have been shown to harbor two additional related families of micronuclear-limited TBE element families termed TBE2 and TBE3 (K. R. Wilhams, T. G. Do& and G. Henick, unpublished).About 1000 copies of each of these 4.1-kbp elements exist in the micronucleus, but they do not cross-hybridize with the TBEl elements. Although no TBEl transposition event has been detected in the lab (transposition is normally detected by the phenotype of the mutation generated, but TBEl transpositions are expected to be phenotypically silent because they are limited to the micronucleus), other types of evidence indicate that TBEls are transposons. Their high copy number in micronuclear DNA is consistent with transpositional proliferation. Moreover, TBEl structure is characteristic of a large class of transposon types that transpose by a nonreplicative “cut-and-paste” mechanism; i.e., they possess inverted terminal repeat sequences, and these are hghly conserved (48,49). Most transposons also generate target site duplications at their sites of insertions (reviewed in 50), and the terminal ANT direct repeats of TBEl elements can be viewed as such target site duplications. Also, analyses of the sequences flanking TBEl elements, and of sequences of “empty” sites allelic to insertions, show that TBEls frequently interrupt the sequence CANTG, with the central ANT always being duplicated (49). Such insertional specificity is not uncommon among transposons [e.g., Tn10 (51)and the Tcllmariner/IS630 family (13)]. Additional evidence that TBEls are transposons derives from the analysis of micronuclear alleles of the 8 l-locus ( 4 0 ~ Alternative ). processing of the 8 l-locus generates a nested set of three macronuclear chromosomes, bearing one or two of three expressed protein-coding genes, including a mitochondrial solute camer protein (40; A. Seegmiller,K. R. Williams and G. Herrick, unpublished). Multiple alleles of this locus have been identified in both 0.fallax and 0. trfallax.All the alleles in both species are interrupted by five short IESs at invariant positions, indicating that the short IESs arose, and were fixed, prior to the divergence of the alleles. In contrast, only one allele in each species is interrupted by TBEl elements: one 0.fallax allele bears two TBEl-fal elements interrupting a gene, and one 0. trifallax allele bears one TBEl-tri element, also interrupting the same mitochondrial solute carrier gene (48, 49). This situation can be interpreted as resulting from either the loss of TBEl elements from most alleles or the recent insertion of TBEl
12
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
elements into some alleles. Phylogenetic analysis of the different alleles is , that the situation clearly resemconsistent with the latter scenario ( ~ O U ) so bles what one would expect for a transposon. A further strong indication that TBEls are transposons has emerged from the sequence analysis of several TBEl elements (D.Witherspoon, T. G .Do&, K. R. Williams and G. Herrick, unpublished). The sequence of TBEl-fal-1 shows five large open reading frames, of which three have been evolving under selection for function of the encoded protein (described later; Fig. 3c). One of these conserved ORFs encodes a 22-kDa protein of unknown function. The other two ORFs encode proteins with strong similarities to known proteins. One encodes an apparent protein b a s e with N-terminal Cys-CysHis-His-typezinc fingers, and the other encodes a protein homologous to the transposases encoded by a number of established transposons found in a wide range of eukaryotic and prokaryotic hosts. These transposases include those of the families of the Tcls of nematodes, the mariners of insects, and IS630-related transposons of bacteria, as well as the retroviral integrase proteins and the IS3 transposase family (13).These transposases show a conserved motif with a cluster of acidic residues (the so-called Asp-Asp-35-Glu, or D,D35E, motif) known to coordinate Mg2+in the transposase active site (52).Subsequently, a number of other well-characterized transposases have been shown to carry homologous D,D35E motifs: muA of bacteriophage Mu (53),the TnlO transposase (54),and the tnsA and tnsB proteins of Tn7 (55). Not only do these findings indicate that TBEls are transposons, but they also provide critical insights into the mechanisms of action of their transposases, because characterized D,D35E bnsposases often function by a cut-andpaste strand-transfer reaction (reviewed in 10, 11). Additional studies indicate that the TBEls are not only transposons, but that they are transpositionally active (D. Witherspoon, T. G. Do&, K. R. Williams and G. Hemck, unpublished). The three major open reading frames (Fig. 3c), including the transposase gene, have been sequenced in 10 cloned TBEls, and the data subjected to a divergence analysis. The latter involves calculating the realized fractions of potential synonymous (DJ and nonsynonymous (DJ codon mutations between each pair of matching sequences. High values of DdD, imply that an open reading frame (ORF) has been under selection for function of its encoded protein. DdD, ratios obtained from the TBEl analysis are high (--15-1'7), indicating that a selection has been acting on these genes during the time since these elements were created and have been diverging from one another. Further support for the conservation of the TBEl ORFs comes from polymerase chain reaction (PCR) studies (D. Witherspoon, T. G. Doak, K. R. Williams and G. Herrick, unpublished). PCR amplifications of segments of TBEl elements were performed on genomic DNA under conditions where
DNA DELETION IN CILIATES
13
-70% of the elements were sampled. Homogeneously sized PCR products resulted, arguing that TBEls are conserved in length. More importantly, bulk sequencing of the PCR products indicates that they are conserved in sequence. Some sequence ambiguities were present, but these were primarily synonymous. Thus, the data again indicate TBEls have diverged under a selection for the function of their three genes. Because one of the ORFs encodes a transposase, this implies that a large fraction of the elements is transpositionally active. Although TBEl elements show clear similarities to the Tcl/murinerl IS630 group of transposons, they differ in two aspects. First, the other members of this group create TA target site duplications (reviewed in 13), as compared to the ANT duplications of the TBEls. Second, the telomeric repeats present at the ends of TBEl elements are not seen in any other member of the group. The origin of the repeats is unclear, but it has been suggested that a free linear form of an ancestral ciliate transposon acquired the repeats by the same telomere addition process that operates on the ends of macronuclear chromosomes during development, and subsequently reinserted into the micronuclear genome (48).That the telomeric repeats are conserved within TBEls (49), and between TBEls, TBE2s, and TBE3s (T. G. Do&, K. R. Williams and G. Hemck, unpublished) implies that the repeats serve some function for the elements (e.g., binding host telomere-binding proteins during transposition or excision). At least some TBE elements are not only transposons but also are IESs. All TBE elements are eliminated during macronuclear development, but in a few cases it has been demonstrated that individual elements are precisely excised as a unit from the macronuclear-destined sequences they interrupt along with one copy of the ANT direct repeat. This results in the exact somatic reversion of the respective germ-line insertion mutation. The first key evidence in this regard came from the analysis of the two TBEl-fal elements (TBEl-fal-1 and TBEl-fal-2) that interrupt one of the micronuclear alleles of the gene encoding the mitochondrial solute carrier protein in 0.faZZux. This micronuclear allele was shown to produce an actively transcribed macronuclear gene from which the TBEl elements were precisely excised (56, 57). Similarly, the micronuclear allele of the same gene that carries a TBEl-tri transposon in 0. trifullax also produces a functional macronuclear chromosome with full efficiency and fidelity (49). More recently, TBE2 and TBE3 elements have been i d e d i e d that interrupt macronuclear-destined sequences, and these have been shown to produce macronuclear chromosomes from which the elements are precisely excised (G. Hemck, K. W. Williams and T. J. Doak, unpublished). Finally, relatively abundant free circular forms of unit length TBE1, TBE2, and TBE3 transposons have been observed during 0,trfallux macronuclear development (49; K. W. Williams,
14
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
unpublished results). These circles represent a signhcant subset of the element family, arguing that many TBEl s are excised as discrete units,and thus are IESs. The other hypotrichous ciliate known to harbor transposon IESs is E. crussus. The initial transposon family identified was termed Tecl (transposon of E. c~assus1).Tecls are present in 5000-7000 copies per haploid micronuclear genome (45, 58, 59). They are 5.3 kbp in length and have inverted terminal repeats of about 700 bp (Fig. 3d) (59-61). Like the short IESs in E. crussm, the Tecl elements are bounded by TA direct repeats. The estimate of 5000-7000 copies of Tecl elements per haploid genome is based on elements that conform closely to a consensus restriction map; perhaps an equal number of copies that have apparently diverged from the consensus map also exist within the micronuclear genome (45, 62). Sequencing has revealed that each Tecl element carries three long open reading frames (Fig. 3d) (60).ORF 1 encodes a protein that is homologous to the protein encoded by the Oxycricha TBEl elements and to the transposases of the Tcl/mQriner/IS630family of transposons (13),all members of which create TA target site duplications on insertion. The other two ORFs encode basic proteins of unknown function. As is the case for the TBEl elements, transposition of Tecl elements has not been observed in the laboratory. However, the structure of Tecl elements, their high copy number, and the fact that they encode a protein homologous to known transposes all argue that Tecl elements are indeed transposons (although not necessarily presently active; see Section V,C). Subsequent to the discovery of the Tecl transposons, a second related group of elements (Tec2) was identified in E. c~assus(63). Like Tecl elements, Tec2s are 5.3 kbp long and are present at a similar copy number in the micronuclear genome. Tec2 elements are bounded by TA direct repeats and have inverted terminal repeats that are again about 700 bp long, but only the first 300 bp show sigmficant sequence similarity to the Tecl elements (60).Internal regions of the Tec2 elements show only weak cross-hybridization to corresponding regions of the Tecl elements, but sequence analysis indicates that they are clearly related. Tec2 elements contain open reading frames capable of encoding proteins similar to those of ORFl and ORF3 of the Tecl element (60). A region corresponding to the Tecl ORF2 is also present, but it appears to have been divided into two adjacent ORFs by a single base insertion. It has been suggested (60)that a programmed frameshift similar to the types found in retroviruses (64)could be responsible for producing a single ORF2 protein in Tec2 that would be equivalent to the corresponding Tecl protein. At least some members of the Tec families are IESs. Three examples of Tec elements intempting macronuclear-destined sequences have been de-
DNA DELETION I N CILIATES
15
scribed, and all have been shown to excise such that one copy of the terminal TA direct repeat is retained in the resulting macronuclear DNA (45, 63). More globally, large numbers of free, unit-length, extrachromosomal circular and linear copies of Tec elements are observed in the DNA of cells undergoing macronuclear development (59, 61, 63),a r p n g that Tec elements are excised as units from many positions in the micronuclear genome. Whether every Tec element is excised as a unit is less clear. Recent work indicates that some Tec elements are eliminated during the polytene chromosome stage of macronuclear development, whereas others are eliminated later during the onset of the vesicle stage (65, 66). The latter elements reside within other sequences that are subject to developmental elimination, raising the possibility that they are not excised as units. The terminal direct repeats, and in some cases inverted terminal repeats, of the short hypotrich IESs resemble the organization of many transposons and this has led to suggestions that they are related to transposons (17, 38, 39, 48,49, 67). This topic is discussed in detail in Section V, but briefly, the notion is that transposons capable of developmental excision have populated the micronuclear genomes of ciliates, and that over the course of evolution, some transposon copies have undergone mutation and deletion events to form the short IESs. The identification of hypotrich transposon IESs has strengthened this notion. The situation is particularly striking in E. massus, where both the short IESs and Tec elements possess terminal direct repeats of TA. Indeed, this similarity does not appear to be coincidental, in that a recent analysis of E. massus short IESs indicates that six of the first seven terminal bases of the IESs are nonrandom in sequence, and that the deduced consensus sequence of this conserved region is similar to the sequence of Tec element ends (68) (see Section III,B,2 and Fig. 7). Some indcation that the Tec and IES excision processes might be distinct came from studies of the timing of excision during macronuclear development. Both types of elements are excised during the polytene chromosome stage of macronuclear development, but early studies indicated that Tec elements were removed 10-15 hr prior to the short IESs (29,47,59). Even this apparent discrepancy has been minimized, however, as more recent work indicates that two distinct periods of Tec element excision occur during the polytene chromosome stage, with the later period coinciding with the period of excision for the small IESs (66).
6. Deleted DNA in Tetrahyrnena The first clear evidence for DNA breakage and rejoining events in ciliates was obtained from studies on the oligohymenophoran T themophila in 1984. Yao et al. (69) isolated an anonymous 9.5-kbp cloned segment of micronuclear DNA and compared it to the corresponding region of the
16
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
macronuclear genome by a combination of restriction mapping and Southem hybridization analyses. The micronuclear DNA contained three segments of DNA referred to as the R, M, and L regions (1.1,O.g or 0.6, and >2 kbp in size, respectively) that were absent from the corresponding region of the macronuclear genome. Similarly, Callahan et d.(70) used Southern hybridization to examine the micronuclear and macronuclear organizations of the a-tubulin gene in Tetrahymena and found that segments of about 1kbp were excised from both the 3’ and 5’ flanking regions of the gene. Well-characterized deletion segments have since been identified near the calmodulin gene (74,within a gene of unknown function (the mse2.9 IES) (72),and in an anonymous micronuclear region (73). One of the unusual, and perhaps distinguishing, features of DNA deletion in Tetrahymena is that no IESs have been found that interrupt the coding regions of genes, although the mse2.9 IES is located within an intron (72). Although only a handful of DNA deletion segments have been well-characterized in this system, such segments appear to be quite common. Surveys employing random cloned segments of micronuclear or macronuclear DNA in hybridization analysis lead to estimates of approximately 6000 DNA deletion events occurring per genome (69, 74).If the average IES is about 2 kbp, which is not unreasonable given the IESs characterized to date, DNA deletion events can account for the majority of the 15% of micronuclear sequences that are eliminated during macronuclear development in Tetrahymenu (75). All of the sequenced Tetrahymena IESs are AT rich and contain no significant open reading frames, but otherwise show little sequence similarity and few universally conserved organizational motifs. To illustrate this point, the M-region IES (Fig. 3c) is considered first, because it represents the most thoroughly studied IES. The M-region can undergo two alternative forms of rearrangement such that either 0.9 or 0.6 kbp of DNA is eliminated (76) (Fig. 3e). The two alternative deletions differ in their left boundaries, but share a common right b o u n d q . Sequencing of the M-region (77)revealed that short direct repeats of 8 and 5 bp are located near the boundaries of the 0.9- and 0.6-kbp deletions, respectively. An additional feature noted in the analysis was copies of the sequence AAAAAGGGGG (A5G5)located about 45 bp 5’ from each of the three M-region deletion boundaries (i.e., within the sequences flanking the IES that will be retained in the macronucleus) (Fig. 3e). As discussed in more detail later (Section III,B),these flanking sequences are essential for IES excision and serve to define the end points of the deletion (78, 79). For three of the other four sequenced IESs, short direct repeats (4-6 bp) are seen in the vicinity of the deletion boundaries (71, 72, 80).While all of the direct repeats are composed primarily of AT base pairs, they each differ
DNA DELETION IN CILIATES
17
in primary sequence. More importantly, none of the other Tetruhymena IESs possesses the A5G5sequence element that is essential for the M-region deletion in their flanking sequences. Short palindromic regions have been noted that flank the R-region IES (80),the IES that flanks the calmodulin gene (74,
and the mse2.9 IES (72),but there is as yet no evidence that these sequence elements play a role in the excision process. In contrast to the hypotrich systems, there is as yet no clear evidence in Tetruhymenu for high-copy-number transposon-like elements that are subject to developmental excision. Southern hybridization analyses indicate that the L, R, and calmodulin gene-associated IESs are related to small families of sequences in the micronuclear genome, and that most members of these families are eliminated during the course of macronuclear development (71, 76). However, none of these additional family members has been analyzed in detail, so that it is unclear if these sequences constitute similar units of excision or simply reside within larger segments of DNA that are subject to elimination. This same situation applies to a number of repetitive sequence families that have been identified within the micronuclear genome (e.g., 81-84). One repetitive sequence family, the Tel-1 elements, is a candidate for a transposon family (85). The family was identified based on isolating micronuclear clones that contain internal blocks of the C,A, telomeric repeat sequence characteristic of the organism. Each clone analyzed had at least 20 copies of the telomeric repeat, plus an adjacent conserved 30-bp sequence. One clone had two copies of the telomeric repeat and associated sequence arranged in inverted orientation. This suggests that the Tel-1 elements have a transposon-like structure and are in this way analogous to the TBE elements in 0.fallax and 0. trifullur. Other analyses indicate that the Tel-1 repetitive sequence family consists of members that are primarily 9.7 or 13.2 kbp long, and that at least some members of this family are unstable, and thus possibly transpositionally active, in the micronuclear genome (86).Most of the family members are eliminated during macronuclear development, but, again, it is not clear whether the Tel-1 elements are the unit of excision or simply reside within larger stretches of developmentally eliminated DNA. The one sequenced rearranged segment of DNA in Tetrahymena that most closely resembles a transposon in organization is Tlrl (Tetruhymena long repeat element), which is at least 13kbp long (Fig. 3f) (73).The two termini of Tlrl have been cloned and sequenced separately, so that it is not yet clear that it undergoes a simple deletion as an IES. However, both ends of the Tlrl reside on the same micronuclear chromosome, so its removal by a simple deletion process seems likely. Inverted repeats (825 bp) lie near the two boundaries of the Tlrl deletion, and two distinct 19-bp sequences are repeated multiple times within the inverted repeat. This terminal sequence or-
18
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
ganization is reminiscent of the structure of TU transposons first identified in sea urchins (reviewed in 87).The Tlx-1element undergoes alternate forms of rearrangement. The major class of deletion has boundaries that do not coincide with the ends of the inverted repeat (indeed one deletion boundary is such that more than 200 bp of one inverted terminal repeat is retained in the macronucleus), but some of the minor deletion products have boundaries that coincide more closely with the ends of the repeats (73). Hybridization experimentsindicate that the two 19-bprepeated sequences are associated with each other at 6 or 7 additional sites within the micronuclear genome and that each of these is subject to developmental elimination. Part of one additional member of the family has been cloned (TlrZ), and it also appears to reside adjacent to a site of developmental DNA rearrangement. The current view is that Tlrl is a representative of a small family of related IESs that possess similar termini. Additional studies are needed to determine if these IESs share conserved internal regions, if they possess other features of typical transposons, and if the rearrangement events associated with the other family members more closely coincide with the repeated sequence boundaries.
C. Deleted DNA in Paramecium Despite the fact that the oligohymenophoran Paramecium has historically been one of the most intensively studied ciliated protozoa, the analysis of DNA deletion during macronuclear development in this organism is just beginning. Work in this area had been hampered by the inability to isolate micronuclei, and hence to obtain micronuclear DNA in pure form. The recent development of a method to accomplish this task (88) has allowed the micronuclear counterparts of a number of macronuclear genes to be cloned and has revealed numerous IESs. Indeed, Paramecium now represents the ciliate species for which the largest number of IESs have been sequenced. All of the IESs identified to date exist in and around members of a multigene familywhose members encode alternatively expressed major cell surface proteins referred to as immobilization antigens (i-antigens). The micronuclear copies of all or part of six different i-antigen genes have been analyzed and 21 IESs have been identified (89-92; H. Schmidt, unpublished). If the i-antigen genes are representative of the general distribution of IESs in the micronuclear genome, this would imply that approximately 65,000 IESs undergo developmental excision during macronuclear development (93). The Paramecium IESs are again AT-rich sequences that range in size from 28 to 882 bp. Moreover, 5 of the 21 known IESs are 28 bp long, and it has been suggested that this may represent the minimum size necessary for effcient developmental excision (94). As in E . crassus, all of the Paramecium IESs can be viewed as possessing TA direct repeats at their termini (Fig. 3b),
DNA DELETION IN CILLATES
19
and this suggests that the DNA excision processes in the two organisms are closely related. Further support for this notion derives from a statistical analysis of the sequences at the ends of 20 of the Paramecium IESs (94).The results indicate that the first 8 bp of the IES ends are well conserved, and imply that the ends of the IESs are short inverted repeats. The consensus sequence derived from this analysis is TAYAGYNR, which is quite similar to the termini of the E. c~assusTec transposons, and more generally to the Tclrelated family of transposons [the implications of this similarity are more fully discussed in Sections III,B (see Fig. 7) and v]. No transposon-like elements have been identified in Paramecium, but there has not yet been a systematic attempt to identify such species. There is also no information on the timing of IES excision events during macronuclear development in Paramecium. The precision of excision has not been broadly addressed, but based on Southern hybridization analyses (88-93), and the sequence analysis of single macronuclear and micronuclear clones, excision appears to be precise, with one copy of the TA direct repeat retained in the macronuclear DNA molecule. Precise excision is expected, because many of the identified IESs interrupt coding regions. However, a recent analysis has identified a regon near the D surface antigen gene in the macronucleus that is highly variable in size (95).Sequencing of this region from multiple macronuclear chromosomes suggests that it is derived by the variable excision of multiple DNA segments that are bounded by TA repeats. Isolation and sequencing of the correspondmg region of the micronuclear genome will be required to demonstrate that variable IES excision is the basis of the phenomenon, but the results suggest that there may be greater variability in the efficiency, and perhaps fidelity, of excision in noncoding DNA.
111. Mechanisms of Internal Eliminated Sequence Excision An understanding of the molecular mechanisms underlying DNA deletion in ciliates is clearly of interest, in part because it may also provide information on the origin of IESs. A detailed understanding of deletion mechanisms will require the reconstsuction of such reactions in vitro, but none of the ciliate systems has yet progressed to this point. Nevertheless, insights into the mechanisms of deletion have been obtained through the analysis of the deletion products and intermediate forms. Some information concerning the DNA sequence requirements for excision events, and potential trans-acting factors, has also been obtained. These results are considered in this section, along with a discussion of how the DNA deletion process is potentially related to other DNA rearrangement events in ciliates.
20
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
A. Analysis of Excision Products and Intermediates; Models of Excision Detailed studies of deletion products and excision intermediates have been performed only on a few ciliate species: E. crassus, 0. &ifallax, and T. themzaphila.These species have been amenable to such types of analyses because mating can be controlled in the laboratory. This allows production of large populations of staged cells that can serve as a source of biochemically significant amounts of DNA or protein, as well as the correlation of rearrangement events with other cellular processes. The studies on deletion products and intermediates have yielded similar results in some instances, but significant differences have also been observed that suggest mechanistic differences. 1. DNA Excision Products in the Hypotrichs a. Macronuclear Junctions. The hypotrich studies have focused on characterizing the two products of excision: the macronuclear DNA and the liberated IES. As noted in Section II,A, hypotrich short IESs, Tec elements, and TBE elements are all excised such that one copy of the direct repeat flanking these IESs is retained in the macronuclear sequence. In most cases, this conclusion is based on the comparison of sequences from one, or at most a few, macronuclear DNA clones. There have, however, been three studies that have more globally examined macronuclear DNA products (these will be referred to as “macronuclearjunctions” or “excision sites”), and in each case the conclusion is that excision is precise and efficient, leaving behind one copy of the terminal direct repeat. The approach has been to use PCR to amplify macronuclear junctions and sequence the resulting PCR products to assess variability in excision. In E. crassus, this approach has been used to analyze excision sites in DNA isolated from developing macronuclei soon after excision occurs, and in DNAs isolated from multiple clonal cell lines arising from independent episodes of macronuclear development (96). In the latter case, even though clonal cell lines were examined, multiple excisions were in effect analyzed, because excision occurs during the polytene chromosome stage of macronuclear development when multiple chromatids are present (29, 59, 66). Three short IESs and two Tec elements were examined, and in all cases the results indicated that single TA direct repeats were retained at the macronuclear junctions with no evidence of variability. The same type of approach has been used to examine the macronuclear excision products resulting from the removal of an 0.fallax short IES (57), the 0.fallaxTBEl-fal-2 element (57),and the 0.trifallaxTl3El-tri-1 element (49).Again, only excision products retaining one direct repeat were observed. Although it is not possible to rule out completely a low level of variability in
DNA DELETION IN CILIATES
21
the hypotrich excision process, there is not a single example where anything but a single copy of the direct repeat has been observed at an excision site. Thus, all evidence points to a high degree of fidelity in the hypotrich excision processes.
b. Tec Transposon Circles. The nature of the IESs following excision has been examined in E. c~assusand 0.trqaEEax. The studies on Euplotes indicate that both the short IESs and Tec transposons assume free, circular forms following excision. In the case of the short IESs, evidence for circular forms has been obtained indirectly through the use of PCR strategies that are designed to produce an IES-derived product only if the IES assumes a circular form (47). PCR products of the expected sizes were obtained only from DNA of cells undergoing macronuclear development at times following the removal of short IESs, arguing that the circular forms are generated only in concert with developmental excision events. For the Euplotes Tec elements, the high copy number of these elements allowed free circular forms to be directly visualized by ethidium bromide staining of agarose gels of DNA isolated from developing macronuclei, as well as by Southern hybridization (59, 63). Such forms first appear at the onset of the polytene chromosome phase and persist throughout much of the remainder of macronuclear development. At this stage, there are insufficient data to demonstrate that DNA circles are the direct product of the excision reaction. However, one of the predctions for a primary excision product is that it should be produced in stoichiometric amounts relative to the reaction substrate. This prediction appears to be satisfied for the Tecl elements. Quantitative analyses indicate that approximately 2 X lo4 free forms per genome appear during macronuclear development (59).Even allowing for some DNA replication prior to excision, this large number of free elements indicates that a substantial portion of the Tecl transposons contribute to the population of free circles. The small size of the short IESs, and the fact that they represent unique sequences in the genome, has made it difficult to quantitate directly the level of free circular forms. Nonetheless, they are readily detectable by standard PCR procedures, arguing that the circular forms are not exceedingly rare, and that they could represent the primary excision product. The structures of the junctions of Tec and IES circles are similar but highly unusual. Given that the macronuclear junctions retain one of the TA direct repeats flanking these elements, it seemed quite likely that the junctions of the free circular short IESs and Tec transposons might also contain a single TA repeat. Instead, both the Tec and short IES circles contain two TA direct repeats at their junctions, separated by 10 bp of non-IES DNA (47, 59, 61, 97). For the Tec elements, this junction structure has been deduced by the
22
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
sequence analysis of individual cloned Tecl and Tec2 junctions and by bulk sequencing of purified Tec circular forms (59, 61).In the latter case, the 10bp separating the pair of direct repeats at the junction give the appearance of being random (i.e., no sequence is readable). Insight into the origin of the extra nucleotides at the junction has come from analyzing the circular forms of the short IESs. Because these sequences are unique in the micronuclear genome, it has been possible to design PCR strategies that amphfy the junction from a single IES. Sequencing of such PCR products directly suggested that the 10-bpjunction sequence is derived from bases flanking the IES, but some ambiguity at the central 6 bp of the junction was evident (47). Cloning and sequencing of the PCR products revealed that two types of junction sequence were present: one consisting of a TA direct repeat followed by 2 bp from the right side of the IES joined to 8 bp from the left side of the IES and the other TA repeat, and a second with 8 bp from the right side of the IES joined to 2 bp from the left side of the IES (97). One explanation for these two types of PCR products is that the free IES circles have junctions with 6-bp heteroduplexes at their centers, such that one strand of the heteroduplex is derived from left-flanking sequences, and the other is derived from right-flanking sequences (see Fig. 4). Evidence supporting the existence of a heteroduplex, as opposed to two distinct populations of free circles in the cell with different junctions, was obtained by employing “strand-biased PCR” (97). More direct evidence for a heteroduplex at the junction of the free Tec elements was obtained by Jaraczewski and Jahn (64, who demonstrated that the junctions of the Tec circles were not nicked or gapped, but were sensitive to digestion by S1, mung bean, and Bal31 nucleases, all of which have endonuclease activity for single-stranded DNA. Based on the nature of the two excision products, models for the short IES and Tec element excision process have been proposed (61,97). The two models are quite similar, but the Tec element model (61)is more general owing to the fact that the length of the heteroduplex at the junction could not be directly determined for these elements. Therefore, we present the model based on the short IESs (97), for which the structure and derivation of the junctional nucleotides is more clear (Fig. 4).The model accounts for the shared nucleotides between the macronuclear DNA molecule and the free circular short IES or Tec element by proposing that staggered cuts initiate the excision process and that subsequent filling in of the overhangs results in sequence duplication. At each element end, one DNA strand is cleaved internal to the TA direct repeat, while the other strand is cleaved 8 bp outside the element, generating DNA ends with 10-base 5’ overhangs @’-overhanging ends are predicted on the basis of the strand-biased PCR experiments) (97).Next, for the pair of ends destined to form the macronuclear junction,
23
DNA DELETION IN CILIATES
. ..B B1-B B B B B B B B ~
5' 3'.
..bbbbbbbbbb&
t
7 Short lES'Tec
...
tpDDDDDDDDDD 3' atdddddddddd.. . 5 '
Fill-in and Ligation
Macronuclear DNA Molecule
t
~~m~
Free Short IESnec
FIG.4. Model of Euplotes crmsus short IES and Tec transposon excision (97).A short IES or Tec transposon within micronuclear DNA is shown at the top. The teiminal TA direct repeats are shown along with the complementaryleft- and right-flanking micronuclear DNA sequences (B/band D/d, respectively).Bases that will form the heteroduplex region in the free circular short IES or Tec transposon are indicated by lines. Arrowheads indicate the positions of DNA strand cleavage at each end of the element. Bases that result from new DNA synthesis are indicated in italics. Additional details of the model are described in the text.
their terminal TA residues base-pair, and then the gaps are filled in and ligated. This forms a macronuclear junction with a single TA repeat. The other pair of ends is processed differently. It is proposed that the excision machinery aligns these ends such that the terminal 6 bases of the overhangs overlap, even though they are for the most part incapable of base pairing. Fill-in of the gaps and ligation then generate the circular forms with two TA repeats separated by 10 bp, and a central 6-base heteroduplex. Overall, this model is novel and shows little similarity to other site-specific recombination processes, with the possible exception of the bacterial transposon Tn916. Free circular forms of the bacterial transposon Tn916 that have been identified appear to be intermediates in transposition, and these circles have a heteroduplex junction that derives from sequences flanking the integrated form (reviewed in 98). However, in contrast to the Euplotes IESs junctions, the heteroduplex of the Tn916 circles spans the entire region between the transposon ends. Further studies on excision intermediates in Euplotes will be required to substantiate and refine this unusual model of excision.
24
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
c. TBEl-CTi Trunsposon Circles. The fate of the excised DNA has also been examined for the TBEl-tri elements in 0. triyalkx (49).Again, free supercoiled TBEl circles (4.1 kbp) have been identifed by Southern hybridization to DNA isolated from cells undergoing macronuclear development. PCR amplification was then used to examine the nature of the TBEl-tri circle junctions. Bulk sequencing of the PCR product showed that the circle junctions consist of the two transposon inverted terminal repeats joined by the sequence ANT, where the second position is fully degenerate. Given that TBEls are found to interrupt the sequence GANTC, with the central ANT duplicated at each element end, it seems that precise excision (removalof the element and one ANT duplication) cyclizes the element and the circle carries away one ANT duplication at its junction. A similar situation has been observed for the bacterial transposon IS911. Polard and co-workers (99) studied the effects of overproduction of IS911 transposase in trans on a plasmid carrying a copy of IS911. Two products resulted: first, the plasmid with the element removed and a single copy of the target duplication remaining (analogous to the macronuclear junction resulting from TBEl-tri excision),and second, IS912 circles with a single copy of the target duplication joining the two element ends (analogous to the TBEl-tri circles). Given that IS911 transposase was responsible for this reaction, and that IS911 transposase is a D,D35E homolog of TBEl transposase (13),it was suggested that a modified TBEl transposition reaction is responsible for TBEl developmental excision (49). The proposed model for TBEl transposon excision (Fig. 5) (49) is a variation on the known mechanism employed by cut-and-paste transposons (10, 11).A modified transposase produced from the TBEl-tri element is viewed as carrying out the reaction. The transposase may be physically modified by the other two conserved transposon-encoded proteins (e.g., it might be phosphorylated by the zinc fingerpotein kinase), or its activity may be altered by interaction with these proteins. Excision is initiated by a staggered doublestrand break at one end of the TBEl, producing two ends with underhanging 3’-OH groups and 5’-protruding ANT tails. The two 3’-OH groups then attack the corresponding phosphodiester bonds at the far end of the element (in essence being transposed to that position), because this sequence resembles the transposase target. The macronuclear DNA joint is covalently formed by the attack of the flanking 3’-OH at the bond 5‘ of the far ANT in the same strand, and one covalent TBEl circle strand is formed by the attack of the element 3’-OH on the bond at the 5’ end of the ANT duplication at the far end, liberating a 5’ PANT protruding from the other strand. The macronuclear DNA joint is completed as this protrusion pairs with its complement, forming a ligatable 5’-P, 3’-OH nick. Similarly, the second strand of the element circle also forms an ANT duplex and another ligatable 5‘-P,
25
DNA DELETION IN CILIATES
Micronucleus:
-CANT -gT
NA
1
Cleavage
-gTNA Transesterification
FIG.5. Model of TBEl excision (49).A double-strand break at the borders of the ANT direct repeat (arrowheads)at one end of the TBEl transposon initiates excision. The two resulting 3' hydroxyl groups participate in transesterificationreactions with phosphodiester bonds of the direct repeat at the opposite end of the transposon, which resembles the transposition target sequence (Jxghlighted in black). This yields a free TBEl circle with one covalentlyjoined DNA strand, along with the macronuclear DNA, also joined on one strand. Carets denote nicks that must be ligated to form the final products. See text for additional details.
3'-OH nick. Ligation then seals the nicks to generate the macronuclear DNA and the free circular TBEl-tri transposon, each carrying one copy of the target site duplication. One of the pleasing features of this model is that the free circular TBEls can be viewed as a product of transposition-like reaction, rather than as a transposition intermediate. Thus, the circular forms would not be expected to reinsert elsewhere. Excision of the element, only to have it reinsert elsewhere, would accomplish little in regard to generating functional macronuclear genes (49).
26
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
2. EXCISION INTERMEDIATESAND PRODUCTS I N Tetrahymena thermophila a. Macronuclear Junctions. Detailed studies on deletion products and excision intermediates in Teh-ahymenahave been performed only on the Mand R-region deletions. The macronuclear products of excision differ significantly from those observed in the hypotrichs. As discussed previously, some of the Tetrahymena deletion sequences vary greatly in regard to the boundaries of excision (73, 74, 76). For example, the Tetrahymena M-region deletion can occur such that either 0.6 or 0.9 kbp of DNA is excised (Fig. 3e) (76). In addition to this macroheterogeneity in deletion boundaries, microheterogeneity also is observed. The experimental approaches used to address this question have involved direct sequencing of multiple macronuclear empty sites or PCR products derived from such regions (100-102), as well as Southern hybridization analyses employing oligonucleotide probes corresponding to known macronuclear junction sequences (80, 102).For the 1.1kbp R-region deletion, two alternative sets of deletion boundaries have been defined (101,102).One form of the deletion predominates, occurring >90% of the time. The minor form of the deletion has left and right boundaries that are each shifted 4 bp in the same direction, such that the same length of DNA is eliminated but the macronuclear junctions contain four different bases. The 0.6-kbp deletion of the M-region also displays two different boundaries (100).In this case the two types of deletions occur with equal frequency, and only the right deletion boundary is shifted, such that 13 bp is variably retained in the macronuclear DNA. For the 0.9-kbp M-region deletion, three different sets of closely spaced deletion boundaries have been observed at roughly equal frequencies, and these involve variability in both the left and right boundaries (100-102). All of these R- and M-region deletions can be viewed as occumng with short direct repeats at their boundaries, with one repeat being retained at the macronuclear junction. However, for three of the macronuclear junctions, the repeat would only be a single base in length, which makes the significance of a terminal duect repeat in Tetrahymena questionable. Overall, the results on the M- and R-regions indicate that the deletions are not entirely precise, but are limited to a small number of outcomes. The macronuclear junctional microheterogeneity that results may explain why DNA deletion elements interrupting coding regrons in this species have not been found.
b. The Fate of Excised Elements. Analysis of the second product of excision, the liberated IES, has been difficult in Tetrahymena. Initial attempts to detect free excised forms by hybridization approaches were unsuccessful (76,103),arguing that the eliminated DNA sequences were rapidly degraded following excision. Two more recent attempts to detect excised IESs have
DNA DELETION IN CILIATES
27
relied on PCR approaches designed to detect free circular products (100, 101), as was done in the hypotrichs. In contrast to the hypotrichs, single rounds of PCR amplification did not readily yield the expected PCR products for M- or R-region free circular forms. However, products of the expected size could be detected if the PCR products were subjected to Southem hybridization analysis ( l o ] ) ,or directly if a second round of PCR was performed using a nested set of primers (100).These results indicate that circular forms of IESs are present within the cell, but that, in contrast to the hypotrichs, they are extremely rare. The products are, however, specific to the deletion process. PCR products are observed only when the substrate DNA is isolated from cells undergoing macronuclear development after the time of deletion events. Individual PCR products have been cloned and sequenced to define the circle junctions, and in many cases they correspond to what might be predicted from the known macronuclear junction sequences. That is, the circle junctions correspond to the expected product of a simple reciprocal recombination event, or to a transposase-based reaction as proposed for the TBE transposons (Fig. 5). However, not all of the expected circle junctions were observed, and others were difficult to explain, including some that contain substantial lengths of sequence from outside the known deletion boundaries (100).The overall results, particularly the scarcity of circular forms, led to the view that free circles are not the primary excision product in Tetrahymena. It seems more likely that the Tetrahymena IESs are initially excised as free linear forms and rapidly degraded. Occasionally, the linear forms undergo secondary processing to yield the observed circular IESs.
c. Analysis of Cleavage Intermediates. In lieu of clear-cut information on the nature of the IES following excision, studies examining potential in vivo excision intermediates have provided the best insight into the Tetrahymenu excision mechanism. Saveliev and Cox (104,105)have used anchoredPCR approaches to detect and characterize DNA breaks that occur at the ends of the M and R IESs during the course of development. The initial study employed a ligation-mediated PCR (LMPCR) strategy designed to detect phosphorylated 5' DNA ends near the boundaries of the R-region, the 0.6kbp M-region and the 0.9-kbp M-regon. LMPCR products of the expected sizes were observed in DNA from cells undergoing macronuclear development at the known time of excision events, but not in DNAs from cells at other times of development or from vegetatively growing cells. Sequence analysis of the LMPCR products allowed the positions of the breaks to be determined. With two exceptions,breaks were detected at positions that were consistent with all the known macronuclear junction products derived from the M- and R-regions. That is, by joining a break observed at one end of a
28
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
deleted region with a break observed at the opposite end, most of the observed macronuclear junctions could be generated. Moreover, two general rules emerged concerning the positions of DNA breaks. First, the positions of the breaks were such that A residues were present at the predicted 3' ends (in one case a G residue was present). Second, for each break detected on one DNA strand, a second break was detected on the complementary DNA strand 4 bp away. In other words, these two observationssuggest that doublestranded cleavages are occurring at sites with the sequence A " N N T to generate two products with 4-base 5' overhangs and 3'-terminal A residues. A subsequent study has supported this hypothesis (105).An anchoredPCR strategy was designed to detect 3'-OH termini, and such termini were observed at all of the positions adjacent to the 5' ends detected in the original analysis. Furthermore, PCR-based analyses revealed two additional important features of the cleavage intermediates. First, the cleavage events generate double-stranded breaks in the DNA such that the predicted termini with 4-base 5' overhangs are produced. Second, double-stranded DNA breaks can, in at least some instances, occur at one end of the IES without a corresponding break at the other end of the IES. Based on the known macronuclear products of M- and R-region excision, and the observed breakage intermediates, a model of the excision process has been presented (Fig. 6) (105).In the model, deletion is initiated by a doublestranded cleavage targeted to the A " N N T sequence present at one end of the IES. The cleavage event is such that a 4-base 5' overhang is generated, and each 3'-OH end terminates with an A residue. The hydroxyl group of the A residue present at the 3' end of the macronuclear-destined DNA then serves as a nucleophile to attack the phosphodiester bond between the A and N residue in an A " N N T sequence present at the other end of the IES. This transesterification reaction serves to link covalently one strand of the macronuclear-destined sequences, producing a macronuclear junction. Additional processing steps are required to join the other strand of the macronuclear-destined sequences, and to liberate the IES. These latter steps have not been specified, but the IES is thought to be liberated as a linear form that is processed occasionally to yield the rare circular forms that have been observed. There are a number of gratifylng aspects to the Tetrahymenamodel. First, it suggests a transposon-like mechanism for IES excision, and it is thus quite similar to the excision model for the 0. tr$aZZax TBEl-tri transposons (Fig. 5) (49).The major difference between the two models is that there is a second proposed transesterification step for TBEl-tri transposons that results in the circularization of the excised DNA. Second, the initial cleavage (initiation) can generally be viewed as occurring on either side of the IES, and this helps explain some of the macronuclear junctional diversity. In some cases
29
DNA DELETION IN CILIATES
chromosomal junction 1 1 1 - 1 1 1 1
&
-1111-11
-1111111
1 - 1 - 1 1 1 1
& A
tam1
FIG.6. Model of DNA deletion in Tetrahymena t h o p h i l a . An unexcised IES is shown at the top of the figure (black rectangle). An initiating cleavage event occurs at one end of the IES. Cleavage occurs at specific sites (arrows) and generates two DNA ends with 4-bp 5’ overhangs and 3’ A residues. The 3’ hydroxyl group of the A residue on the macronuclear-destined end serves as a nucleophile in a transesterification reaction with a corresponding site on the opposite side ofthe IES. This creates a macronuclex junction on one DNA strand. Additional processing steps are required to join covalently the opposite strand of the macronuclear DNA. See text for additional details. Figure reproduced from Saveliev and Cox (105)with permission.]
there may be a bias in whch end of the IES is chosen for initiation, and this would result in biases in the observed macronuclear junctions (e.g., the fact that 90% of the R-region junctions are of one type) (101).Third, it is particularly pleasing that both the M- and R-regon deletions conform to the model. Indeed, it has been pointed out (104)that the boundaries of the mse2.9 IES (72)and the IES near the calmodulin gene (71)can be viewed as conforming to the model. Thus, although these various IESs show little organi-
30
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
zational or sequence similarity, it may be that at least some aspects of the excision process, and hence the excision machinery, are shared.
B. Cis-acting Sequences 1. SEQUENCEREQUIREMENTS FOR M-REGIONDELETION
THE
Telruhymena
The ability to transform ?: thermqphila has made it possible to examine the sequence requirements for IES excision.The studies to date have focused primarily on the M-region deletion. The Tetruhymena transformation system used employs a vector based on the T&uhymena ribosomal RNA (rRNA) gene (106). More specifically, the vector contains a micronuclear form of an rRNA gene that confers a drug-resistance phenotype that allows for the selection of transformants when expressed in the macronucleus. When introduced into the developing macronucleus of cells at the appropriate developmental stage, the rRNA gene is correctly processed to yield a normal macronuclear chromosome with palindromic rRNA genes imparting drug resistance. To ident& cis-acting sequences involved in M-region excision, this IES plus 0.24 kbp of left-flanking and 0.9 kbp of right-flanking sequences was initially engineered into a nontranscribed region of the rRNA gene (78). When this construct was injected into developing cells, transformants resulted that contained rRNA genes in which the 0.6 and/or the 0.9 kbp M-region deletions frequently had occurred. This indicates that the M-region, plus a limited amount of flanking sequence, suffices to specify excision. Indeed, a construct containing the M-region with only 65 and 70 bp of left- and rightflanking sequences, respectively, was able to undergo both deletions. However, when the left-flanking region was reduced to 33 bp, 0.9-kbp deletion events were no longer observed. This truncation removes the A5G5 sequence that is present approximately 45 bp outside of all three M-region deletion boundaries (see Fig. 3e). The role of this sequence in specifylng deletion was further substantiated by using site-directed mutagenesis to alter three of the bases in the copy of this sequence motif that resides adjacent to the left boundary of the 0.9-kbp deletion. This abolished 0.9-kbp deletions, but had no effect on the 0.6-kbp deletion. Further insight into the role of the A,G, tracts was obtained by generating constructs containing short inserts (20-103 bp) between the right end of the M-region and the right-flanking A,G, tract (79).When such constructs were injected into cells, deletions occurred, but the right boundaries were shifted such that they were from 41 to 54 bp from the repositioned AsG5 tract. In other studies, A,G, tracts were introduced within the M-region. These insertions created new deletion boundaries, which were again located
DNA DELETION IN CILIATES
31
about 40-50 bp from the inserted A,G, tracts. Overall, the results indicate that the A,G, tract is necessary for excision and that it plays a major role in specifylng the boundaries of deletion in an orientation- and dstance-dependent manner. The roles of sequences at the ends and within IESs are currently less clear. In the M-region studies, altering the position of the A,G, tracts resulted in new deletion boundaries. That is, the direct repeats that reside at the normal boundaries of the M-region were not used. This suggests that these direct repeats are not essential. However, novel direct repeats of 1to 4 bp were usually present at the new deletion boundaries, suggesting that at least some direct repeat may be a requirement of deletion. There is also one perplexing aspect to the new deletion boundaries. As discussed in Section III,A, there is evidence that the deletion boundaries are created by cleavage at the sequence ANNNNT, and a model of deletion has been proposed based on this type of cleavage (Fig. 6). Some of the novel boundary sites do not conform to the model. A limited number of studies have investigated the role of internal sequences. It is clear that an intact M-region is not required for excision. Deletion of 395 bp internal to the 0.6-kbp M-region, such that only 16 bp of the sequence adjacent to the right boundary is retained, did not disrupt correct excision, although the frequency of such events was reduced (78).Moreover, in some of the studies discussed above, placing an A,G, tract internal to the element allowed deletion events, even though some M-region sequences were in effect repositioned outside the deletion boundaries (79). Nevertheless, there is some indication that internal sequences may be required for deletion, because substituting the entire 0.6-kbp M-regon with macronuclear DNA or Escherichia coli DNA abolishes deletion (see 19). The current view for the M-region is that the external A,G, tracts play a major role in defining the deletion boundaries, and that internal sequences, perhaps consisting of multiple elements, serve to promote the process (19).It is clear, however, that the external cis-acting sequences defined for the M-region cannot be general, because none of the other Tetrahymena IESs possesses flanking A,G, tracts. Flanking sequences may, however, be a general feature of the excision process in this organism, in that preliminary studies indicate that external sequences are required for R-region deletion (see 19, 79).As a way of integrating the current data, it has been suggested that there are a number of classes of Tetrahymena IESs that differ in their cis-acting sequences (19, 79). Each different IES group would be associated with a different type of flanking sequence that interacts with a protein involved in specifylng excision. Such proteins might not act directly to catalyze excision, but could be responsible for interacting with a second general protein that carries out this task. In this way, the various groups of IESs could still
32
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
have their own specific factors, yet converge on a common excision mechanism.
2. RESEMBLANCE OF Paramecium AND Euplotes SHORTIES TERMINI TO TECTRANSPOSONS The in vitro mutagenesis and transformation approach has not been applied to hypotrich and Parumecium IESs. Transformation of the vegetative macronucleus is routine in Paramecium, but introduction of DNA into the developing macronucleus has not yet been reported. For the hypotrichs, transfornation systems have generally not been developed. However, it has been shown that micronuclear DNA can be microinjected into the developing macronucleus of Stylonychia lemnae and that it undergoes correct fragmentation ( 1 0 6 ~ This ) . may make it possible to use transformation to study the sequence requirements of IES excision in this organism. Despite the absence of transformation, some insights into sequences potentially involved in IES excision have been obtained through comparative analyses. As discussed in Section II,C, 21 Paramecium IESs have been sequenced, and each terminates in TA direct repeats. This large number of IESs made it possible to apply statisticalmethods to look for nonrandom base composition at each position of the ends of the IES, beginning with the direct repeats (94). The results indicate that at least seven of the first eight positions (including the TA direct repeat) at the ends of the IESs are nonrandom in base composition, and the similarity may extend up to 14 bp. The derived consensus sequence is Tl,oA1,,Y,,A7,G7,Y,,NR,, (subscripts indicate the percentage of IES ends conforming to the consensus), and this sequence is arranged such that it forms inverted terminal repeats. The deduced consensus sequence is strikingly similar (Fig. 7) to the sequences of the ends of the E. massus Tec transposons ( 6 4 ,and more generally to the ends of members of the Tcl-related family of transposable elements (7, 107-109). A similar analysis has been performed on 14 short IES ends from E. massus (68) (the TelIESs described in Section II,A were not included in the analysis, because there is good evidence that they represent a distinct class). The results again indicate that the short IESs have nonrandom base composition primarily at their ends, and the deduced consensus sequence is T,,A,,,T71r7,G,6C,,R,6 (Fig. 7).Like the Parumecium short IES consensus, this is quite similar to the ends of the Tec transposons (61) and the Tcl-related transposon family (7, 107-109). In addition to the terminal sequence conservation, the Euplotes short IESs also show a cluster of statistically significantpositions beginning 17 bp from the end, with a sequence of TNNNGAA. This short patch is noteworthy because it corresponds in both sequence and position to a near universally conserved region of the Tec element inverted repeats (61).
33
DNA DELETION I N CILIATES
E. crassus Short IESs: E. crassus Tecl & TecZ:
5' 5' Paramecium Short IESs: 5' T c l /mariner Transposons: 5 '
. .. 3 ' ..- 3 ' . .. 3 ' .. . 3 '
FIG.7. Comparisons of the deduced terminal consensus sequences of the Paramecium (94) and Euplotes c r m w (68)short IESs with the termini of the Euplotes Tecl and Tec2 transposons (61)and the terminal consensus sequence for the Tcl-related transposons (7,107-109). In each case, the first two bases (TA) represent the terminal direct repeat. Identical bases are highlighted with a black background, and similar positions (e.g.,G versus R) are highlighted with a stippled background. [Reproducedfrom Jacobs and Klobutcher (68) with permission.]
There are at Ieast two possible models that explain the terminal sequence similarities of Tec transposons and the Paramecium and Euplotes short TA repeat IESs. First, they may have attained similar terminal sequences through evolutionary convergence. More specifically, the ability to remove what are now viewed as short IESs, as well as the transposons, from the micronuclear copies of genes may have conferred a selective advantage on the organisms through the production of superior gene products. In the context of an existing or evolving DNA excision system, one would expect adaptive mutations that enhance excision to become fixed. Therefore, if terminal sequences are important for excision, the ends of short IESs and transposons may have come to resemble each other by virtue of mutations that allowed them to be more efficiently excised. The second and simpler interpretation of the data is that the short IESs derive from transposons. That is, the data can be viewed as supporting the hypothesis that IESs are the mutated and deleted remnants of transposons that have been active in the micronuclear genome. Assuming that the latter hypothesis is correct, two possible roles for the conserved short IES termini have been suggested (94). The terminal sequences of many transposons are known to be required for transposition (reviewed in 50). As such, the conserved termini of the short IES might allow for its continued mobility in the micronuclear genome. It is dfficult to envision, however, a selection for the continued mobility of short IESs, because there is little reason to believe that such a selection exists even for intact transposons (this issue is more fully explored in Section V). Thus, the more likely role of the terminal sequences is in specifjmg excision. In this case, a selective force for the retention of sequences involved in excision can easily to envisioned, in that failure to remove IESs would often result in nonfunctional genes. Although it is likely that the terminal sequences of the short IESs function in specdjang excision, additional sequence information appears to be required. A computer-generated, weighted consensus sequence, or profile, de-
34
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
rived from the Paramecium short IES termini, was used to search long stretches of sequenced micronuclear DNA to determine if known IESs could be identified by their similarities to the consensus (94). Known IES termini were among the top matches when Paramecium micronuclear DNAs were analyzed, and the consensus was also somewhat effective in identifylng short IES and Tec element ends in Euplotes micronuclear sequences. However, not all IES ends were identified in these studies, and many non-IES sequences turned up among the top matches. Thus, it seems the terminal sequences could play a significant role in specifylng an IES for excision, but that additional sequence information is still required. It may be that two terminal sequences in inverted orientation and at the appropriate distance are sufficient to speclfy excision. Alternatively, additional sequence elements could be located within the IES, or within flanking regions, as has been observed in T&ahymena. The additional conserved internal sequence noted in the E. c~assusshort IESs is a candidate for such an element. Overall, although some insight into potential cis-acting sequences has been obtained in these organisms, biochemical and/or transformation-based approaches are still needed to confirm the proposed role of the terminal sequences and to define any additional sequence requirements for excision.
3. CONSERVED SEQUENCESOF IESs IN Oxytrichafallax AND OxyCricha trifallax The short IESs in other hypotrich species cannot generally be viewed as conforming to the terminal consensus sequence derived from the Euplotes and Paramecium IESs. Indeed, they tend to vary in both the length and the sequence of even their short terminal direct repeats. This indicates that they are a diverse group. For instance, the five known short IESs of 0.fallax and 0.trifallax (all interrupting the 81 locus) (110; 40a; A. Seegmiller and G. Herrick, unpublished) do not resemble one another in primary sequence, and presumably each represents a set of short IESs excised by an excision machinery that is, at a minimum, not entirely identical to that employed for the other sets. In lieu of a transformation system to permit interventional, structurefunction analyses, the essential nature of IES sequence features has been assessed in 0.fallax and 0. trifallax by taking advantage of evolutionary divergence. Multiple, widely diverged alleles of the 81locus from both species have been sequenced, and their conserved sequence features have been identified (40a;A. Seegmiller and G. Henick, unpublished). All the alleles contain the five identified IESs at corresponding positions. All of the IESs interrupt highly conserved macronucleus-destined sequences (four interrupt proteincoding regions; one interrupts a conserved region 5’ of a gene that might con-
DNA DELETION IN CILIATES
35
sist of promoter elements or a macronuclear DNA replication origin), so they have been constrained to retain fully efficient and precise excision-required sequences. Four of five IESs show strong conservation of terminal sequences, but no conservation of central sequences other than length. This suggests that the termini of these IESs have evolved under purifylng selection against mutations that hamper excision, and that internal sequences are likely unnecessary for excision. The sequences of the conserved termini differ from IES to IES, again consistent with multiple classes and excision mechanisms. However, one of the Oxytricha short IESs shows a conserved consensus of inverted terminal repeats that resembles that of the Paramecium TA IESs, and that of the Euplotes short TA IESs, suggesting that it is related to the IESs in these other species. The fifth IES, like the Te&-ahymenaM segment, appears to have critical internal sequences and nonessential terminal sequences.
C. Transacting Factors Currently, none of the proteins that mediate IES excision has been definitively identified. A number of different approaches are being applied toward this end. This section summarizes work in this area, along with other experiments that implicate particular macromolecules as influencing excision in trans. 1. IDENTIFICATION OF A CONJUGATION-SPECIFIC PROTEIN THAT INTERACTS WITH THE Tetrahymena Tlrl DELETION The extensive nature of the ciliate DNA excision processes suggests that biochemical approaches would be effective in identifylng trans-acting factors that mediate excision in vitro, cleave DNA at excision boundaries, or simply bind to IESs. Approaches of this type are, or have been, pursued in a number of laboratories, but little success has generally been reported. Nonetheless, recent work has identified a protein that may participate in the excision of the T'ruhymena Tlrl IES (J. L. E. Ellingson, I. M. Kalve, E. E. Capowski and K. M. Karrer, unpublished). Trll (73) (Fig. 3f) is the large Tetruhymenu IES associated with 825-bp inverted repeats that include two 19-bp tandemly repeated sequence motifs (19A and 19B).A short restriction fragment including the 19A repeats was used in a DNA mobility shift assay. A protein in extracts prepared specifically from cells 10-12 hr after the initiation of mating, which represents the period of Tlrl excision, interacted with the restriction fragment. A similarly sized restriction fragment including the 19B repeats displayed no specific interactions with the developmental crude cell extracts. The developmental-specificnature of the binding activity suggests its involvement in the Tlrl excision process, and implicates the 19A repeats as potential cis-acting signals. Additional studies employed fractionated developing cell extracts in
36
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
DNase I footprinting analyses on a restriction fragment containing both the 19A and 19B repeats. Hypersensitive sites were found within both the perfect and degenerate copies of both repeats, and si@icant protection was observed in the 19B repeat region on one DNA strand. The observation of footprints in the 19B repeats was surprising, given that the 19B repeats failed to show an interaction in the DNA mobility shift assay. Therefore, a DNase I footprinting analysis was performed on the 19B repeat region alone. No footprinting was observed, indicating that the protein interactions with the 19B repeats are dependent on the 19A repeats and their associated binding activity. The role of the binding protein in DNA excision is clearly speculative at this point, but the developmental-specific nature of the binding activity is suggestive of a role in rearrangement. Further analyses, perhaps with the purified protein, will be required to assess its function. However, even if this protein is only indirectly involved in excision (e.g., as an accessory factor that interacts with excisase),it may well provide a means of identifymg other components of the excision system.
2. IDENTIFICATION OF GENESAND GENEPRODUCTS INVOLVED IN CONJUGA~ON Several approaches have been taken to isolate genes or proteins whose expression is limited to conjugation and macronuclear development, and that are potential candidates for involvement in IES excision. The extensive nature of the excision processes in ciliates argues that excision proteins, and hence the mRNAs encoding them, should be abundant. One type of approach to isolate conjugation-specific genes has involved either the construction of cDNA subtraction libraries or differential hybridization screening of recombinant libraries. Numerous conjugation-specific genes have been isolated in both T. thermophila (111, 112) and E. crmsus (113; 2. Ling, S. Ghosh and L. A. Klobutcher, unpublished) by these procedures. The sequence of only one of these genes has been reported (114),and the encoded gene product is likely involved in transcriptional control rather than excision. Sequencing of the other genes, coupled with studies on the temporal and spatial localization of their gene products in developing cells, should help to determine if any of the other conjugation-specificgenes might function in excision. A second approach has been to punfy directly proteins that appear during macronuclear development. Madireddi et al. (115) identified a number of conjugation proteins, and purified one 65-kDa phosphoprotein (p65) that is abundant during the early period of macronuclear development. Antibodies against the p65 protein primarily stain the developing macronucleus, but also the old macronucleus. In the developing macronucleus, the staining is
DNA DELETION IN CILIATES
37
uniform early in development, but, during the period when DNA rearrangement events are occurring, a limited number of vesicles are visualized within the nucleus. These vesicles encase DNA (115),and more recent studies in&cate that this DNA is micronuclear limited (116).These observations have led to the suggestion that the p65 protein forms part of a structure that is analogous to the vesicles observed during macronuclear development in the hypotrichous ciliates, and that these are involved in the developmental elimination of DNA (115,116).Further studies are needed to substantiate the role of the p65 protein. Nonetheless, its developmental pattern of appearance, and localization to the developing macronucleus, make it a strong candidate for playing some role in the DNA rearrangement process. Finally, genetic approaches may contribute to the identification of proteins involved in DNA excision. The genetic tools available for 7’.t h m p h i l u have been used to isolate a number of mutations that result in the arrest of cells at various stages of conjugation (117,118; E. Cole, unpublished). These conjugal-block mutants have been characterized primarily in regard to what nuclear stage is affected, but future studies aimed at determining whether particular DNA rearrangement processes are affected may determine which genes are candidates for involvement in excision. Ultimately, cloning of the genes will be required, but there is currently no simple means of going from mutation to gene in Tetrahymena. Advances in DNA transformation may make cloning by complementation possible in the future, and, likewise, further development of the genetic map should make positional cloning feasible. 3. ROLEOF TRANSPOSON-ENCODED PROTEINS IN EXCISION
The identification of transposons, some of which are IESs, in hypotrich species raised the possibility that the transposons might encode excision functions. This possibility is bolstered by the analyses of excision products and/or intermediates in 0. tr$allar and ?: thmophila that have led to models of excision with transposase-like reaction mechanisms (see Section 111). As a result, studies have been performed to assess conjugation-specificgene expression from both the E. ma.ssus Tec elements and the 0. trgallux TBE1tri transposons. Northern blotting analyses failed to detect transcripts from Tec elements either during vegetative growth or during conjugation and macronuclear development (119).However, using a highly sensitive reverse transcriptase PCR (RT-PCR)procedure, low levels of Tec transcripts were observed early in development, at times prior to Tec excision. These transcripts were extremely rare, with their maximum abundance estimated to be 0.0004% of poly(A)+ RNA. The low levels of transcripts are interpreted as being more consistent with their function in low-frequency transposition of the Tec elements, rather than the massive excision process involved in Tec transposon and short IES
38
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
removal (119). Transposition events would by necessity occur in the micronucleus, and the timing of expression is consistent with this notion. The rare transcripts appear prior to micronuclear meiosis, a time when the micronucleus is known to be transcriptionally active in Tetrahymena (124 121). Similar unpublished studies on the TBE-tri transposons have produced very different results (K. R. Williams and G. Henick, unpublished). Copious TBE1-tri (and TBE2-tri and TBE3-tri) transcripts are readily detected by Northern blot hybridization in RNA isolated from exconjugants. The detected transcripts are not polyadenylylated, they are heterogeneous in size, and appear to result from transcription into TBEl-tris from flanking sequences. Transcripts derived from both TBEl-tri DNA strands can be detected. This would potentially allow for the expression of the transposase and the zinc fingerlkinase genes that reside on one DNA strand, and the 22-kDa gene that resides on the other (Fig. 3c).For the TBEl-tri element interrupting the mite chondrial solute carrier protein gene, RT-PCR detects RNA entering TBE1tri-1 from its flanking macronuclear-destined sequences. This suggests that the mitochondrial solute carrier gene promoter, which has been mapped for vegetative RNAs (56), might be responsible for the initiation of the TBE1tri-1transcripts. The nature of the TBE-tritranscripts clearly needs to be investigated further. However, the copious transcripts observed raise the possibility that large amounts of TBE-tri gene products are produced during macronuclear development. Coupled with the analyses indicating that the TBEl-tri ORFs have been under selection for function (see Section KA), it is not unreasonable that element-encoded functions, and particularly transposase, might act in tram in excision as proposed in the model of TI3El excision (Fig. 5). 4. INFLUENCEOF THE OLDMACRONUCLEUS ON IES EXCISION A combination of genetic and molecular genetic analyses on Purumecium provides ample evidence that the old macronucleus influences the formation of the new macronucleus in the same cell during sexual reproduction, including effects on DNA rearrangement (reviewed in 16).Recent studies on both Paramecium and T, therm0phil.uindicate that the sequence composition of the old macronucleus also influences IES excision, and it is likely that this influence is exerted by trans-acting factors. The Paramecium work stems from analyses of the mtFE mutation that affects mating type determination (122). The two possible Purumecium mating types (0and E) are normally determined by maternal inheritance following conjugation, such that the exconjugant derived &om the mating type 0 parent remains 0,and the exconjugant from the mating type E parent remains E. Cells homozygous for the mtFE mutation are constitutively mating type E, irrespective of the mating type of the parent. Once
DNA DELETION I N CILLATES
39
the E mating type is established, cells retain this mating type through multiple rounds of sexual reproduction, even when an mtF+ allele is introduced. Analysis of the macronuclear G surface antigen gene in the mtFEmutant strain revealed that it had retained a 222-bp IES in its coding region (89). Cells homozygous for the mtFE mutation failed to excise the IES. Interestingly, once IES excision failure occurred, it was stably propagated during subsequent rounds of sexual reproduction, even in the absence of the mtFE mutation. The effect appears to be specific, because a number of IESs in other genes were examined, and no excision abnormalities were evident. These results suggested that the presence of the IES in the old macronucleus inhibits excision of the correspondmg IES in the developing macronucleus. This was substantiated in experiments involving the injection of the cloned IES into the macronucleus of vegetative cells, and then carrying the transformants through sexual reproduction (93).Injection of the cloned 222bp IES with a small amount of flanking sequences inhibited IES excision during the next round of sexual reproduction, with the severity of the excision defect being correlated to the number of copies of the construct in the old macronucleus. Constructs consisting only of the IES had similar effects, but a construct in which the majority of the 222-bp IES was deleted showed no effect. Thus, it appears that the presence of the IES sequence in the old macronucleus is sufficient to inhibit excision. As before, once a high level of IES+ molecules are generated in the macronucleus, the effect is perpetuated during subsequent rounds of conjugation, producing a stable epigenetic state. Very similar results have been obtained in T. thermuphita (123).The studies have involved injection of either the cloned M-region or R-region into the macronucleus. In each case, excision of the corresponding IES was inhibited during the next round of macronuclear development. The results were also quite similar to those in Paramecium in a number of other aspects. First, the effects were for the most part specific, such that cells harboring the Mregion in the macronucleus displayed defects in excising the M-region, but not the R-region or another unlinked IES. Second, once the IES was retained in the macronucleus the effect was self-perpetuating in a subsequent round of macronuclear development. Third, the M and R elements (i.e., without their normal flanking DNA) were sufficient to disrupt excision. One difference between the two systems is that the presence of the IES in the old macronucleus of one member of the mating pair inhibited IES excision not only in the exconjugant derived from that cell, but also in the other exconjugant even when it contained no IES+ forms in its old macronucleus. It was suggested that this difference derives from the greater cytoplasmic mixing that occurs during Tetrahymena conjugation as compared to Paramecium. These results imply that there is some interaction between the old
40
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
macronucleus and the new developing macronucleus, via the cytoplasm, that influences excision. Two types of models have been proposed to explain the results (93, 123).The first type of model suggests that the IESs present in the old macronucleus sequester a trans-acting factor required for IES excision. Sequestration would be efficient because of the high copy numbers of the IES present in the old macronucleus relative to the developing macronucleus. However, the truns-acting factor involved could not be one that is generally involved in IES excision, because this would affect the excision of all IESs, in contrast to the relatively specific effects observed (complete inhibition of IES excision would also probably be a lethal event, at least for Paramecium). On the other hand, the large number of IESs makes it unreasonable to propose that each IES has its own trans-acting factor. Thus, it is envisioned that IESs interact with multiple and variable combinations of a small number of factors. The presence of one IES in the old macronucleus would then be expected to sequester a limited subset of factors, and thus affect the excision of only a subset of IESs. In the instances examined, retention of the particular subgroup of IESs in the mature macronucleus would presumably be compatible with cell viability. The alternative model proposes that the IES in the old macronucleus produces a trans-acting factor that is directed to the developing macronucleus, where it specifically influences excision. Because the IESs do not appear to encode proteins, it has been postulated that the truns-acting factor is a nucleic acid (either DNA or RNA) that serves as a guide or template for the processing events in the developing macronucleus. If a sequence in the old macronucleus retained an IES, the template or guide produced would be altered such that the normal processing pattern is altered or inhibited. This model is particularly attractive because it accounts for the high degree of specificity observed. One version of this model has been tested in Paramecium (93). Specifically, it was envisioned that IES excision might proceed via a mechanism similar to the cut-and-repair process that has been observed for some transposons (124, 125). When a cut-and-paste transposon (e.g., Drosophilu P or nematode Tcl transposons) excises, the resulting chromosomal gap appears to be repaired using homologous sequences, typically from the same locus on the sister chromatid or the homologous chromosome. In the context of IES excision, a template copy produced from the old macronucleus would be used to repair a gap created by IES excision. If the template were generated by a normal IES- macronucleus, the repaired gap would lack the IES, but if the template came from an IES+ macronucleus the gap would be repaired such that the IES is regenerated. To test this version of the model, the Paramecium 222-bp IES was modified by insertion of a novel restriction site,
DNA DELETION IN CILIATES
41
and then the construct was introduced into the macronucleus by microinjection. Following the next round of sexual reproduction, IES excision was again inhibited, but the retained IESs did not contain the novel restriction site. Thus, the IES in the old macronucleus does not appear to function directly in gap repair. It is still possible, however, that a nucleic acid form derived from it could serve as a guide for excision. Additional studies are needed to differentiate between these two models. Nonetheless, these observations provide new means of identifylng both cisacting and trans-acting factors for the IES excision process. It is worth noting that the Parumecium mtF gene product is a strong candidate for a transacting factor. The mtFEmutation affects not only mating type determination and the excision of the 222-bp IES, but also a number of other cellular phenotypes. It has been suggested that the pleiotrophic effects of the mutation might be the result of failure to excise a subset of IESs so that multiple genes would be affected (89).Moreover, based on the current understanding of the mating type determination system in Paramecium, a model has been proposed that explains the effects of the mtFEmutation on mating type via an IES excision defect (see 89).Again, it is unlikely that the gene encodes aproduct that is generally involved in IES excision, but it is a candidate for encoding a trans-acting factor required for the excision of a subset of IESs.
D. Relationship of Internal Eliminated Sequence Excision to Other Ciliate DNA Rearrangement Processes 1. CHROMOSOME FRAGMENTATION
Blackburn and Karrer (126) initially suggested that the chromosome fragmentation and DNA excision events of macronuclear development could be related. They noted that chromosome fragmentation could be viewed as a defective DNA excision event. That is, the chromosome is broken in the same manner as for IES excision, but flanking sequences do not rejoin, and hence are processed as new chromosome ends. This scenario was suggested in part by the observation that both chromosome fragmentation and IES excision events occur during the same period of macronuclear development in Tetruhymenu (20, 76; J. L. E. Ellingson, I. M. Kalve, E. E. Capowsh and K. M. Karrer, unpublished). However, a number of subsequent observations indicate that the proposed relationship between the two processes cannot be as simple as originally proposed. For example, studes on E. mmsus indicated that many of the transposon and short IESs are excised well before the period of chromosome fragmentation (28, 29, 47, 59, 66), so that the two processes are not tempo-
42
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
rally linked in this species. Moreover, a well-conserved sequence element termed the chromosome breakage sequence (Cbs) has been shown to be required for chromosome fragmentation in ?: thermophila (34).Cbs sequences have not been found in the vicinity of Tetrahymena IESs, arguing that they are not involved in the DNA breakage step of IES excision. Nonetheless, a number of observations continue to suggest relationships between chromosome fragmentation and IES excision, though perhaps less directly than originally envisioned. The first relates to the Euplotes TelIESs, whose sequences resemble telomeric repeats (44), and the 0.fallax and 0. tnfallax TBE transposons, which have perfect telomeric repeats at their ends (48, 49). The TelIESs are also known to be excised during the period when chromosome fragmentation and telomere addition are occurring (44), suggesting some type of linkage with the chromosome fragmentationprocess. One possible means for this linkage to occur is for the IES excision processes to share some protein factor with the chromosome fragmentatiowtelomere addition process. For instance, the telomere binding protein gene is transcriptionally active during the period of chromosome fragmentation/telomere addition (127),and it is conceivable that when the protein is synthesized, it interacts with the trelomeric repeat sequences of TBE transposons and TelIESs and serves as a cofactor for the excision process. Alternatively, other proteins that interact with telomeres might play a similar role. Second, an intriguing sequence similarity has been noted in E. mmsus. Chromosome fragmentation in this species is highly precise, and a consensus sequence has been deduced that is thought to play a role in specifylng the positions of chromosome fragmentation/telomere addition (35).It is found either within or flanking the macronuclear-destined sequence, and it has a well-conserved core of TTGAA. Analysis of multiple Tec elements (61) revealed that the same sequence is highly conserved within their inverted repeat termini, and it resides at a position relative to the Tec transposon termini that is very similar to the placement of this sequence relative to chromosome fragmentation/telomere addition sites. Moreover, the recent analysis of E. mmsus short IESs has revealed that they possess a similar conserved sequence at the same position as the Tec elements (68) (see Section 111,B). It is possible that this sequence element is responsible for the binding and positioning of a protein with endonucleolflc activity that serves to cleave the DNA for either IES excision or the formation of macronuclear chromosome ends. Alternatively, a common protein factor might interact with such sites and serve to enhance the formation of DNA excision or chromosome fragmentation complexes, depending on the particular stage of macronuclear development. Finally, studies of Paramecium provide an indication that alternative patterns of chromosome fragmentation/telomere addition and IES excision are
DNA DELETION IN CILIATES
43
coupled. Both micro- and macroheterogeneity have been noted in the position of chromosome fragmentation/telomere addition sites in Paramecium (92, 128-130). The microheterogeneity is such that the position where telomeric repeats are added can vary over a 200- to 800-bp region. This has given rise to the notion that there are “chromosome fragmentation domains.” The macroheterogeneity, on the other hand, is reflected in the alternative use of chromosome fragmentation domains that are located from 2 to 13 kbp away from each other. This gives rise to macronuclear chromosomes that differ significantly in size. Three separate studies have reported cases where the presence of a sequence element that is quite possibly an IES is correlated with the use of adjacent chromosome fragmentation/telomere addition sites (92, 128, 129; E Caron, A. Butler, A. Le Moue1 and E. Meyer, unpublished). The most thoroughly understood case concerns chromosome fragmentation near the Paramecium primaurelia G surface antigen gene (92), because both the micronuclear and macronuclear regions were analyzed. The results indicate that some macronuclear chromosomes lack a 76-bp IES and have telomere addition sites located about 2.2 kbp downstream, whereas other macronuclear chromosomes have retained the IES and have telomere addition sites located within 200 bp downstream. That is, the presence or absence of the IES correlates with the alternative use of proximal or distal telomere ad&tion sites. It should be emphasized that in these Paramecium stu&es, no IES boundary has yet been shown to be directly used as a telomere addition site, so that a common cleavage event for the two processes remains speculative. Nonetheless, in the cases studied to date there is a clear indication that one process influences the other. The situation is perhaps analogous to mRNA processing, where alternative intron splicing patterns are in some cases correlated with alternative use of poly(A) addition sites (reviewed in 131). How alternative IES excision influences the choice of chromosome fragmentation/telomere addition sites (or vice versa) is not known, but it has been suggested that the processes may compete for one or more common factors (92). Alternatively, failure to excise an IES may alter the chromatin structure of a region of the micronuclear genome such that chromosome fragmentation sites are either exposed or shielded. 2. DNA SCRAMBLING It is also possible that the DNA scrambling process observed for some hypotrich genes (Fig. 2b) is related to IESs. A scrambled micronuclear DNA arrangement has been observed for two Oxyh-ichanova genes. The macronuclear chromosome bearing the actin I gene is split into nine segments in the
44
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
micronucleus, and these are arrayed in an unorthodox order, with some segments inverted relative to the others (37, 132). Similarly, the sequences that will form the macronuclear chromosome bearing the or-telomere-binding protein gene are split into 14 disordered segments in the micronucleus (36). Macronuclear-destined sequences that ultimately adjoin one another are bordered by repeats of 9-13 bp; these are either direct or inverted repeats, depending on whether the macronuclear-destined sequences are in the same orientation or inverted. Recombination between the repeats would serve to assemble the macronucleat-destined sequences into the correct order, while simultaneouslyliberating the DNA that exists between the macronuclear destined DNA sequences, or reordering them such that they end up flanking the macronuclear destined sequences (chromosome fragmentation would then form the macronuclear chromosome). The DNA separating macronuclear-destined sequences in scrambled genes is reminiscent of IESs. However, the rearrangement process is clearly more complicated than IES removal. The repeats that appear to guide unscrambling also are larger than the direct repeats associated with the standard short-hypotrich IESs. These differences have led to the proposal (36) that the sequence interruptions in scrambled genes be referred to as Type-I1 IESs to distinguish them from the Type-I IESs that are excised with joining of the immediately flanking sequences. Despite the differences in the two processes, they may be evolutionarily related. Mitcham et al. (36) proposed a model for the origin of scrambled genes. They envision that for scrambling to occur, a micronuclear gene must first be interrupted by standard IESs. These IESs provide the repeat sequences at the adjacent macronuclear-destined sequence blocks that guide their reunion during macronuclear development. Once this situation arises, recombination events can occur within different IESs to scramble the order of the macronuclear-destined sequences. Further analyses of the organization of the 0.nova scrambled genes in other ciliate species will be useful in assessing the validity of this scenario. In addition, determining whether unscrambling and IES excision occur concomitantly would help elucidate the relationship of the two processes.
IV. Possible Functions of Internal Eliminated Sequences In the following discussion we develop the proposition that ciliate short IESs are copies of transposons that have degenerated under selection to the point that they retain only those few sequences required in cis for precise excision from the developing somatic nucleus of the host. Others have suggested that short IESs serve host functions, instead of being simply the resid-
DNA DELETION IN CILIATES
45
ua of invading selfish DNA parasites that serve only their own fitness. IESs have been suggested to provide the bases of the several genetic differences between the micronucleus and macronucleus (19, 70, 73, 75). Specifically, IESs might contribute to micronuclear chromosome condensation and mitotic chromatid disjunction, the different timing of DNA replication between the two nuclei, or might be the basis of a molecular mechanism to heterochromatize or otherwise transcriptionally silence the micronucleus.2 Ln contrast, the limited data currently available suggest that IESs do not serve micronuclear-specific functions. First, there are the previously discussed situations in 0.fallux and 0. tri$allux, whereby alleles of the same locus differ in regard to the presence of TBEl transposons, yet both types of alleles are maintained in the micronuclear genome and give rise to functional macronuclear genes (48, 49, 57). Similarly, in a phylogenetic survey of various Tetruhyrnena species, Huvos (133)found that many loci containing IESs in ?: thermophita are not involved in rearrangements in related species. These data suggest that IESs are not essential. Second, studies involving the introduction of IESs into the macronucleus by transformation have been discussed (93, 123) (see Section 111,C). If IESs served micronuclear-specific functions, their ectopic presence in the macronucleus might be expected to be deleterious. In contrast, no deleterious effects were observed in the transformants. Thus, although these programmed rearrangements of genetic molecules might be, and have been, viewed as innovations for regulation of development of the organisms in which they occur, a more parsimonious view is that these elements are simply parasites that have evolved behaviors that, while permitting their propagation vertically through the germ line (the micronucleus) of the host, at the same time minimize the impact of their presence by judicious precise excision from the secondary, worlang copy of the genomic information (the macronucleus) of the host. However, even this view does not preclude the possibility that some IESs might ultimately be shown to possess some function. For example, spliceosomal introns are thought to have arisen from transposition of group I1 introns (134). Although the original mobile group I1 introns may have spread through genomes simply because of their abilities to replicate and excise themselves from mRNA, their spliceosomal descendants have in at least some cases evolved functions beneficial to the host, such as generating protein diversity through their role in alternative mRNA splicing (reviewed in 131).
One of us has proposed (16)that the silencing of the micronucleus was driven by the need to inactivate dominant transposon-induced mutations, which would ovemde transposon-cleansed. reverted macronuclear genes.
46
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
V. Evolution of Ciliate Internal Eliminated Sequences by the Invasion, Bloom, Abdication, and Fading of Transposons An evolutionary perspective is necessary for understanding the workings of any extant biological process. This final section presents speculative evolutionary scenarios explainingthe origin of transposons and short IESs in ciliates. The scenarios are built on present knowledge of transposons in hypotrichs, and on the patterns of distribution and similarities of IESs in various ciliates, and of known phylogenetic relationships. The speculations are no doubt controversial and are intended to suggest explicit tests and counterhypotheses, and stimulate investigation. The central propositions are that transposon invasions have generated the ciliate DNA excision phenomena, and that short IESs are ancient transposon IESs that have shrunk by loss of internal sequences unnecessary in cis for the developmental excision. We’ve mentioned in passing the various evidence that suggests the involvement of transposons: (1)some IESs possess features typical of transposons; (2) studies of excision products and intermediates in some ciliates are suggestive of involvement of a transposase in the reaction; and (3) some short IESs fall into similarity families, suggesting that the elements of a family are paralogous homologs created by transposition of one kind of element. Figure 8 represents a proposed set of transitions, which we refer to as the invasion, bloom, abdicate, and fade (IBAF’) progression, leading to the generation of various unrelated families of transposons and short IESs. Briefly, this model proposes that a transposon first invades the micronuclear genome. Transposons that encode functions allowing for their developmental excision are able to proliferate to high levels (bloom) due to the opportunity afforded them by the nuclear dimorphism of their host, which allows the elements to “get out of the way” during macronuclear development, thereby ridding the host of transposon mutations in its “expression nucleus.” htially, the transposons depend on each other for excision functions through a pool of “communal excisase.” This, however, leads to a selection for the transfer of excision responsibility to the host (abdication). Once the host becomes responsible for excisase production, transposon mutation and internal deletion is allowed (fade),with the exception of sequences required in cis for excision, such that short, and ultimately unrecognizable, elements result. Each of these steps is discussed in the following sections, along with supportive data from the various ciliate systems. An attempt is then made to explain the current IES status of the various ciliate groups in terms of a series of IBAF progressions.
47
DNA DELETION IN CILIATES
r-
Micronuclear DNA:
Invasion
t
Exclsase
+
Exclsase
1 Bloom t
Exclsase
t
Exclsase
1 Abdicate
t
Excisas
HDSt
1
1
1
1
Exclsase
Exclsase
Exclsase
Exclsase
1 Fade
] "CornmunatExcisase"
Excisase Had
Excisase FIG.8. Summary of the invasion, bloom, abdicate, and fade (IBAF) progression. Micronuclear DNA is shown as a line, and transposons and short IESs as black rectangles. In the abdicate step, a part of' a transposon encoding excisase function(s) is represented as coming under the control of a strong host promoter, See text for additional details.
A. Transposon Invasion in Ciliates The initial step in the IBAF progression involves invasion of a transposon into the micronuclear or germ-line genome. Horizontal transfers have been central to the evolutionary history of various transposons (135,136,153), but the physical route by which the DNA element is moved from the genome of one host to that of another remains unknown. A mite with peripatetic feeding habits has been suggested as the vector for Drosophila P element transfer from insect to insect (137).Alternatively, vectors might be viruses or plasmids. How transposons initially arrived in the micronuclear genome of ciliates is equally unclear. However, it is worth nothmg that ciliates routinely engulf live microbes (including bacteria, smaller ciliates, and other protozoa) into food vacuoles. The route from food vacuole to micronucleus seems sufficiently short to permit the rare transfer of DNA from an engulfed microbe. Bacteria harbor members of the IS630 and IS3 families of transposons,
48
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
which are all D,D35E relatives of the two known ciliate transposons (13).It is thus quite possible that bacteria, or other microorganisms, that serve as food source for the cihates have sporadically been the source of transposon invasions. One feature of ciliate genetics that might be a hindrance to transposon invasions is that many species employ variations to the universal genetic code (reviewed in 138). The consistent differences between ciliate codes and the universal code is that some universal stop codons are translated as an amino acid (Gln or Cys). It has been suggested that the unusual ciliate codes constitute defense mechanisms against the incorporation of foreign DNA (18).Any transposon invader would be forced to use this code, which would presumably generate C-terminal tails on at least some of its proteins. The detailed consequences of this are difficult to anticipate, but might well be sufficient to hamper the initial proliferation of at least some transposons. The current data indicate that at least some transposons have been able to overcome this potential hurdle. This may have been the result of fortuitous use of termination codons for the typically limited number of transposon ORFs, or perhaps the transposon emerged from a period of acclimation during which partially active proteins were produced. Note that once a transposon has adapted to the ciliate genetic codes, it might well be transferred to another species by one ciliate feeding on another. In the following discussion, several horizontal transfers are invoked to explain the current distribution of short IESs in extant ciliates, and ciliate-to-ciliate transfers help make this a reasonable proposition.
B.
Bloom; TBE Transposons 1. LIMITSTO TRANSPOSON MULTIPLICATION Having entered the micronuclear genome, a transposon would have the opportunity to produce additional copies of itself. The genetic organization of ciliates presents an unusual situation for a transposon. Before discussing this, it is perhaps useful to consider the progression of a transposon invasion in other eukaryotic organisms. Once a transposon enters a eukaryotic host, proliferation does occur, but with limits. For example, the rate of transposition, and ultimately the total copy number of an element, are limited by the resultant rate of generation of deleterious insertion mutations. Another limit is imposed by the secondary consequences of large numbers of homologous DNA elements spread across the germ-line genome. For instance, large numbers of transposons can lead to ectopic recombination that disrupts chromosome organization. Surprisingly, population genetics modeling indicates that this ectopic recombination damage limits the proliferation of transposons
DNA DELETION I N CILIATES
49
even in conventional hosts, where insertion mutations are not phenotypically silent (140). Another limit to unbridled proliferation of transposons in eukaryotic hosts is the inevitable accumulation of transposons with mutant transposase genes. Such mutants can continue to proliferate by, in effect, parasitizing their wild-type sibs, which provide transposase in trans. In prokaryotes, where transcription and translation are coupled, complementation of transposase mutants in trans is generally ineffective (i.e., the transposase protein tends to interact only with the particular transposon that encoded it) (144, such that mutants do not proliferate and mostly “live” (i.e., encoding functional transposase) elements are encountered. In eukaryotes, however, population dynamics modeling predicts that following the introduction of a live element into a new genome (gene pool), it proliferates rapidly, but inexorably mutant “dead elements accumulate until transposition eventually stops completely for lack of a source of competent transposase (142). Experimental support for this proposition derives from the analysis of mariner transposons in insects (135, 153). Painvise sequence comparisons of mariner transposase genes cloned from insect populations show that no selection has been operating to remove nonsynonomous mutations, which build up to levels commensurate to those of synonymous mutations (selectionfor transposase function is, however, imposed at the time of horizontal transfer of an element into a new genome; if the transposon is mutant it will be lost; if it is competent it will bloom) (135,153).Thus, by our terminology, eukaryotic transposons typically undergo an invasion, bloom, and fade (IBF) progression.
2. OPPORTUNITY FOR TRANSPOSONS TO GETOUTOF THE WAY PROVIDED BY THE CILIATE GERM-LINE SOMASYSTEM We suggest that the adoption of nuclear dimorphism by ciliates has relieved some of the usual constraints on transposition, so that they have become ideal hosts for transposons that can precisely remove themselves, or be removed, from the developing macronucleus (somatic nucleus) prior to dependence on that nucleus for gene expression. Such elements depend on vertical or sexual transmission through the germ-line micronuclei for their continued presence in the host, but serve themselves no good by remaining in the somatic macronucleus, a dead-end nucleus, the genes of which are not transmitted to sexual offspring. The optimal strategy for an element in such a setting is to remove itself from the somatic nucleus, affording this nucleus full functionality and affording the host, and the element copies in the germline, unencumbered fitness. Examples of employment of such a strategy are seen in other organisms
50
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
with life cycles that, by analogy, employ a germ-line/soma division of labor. In Bacillus subtilis, starvation induces sporulation, generating a persistent spore that will carry its genes through adverse conditions until a time when proliferation is again possible. Induction of sporulation leads to the differentiation of two daughter chromosomes and their surrounding cytoplasm into the spore and the spore-mother cell. Once the spore is assembled, the sporemother cell lyses, hence serving a dead-end somatic role in support of the germ-line spore (see 143).A large -42-kbp DNA element interrupts the gene for developmental transcription factor uK,which functions only in the sporemother. At the time of uK expression, the interrupting element recombines from the cKgene in the spore-mother genome, but not from the fore-spore genome, by expressing a site-specific recombinase gene carried on the element (4, 5, 14).In this way, the germ-line mutation represented by insertion of the element into the uKgene is somatically reverted, restoring fitness to the host and to itself. An analogous situation occurs in the blue-green, nitrogen-fixing bacterium Anabuena, which grows in chains of cells. Most cells remain proliferative (germ-lineanalogs)but an occasional cell terminally differentiatesinto a heterocyst incapable of further proliferation (a somatic cell analog) but able to fm nitrogen and pass the products along to its sibling cells in the chain. At least three genes in the heterocyst cell are converted to functional forms by the excision of large DNA elements (3, 15, 144). Mobile introns and inteins provide further illustrations of this get-out-ofthe-way strategy (145,146).Mobile introns remove themselves not from the DNA (germ-line analog) but from the set of genomic transcripts (soma analog) prior to translation. Mobile inteins remove themselves from the set of genome-encoded proteins after translation and before the proteins function, in which case the genes are the germ-line analog and the protein collection is the somatic analog. By these self-splicing eliminations these elements restore functionality to the information they interrupt, increasing their own fitness by restoring the fitness of the DNA that carries them. The ciliate transposons (and short IESs) can be viewed in this light. They are able to get out of the way of the ciliate host genes before they are expressed. A clear consequence for a transposon is the ability to generate more copies of itself by transposition into further soma-specifically expressed genes, the resulting mutations being phenotypically silent, because they are somatically reverted during macronuclear development. That most ciliate genes are not expressed drectly from the germ-line means that the number of “safe” transpositional targets is essentially unlimited. However, to the extent that genes are du-ectly expressed from the germ line (from the micronucleus in vegetative cells, or from the zygotically de-
DNA DELETION IN CILIATES
51
rived nuclei prior to maturation of the macronuclear anlage), the “license” to transpose indiscriminately would be somewhat limited. 3. TBE TRANSPOSONS: MAINTENANCE OF TRANSPOSON FUNCTIONS IN CILIATE BLOOMS As dlscussed previously, one of the other limits on eukaryotic transposon proliferation is the inevitable accumulation of mutations in transposon family members. We propose that the accumulation of such mutations is delayed for the ciliate transposons. This is suggested by the features of the TBE elements present in 0.fallax and 0.trfallax. These elements are actively transcribed during macronuclear development (K. R. Williams and G. Herrick, unpublished). Moreover, as dwussed in Section II,A, divergence analysis of TBEls indicates that they have been diverging from one another since their creation (by transposition) under a pullfylng selection against mutations that compromise the functions of their three encoded proteins: transposase, zinc fingerkinase, and 22-kDa protein (D. Witherspoon, T. G. Doak, K. R. Williams and G. Henick, unpublished). What might be the source of this selection? The most obvious first guess might be that selection is for maintenance of transposition function. However, note that TBEls are transposons that function in a eukaryotic host, and they must “share” their proteins because proteins are synthesized away from the site of their genes and do not function in cis. Hence, as seen with mariner elements (see above), no selection is expected to maintain transposition function. A proposed source for the selection of TBEl genes is that they encode excision proteins and that sufficient developmental excision power must be maintained to remove all TBEls within a developing macronucleus. Under this communal excisase model, all transposons in an organism contribute to a pool of excisase proteins responsible for the developmental excision of all family members. Selection would then be directed not at each individual TBE1, but instead at the Oxytricha host and the aggregate competence of its complement of TBEls to provide sufficient doses of “excisase”to assure that the cell emerges into vegetative life with a new macronucleus that has been fully purged of all TBEls during a limited time of development. R e c d that most macronuclear DNA is genic, and a macronuclear gene interrupted by a TBEl would probably not function, so the fitness of the cell relying on that gene would be reduced. The population of TBEls to be excised is a mixture of excisase providers and excisase mutants. Cells that inherit an inadequate mix of TBEls will develop incompletely purified macronuclei, compromising their abilities to proliferate and transmit their (inadequate) mixes of TBEls to sexual progeny. Thus selection leads to the preferential transmission of excisase providers, and tends to eliminate excisase-defective mutants.
52
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
Intuition may suggest that the strength of such a selection is inadequate to maintain TBEl genes functional. This issue was explored using a standard population genetics treatment of the implied dynamics (J. Seger and G. Herrick, unpublished), making necessary but sensible assumptions about the mutation rate and the number of critical nucleotides per element. The results show that even a quite weak cost to host fitness caused by each such TBEl mutation is sufficient to maintain a large fraction of TBEls in an excisasecompetent state. The strength of the selection relies inversely on how many TBEls one excisase can excise, and on how many elements must be excised. If excisase is transposase, one functional unit of transposase may not function more than once, given that transposases have evolved only to act once, because transposition events are separated by intervals dramatically exceeding the lifetime of any protein molecule. TBEl transcription may be quite inefficient, so that each element can only direct the synthesis of a limited number of molecules before it is excised. Finally, the total excision load for TBEl excisase might include either or both of the other TBE families, TBE2s and TBE3s. Thus it seems entirely plausible that the selection acting to eliminate excisase-defective TBEls might be strong enough to keep defectives at low frequencies, even in the face of a fairly high mutation rate.
C. Abdication and Fading; TA Internal EIiminated Sequences 1. A SELECTIONFOR THE HOSTTO ASSUME EXCISION FUNCTIONS In the “bloom” TBEl phase just described, the host suffers lost fitness due to the presence of the TBEls, despite their regular excision during macronuclear development. The selection that keeps TBEls functional is exercised by the occasional fitness failure or death of a host. This would in effect create a selection for a host genetic innovation that would wrest the responsibility of excision from the elements. One obvious such innovation would be for the host to capture the transposon genes specifylng excision functions and place them under the control of a strong host promoter, such that excisase would not be limiting. Once this innovation went to fixation, the pressure on the transposons to maintain their own genes functional would be relieved, and one would expect mutant elements to accumulate in the family, leading eventually to the full shrinkage of all members to a minimum size that retains those few sequences required in cis for development excision. That is, the adoption by the host of the role of exxcisase prouidw would lead rapidly to the degeneration of the family of full-length, excisase-competent elements into a family of short IESs, all excised b y the hostfunction. Thus, ciliate transposons like TBEls are predicted to be poised to undergo the final two stages of the IBAF progression, to abdicate and to fade.
53
DNA DELETION IN CILIATES
The situation described is analogous to that proposed for the evolutionary progression of group I1 self-splicing introns to spliceosomal introns (134). It has been suggested that the internal sequences of the group I1 introns that are involved in self-splicinghave been broken into segments and transformed into host genes encoding the snRNAs that function in the spliceosome.Once the “host” became able to provide for intron excision in trans,the group I1 introns were no longer under selection for maintenance of splicing functions, and degenerated into the current spliceosomal introns. In the ciliate case, it should also be noted that besides the capture of excisase genes from a new transposon, the host might gain excisase function by the mutational modification of a transposase gene from some other resident transposon, such that the transposase now has specificity for the new transposons, and performs developmental excision instead of transposition. Alternatively, at the time of the initial invasion, the host might already have an excisase gene, as the result of a previous IBAF progression, generated to deal with a previous invasion of a related transposon. The expression of this gene need only be increased to handle the load of the new transposon as that family expands. In other words, under these types of situations, the new transposon invader would not go through the proposed IBAF progression, but instead would go through the IBF (invasion,bloom, and fade) progression that is more typical of eukaryotic transposons.
2.
ABDICATION AND FADE IN
Euplotes
The E. crassus Tec elements can be viewed as a system where the transposon is no longer responsible for its own excision. Unlike TBEls, the Tecs clearly cannot encode their own excisase, because the necessary transcripts cannot be detected (119).Also, Tec genes from various elements have been sequenced, and in many cases the genes show mutational damage (stops, short insertions and deletions) inconsistent with gene expression (M. Krikau and C. Jahn, unpublished). Consistent with this, divergence analyses (D. Witherspoon, unpublished) show the genes have diverged from each other under only some selection for function. Comparison of the divergence values for the Tec transposase and TBEl transposase, which are homologous (13),illustrates this: the average D,/D,, ratio (see Section II,A) for TBEl transposase pairs is -17, but for Tecl transposase pairs is -5 (D. Witherspoon, unpublished). The lowered Ds/Dnvalues for Tecl transposase might be interpreted as reflecting an early period in which selection against nonsynonomous mutants was in force, followed by a recent period in which no selection was imposed. What the nature of the implied earlier selection might have been is a subject for speculation. One additional implication is that most Tecls may no longer be transpositionally active. Besides the Tec transposons, Euplotes has short TA IESs with termini that
54
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
resemble the Tec termini, and which appear to be excised by a similar mechanism. One interpretation of this is that the Tecs have not only abdicated excision functions, but some have also significantly faded to form the short TA IESs. This scenario is probably too simple. One would expect to see some Tecs with internal deletions, headed toward generation to short TA IESs, but no shortened Tecs have been reported. This suggests that Tecs have just recently entered a “fade” phase. (The pace of accumulation of neutral, short deletions is pseudogenes has been measured recently in primates (147),but no such information is available for ciliates.) If Tecs do not seem to have progressed that far into the “fade” process, a more attractive explanation is that the current short IESs are the remnants of a previous IBAF progression involving a TA repeat transposon. As noted above, this early IBAF progression would have left the host with a TA IES excision system that would have facilitated the more recent bloom of Tec elements. That is, the initial TA transposon would have proceeded to the point where the host has taken over the excision function, allowing the initial elements to degenerate to the current short TA IESs. With the excision system now in place, the establishment of secondary TA transposon invaders would be facilitated.
D. Tetrahymena and IBAF Progression The hypothesis that IBAF progressions have generated Tetrahymena IESs leads to several difficdties. Several IES sequences in T&ahymena have been described, but these deletion segments differ in many ways from IESs in other ciliates, even in the sister oligohymenophoran Paramecium. These IESs also show very little similarity to each other, and generally do not resemble transposons, although some appear to be members of repetitive element families. There is also imprecision, or heterogeneity, in the IES excision process that apparently limits IESs to noncoding regions. These features might be viewed as an indication that there have been no recent transposon invasions in Tetrahymena and that the currently observed IESs represent a small subset of heavily weathered and faded relics of ancient transposon invaders. While attractive, there are still some difficult aspects to this suggestion. For example, if Tetrahymena has been able to rid its coding regions of IESs, and no new transposon invasions have occurred, one expects that there would be no selection for the maintenance of the excision system and that it would rapidly degenerate. In light of this, the extant excision system in Tetrahymena might be explained in a number of ways. First, the IESs may serve some as yet undefined function in the micronuclear genome. Second, the excision machinery may have evolved to serve a dual function, perhaps playing a role in a cellular process such as DNA repair. Third, at least some
DNA DELETION IN CILIATES
55
IESs may reside in functional regions of the genome, so that their continued excision is a requirement for host viability. Such a subset of IESs could be present in coding regions, but also might interrupt regulatory regions that function in the macronucleus. An additional question that arises is why Tetrahymena has not been subject to more recent transposon invasions. It has been suggested (16)that Tetrahymena might be devoid of recently IBFIIBAF-generated IESs, because it evolved a reliance on early expression of anlage genes needed to complete conjugation and macronuclear development (148, 149). This would mean that such genes are not “safe havens” for new insertions, which would limit the blooms of excision-capableelements. This explanation is not entirely satisfjmg, as another oligohymenophoran species, Paramecium, appears to have numerous IESs within coding regions, yet also appears to rely on early zygotic gene expression (reviewed in 150). It has also been noted that many of the characterized Tetrahymena IESs cany sequences that are members of small micronuclear repetitive element families (see Section 11,B).This might be viewed as evidence that these IESs still retain significant sequence similarity to their founding transposon invader, rather than being heavily weathered remnants. However, this aspect of the Tetrahymna IESs may be misleading. If IESs are indeed nonessential sequences, they can be expected to accumulate copies of mobile DNA elements during the course of evolution (20).That is, transposition events into IESs could have generated the association of repetitive sequences with IESs. The transposons responsible might have been conventional, lacking excision capability. Their insertions into macronuclear-destined sequences would have been selected against, but insertion into IESs would lead to no phenotype. Thus, members of such nonexcising transposon families would have survived only in the shelter of IESs (or in nonfunctional regions of macronuclear-destined DNA), and the repetitive sequences that are now parts of Tetrahymena IESs would have played no role in the generation of the IESs. Such a view may also help explain some confusing aspects of the association of repetitive DNA with DNA deletion segments in Tetrahymena. For example, only some members of the repetitive sequence families associated with the Tel-1 elements are eliminated during macronuclear development (85).These repetitive sequences may have simply become adventitiously associated with IES sequences as just described, and might not represent sequences capable of being independently excised. Finally, one objection to the transposon origin of IESs in Tetruhymena arises from the demonstration that sequences &a-nal to the M IES are necessary for its excision. It might be expected that all sequences necessary for excision of a transposon IES would lie within the element (19, 78).It has been noted, however (49),that if IES excision is mechanistically related to the re-
56
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
verse of transposition, then flanking sequences might indeed play a role in excision. Sequence-specific target sites are seen for TA transposons, including the Tec IES elements (reviewed in 13),by the TBEl IES elements (49), and for a handful of other elements, including TnlO (51)and Tn7 (55). On interruption by a new element the target sequences become “split.”The specific target site bound by Tn 7 tsnD protein, Tn 7att, is an especially relevant example, being a discrete sequence lying somewhat distant from the insertion point (55). If the IES excision machinery is transposase related, it is not unreasonable to assume that some of the same sequence interactions required for insertion might also be involved in excision. Indeed, such interactions are desirable for an excision process, because they provide a means of “holding” the flanking macronuclear-destined sequences in the excision complex, so that they ultimately may be joined (49).Thus, sequences such as the A5G5 tracts flanking the Tetruhymenu M IES may represent the original transposition target, and they now play a sigmficant role in specifylng the sites of excision.
E. Phylogenetic Distribution of Internal Eliminated Sequences in Ciliates Figure 9 summarizes the current state of IESs in Tetruhymenu, Paramecium, Euplotes, and oxytnchids. Also shown is the phylogeny of the extant ciliates with a series of proposed IBAF or IBF progressions along earlier branches, in an attempt to explain the present distribution of IESs. The most broadly distributed IES types are the various TA IESs, including the Tec transposons. Such elements represent the major types of IESs present in E. crassus and Paramecium,and appear to represent a subset of the short IESs present in at least some oxytnchids.We explain this broad distribution by first postulating an early invasion of a TA transposon into a common ciliate ancestor and a subsequent IBAF progression. This provides the ciliate ancestor with a machinery that facilitates secondary invasions of other TA transposons. Secondary invasions of TA transposons are required to explain the fact that some extant lineages contain numerous short TA IESs, others contain few TA IESs, and still others contain relatively intact TA transposons. Thus, we postulate that relatively late secondary TA transposon invasions in the Euplotes lineage are responsible for the current Tecl and Tec2 elements, whereas more distant secondary invasions of TA transposons are responsible for the short IESs in both Euplotes and Paramecium. The oxytrichid and Tetrahymenu lineages are viewed as having few or no TA IESs, respectively. The few TA IESs remaining in the oxytnchids could be the remnants of the primary invasion. An alternative view for the origin of the TA IESs in the various lineages is that multiple IBAF cycles have occurred independently and at various
57
DNA DELETION IN CILIATES
Eupiotes
Short TA IESs:
Other Shod IESs:
few
many
various TellESs
many
nd
nd
various
Tees:
-
many
nd
nd
TBEs:
many
-
nd
nd
FIG.9. Phylogenetic distribution of ciliate IESs and hypotheses for their origins. The tree is derived from the 28s rRNA gene sequences tree of Tourancheau Gt al. (138);relative branch lengths have been preserved (the scale bar represents 1% divergence). Tetrahymena was represented by the rDNA of Tetrahymena tlwmnoplzila; Paramecium, by Paramecium primaurelia; Euplotes, by EupZotes aediculatus; oxytrichids, by Stylonychia lemnae (the evolutionary events represent primarily those leading to extant IESs in 0.faZlax and 0.trifu1l.x).Proposed invasions by transposons (transposable elements, Tpn's) are indicated by the arrows. The chart below the tree summarizes the types of IESs known in the four clades indicated (nd, not determined; -, not observed in hybridization analyses). Note that the absence of TBEs in E u p l o h cr-as~usis based on the absence of hybridization with only TBEl-derived probes (K. R. Williams, L. A. Klobutcher and G. Hemck, unpublished). C. L. Jahn (unpublished) has examined oxytrichid DNAs for cross-hybridization with Tec element probes.
times in the different lineages. One factor suggesting this alternative scenario is the ease with which TA transposons may be able to independently invade
and bloom in ciliate hosts. The TA transposons, like Tec, are related to the widely distributed set of transposons in the Tcl/mariner/IS630 superfamily, many with TA target site specificity (reviewed in 13).Mariner has been successfully horizontallytransferred repeatedly (135,153),and mariner and Tcl show a propensity for sporadic excision in somatic tissues of their eukaryotic hosts (151, 152). Such an ability would be of obvious benefit on transfer into a ciliate host.
58
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
The various Oxytncha TBE elements are indicated as resulting from recent invasions (Fig. g), consistent with the intactness of the elements and the proposal that excision functions have not yet abdicated to the host in the case of TBE1. In addition, the divergence distances between individual TBEls are consistent with the implication that the founding TBEl invaded a recent ancestor of 0.fullax and 0. trfullax (D. Witherspoon and T. G. Do&, unpublished). Independent invasions of TBEl, TBE2, and TBE3 are shown (Fig. 9), as opposed to a single TBE element invasion followed by diversification within the ciliate genome, because the former scenario is more consistent with the observed degrees of sequence diversity between TBE families. The current data also suggest that multiple classes of smaller IESs exist within the oxytrichids and Tetruhymenu that are apparently unrelated to either TA IESs or TBE transposons. These are envisioned as having resulted from earlier invasions of a number of distinct transposons within these lineages, which have subsequently degraded into the short IESs of the extant species.
F. Future Directions The origin of ciliate DNA deletion systems via repeated transposon IBAF and IBF progressions is consistent with much of the current data, but alternative scenarios have also been suggested (73, US).Clearly, further research is required. For example, the proposal that DNA deletions are catalyzed by transposon-encoded or transposon-derived proteins requires further testing. Thus, further studies on the mechanism of DNA deletion, and the isolation and characterization of the excision machinery, are highly desirable in this regard. Such studies may also serve to identify novel mechanisms of site-specific recombination, or significant variations on known mechanisms (e.g., transposition). Further studies on the diversity of the DNA deletion phenomena in ciliates would also be valuable. Only a few distantly related species have been examined at the molecular level, making it difficult to draw fitm conclusions regarding the origins of various IESs. Examination of additional species would serve to both refine the proposed phylogeny and test the IBAF progression hypothesis. Also, a thorough search for orthologous IESs in different species would help define precisely when the various proposed transposon invasions occurred. Whether or not IBAF progressions are responsible for the origin of ciliate DNA deletion, transposons have clearly become involved in the process. As such, the ciliates provide an excellent opportunity to study the coevolution of transposons and their hosts. Moreover, further studies of the systems may provide insights into the origin of the more limited developmental DNA deletion processes observed in a wide number of species.
59
DNA DELETION IN CILIATES
ACKNOWLEDGMENTS This work was supported by NSF Grant MCB-9414416to LK, and NIH Grant GM25203 to GH. The authors thank the many workers in the field who provided reprints and preprints and communicated unpublished results. We also thank Ann Cowan, Carolyn Jahn, and Mary Ellen Jacobs for their comments on the manuscript, as well as Jon Seger, David Witherspoon, and Tom Doak, who also made si@icant contributions to the development and refinement of the presented ideas and hypotheses about the evolution of transposons and short IESs.
REFERENCES 1 . A. V. Matveyev, E. Rutgers, E. Soderback and B. Bergman, FEMS Microbiol. Lett. 116,201 (1994). 2. R. Haselkom, in “Mobile DNA” (D. E. Berg and M. M. Howe, eds.),p. 735. American Society for Microbiology,Washington, D.C., 1989. 3. C. D. Carrasco, J. A. Buettner and J. W. Golden, PNAS 92,791 (1995). 4. B. Kunkel, R. Losick and P. Stragier, Genes Deu. 4,525 (1990). 5. P. Stragier, B. Kunkel, L. Kroos and R. Losick, Science 243,507 (1989). 6. M. R. Lieber, FASEBJ. 5,2934 (1991). 7 . D. H. Dreyfus, Mol. lmmunol. 29,807 (1992). 8. H. Sakano, K. Huppi, G. Heinrich and S. Tonegawa, Nature (London)280,288 (1979). 9. A. E. Gorbalenya, Rot. S d . 3, 1117 (1994). 9a. C. B. Thompson, lmmunity 3 , 5 3 1 (1995). 9b. D. C. van Gent, K. Mizuuchi and M. Gellert, Sdence 271,1592 (1996). 10. K. Mizuuchi, ARB 61, 1011 (1992). 11. N. L. Craig, Science 270,253 (1995). 12. N. D. F. Grindley and A. E. Leschziner, Cell 83,1063 (1995). 13. T. G. Doak, F. P. Doerder, C. L. Jahn and G. Hemck, PNAS 91,942 (1994). 14. T. Sato, Y. Samori and Y. Kobayashi, J. Bact. 172,1092 (1990). 15. C. D. Carrasco, K. S. Ramaswamy, T. S. Ramasubnnanian and J. W. Golden, Genes Dev. 8, 74 (1994). 16. G. Hemck, Seminars Deo. Biol. 5 , 3 (1994). 17. L. A. Klobutcher and C. L. Jahn, Cum. @in. Genet. Deo. 1,397 (1991). 18. D. M. Prescott, Microbiol. Rev. 58,233 (1994). 19. M.-C. Yao, Trends Genet. l2,26 (1996). 20. M.-C. Yao, in “Mobile DNA” (D. E. Berg and M. M. Howe, eds.), p. 715. American Society for Microbiology,Washington, D.C., 1989. 21. E. H. Blackburn and C. W. Greider, eds., “Telomeres.” Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1995. 22. J. G. Gall, ed., “The Molecular Biology of Ciliated Protozoa.” Academic Press, Orlando, 1986. 23. G . Steinbruck and M. Schlege1,J. Protozool. 30,294 (1983). 24. D. Ammermann and M. Schlegel,J. Protozool. 30,290 (1983). 25. A. H. Knoll, Science 256,622 (1992). 26. A.-D. G. Wright, MS Thesis, University of Guelph. Guelph, Ontario, Canada, 1993. 27. D. Ammermann, G. Steinbruck, L. von Berger and W. Hennig, Chromosomu 45, 401 (1974). 28. M. Roth and D. M. Prescott, Cell 41,411 (1985).
60
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
29. S. L. Tausta and L. A. Klobutcher, NARes 18,845 (1990). 30. H. Kraut and H. J. Lipps, in “Advances in Invertebrate Reproduction 3” ON.Engels, ed.), p. 533. Elsevier, Amsterdam, 1984. 31. M. R. Lauth, B. B. Spear, J. Heumann and D. M. Prescott, Cell 7 , 6 7 (1976). 32. G. L. Yu, J. D. Bradley, L. D. Attardi and E. H. Blackburn, Nature (London)344,126 (1990). 33. J. R. Vermeesch and C. M. Price, MCBioll4,554 (1994). 34. M. C. Yao, C. H. Yao and B. Monks, Cell 63,763 (1990). 35. S. E. Baird and L. A. Klobutcher, Genes Dew. 3,585 (1989). 36. J. L. Mitcham, A. J. Lynn and D. M. Prescott, Genes Dev. 6,788 (1992). 37. A. F. Greslin, D. M. Prescott, Y. Oka, S. H. Loukin and J. C. Chappel, PNAS 86,6264 (1989). 38. L. A. Klobutcher, C. L. Jahn and D. M. Prescott, Cell 36, 1045 (1984). 39. R. M. Ribas-Aparicio,J. J. Sparkowski,A. E. Proulx, J. D. Mitchell and L. A. Klobutcher, Genes Deu. 1,323 (1987). 40. G. Hemck, D. Hunter, K. Williams and K. Kotter, Genes Dew. 1,1047 (1987). 40a. A. Seegmiller,K. R. Williams, R. L. Hammersmith, T.G. Doak, D. Witherspoon, T. Messick, L. L. Storjohann and G. Hemck, Mol. Biol. Euol., in press (1996). 41. C. Eder, C. Maercker, J. Meyer and H. J. Lipps, Int. J. Dew. Biol. 37,473 (1993). 42. P. Bierbaum, T. Donhoff and A. Klein, Mol. Microbiol. 5, 1567 (1991). 43. Y. Oka and T. Honjo, NARes 11, 4325 (1983). 44. L. A. Klobutcher, PNAS 92,1979 (1995). 45. S. E. Baird, G. M. Fino, S. L. Tausta and L. A. Klobutcher, MCBiol9,3793 (1989). 46. C. A. Hale, M. E. Jacobs, H. G. Estes, S. Ghosh and L. A. Klobutcher, J. Euk. Miwobiol. 43,389 (1996). 47. S. L. Tausta and L. A. Klobutcher, Cell 59,1019 (1989). 48. G. Hemck, S. Cartinhour, D. Dawson, D. Ang, R. Sheets, A. Lee and K. Williams, Cell 43, 759 (1985). 49. K. Williams, T. G. Doak and G. Herrick, EMBOJ. 12,4593 (1993). 50. D. E. Berg and M. M. Howe, eds., “Mobile DNA.” American Society for Microbiology, Washington, D.C., 1989. 51. S. M. Halling and N. Kleckner, Cell 28,155 (1982). 52. F. Dyda, A. B. Hickman, T. M. Jenkins, A. Engelman, R. Craigie and D. R. Davies, Science 266, 1981 (1994). 53. T. A. Baker and L. Luo, PNAS 91,6654 (1994). 54. R. Rezsohazy, B. Hallet, J. Delcour and J. Mahillon, Mol. Microbiol. 9,1283 (1993). 55. N. L. Craig, Cuw. Topics Microbiol. Immunol. 204,27 (1996). 56. K. R. Williams and G. Herrick, NARes 19,4717 (1991). 57. D. J. Hunter, K. Williams, S. Cartinhour and G. Hemck, Genes Dm. 3,2101 (1989). 58. K. Knecht and L. A. Klobutcher, Eur. J. Psotistol. 31,201 (1995). 59. C. L. Jahn, M. F. Krikau and S. Shyman, Cell 59,1009 (1989). 60. C. L. Jahn, S. Z. Doktor, J. S. Frels,J. W. Jaraczewskiand M. F. Krikau, Gene 133,71(1993). 61. J. W. Jaraczewski and C . L. Jahn, Genes Dev. 7,95 (1993). 62. C. L. Jahn, L. A. Nilles and M. F. Krikau, J. Protozool. 35,590 (1988). 63. M. F. Krikau and C. L. Jahn, MCBioZ 1 1 4751 (1991). 64. M. Chandler and 0. Fayet, Mol. Microbiol. 7,497 (1993). 65. J. S. Frels, C. M. Tebeau, S. Z. Doktor and C. L. Jahn, MoZ. Biol. CelE 7,755 (1996). 66. J. S. Frels and C. L. Jahn, MCBioll5, 6488 (1995). 67. C. L. Jahn, Protozool. 38,252 (1991). 68. M. E. Jacobs and L. A. Klobutcher, J. Euk. Microbiol. 43,442 (1996). 69. M. C. Yao, J. Choi, S. Yokoyama, C. l? Austerberry and C. H. Yao, Cell 36,433 (1984). 70. R. C. Callahan, G. Shake and M. A. Gorovsky, Cell 36,441 (1984).
DNA DELETION I N CILIATES
61
71. M. Katoh, M. Hirono, T. Takemasa, M. Kimura and Y. Watanabe, NARes 21,2409 (1993). 72. T. Y. K. Heinonen and R. E. Pearlman,]. Biochern. 269,17428 (1994). 73. J. M. Wells, J. L. Ellingson, D. M. Catt, P. J. Berger and K. M. Karrer, MCBiol 14, 5939 (1994). 74. E. A. Howard and E. H. Blackburn, MCBiol5,2039 (1985). 75. M.-C. Yao and M. A. Gorovsky, Chrornosorna 4 8 , l (1974). 76. C. F. Austerbeny, C. D. Allis and M. C. Yao, PNAS 81,7383 (1984). 77. C. F. Austerbeny and M. C. Yao, MCBiol 8,3947 (1988). 78. R. Godiska and M. C. Yao, Cell 61,1237 (1990). 79. R. Godiska, C. James and M. C. Yao, Genes Dev. 7,2357 (1993). 80. C. F. Austerbeny and M. C. Yao, MCBiol7, 435 (1987). 81. C. F. Brunk, S. G . S. Tsao, C. H. Diamond, P. S. Ohashi, N. N. G. Tsao and R. E. Pearlman, Can]. Biochem. 60,847 (1982). 82. A. H. Tschunko, R. H. Loechel, N. C. McLaren and S. L. Men, Genetics 117,451 (1987). 83. M. C. Yao,]. Cell Biol. 92, 783 (1982). 84. T. C. White, G . M. El-Genely and S. L. Allen, MGG 201, 65 (1985). 85. J. M. Cherry and E. H. Blackburn, Cell 43, 747 (1985). 86. C. Wyman and E. H. Blackbum, Genetics l29,57 (1991). 87. B. Hoffman-Liebermann, D. Liebermann and S. N. Cohen, in “Mobile DNA” (D. E. Berg and M. M. Howe, eds.), p. 575. Ameiican Society for Microbiology, Washington, D.C., 1989. 88. L. B. Preer, G. Hamilton and J. J. Preer,]. Protozool. 39,678 (1992). 89. E. Meyer and A.-M. Keller, Genetics 143, 191 (1996). 90. J. Scott, C. Leeck and J. Fomey, NARes 22,5079 (1994). 91. C. J. Steele, G . G . Barkocy, L. B. Preer and J. J. Preer, PNAS 91, 2255 (1994). 92. L. Amar,]MB 236,421 (1994). 93. S. Duharcourt, A. Butler and E. Meyer, Gmes Deu. 9,2065 (1995). 94. L. A. Klobutcher and G. Hemck, NARes 23,2006 (1995). 95. E Bourgain-Guglielmetti and F. Caron,J. Euk. MicrobioZ. 43,303 (1996). 96. S. L. Tausta, L. R. Turner, L. K. Buckley and L. A. Klobutcher, NARes 19,3229 (1991). 97. L. A. Klobutcher, L. R. Turner and J. LaPlante, Genes Dev. 7,84 (1993). 98. J. R. Scott and G. G. Churchward, Annu. Rev. Mierobiol. 49,367 (1995). 99. P. Polard, M. F. PrBre, 0.Fayet and M. Chandler, EMBO]. 11,5079 (1992). 100. S. V. Saveliev and M. M . Cox, NARes 22, 5695 (1994). 101. M. C. Yao and C. H. Yao, NARes 22,5702 (1994). 102. C. F. Austerbeny, R. 0. Snyder and M. C. Yao, NARes 17,7263 (1989). 103. R. W. Yokoyama and M. C. Yao, Claaniosonuz 8 5 , l l (1982). 104. S. V. Saveliev and M. M. Cox, Genes Dev. 9,248 (1995). 105. S. V. Saveliev and M. M. Cox, E M B O J. 15,2858 (1996). 106. M. C. Yao and C. H. Yao, MCBioZ 9,1092 (1989). 106a. J.-P. Wen, C. Eder and H. J. Lipps, NARes 23, 1704 (1995). 107. A. D. Radice, 8. Bugaj, D. H. A. Fitch and S. W. Emmons, MGG 244,606 (1994). 108. S. Henikoff, New Biol. 4,382 (1992). 109. J. Collins, E. Forbes and P. Anderson, Genetics l21,47 (1989). 110. G. Herrick, S. W. Cartinho~~, K. R. Williams and K. P. Kotter,]. Protozool. 34,429 (1987). 111. D. W. Martindale and P. J. Bnins, MCBiol 3, 1857 (1983). 112. M. B. Rogers and K. M. Karrer, Dec. B i d . 131,261 (1989). 113. S. Ghosh, NARes 24, 795 (1996). 114. D. W. Martindale, NARes 18,2953 (1990). 115. M. T. Madireddi, M. C. Davis and C. D. Allis, Dev. Biol. 165,418 (1994).
62
LAWRENCE A. KLOBUTCHER AND GLENN HERRICK
116. M. T. Madireddi,J. F. Smothers and C . D. U s , Sem Deu. Bid. 6,305 (1995). 117. E. S. Cole, Deu. Bid. 148,403 (1991). 118. E. S. Cole and J. Frankel, Deu. BWZ. 148,420 (1991). 119. J. W. Jaraczewski,J. S. Frels and C. L. Jahn, NARes 22,4535 (1994). 120. T. Sugai and K. Hiwatashi,J. Protozoal. 21,542 (1974). 121. D. W. Martindale, C. D. Allis and P. J. Bruns,]. Psotowol. 32,644 (1985). 122. Y. Brygov and A.-M. Keller, Deu.Genet. 2,13 (1981). 123. D. L. Chalker and M.-C. Yav, MCBioll6,3658 (1996). 124. W. R. Engels, D. M. Johnson-Schlitz,W. B. Eggleston and J. Sved, Cell 62,515 (1990). 125. R. H. A. Plasterk, EMBOJ. 10,1919 (1991). 126. E. H. Blackburn and K. M. Karrer, Annu. Reu. Genet. 20,501 (1986). 127. C. M. Price, A. K. Adams and J. R. Venneesch,J. Euk. Microbial. 41,267 (1994). 128. F. Caron,JMB 225,661 (1992). 129. J. D. Forney and E. H. Blackbum, MCBiol 8,251 (1988). 130. A. Baroin, A. Prat and F. Caron, NARes 15,1717 (1987). 131. M. McKevwn, Annu, Reo. Cell BWE. 8,133 (1992). 132. D. M. Prescott and A. F. Greslin, Dev. Genet. l3,66 (1992). 133. P. Huvos, Genetics 141, 925 (1995). 134. P. A. Sharp, Science 254,663 (1991). 135. H. M. Robertson and D. J. Lampe, Mol. Bid. Euol. 12,850 (1995). 136. M. G. Kidwell, Cum. Opin. Genet. Deu. 2,868 (1992). 137. M. A. Houck, J. B. Clark K. R. Peterson and M. G. Kidwell, Science253,1125 (1991). 138. A. B. Tourancheau, N. Tsao, L. A. Klobutcher, R. E. Pearlman and A. Adoutte, EMBOJ. 14,3262 (1995). 140. C. H. Langley, E. Montgomery, R. Hudson, N. Kaplan and B. Charlesworth, Genet. Res. 52,223 (1988). 141. K. M. Derbyshire, M. Kramer and N. D. Grindley, PNAS 87,4048 (1990). 142. N. Kaplan, T. Darden and C.H. Langley, GeneEics 109,459 (1985). 143. M. B. Yarmolinsky, Science 267,836 (1995). 144. J. W. Golden, S. J. Robinson and R. Haselkom, Nutwe 3 14,419 (1985). 145. R. Saldanha, G. Mohr, M. Belfort and A. M. Lambowitz, FASEBJ. 7,15 (1993). 146. M. Belfort, M. E. Reaban, T.Coetzee and J.Z. Dalgaar4J. Bat. 177,3897 (1995). 147. N. Saitou and S. Ueda, Mol. Biol. Evol. 11, 514 (1994). 148. J. G. Ward and G. Henick, Deu. Biol. 173,174 (1996). 149. J. G. Ward, M. C. Davis, C. D. Allis and G. Herrick, Genetics 140,989 (1995). 150. S. F. Ng, Bwl. Rm. 65,19 0990). 151, D. G. Moerman and R. H. Waterston, in “MobileDNA” (D. E. Berg andM. M. Howe, eds.), p. 537. American Society for Microbiology, Washington, D.C., 1989. 152. D. S . Haymer and J. L. Marsh, Deu. Genet. 6,281 (1986). 153. A. R. Lohe, E. N. Moriyama, D. A. Lidholm and D. L. H a d , Mol. Bid. E d . l2,62 (1995).
DNA Excision Repair Assays DAVIDM u AND AZIZ SANCAR Department of Biochemistry and Biophysics University of North Carolina School of Medicine Chapel Hill, North Carolina 27599 I. In Vitro Assays . . . . . . . . . . . . .
D. Repair Synthesis Assay . E. Restriction Enzyme Sens' F. Biological Activity-Trans 11. In Vivo Assays .
.............................
64 64 65 69 71 72 72 73 73
.................
B. Mapping of W Photolesions by the Ligation-mediatedPolymerase Chain Reaction . . . . . . . . . ............................. C. Postlabeling.. . . . . . . . . . . ........................ D. Immunological Detection of Photolesions ...................... E. Unscheduled DNA Synthesis and Equilibrium Sedimentation ............................. F. Host Cell Reactivation . . . 111. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.............................
74 75 75 77 77 78 78
DNA lesions at specific sites in the genome can cause mutation or induce recombination, and may result in other DNA rearrangement reactions. These changes can ultimately lead to cancers. It has been estimated that 70-80°/o of cancers are caused by endogenous or exogenous agents that damage DNA (1).Similarly, many drugs used in cancer chemotherapy are DNA-damaging chemicals. Some patients and tumors are not responsive to these drugs, whereas others, after an initial favorable response, become refractory. It has been suggested that elevated DNA repair activity contributes to drug resistance (2).Thus, it is of chical and scientific significance to understand the molecular mechanisms that repair DNA. Of all the DNA repair mechanisms, nucleotide excision repair is probably the most important in view of the wide variety of DNA lesions that can be acted on by excision repair. In nucleotide excision repair, the damage is removed from DNA in the form of 12-13 nucleotides (prokaryotes)or 24-32 nucleotides (eukaryotes),by dual incisions of the damaged strand through an Progress in Nucleic Acid Research and Molecular Biology, Vol. 56
Copylight 0 1997 by Academic Press.
63
AU lights of reproductionin any lorn reserved 0079.660397 $25.00
64
DAVID MU AND AZIZ SANCAR
ATP-dependent multisubunit enzyme system we refer to as excision nuclease (or excinuclease). Defective nucleotide excision repair gives rise to an autosomal recessive hereditary disorder called xeroderma pigmentosum (XP) (3). From cell fusion studies, this disease was found to be genetically heterogeneous and classified into complementation groups A through G (4-5). Proteins defined by these seven complementation groups are a part of the excision nuclease, which is the operational definition for dual incision activity that requires all seven subunits (6). In recent years, important advances in excision repair have significantly increased our understanding of DNA repair. Detailed accounts of these advances have been documented (5, 7, 74. It is the purpose of this article to review the various tools, i.e., repair assays, used to study excision repair. Some of these assays have been in use for many years and others have been developed recently and have been instrumental in the rapid progress in the enzymology/molecular biology of excision repair. AU of the repair assays are broadly classified into two categories, in vitro and in vivo, although in some cases the line between them is blurry. We discuss the theoretical principles of the various assays, their specific use, and their advantages and disadvantages. Because this review is not intended to be a laboratory manual, no attempt is made to describe the technical details of various assays.
1. In Viiro Assays
A. Nicking/lncision Assay This assay measures the damage-dependent incisions of DNA. The earliest and still widely used version of this method (endonuclease-sensitivesite assay) measures the average size distribution of DNA in alkaline sucrose gradients, followingtreatment with T4 endonuclease V, which incises at the sites of pyrimidine dimers (8).A popular version of the nicking assay is based on the conversion of covalently closed circular, supercoiled plasmid DNA into a nicked, relaxed form. The conversion is commonly monitored by three methods: alkaline sucrose gradient (9),nitrocellulose fiter binding (lo),and agarose gel electrophoresis (11).Although this nicking assay can be carried out with relative ease, it does not detect incisions at the nucleotide level, nor can it distinguish a repair endonuclease such as T4 endonuclease V from a repair excision nuclease such as the Uvr(A)BC excinuclease of Escherichiu CoZi (12).To circumvent this problem, linear DNA fragments, containing damage randomly distributed throughout the DNA or at a specific position, are labeled only at either the 5' or the 3' end and subjected to the action of repair proteins. The incised products are then analyzed using denaturing
DNA EXCISION REPAIR ASSAYS
65
acrylamide gels to visualize the precise incision sites. Although linear DNAs containing damaged nucleotides at random sites [obtained by exposing the DNA to irradiation or to various model carcinogens such as psoralen, 2-(Nacetoxyacety1amino)fluorene (AAAI?), (+)anti-benzo(u)pyrene-7,8-dihydrodiol-9,lO-epoxide (BPDE), or cisplatin] have been used as substrates (13), uniquely modified DNAs are a better choice because they offer more defined analysis of the excision reaction, such as the order of incision, and provide more unambiguous data.
B. Excision Assay As shown in Fig. 1, nucleotide excision repair is generally considered a two-stage event: damage-guided dual incision (excision) and repair synthesis. Excision assay refers to the method of detecting the damage-carrying oligonucleotide as a result of the first stage. At least three isotopic labeling schemes have been put forth to detect the excised, lesion-containing oligomer: (1)A radiolabel is incorporated in the vicinity of the lesion in a synthetic substrate, such that the released oligomer carries the label and can be resolved on a sequencing gel (14).(2) The substrate is not radiolabeled. However, following the repair reaction, the excised oligomer is radiolabeled by deoxynucleotidyl terminal transferase before separating on a sequencing gel (13, 15). (3) The substrate is not radolabeled. Following excision, the products are separated on a sequencing gel and the excised fragment is located by Southern hybridization (16). Method 1,employing an internally isotopically labeled DNA substrate, is commonly used in the authors’ laboratory because it is superior to others in
A
resynthesis and (2) ligation
FIG.1. Overview of nucleotide excision repair. Nucleotide excision repair is convenientlyenvisioned as a two-step reaction. Step 1 is the dual incision (excision) flanking the lesion (shown as a triangle),resulting in an oligonucleotidecarrying the lesion. In step 2, the resynthesisand ligation reactions fillthe intermediatecontaininga single-stranded gap, giving the rephed product.
66
DAVID MU AND AZIZ SANCAR
terms of simplicity, sensitivity, and specificity. Furthermore, this is the only assay that allows one to carry out rigorous quantitative analysis of the excision nuclease in cell-free extracts or reconstituted systems. When the products of the excision nuclease are examined by denaturing polyacrylamide gels, radiolabeled oligomers containing the lesion will appear as a result of the two nicks, 5‘ and 3’ to the damage. In essence, excision assay is identical to incision assay in terms of the experimental procedure. The only difference lies in the radiolabel position in the substrate DNA. The incision assay requires terminally labeled DNA whereas the excision assay employs a substrate that is internally radiolabeled at a phosphodiester bond in the vicinity (5’ or 3’)of the lesion, such that the dual incision will release the damage and the radiolabel in the same fragment, which can be analyzed using denaturing electrophoresis. The nature of the damage used to synthesize the substrate DNA for either incision or excision assay generally does not affect the assay because the excision nuclease practically excises any type of damage (7).The choice is normally governed by the availability of the lesion in the precursor form ready for phosphoramidite chemistry so that the particular damage can be incorporated into an oligonucleotide by a commercial oligonucleotide synthesizer (17).Subsequently, this lesion-canying oligomer is assembled into a longer, double-stranded DNA through annealing and ligation with other oligomers (18, 19). Cholesterol-DNA (19),biotin-DNA adducts (J. Reardon, personal communication), and W photoproducts such as cis-syn-cyclobutane thymine h e r (17)are among those that are routinely incorporated into oligomers using phosphoramidite chemistry. The other method of making a short oligonucleotide containing a defined lesion is to damage the oligomer of a special nucleotide sequence that contains a site hypersensitive to a particular compound (20).Following the reaction, the desired oligomer product is isolated by gel electrophoresis or high-performance liquid chromatography. A good example is platinated oligonucleotides, which are usually generated by reacting DNA with the anticancer drug, cisplatin, forming an intrastrand adduct, usually at (GpG), (ApG),and to a lesser extent (GpTpG) sites (21).A second consideration regarding substrate preparation is the substrate length. For the bacterial excision nuclease, a DNA fragment as short as 40 nucleotides is sufficient for excision to take place (22), whereas the minimal length for the human enzyme is 100 nucleotides (23). In addition to the linear substrate, covalently closed circular plasmid DNA of several kilobases is also utilized as internally labeled substrate, despite the fact that it is more laborious to make such a substrate. In fact, the excision assay for the human excision nuclease was originally developed using a plasmid substrate containing four cyclobutane thymine dimers at predetermined positions (24). The radiolabeled plasmid substrate was prepared
DNA EXCISION REPAIR ASSAYS
67
by annealing a 5 terminally labeled thymine-dimer-containing oligonucleotide to a single-stranded closed circular template (which contained sequences complementary to the damaged oligomer at four sites), followed by second-strand DNA synthesis and ligation to give the final product. Because of the ease of preparing linear DNA substrates of 120-150 nucleotides from several shorter, complementary oligomers, closed circular substrates are not used as much for the biochemical studies of the excision repair nuclease mechanism. However, for investigations such as transcription-coupled nucleotide excision repair, in which a more versatile substrate (e.g.,a promoter) is needed, a plasmid substrate is preferable and may be essential (25). In fact, the bacterial transcription-coupled excision repair was first reconstituted in vitr-0 using a covalently closed circular DNA as substrate (25). The incision/excision assay is a powerful tool in studying the mechanism of the excision nucleases because it directly detects the products from the enzyme action (7,26). However, the signal may be obscured when cell extracts or partially purified proteins are used, due to DNA degradation by nonspecific nucleases. This is a serious problem, especially with the incision assay. Nonspecific nucleases degrade DNA to the vicinity of the lesion and stop at a position near the lesion, giving rise to a unique band on analysis that could be falsely interpreted as a result of damage-specific nicking because it is not observed with undamaged DNA (19).To eliminate such artifacts, control experiments, such as cross-complementation using extracts from XP mutant cell strains of different complementation groups, are essential to ascertain that the signal observed is the product of the excision repair nuclease (19,27). Using the excision assay, it was shown (24) that human excision nuclease removes thymine dimers by incising the 22nd-24th phosphodiester bond 5’ to the lesion, and the 4th-6th phosphodiester bond 3 to the lesion, which produces damage-containing oligomers of 27 to 29 nucleotides. In contrast to the rather fixed cutting pattern (the 8th phosphodiester bond 5 and the 4th or 5th phosphodiester bond 3 to the lesion) of the Uw(A)BC excision nuclease of E . coli (11, 7), the actual incision sites of the human enzyme are more influenced by adduct type and sequence context. As an example of the “adduct effect,” Fig. 2 shows an incision assay using human excision nuclease reconstituted from purified repair factors, and a 140-mer duplex DNA 5 terminally labeled in the damaged strand, containing a cholesterol adduct at position 70 (26, 28). Although the cutting is not evenly distributed between the largest (53nt) and smallest (42 nt) fragments for the 5 incision sites (lane 31,a total of 12 bands can be accounted for, ranging from 42 to 53 nucleotides in size. These correspond to incisions in the range of the 18th to the 28th phosphodiester bond 5‘ to the lesion. When the XPF-ERCC1 repair factor, the protein responsible for the 5’ nick (19,33), is omitted from the fully reconstituted nuclease system (lane 5),3 incised fragments are observed in the range of 72 to 81 nucleotides (26).
FIG.2. Incision assay with reconstituted human excision repair nuclease in uitro. The 5' terminallylabeled 140-mersubstrate contains a cholesterolh adduct (26)at the 70th nucleotide. The incision reaction was conducted by incubating DNA with human excision nuclease reconstituted from six purified repair factors (26,28).The products were analyzed on 8%denaturing polyacrylamide gels. Lane 1contains radioactive DNA size markers prepared from the HinfIdigested 9x174 DNA. Lanes 2 and 4 are the substxate DNA alone, whereas lane 3 shows the incision reaction in the presence of all six basal repair factors (XPA, TFIIH, XPC, XF'G, XPFERCC1, and RPA). The XPA and RPA are recombinant proteins made in E. coli (29, 30); the XPG factor is expressed and purified from the baculovirus/insect cell expression system (31,32). TFIIH, XPC, and XPF-ERCC1 are purified from HeLa cells (26,28). Lane 5 shows an incision reaction containing all basal excision repair factors of the excision nuclease except ERCC1-XPF. The size of each incised fragment is indicated to the right. A schematic drawing showing the distributions of the two repair nicks in relation to the adduct is presented at top. The symbol, , stands for the cholesterol-Aadduct (26).The cholesterol adduct-containingoligonucleotide
x
DNA EXCISION REPAIR ASSAYS
69
These correspond to the incisions from the 2nd to the 10th phosphodiester bond 3' to the cholesterol adduct. The incision sites of human excinuclease range from the 18th to 28th phosphodiester bond 5' to the damage and from the 2nd to the 10th phosphodiester bond 3' to the damage. In other words, both incisions can cover up to about 10 nucleotides, i.e., a full turn of a double helix, as illustrated in Fig. 3, depending on the lesion structure. This flexibility is intriguing especially when compared with the much more fixed cutting sites of bacterial excision nuclease. However, the two incisions are not randomly coupled because the size range of the most prominent excised products (24- to 30-mer) is less than that predicted by simply assuming that any 3' nick can be combined with any 5' nick. Exactly how the excision nuclease senses the distance between the incisions is not fully understood. With other lesions, a more narrow distribution of the incision sites is observed, which may change with sequence context. As an example, a thymine dimer in one particular sequence was excised mainly by hydrolysis of the 5th phosphodester bond 3' and the 22nd-24th phosphodiester bond 5' to the dimer (34).By contrast, a thymine dimer in a different sequence context was removed mainly by hydrolysis of the 6th-7th phosphodiester bond 3' and the 24th-25th phosphodiester bond 5' to the dlmer (26).Despite the effect of damage type and sequence context on the precise locations of the two nicking sites, it is important to bear in mind that the distance between the two incisions always falls in the range of 24 to 32 nucleotides long.
C. Analytical Chemistry Many analytical chemical methods have been used to detect the release of damaged nucleotides in DNA by excision nuclease. The classic example is the one for quantitating UV photolesions in DNA (35-38). Cells are incubated with [3H]thymine,irradiated with W, and, following incubation, the DNA is isolated and hydrolyzed to bases by acid/heat treatment. The unhydrolyzable pyrimidme dimers and [6-4]photoproducts are then separated from pyrimidine monomers by chromatography. Paper chromatography has been used to analyze the hydrolysis products. More recently, high-performance liquid chromatography (HPLC) has also been included as a tool to resolve the hydrolysates. When the hydrolysates are separated using reversed-phase HPLC, chfferent forms of pyrimidine dimers (TOT and T()C as T()U, C()C as U()U) as well as [6-4]photoproducts are resolved (39).Quantitation of each pyrimidine dimer is obtained by integrating the area in each peak in the chromatogram or by analysis of isotopic label incorporated into the pyrimidines. was obtained from MidLand Certified Reagents (Midland,Texas) and was synthesized using the cholesteryl-thiethyleneglycolphosphoramiditeprecursor via a conventional oligonucleotide synthesizer.
5'
3’ mrmrm Inclsbn sites of E. coll Uvr(A)BC excision nuckase
@@ Thelesion lnclslon sites of human excision nuclease
FIG.3. Schematic drawing of the incision sites of the E. coli and human excision repair nuclease. A double-helix DNA containing a damaged nucleotide is shown. The rather invariant incision sites of E. coli Uvr (A)BCexcinuclease and the more variable nicking sites of human excinuclease are indicated.
DNA EXCISION REPAIR ASSAYS
71
D. Repair Synthesis Assay Nucleotide excision repair can be conceptually envisioned as a two-step reaction, i.e., damage-guided dual incision and gap-fillingto g v e the repaired product (Fig. 1) (7, 40-41). The preceding discussion concentrated on the methodologies that directly detect the damage-guided dual incision and adduct removal. However, following the dual incision, the resulting singlestranded gap must be filled by replication proteins and sealed by ligase to complete the repair. When [ C Y - ~ ~ P I ~ are N Tincluded PS in the repair reaction containing unlabeled damaged substrate, the substrate will become radioactive because of the incorporation of labeled nucleotides by excision gap-fillingDNA synthesis. This constitutes the mechanistic basis for another commonly employed in vitro repair assay, termed repair synthesis assay (42-44). The substrate for repair synthesis is often damaged by W irradiation at 254 nm, or by treatment with 2-(N-acetoxyacetylamino)fluorene, psoralen, or cisplatin. On incubation with wild-type cell extracts or purified repair factors ( 4 9 ,radiolabeled nucleotides are incorporated preferentially into damaged DNA compared to undamaged DNA, which is either included in the same reaction in the form of a plasmid of different size (43)or is used in a parallel, control reaction (44). Protocols have been developed for restoring the repair activity of an XP mutant cell-free extract (CFE)by mixing two CFEs from different complementation groups (43). An important distinction between the excision assay and the repair synthesis assay, in addition to their mechanistic bases, is the background signal. In excision assay with either cell-free extracts or purified proteins, when a single repair factor is missing, the signal (i.e.,the excised fragment, 12-13 nt in prokaryotes and 24-32 nt in eukaryotes) is completely abolished. In contrast, with the repair synthesis assay, cell-free extracts from cell lines known to be absolutely defective in excision, such as the human XP-A null mutants, incorporate radiolabel into damaged DNA in preference to undamaged DNA. The ratio of radiolabel incorporation into damaged DNA over undamaged DNA can be as high as 50% of the ratio obtained with wild-type cells (46).There is no satisfactoryexplanation at present for this damage-stimulated DNA synthesis that is not of repair origin. The recent finding of a UV-endonuclease in Schizosmchuromyces pombe (47, 48) and in the filamentous fungus Neurospwu mmsu (49),which incises the phosphodiester bond immediate 5' to the UV photoproducts, raises the possibility that such an activity in mammalian cells might be responsible for the damage-stimulated DNA synthesis observed in cell-free extracts of XP mutants. However, no such activity has yet been detected in human cells. The lesion-provoked DNA synthesis in these mutants may be attributed to non-
72
DAVID MU AND AZIZ SANCAR
specific nucleases that preferentially nick at the damage site because of partial single-strandedness caused by damage. Subsequently, nick translation of the polymeraselexonuclease combination will result in the damage-dependent DNA synthesis that does not originate from repair; that is, it is not a result of damage removal by a repair enzyme. Although the repair synthesis assay has certain shortcomings, it provides some information that cannot be obtained by the excision assay, e.g., the size of the repair patch. Using dNTF's (as)in repair synthesis,phosphorothionate linkages were introduced into the repair patch both in an E. coli (44) and in human (24)repair synthesis systems. Following the repair synthesis reaction, the terminally labeled fragment with the repair patch was treated with iodine, which specifically cleaves phosphorothionate linkages. On resolution using denaturing polyacrylamide gels, the repair synthesis patch was visualized as a sequencing ladder. Based on this method, it was demonstrated that in both systems the excision gap was precisely filled in without enlargement at either the 5' or the 3' side of the lesion.
E. Restriction Enzyme Sensitivity This assay involves making a photolesion in a sequence recognized by type I1 restriction enzymes such as TTAA by MseI or GGTACC by KpnI. After the repair of DNA containing a single T()TAA site, the DNA is isolated and subjected to MseI digestion. The repair reaction renders the damage site sensitive to MseI digestion in proportion to the level of repair synthesis. Similarly, a psoralen adduct at the T residue of the KpnI site (GGTACC)has been used to measure repair by both E . coli and human excision nucleases (24, 44). This assay is versatile in that it is applicable to nucleotide excision repair as well as to the light-dependent repair mediated by photolyase. Indeed, a TOT or [6-4]photoproduct at the TTAA site in a linear DNA was used to measure the photolyase activity, and to characterize a novel direct repair activity specific for a [6-4]photoproduct in cell-free extracts made from Drosophilu, Xenopus, and Texas rattlesnake (50, 51).
F. Biological Activity-Transformation Assay This assay measures the restoration of the biological function (replication or transcription) of damaged DNA, by treatment with excision nuclease. The method was originally developed to measure photolyase activity and has been adapted to study repair by excision nucleases. The transformation assay consists of the following steps: (1)generation of damaged DNA (e.g., a plasmid) by either UV irradiation or DNA-damagingagents such as cisplatin; (2) in vitro repair of the damaged plasmids; (3)uptake of the repaired plasmids into host cells (i.e., transformatiorutransfection); and (4) selection of transformed host cells for a drug-resistant gene camed by the plasmid. Be-
DNA EXCISION REPAIR ASSAYS
73
cause unrepaired lesions in the plasmid can block replication and hence the survival and colony formation of the transformed host cells, the number of transformants indicates the extent of the in uitro repair reaction. Using cisplatin-treated plasmid pBR322 as a probe, it was demonstrated that nucleotide excision repair is the major pathway to repair platinum adducts in transforming plasmid DNA (52).When damaged pBR322 is subjected to bacterial (A)BCexcision nuclease in vitro prior to transformation, a fraction of the adducts is excised, resulting in a proportional increase in transformation efficiency (52).A more recent application of the transformation assay involves a W-irradiated pBR322-derived plasmid (pOC2) carrying an indicator gene (53).After the treatment with a reconstituted E. coli nucleotide excision repair in uitro, the mutated plasmids are identified by transforming an indicator host strain, which gwes rise to colored colonies when transformed with mutated plasmids and white colonies when transformed with wild-type plasmids. From these experiments, it was concluded that E. coli DNA polymerase I11 is responsible for the excision repair gap-directed mutagenesis caused by misincorporation during resynthesis, due to the presence of W photoproducts in the single-stranded excision gap.
II. In Vivo Assays A. Nicking Assay and T4 Endonuclease V-Sensitive Site Assay The first step in excision repair is the nicking of DNA. Following DNA damaging, the DNA is isolated from repair-competent cells and analyzed on an alkaline sucrose gradient or on an alkaline agarose gel. The average size of fragments is indicative of the number of nicks made that remain to be sealed by repair synthesis and ligation. Alternatively, DNA is isolated from W-irradiated cells after a period of post-UV incubation, and then treated with T4 endonuclease V. The nicking products are analyzed by alkaline sucrose gradients or alkaline gel electrophoresis. T4 endonuclease V nicks at pyrimidine dimers and hence the decrease in the T4 endonuclease-sensitive sites (ESSs) is a measure of excision repair in wiuo. In the mid-1980s, a technique was introduced to measure DNA repair for a specific gene in the genome (54-57). Using this assay and T4 endonuclease V to prove for pyrimidine dimers, it was discovered that the removal of pyrimidine h e r s in the actively transcribed dihydrofolate reductase (DHFR) gene is much more efficient than in the overall genome (55).The basic technique involves restriction dgest of density-labeled DNA1 isolated from WSee Section ILE for a more detailed description of density-labeledDNA.
74
DAVID MU AND AZIZ SANCAR
irradiated cultured cells at various time periods. Low-density, nonreplicated DNA that has been repaired is then subjected to digestion with T4 endonuclease V.Subsequently, the T4 endonuclease-treated DNA is resolved in alkaline agarose gels and subjected to Southern hybridization. From the intensity of the full-length fragment of interest (zero class in the Poisson distribution), the repair (thymine dimer removal) can be determined from the number of T4 ESSs (58).Using the E. coli (A)BC excinuclease in place of T4 endonuclease V, this method has been generalized to detect all lesions as the A(BC) excinuclease-sensitive site (ASS) assay (59, 60).
6. Mapping of UV Photolesions by the Ligationmediated Polymerase Chain Reaction A variation of the gene-specific repair assays can be extended to resolving repair of individual lesions in a DNA sequence of interest. Ligationmediated polymerase chain reaction (LMPCR)-based mapping of lesions at nucleotide resolution was developed for this purpose (61).The invention of LMPCR adduct detection relied on the observation that [6-4]photoproducts can be cleaved at the adduct site by heating in 1 M piperidine (i.e., alkaline hydrolysis) (62). Subsequent to the alkaline hydrolysis, the cleaved, denatured single-stranded fragments are annealed with a genespecific primer for primer extension. The resulting double-stranded fragments are then blunt-end ligated to a common linker. To amp% these DNAs, PCR is carried out using both the primer for primer extension and the primer to the linker. Subsequently, the amplified DNA is separated on denaturing polyacrylamide gels and transferred to filters. To visualize the size distribution of the amplified DNA, which, in turn, reflects the distribution of [6-4]photoproducts along the gene of interest, autoradiography is carried out for the fdters bearing the amplified DNA fragments via hybridization with a radiolabeled complementary oligonucleotide, which is shared by all amplified DNA. Since the introduction of this method in 1991, two important technical improvements have been made to increase its versatility: mapping of cyclobutane thymine dimers, and measurement of repair rate for each UV photoproduct along a particular gene (63).The original protocol (61)was designed for [6-4]photoproducts based on their instability in alkaline hydrolysis. Cyclobutane thymine dimers (TOT),on the other hand, are not particularly susceptible to alkaline hydrolysis, making them unsuitable for analysis by this method. An alternative is to nick the DNA at pyrimidine h e r sites using T4 endonuclease V. However, this treatment leaves TOT (actually thyminyl thymidine cyclobutane dimer) at the 5’ end. This terminus cannot be blunt-end ligated to a common linker containing a PCR primer site.
DNA EXCISION REPAIR ASSAYS
75
The problem was solved by the use of photolyase ( 7 ) to split TOT after the T4 nuclease digestion, removing the block for subsequent primer extension. Thus, the most frequent photolesion,TOT, can be detectedusing this method as well. To study repair rate, LMPCR is conducted at different time points subsequent to UV irradiation and thus the rate of adduct disappearance (i.e., repair) is measured for each lesion along a particular gene. Using this method, it was found that the repair rate of TOTS along the p53 gene in W-irradiated human fibroblasts is highly variable and sequence dependent (64).Slow repair was seen at seven of eight sites often mutated in skin cancer, implicating a link between repair efficiency and the mutation frequency. In addition, a second gene, human chromosome X-linked phosphoglycerate kinase, was subjected to the same analysis; the transcribed strand beginning downstream of the transcription start site, at nucleotide + 140 in exon 1,was preferentially repaired (63). A more recent study found that in the human JUN gene, even though the transcription-factor-bindingsites in the promoter re@onwere repaired slowly, very fast repair was seen in both strands between nucleotides -40 and +lo0 (65).
C. Postlabeling Postlabeling is a general assay that measures the levels and the rates of disappearance of abnormal (damaged) nucleotides from DNA. It can be used to q u a n e a given DNA adduct or base adducts of unknown nature (66,67). As implied by the name of the procedure, the assay begins with the isolation of nonradioactive damaged DNA. Subsequently, DNA is digested exhaustively with a mixture of nucleases and a phosphatase, and radiolabeled with T4 polynucleotide kinase and [y'3zP]AV.To separate unmodified from adducted mononucleotides, the labeled mixture is resolved by two-dimensional thin-layer chromatography or HPLC. The adducted mononucleotides are detected and identified by comparison with a control containing DNA unexposed to the treatment with DNA-damagingagents. Various procedures for improving the detection signal and identifjrlng the abnormal nucleotides have been introduced in the refinements to the assay (67).
D. Immunological Detection of Photolesions Immunological detection utilizes the high-affinity binding between antigen and antibody to visualize photolesions in cells, by using antibodies directed against W photoproducts or other DNA lesions such as cisplatin (68). Polyclonal (69- 71) or monoclonal antibodies have been used. Monoclonal antibodies, which specifically recognize cyclobutane thymine dimers or [6-4]photoproducts, were generated (72, 73),raising the specificity and util-
76
DAVID MU AND AZIZ SANCAR
FIG.4. An example of compound lesion, as it would occur in a cyclobutane thymine dimer (TOT) mispaired with AG.
ity of the assay significantly.To study the repair of photoproducts, cells were damaged with W light and incubated in growth media. At different time points, genomic DNA was extracted and the amount of either TOT or [6-4]photoproduct was quantitated using specific antibodies and standard immunological detection procedures, for example, the enzyme-linked immunosorbent assay (ELISA) (74). Using this method with mouse cell lines, it was demonstrated that more than 50% of the [6-4]photoproducts in the genome is removed after 6 hr following W irradiation, whereas only 10% of thymine dimers is repaired in the same period (75). The cause of this drastic difference in the repair rates of these two major photolesions is not known. However, the following in vitro experiments provide some clues to a possible answer. Solution nuclear magnetic resonance studies of TOT and [6-4]photoproduct-containing decamer duplex DNA reveal a potential loss of hydrogen bonding at the 3’ side of the [6-4]photolesion, and a larger helical distortion caused by the [6-4]photoproduct relative to TOT (76). This is likely to be the structural basis underlining the different repair rates for the two lesions. To further test this proposal, the adenine nucleotide base-paired with either thymine of the TOT was mutated into guanine in a 140-mer duplex DNA containing a single TOT, so that a slowly repaired photolesion was given a further structural “stress.” By doing so, it was expected to turn TOT into a better substrate for the excision repair nuclease because a mismatch would bring about some unwinding at the TOT site. Indeed, it was found that the TOT was excised at a faster rate when placed in the context of a mismatch (77). A putative harmful effect of a “compound lesion” (Fig. 4), such as a thymine dimer plus mismatch, is “mutation fixation,” which causes a point mutation of T to C in the case of a T()T:AG compound lesion. The contribution of compound lesions to mutation frequency in vivo remains to be determined. Studies with antibodies to platinum-DNA adducts have suggested that patients with a favorable response to the chemotherapeutic effect of cisplatin have higher levels of intrastrand diadducts (78). However, these preliminary reports need further confirmation before a correlation can b e made between DNA repair capacity of a cell or tissue and its (or the host’s) resistance to chemotherapy by cisplatin.
DNA EXCISION REPAIR ASSAYS
77
E. Unscheduled DNA Synthesis and E quiIibrium Sedimentation Unscheduled DNA synthesis was the first method used to detect excision repair in humans. When [3H]thymine is included in a culture medium containing W-irradiated cells, it is found that, unlike unirradiated cells, which incorporate [3H]thymine into DNA only in S phase, the DNA of W-irradiated cells is tritiated throughout the cell cycle Fence “unscheduled DNA synthesis” (UDS)] in a nonconservative manner (79). UDS is usually detected by autoradiography (SO). Since its introduction, UDS has become one of the most widely used in wiwo repair assays, and it has been used to discover that the W-sensitive XP syndrome is caused by a defect in excision repair (3).Even though W lesions were originally discovered to evoke UDS, it is now known that many other carcinogenic compounds, such as methylnitrosourea and 4-nitroquinoline l-oxide, elicit UDS as well. It can be applied to almost all cellular systems and can be particularly valuable in mixed cell populations in which cells cannot be physically separated but can be visually differentiated (81). A second method to measure repair synthesis in oivo involves density labeling, using incorporation of 5-bromouracil into newly synthesized DNA (82).In this repair synthesis assay, tritiated thymine and 5-bromouracil are added to the growth medium of W-irradiated cells; the radolabel and the density label are incorporated into newly synthesized DNA. the nonreplicated (light-light) and the replicated (light-heavy or heavy-heavy) DNAs are then separated by equilibrium density centrifugation in a CsCl gradient. Radioactivity in these DNAs is then determined. Any radiolabel present in the light-light DNA is a measure of UDS resulting from repair synthesis in the parental duplex. The sizes ofthe repair synthesispatches in both prokaryotes (12-13 nucleotides) and eukaryotes (24-30 nucleotides) are too small, compared to the average fragment size (300-400 base pairs), and hence do not cause a significant shift in the position of the repaired DNA, whch is readily separated from the replicated DNA containing a high level of radiolabel and hybrid density.
F. Host Cell Reactivation Host cell reactivation measures the in oiwo restoration of biological activity to in uitro-damaged DNA. The ability of W-damaged viruses to replicate in infected cells hinges on the genetic makeup of the host cells. The use of damaged phage or plasmid DNA provides certain advantages over direct treatment of cells with a DNA-damaging agent in studying the cellular DNA repair mechanism. In this assay the physiology of the cell is not perturbed by
78
DAVID MU AND MI2 SANCAR
the DNA-damaging treatment and, as a consequence, the fate of the transfecting DNA is solely dependent on the capacity of the host cell to process DNA damage. Most viruses use host cell proteins to repair and replicate. Taking advantage of this, a repair assay was designed to use the ability of Wdamaged viruses or plasmids to replicate in host cells as an indicator of the host repair capacity. This forms the basis for the host replication-dependent Host Cell Reactivation assay (83-88). More recently, the chloramphenicol acetyltransferase (CAT1 assay, originally developed to study transcriptional control in mammalian cells, was adapted to study DNA repair and mutagenesis (89).Instead of viruses, a W-damaged vector DNA carrying a gene with a readily detectable phenotype (e.g., chloramphenicol acetyltransferase) is used to transfect host cells. In the absence of host repair, W lesions block transcription (go), leading to reduced production of CAT and hence a reduced level of CAT activity, and vice versa. Alternatively, with an appropriate plasmicbhost system, after a round of replication the plasmid is isolated from the mammalian cells and transfected into indicator bacterial cells to detect mutations (91).
111. Conclusion Recent developments in DNA repair have led to an increased interest in the field. Thus, scientists with no prior background in DNA repair are starting to cany out repair experiments in order to develop an integrated view of DNA repair, replication, transcription, and cell-cycle regulation. Because each excision repair assay has its particular use, advantage, and shortcoming, it is important that the appropriate assay be used for experiments designed to address a particular question. It is hoped that this overview of the theoretical basis and of the specific applications of the excision repair assays will aid researchers in conducting excision repair experiments in their systems. ACKNOWLEDGMENT David Mu is supportedby Grant DRC-13 19 from the Cancer Research Fund of the Damon Runyon Walter Winchell Foundation.
REFERENCES 1. R. Doll and R. Peto, ‘The Causes of Cancer.”Oxford Univ. Press, London, 1981 2. A. Eastman and N. Schulte, Bchem 27,4730 (1988). 3. J. E. Cleaver, Nature (London) 218,652 (1968).
DNA EXCISION REPAIR ASSAYS
79
4. J. E. Cleaver and K. H. Kraemer, in “The rnetobolic Basis of Inherited Disease” (C. R. Scriver, A. L. Beaudet, W. S. Sly and E. Valle, eds.),Vol. 2, p. 2949. Mcgraw-Hill,New York, 1989. 5. E. C. Friedberg, G. C . Walker and W. Siede, “DNA Repair and Mutagenesis.”American Society for Microbiology, Washington, D.C., 1995. 6. A. Sancar, Science 266,1954 (1994). 7. A. Sancar, ARB 65,43 (1996). 7a. R. D. Wood, A R B 65,135 (1996). 8. A. K. Ganesan, C. A. Smith and A. A. van Zeeland, in “DNA Repair: A Laboratory Manual of Research Procedures” (E. C. Friedberg and P. C. Hanawalt, eds.), Vol. 1, p. 89. Dekker, Inc., New York, 1980. 9. E. Seeberg, J. Nissen-Meyer and P. Sbike, Nature (London) 263, 524 (1976). 10. A. T. Yeung, W. B. Mattes, E. Y. Oh and L. Grossman, PNAS 80,6157 (1983). 11. A. Sancar and W. D. Rupp, Cell 33,249 (1983). 12. J. J. Lin and A. Sancar, Mol. Mimohiol. 6,2219 (1992). 13. A. Sancar, D. C. Thomas, B. Van Houten, I. Husain and M. Levy, in “DNA Repair: A Laboratory Manual of Research Procedures (E. C. Friedberg and P. C. Hanawalt, eds.), Vol. 3, p. 479. Dekker, Inc., New York, 1988. 14. B. Van Houten, H. Gamper, J. E. Hearst and A. Sancar,JBC 26%14135 (1986). 15. S. N. Gudzer, Y. Habraken, P. Sung, L. Prakash and S. Prakash, JBC 270, 12973 (1995). 16. J. Moggs, K. J. Yarema, J. M. Essigmann and R. D. Wood,JBC 2 7 4 7177 (1996). 17. C. A. Smith and J.-S. Taylor,JBC 268, 11143 (1993). 18. J. C. Huang, D. S. Hsu, A. Kazantsev and A. Sancar, PNAS 9 1 12213 (1994). 19. T. Matsunaga, D. Mu, C.-H. Park, J. T. Reardon and A. Sancar,JBC 270,20862 (1995). 20. S. F. Bellon, J. H. Coleman and S. J. Lippard, Bchem 30, 8026 (1991). 21. S. J. Lippard, “The Robert A. Welch Foundation 37th Conference on Chemical Research,” p. 49. The Robert A. Welch Foundation, 1993. 22. B. Van Houten, H. Gamper, S. R. Holbrook, J. E. Hearst and A. Sancar, PNAS 83, 8077 (1986). 23. J. C . Huang and A. Sancar,JBC 269,19034 (1994). 24. J. C. Huang, D. L. Svoboda,J. T. Reardon and A. Sancar, PNAS 89,3664 (1992). 25. C. P. Selby and A. Sancar, Science 260,53 (1993). 26. D. Mu, D. S. Hsu and A. Sancar.JBC 271, 8285 (1996). 27. J. T. Reardon, L. H. Thompson and A. Sancar, CSHSQB 58,605 (1993). 28. D. Mu, C.-H. Park, T. Matsunaga, J. T. Reardon and A. Sancar,JBC 270,2415 (1995). 29. C. J. Jones and R. D. Wood, Bchem 33,14197 (1993). 30. L. A. Henricksen, C. B. Umbricht and M. S. Wold,JBC 269,11121 (1994). 31. T. Matsunaga, C.-H. Park, T. Bessho, D. Mu and A. Sancar,JBC 271,11047 (1996). 32. A. ODonovan, D. Scherly, S. G. Clarkson and R. D. Wood, JBC 269,15965 (1994). 33. A. J. Bardwell,L. Bardwell, A. E. Tomkinson and E. C. Friedberg, Science 265,2082 (1994). 34. D. L. Svoboda, J.-S. Taylor, J. E. Hearst and A. Sancar,JBC 268, 193 1 (1993). 35. W. L. Carrier, in “DNA Repair: A Laboratory Manual of Research Procedures” (E. C. Friedberg and P. C. Hanawalt, eds.), Vol. 1, Part A, p. 3. Dekker, Inc., New York, 1981. 36. R. J. Reynolds, K. H. Cook and E. C. Friedberg, in “DNA Repair: A Laboratory Manual of‘ Research Procedures” (E. C. Friedberg and P. C. Hanawalt, eds.), Vol. 1, Part A, p. 11. Dekker, Inc., New York, 1981. 37. M. Sekiguchi and K. Shlmizu, in “DNA Repair: A Laboratory Manual of Research Procedures” (E. C. Friedberg and P. C. Hanawalt, eds.), Vol. 1, Part A, p. 23. Dekker, Inc., New York, 1981. 38. J. T. Comelis and M. Errera, in “DNA Repair: A Laboratory Manual of Research Procedures” (E. C. Friedberg and P. C. Hanawalt, eds.),Vol. 1, Part A, p. 3 1. Dekker, Inc., New York, 198 1.
80
DAVID MU AND AZIZ SANCAR
39. J. D. Love and E. C. Friedberg, in “DNA Repair: A Laboratory Manual of Research Procedures” (E. C. Friedberg and P. C. Hanawalt, eds.),Vol. 2, p. 87. Dekker, Inc., New York, 1983. 40. R. B. Setlow and W. L. Carrier, PNAS 51,226 (1964). 41. R. Boyce and P. Howard-Flanders,PNAS 51,293 (1964). 42. P. R. Caron, S. R. Kushner and L. Grossman, PNAS 82,4925 (1985). 43. R. D. Wood, P. Robins and T. Lindahl, Cell 53,97 (1988). 44. Sibghat-Ullah,I. Husain, W. Carlton and A. Sancar, NARes 17,4471 (1989). 45. A. Aboussekhra, M. Biggerstaff,M. K. K. Shiji, J. A. Vilpo, V. Moncollin, V. N. Podust, M. Protic, U. Hubscher, J.-M. Egly and R. Wood, Cell 80,859 (1995). 46. J. Hansson, S. M. Keyse, T. Lindahl and R. D. Wood, Cancer Res. 51, 3384 (1991). 47. K. K. Bowman, K. Sidik, C. A. Smith, J. S. Taylor, P. W. Doetsch and G. A. Freyer, NARes 22, 3026 (1994). 48. G. Freyer, S. Davey, J. V. Ferrer, A. M. Martin, D. Beach and P. W. Doetsch, MCBioll5,4572 (1995). 49. H. Yajima, M. Takao, S. Yasuhira, J. H. Zhao, C. Ishii, H. Inoue and A. Yasui, EMBOJ. 14, 2393 (1995). 50. S. T. Kim, K. Malhotra, C. A. Smith, J. S. Taylor and A. Sancar,JBC 269,8534 (1994). 51. S. T. Kim, K. Malhotra, J. S. Taylor and A. Sancar, Photochm. Photobiol. 63,292 (1996). 52. I. Husain, S. G . Chaney and A. Sancar,]. Buct. 163,827 (1985). 53. G. Tomer, 0.Cohen-Fix,M. ODonnelI, M. Goodman and Z. Livneh, PNAS 93,1376 (1996). 54. K. Nose and 0. Nikaido, BBA 781,273 (1984). 55. V. A. Bohr, C. A. Smith, D. S. Okumoto and P. C. Hanawalt, Cell 40,359 (1985). 56. I. Mellon, V. A. Bohr, C. A. Smith and P. C. Hanawalt, PNAS 83,8878 (1986). 57. V. A. Bohr, D. S. Okumoto and P. C. Hanawalt, PNAS 83,3830 (1986). 58. V. A. Bohr and D. S. Okumoto, in “DNA Repair: A Laboratory Manual of Research Procedures” (E. C. Friedberg and P. C. Hanawalt, eds.), Vol. 3, p. 347. Dekker, Inc., New York, 1988. 59. D. C. Thomas, D. S. Okumoto and A. Sancar, and V. Bohr,JBC 264,18005 (1989). 60. J. G. R. de Cock, A. van Hoffen, J. Wignands, G. Molenaar, P. H. M. Lohman and J. C. J. Eeken, NARes 20,4789 (1992). 61. G. P. Pfeifer, R. Drouin, A. D. Riggs and G. P. Holmquist, PNAS 88,1374 (1991). 62. J. A. Lippke, L. K. Gordon, D. E. Brash and W.A. Haseltine, PNAS 78,3388 (1981). 63. S. Gao, R. Drouin and G. P. Holmquist, Science 263,1438 (1994). 64. S. Tomaletti and G. P. Pfeifer, Science 263,1436 (1994). 65. Y.Tu, S. Tomaletti and G. P. Pfeifer, EMBOJ. 15,675 (1996). 66. K. Randerath, M. V. Reddy and R. C. Gupta, PNAS 78,6162 (1981). 67. R. C. Gupta and K. Randerath, in “DNA Repair: A Laboratory Manual of Research Procedures” (E. C. Friedberg and P. C. Hanawalt, eds.), Vol. 3, p. 399. Dekker, hc., New York, 1988. 68. A,-M. J. Fichtinger-Schepman. A. T. van Oosterom, P. H. M. Lohman and F. Berends, Cuncer Res. 47,3000 (1987). 69. J. J. Cornelis and M. Errera, in “DNA Repair: A Laboratory Manual of Research Procedures” (E. C. Friedberg and P. C. Hanawalt,eds.),Vol. 1, Part A, p. 3 1. Dekker, Inc., New York, 1981. 70. D. L. Mitchell, Photochem.Photobiol. 48,51 (1988). 71. X. Zhao and J.-S. Taylor,JACS 116,8870 (1994). 72. T. Mizuno, T. Matsunaga, M. Ihara and 0. Nikaido, Mutation Res. 254,175 (1991). 73. T. Mori, T. Matsunaga, T. Hirose and 0. Nikaido, Mutation Res. 194,263 (1988). 74. E. Engvd and P. 0.Perlmann, ImmunochmisCry 8,871 (1971). 75. K. Ishizaki, Y. Ejima, T. Matsunaga, R. Hara, A. Sakamoto, M. Ikenaga, Y. Ikawa and S.-I. Aizawa, Int. J. Canca: 58,254 (1994).
DNA EXCISION REPAIR ASSAYS
81
76. J.-K. Kim, D. Patel and B.3. Choi, Photochem. Photobiol. 62,44 (1995). 77. D. Mu, M. Tursun, D. R. Duckett, J. T. Drurnmond, P. Modrich and A. Sancar, MCBioZ. (1997). In press 78. E. Reed, R. I. Ozols, R. Tarone, S. H. Yuspo and M. Poirier, PNAS 84,5024 (1987). 79. R. E. Rasmussen and R. B. Painter, Nature (London)203,1360 (1964). 80. B. Djordjevic and L. J. Tolmach, Radiat. Res. 32,327 (1967). 81. J. E. Cleaver and G. H. Thomas, in “DNA Repair: A Laboratory Manual of Research Procedures” (E. C. Friedberg and P. C. Hanawdt, eds.), Vol. 1,Part B, p. 277. Dekker, Inc., New York, 1981. 82. D. E. Pettijohn and P. C. Hanawalt,JMB 9,395-410 (1964). 83. Z. Zavadova, Nature N B 233,123 (1971). 84. C. D. Lytle, S. A. Aaronson andE. Harvey, hat. j . Rudiut. B i d . 22,159 (1972). 85. A. S. Rabson, S. A. Tprell and F. Y.Legatlais, PSEBM 132,802 (1969). 86. R. S. Day 111, Cancer Res. 34, 1965 (1974). 87. R. S. Day 111, Photochem. Photobiol. 19,9 (1974). 88. S. A. Aaronson and C. D. Lytle, Nature (London)228,359 (1970). 89. M. Protic-Sabljic and K. H. Kraemer, PNAS 82,6622 (1985). 90. B. A. Donahue, S. Yin, J.4.Taylor, D. Reines and P. C. Hanawalt, PNAS 91,8502 (1994). 91. 0. Cohen-Fix and Z. Livneh,JBC 269,4953 (1994).
This Page Intentionally Left Blank
The Mitochondria I Uncoupling Protein: Structural and Genetic Studies' DANIELRICQUIER AND FREDERIC BOUILLAUD
I
Centre de Rechmches sur ['Endocrinologie Molkulaire et le Diveloppentent Centre National de la Recherche Scientifique 92190 Meudon, France
I. The Uncoupling Protein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. The Uncoupling Pathway o f Brown Adipose Tissue Mitochondria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. The UCP, a Mitochondrid Carrier . . . . . ............ C. UCP Sequence and Similarities ............................... D. UCPTopology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Expression of UCP in Mamm&an Cells . . . . . . . . . . . . . . . . . . . . . . . . F. Expression of UCP in Yeasts . . . . . . . . . . ............ 11. The Uncoupling Protein Gene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Organization of the Rat UCP Gene . B. Comparison of the UCP Gene from C. Comparison ofthe UCP Gene with Genes of Other Mitochonclrial Carriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Control of UCP Gene Transcription . . . . . E. Polymorplusm of the UCP Gene in Humans .................... 111. Conclusions and Perspectives References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85 85 86 57 89 92 92 96 96 97 97 98 104 10s 106
Mammals have two types of adipose tissue, distinguishable by their color. White adipose tissue consists of unilocular lipid-storing adipocytes, referred to as white adipocytes. Brown adipose tissue (BAT) is m d e of multilocular lipid-storing cells, referred to as brown ahpocytes. Morphological studies have revealed the presence of abundant mitochondna in the brown adipocytes. Biochemical studies have demonstrated that brown adipocyte mitochondria contain a unique membranous protein not found in any other
'
Abbreviations: UCP, uncouphg protein; BAT, brown adipose tissue; CAT, chloramphenicol acetyltransferase. Progess in Nriclric Acid Resea-ch md Molecular Biology, Vol. 56
83
Cop).right 8 1997 hy hcadrniic Prrsa. in any lorm rrsened. OOi9-6603iY7 $25 no
iw iights oirrproduution
84
DANIEL RICQUIER AND FWDERIC BOUILLAUD
type of cell. This protein, referred to as the uncoupling protein (UCP),allows brown adipocyte mitochondria to oxidize substrates rapidly without ADP phosphorylation, thus promoting the dissipation of oxidation energy as heat (1-4). BAT has long been regarded as a distinct tissue, present only in mammals. Contrary to white adipose tissue, which is located in subcutaneous, mesenteric, inguinal, retroperitoneal, parametrial, or epididymal regions, BAT is located in interscapular, axillary, cervico-intramuscular,intercostal, periaortic, and perirenal regions. The main function of white adipose tissue is energy storage, and an increase in the number of white achpocytes leads to obesity; BAT, however, has a very active metabolism, resulting in thermogenesis. BAT is active in newborn mammals (including babies), cold-adapted rodents, and hibernators on arousal. Besides cold-induced thermogenesis, BAT can also be activated following food intake and contributes to diet-induced thermogenesis. The function of BAT is particularly obvious in small mammals, which have an elevated metabolism. In large mammals, such as bovines, ovines, or humans, BAT is abundant at birth, but white adipose tissue rapidly becomes the dominant form of adipose tissue during development. Both white and brown adipocytes have a mesodermic origin. Although the existence of a common precursor, prior to brown and white adipose precursor cells, cannot be ruled out, it is believed that brown and white adipocytes derive from distinct fibroblastic precursor cells (1-3). Physiologically, brown achpocytes are controlled by sympathetic fibers that directly innervate the cells (1, 3, 5). Treatment of rodents with norepinephrine activates brown fat thermogenesis. The effect of norepinephrine in BAT is mediated by al-,Pl-, and P3-adrenoceptors. In fact, norepinephrine has a dual effect on cells comprising BAT. On delivery of norepinephrine to the surface of mature brown adipocytes, its interactions with adrenoceptors, causing both a rapid activation of the UCP and thermogenesis. This activation cascade includes an increase in CAMPand free fatty acid levels. The second activating effect of norepinephrine on BAT is a rapid stimulation of UCP gene transcription in mature cells; moreover, if the sympathetic stimulation of BAT is prolonged over 24 hr, norepinephrine recruits dormant brown adipocytes and precursor cells, which convert into mature brown adipocytes with many mitochondria and a high UCP content. Since the pioneering work of Nicholls and others on the loose coupling of thermogenic BAT mitochondria (1,4,6, 7), and since the first observation of a 32-kDa membranous protein induced in BAT mitochondria of coldadapted rats (8),our research, presented herein, has been dedicated to analysis of both the functional organization of the UCP and the mechanisms that strictly control UCP gene transcription.
T H E MITOCHONDRIAL UNCOUPLING PROTEIN
85
1. The Uncoupling Protein
A. The Uncoupling Pathway of Brown Adipose Tissue Mitochondria The discovery of a natural uncoupling mechanism in brown adipose tissue mitochondria elicited great interest in the field of bioenergetics. Actually, most of the main characteristics of this uncoupling pathway had been described (reviewed in 6) prior to the identification of a specific uncoupling protein in the inner membrane of brown adipocyte mitochondria (7,8).It was reported that brown adipose tissue mitochondria are exceptionally permeable to protons, and that this permeability could prevent mitochondnal respiratory control and the build-up of a membrane potential compatible with ATP synthesis. Moreover, it was possible to restore respiratory control by addition of purine di- or triphosphate nucleotides and after removal of endogenous free fatty acids (6).Brown adipose tissue mitochondria are also permeable to various anions, such as choride and bromide (1,4,6, and references therein). Because this anion permeability is inhibited by nucleotides in the same concentration range and with the same molecular specificity as the proton permeability, it was proposed that the same pathway was responsible for both types of permeabilities. Regulation of this pathway was studied in detail by Nicholls and co-workers and other groups (6).They concluded that removal of free fatty acids is not adequate to generate a membrane potential sufficient to restrain respiration rate. They also demonskated that the presence of nucleotides alone allowed higher values of transmembrane potential. However, under these conditions, the value of the potential was still below the value allowing respiration coupling; clearly, the recoupling of brown adipose tissue mitochondria studied in vitro required both removal of free fatty acids and the presence of purine nucleotides. As suggested by several experiments, it was postulated that the action mechanisms of these two effectors were different: first, fatty acids and nucleotides do not compete for occupancy of the same site and fatty acids do not change the affinity of nucleotides for UCP; second, contrary to what is known about proton permeability, the anion permeability of the UCP is not affected by removal of fatty acids. In other respects, it was shown that the UCP in mitochondria is not sensitive to endogenous nucleotides and is exclusively regulated by external nucleotides. It was reported that the uncouphg protein pathway could generate a proton conductance level far beyond the value necessary for full uncoupling of respiration in the absence of nucleotides and the presence of free fatty acids (9).In order to explain the control of UCP activity by ligands, most experiments were carried out using isolated brown fat mitochondria. However, these experiments did not define the actual conditions under
86
DANIEL RICQUIER AND FREDERIC BOUILLAUD
which the UCP works inside brown adipocytes. A critical point is that the intracellular concentration of nucleotides is sufficient to maintain UCP inhibition. Therefore, the actual concentration of nucleotides able to interact with the UCP can be lowered by complexation of nucleotides with magnesium ions, by a compartmentation mechanism (lo),or by a putative activating mediator. Interestingly, a series of experiments showed that a supply of fatty acids to isolated mitochondria induced respiration uncoupling, even in the presence of 3 mM ATP (11).Moreover, using isolated brown adipocytes, it was observed that norepinephrine stimulation or fatty acid addition induced uncoupled respiration (12). Experiments with brown adipocytes isolated from warm- or cold-adapted animals demonstrated that the intensity of uncoupled respiration correlated strongly with UCP levels. It had been observed that a high membrane potential could drive anions across the inner mitochondrial membrane of brown fat mitochondria even in the presence of inhibitory concentrations of nucleotides (13).Accordingly, it was proposed that free fatty acids act by lowering the transmembrane potential necessary for transport to occur in the presence of nucleotides (11). Such a mechanism could be rather inefficient and it may explain why the UCP is expressed in large excess, relative to what is necessary to achieve uncoupled respiration, when full UCP activation is obtained. The exceptionally high transcriptional activity of the UCP gene could be understood as a mechanism compensating the relatively weak molecular activity of the UCP. The present models for the UCP proton transport mechanism show the importance of free fatty acids. Reconstitution experiments of UCP activity into liposomes led to two models. According to the first model, fatty acid anion is transported by the UCP and the protonated form of fatty acids diffuses across the lipid phase of the membrane (14);in that case, anionic conductance of the UCP is indirectly responsible for the uncoupling by allowing free fatty acids to behave as protonophores. According to the second model, free fatty acids bind the UCP, and their carboxyl group is the proton donor; in that case, fatty acids are not translocated through the membrane (15).The two models share the idea that the protonationsldeprotonations of the carboxyl groups of fatty acids participate in proton transport.
B. The UCP, a Mitochondria1 Carrier The discovery of the UCP was the result of two different approaches. Considering that nucleotides bind to the inner membrane of brown fat mitochondria and inhibit the uncoupling pathway, [32P]azido-ATPwas used to label a 32-kDa protein, distinct from the ADP/ATP carrier (7). It was known that cold adaptation of rats increases brown fat thermogenic capacity. An analysis of possible changes in the protein composition of brown fat during cold adaptation demonstrated that the relative and total amounts of a 32-kDa
THE MITOCHONDRIAL UNCOUPLING PROTEIN
87
membranous protein are strongly increased in the brown fat mitochondria of cold adapted rats, relative to animals kept at room temperature (8).The molecular mass (33 kDa) of this protein, termed the uncoupling protein, as well as its ability to bind nucleotides, suggested that it is related to the ADP/ATP canier. In agreement with this hypothesis, a modification of the procedure used to pur& the ADP/ATP translocator allowed Klingenberg and his collaborators to purify the 32-kDa UCP (16).This purification was followed by the production of antibodes by several groups, includng ourselves (17). Antibodies against the UCP confirmed that it is unique to brown adipose tissue mitochondria. Reconstitution experiments confirmed the proton translocating activity of the UCP (18-20). The amino acid sequence of the UCP was determined separately in two laboratories. Aquila and Klingenberg and their colleagues sequenced the purified hamster UCP (21).We first cloned the rat UCP cDNA (22)and used it to derive an amino acid sequence (23).Then rat, mouse, and rabbit cDNAs were cloned in other laboratories. From the two first studies (21, 23), it was immediately recognized that the closest protein sequence was the sequence of the bovine ADP/ATP translocator, which was the first sequenced mitochondrial carrier. It was proposed that the two proteins belong to a family of mitochondrial carriers.
C. UCP Sequence and Similarities New members of the family of mitochondrial carriers have been identified (24,25).This family presently has more than 20 members, including the mitochondrial phosphate carrier, the oxoglutarate carrier, and the citrate carrier; several members of this family have been identified based on sequence similarity, but their biological activity has not yet been elucidated. The mitochondrial carriers share several characteristics. They are all 300 amino acids long and have molecular masses of about 30 kDa. Analysis of their sequences reveals a triplicated structure from a 100-amino acid domain. Each domain contains the following motif: P-x-(DE)-x-(LIVAT)-(RK)-(LIFMV). Such a motif (deposited in the PROSITE data base, accession number PS00215) can be used to identify potential mitochondrial carriers from amino acid sequences. Figure 1 shows the alignment of the different UCP sequences presently available. The partial sequence of the Etruscan shrew Suncus eh-uscus (this small homeothemic mammal has a body weight of 2 g and a very elevated thermogenesis) was determined in collaboration with Dr. S. Klaus (Marburg University). Figure 2 presents an intriguing characteristic of the UCP sequence. Alignment of the UCP and the ADP/ATP carrier reveals that the two proteins are partially homologous and exhibit a stronger similarity in their C-terminal
20
10
40
30
50
Shrew Bovine Eumn Rabbit Mouse EamsteI Rat Predictions Antigenic sites Topalogy
....................................
** ** . . . . . . . . . . . . . . . . . . . . . . . .
..........hhhhhhaHEEEEEEBHHE VSS’ITSEVQPT external
HEEE
*.***
.
.. ****
-.********
......................................
120
130
140
!
!
1
150
160
I
!
. . ~ . . . . ...... ..
*** *****.***** *********.***.
EEEEEEEEEEEEEEEEEEEEEEEE
I~TLFLKTEGLPKLYSGLPA internal 170 180 190 I
!
ORE 200 !
I
T P A ~ I S A G L M T G C V F I G Q ~ E V ~ ~ Q A Q S S L ~ K P R Y S ~ ~ A ~ I ~ F M j L ~ G T S L B L L T R H I I U C T E L ~ D AS--LGSKISAG~GVAVFIGO~WK~OAQSELEGP~R~UA~IATPEGLTOL~GTSPBLlTlII~CTELVTYD~~~~~
T A P S L G S K I L A G L ~ G G V A V F I G Q ~ ~ ~ ~ Q S ~ H G I ~ R ~ ~ A ~ I A ~ G L ~ L W K G ’ I T P ~ ~ V I TP-SLGSKISAGLTMiGVA~IGQ~~QAOSELEGL~R~~A~IATP8SLTSLWKO’ITPIII~C~L~D~~~IIBPILA TPASLG~ISAG~GVAVFIGQ~~~QAOSELEGIKPRrrVPYBAYKVIATlgSLS~WKG~P~II~~L~LMKG~~IIBPILA TPPTLGBRISAGLEFlYiGVAVLIOPPTEVVKVRLPAPSBIATlgSFSTLWKO~P~LRII~CVBLVTYDLMKGAL~QIIBPILA TP~LGSK1SAGLMn;GVAVPIOPPTISWXVRHPAQSELEGI~R~~A~IATPESLSTL~G~PULHRXiVIIWCTEL~~~EEIIBPILA
.
.* *** ** *.* ******* ***.**** ** ****.****** .* **
.hhhhhhhEEEEEEEEEBEEEEEEE--------------------------------------
220 a
*****- **
***** *** **..* * *
. .**
EHBBEEEEEEEEEEEEEEEEEEEEE............. Al'l'lSSLSTLWKGTTPULHRXiV
TPASLGSKISA external !
Predictions Antigenic sites Topology
AKVRLQVQGER~APGVKYKGVLGTIATVAK'I~GPLKLYGIQRQISF~LRIGLYDTVQ~YF~A~~Q
-VPIP’ITSEVQPTMGVRIFSAGVSACLADIITFPLDTAKVRLQIQGEGQASSTIRYKGVLGTI~AKlgGLP~YSGLPAGIQRQISPASLRIGLYDSVQBYFSSGRe -VN~SEVEPTMO~IFSAGVAACLADIITFPLDTAKVRLQIQGEGQISSTIRYKGVLOTITTLAKTBIQRQISFASWGLYDTVQEYFSSGKE -VSS’ITSEVQPTMGVKIFSVSACLADIITFPLDTAKVRLQIQGEGQASSTIR~G~~I’IT~~GLP~YSGLPAGIQRQISPASLRIGLYDTVQEYFSSG~
210
Shrew Bovine Human Rabbit Mouse Eamster Rat
100
90
MGGLTASDVEPTLGVQLFSAGIAACLADVITFPL~AKV~QVQGEC~SSVIRYKGVLGTITA~~G~YSGLPAGLQRQISSASLRIGLY~EFLTAGKE MVGmTDVPPTMGVKIFSAGVAACLADVITFPL~AKVRQQIQGEFPITSGIRYKGVLGTITTLAKTEGPLaLYSGLPAGLQRQISFASLRIGLYDTVQEF~SGEE
,
Predictions Antigenic sites Topology
80
t
................IFSAGVAACVADIITFPLDTAKVRCPIQGECLISSAIRYKGVLGTIITLAKTEGPVKLYSGLPAGLQRQISLASLRIGLYDTVQEFF~KE
110
Shrew Bovine Eulsan Rabbit Mouse Eamster Rat
70
60 I
,
t
internal 230
240
250
260
270
280
290
!
!
!
1
!
!
!
300 I
DDVPCBLLSALTAGFCT~~LTRFIBBPPGYYPPSF................................ D D V P C E W S A W R G F C T I . V L S S P V D V V K T R F V B S S P G Q a D D V P C E L V S A L I A G F C A T A H S S P ~ ~ I U S P P ~ ~ S V P U C ~ ~ G ~ ~ F ~ L V P S ~ D D V P C E F V S A L I R G F C T P L L S S P V D W K T R F I B S P P G D D V P C E L G S A L V A G F C T P V ~ ~ I ~ S D D V P C E L t S A W A G F C ~ P A M N K T R F I B S L P G Q Y P S V P S C ~ ~ G P T ~ F K G ~ S DDVPCELLSALVRGFC’PV~~I~SLPGQ
..** *******.** ** * **. . ** ***** ***** *** * * * * * * * * * * ** ...EEEBHEEEEEEEEEEEHEEEE------------------------------hhhhhhhEHEE~EEHEEEEEEEEEE..~..........~...
* * * * * * - * * * ****..
M’IUYTKEGPAAFFKGPAPS internal
r G P L ~
S S ~ I Y A P ~ I M
M ~ C ~ Q L ~ L S K S R Q ~ S O P Y P ~ C F E Q ~ L S ~ R O ~ ~ ~
-,*.*.**.*
(trypsin cleavage) external
THE MITOCHONDRIAL UNCOtJPLING PROTEIN
89
third (23).We have reported that a group of amino acids, EG-AFFKG, present in the UCP and the ADP/ATP carrier, is also present in the estrogen receptor or other members of the large family of DNA-bindingproteins known as nuclear receptors and involved in gene transactivation (26). Moreover, in these receptors, this region participates in DNA recognition (27).However, it remains difficult to speculate on properties common to mitochondrial carriers and nuclear receptors. The homology between the UCP and the ADP/ATP carrier strengthened our conviction that the EG-FFKG region is involved in nucleotide binding to both carriers, and stimulated us to start a program of recombinant expression of the UCP in order to investigate this hypothesis. Three different mutations or deletions were made in the rat UCP to analyze the contribution of the EG-AF'FKG motif to nucleotide binding to UCP (Fig. 2). Data corresponding to UCP mutants are gwen in Section I, E.
D. UCP Topology We tried to obtain soluble forms of the rat UCP by expressing various fragments of UCP cDNA in frame with the sequence encoding the malE protein, which is a periplasmic component of Eschevichia coli (28). Several fusion proteins made of short UCP moieties or of 103 amino acids were soluble and efficiently targeted to the periplasmic space (28). Inclusion bodies were generated when the whole UCP attached to malE was expressed. In preliminaiy experiments, we observed that a marginal amount of fusion protein prepared from inclusion bodies could be solubilized,renatured, and purified on an amylose colum (B. Miroux, unpublished data).The fusion proteins were not useful to assay nucleotide binding. Polyclonal antibohes against purified UCP (17) were used to check the level of expression of fusion proteins in E. coli. It was decided to use the small fusion proteins to map antigenic sites present in the UCP. Antibodies directed toward a specific region of other mitochondrial carriers have been successfully used to investigate their orientation inside the membrane (29).However, in that case, antibodies were raised against synthetic peptides (29). In our studies, we purified fusion proteins that were subsequently used to purify corresponding antibody subsets from total antiserum. Examination of the reactivity of purified antibodies toward different types of fusion proteins with FIG.1. Alignment of sequences of the UCP from different animal species. Sequences are from the Swissprot data base except for the partial sequence of the shrew, Suncus etruscus (unpublished data obtained in collaboration with Dr. S. Klaus, Marburg University). The clustal alignment program was used. Asterisks indicate identity, whereas dots correspond to amino acids sharing similar properties. Prediction line: transmembranous a helices were predicted (H), or simply suggested (h). Antigenicity and topology lines indicate domains recognized by antiUCP sera and their location relative to the inner mitochondrial membrane (see also Fig. 3).
90
DANIEL RICQUIER AND FREDmIC BOUILLAUD
...
ESTR.
H HHBHHHHHH HH HHHHHHHHHH z z 2 2 2 z z z CAVCNDYASGYHYGVWSC EGCKAFFKR SIQGHNDYMCPATNQCTIDKNRRRKSCQAC...
** ***** *** *****
*
AAC bov
253 VDCWRKIAKD EGPKAFFKG AWSNVLRGMGGAFVLVLYDEIKKFV
UCP r a t
25 1 PSCAMTMYTK EGPAAFFKG FAPSnRLcS~IMFVCFEQLKKELMKSRQTVDCTT
+* +
+
+
+
**
* *
-
hhhhhh HHHHHHHHHHHHHHHHHHHH
Mutant proteins: UCP*F/Y
251 PSCAMTMLTK EGPTAFXKG FVPSFLRLASWNVIMFVCFEQLKKELSKSRQTVDCTT
UCPA3
251 PSCAMTMLTK EGPTA ..F.
UCPA9
251 PSCAMTMLTK
.........
FVPSFLRLASWNVIMFVCFEQLKKLSKSRQTVDCTT
FVPSFLRLASWNVIMFVCFEQLKKELSKSRQTVDCTT
FIG.2. Alignment of the C-terminal domain of the rat UCP and bovine ADP/ATP carrier; similarity with the zinc finger domain of the human estrogen receptor. In the estrogen receptor (ESTR) sequence, z indicates cysteines coordinated to Zn ion; H shows amino acids forming helical fragments present in the DNA binding domain (27).In sequences of the two mitochondrial carriers, + identifies amino acids labeled by azido derivatives of ATP or ADP in the UCP (44, 45) or in the ADP/ATP carrier (46); a dash indicates an arginine residue of which the mutation in the UCP (42) or the ADP/ATP carrier (43)led to loss of activity. The bottom part of the figure shows three types of UCP mutants analyzed (41).
partially overlapping sequences, and toward a fusion protein lacking FKG residues at positions 267 to 269 (A3 mutant in Fig. 2), precisely delineated an antigenic site containing these three amino acids. Purified antibodies were used to determine the topological situation of the antigenic site. The weak reactivity of such antibodies toward freeze-thawed mitochondria contrasted with their high reactivity toward vesicles from sonicated mitochondria. Undoubtedly, this experiment demonstrated that the antigenic site containing F, K, and G residues is oriented toward the matrix side of mitochondria (28). Because it had been shown that trypsin was able to cleave the C-terminal extremity of the UCP in freeze-thawed mitochondria (30) (Figure 2), our data allowed us to propose the existence of a transmembranous segment in the UCP between amino acids 253 and 292 (Fig. 3). This type of experiment, and evidence based on the use of freeze-thawed mitochondria or sonicated particles-that polyclonal antibodies against UCP can react either with the external face of the inner membrane or its inner face-encouraged us to set up a strategy for the mapping of several antigenic sites (31).An expression library of chimeric proteins, containing different short domains of UCP fused to malE, was made in E. coli and was screened with polyclonal antibodies to detect clones corresponding to antigenic sites. Positive clones were recovered and their plasmids were sequenced. Then the
91
THE MITOCHONDRIAL UNCOUPLING PROTEIN
reactivity of antibodies toward a fusion protein containing an antigenic site was tested against freeze-thawed mitochondria or sonicated particles. This approach led us to determine the orientation of five different sites in the UCP (31)(Fig. 3).This study is the most complete topological description of a mitochondrial carrier so far known (31).It supports the model proposed for mitochondrial carriers from computerized predictions, with six transmembranous a-helices linked by polar loops (2s). Each repeat of 100 amino
I
I
I
1Dot
Mito & Submito
40
Mito 8 Submito 0
5
10
0
5
10
15
20
lntermembrane
Mito
100
80 60
C
*.
40
20
I
?
20
0
FIG. 3. Topology of UCP deduced from epitope analysis. Five antigenic sites were identified (28, 31) corresponding to amino acids 1-11, 61-79, 105-118, 164-184, and 255-273. These sites are boxed on the model proposed for the UCP and other mitochondrial carriers. Antibodies specific for each antigenic region were tested toward mitoplasts or sonicated submitochondrial particles; the corresponding titration curves are shown above or below each boxed antigenic region. No antigenic domain was found at the C-terminal end of the predicted helix number 4 or at both extremities of the putative helix number 5, although there is little doubt about their existence. These two transmembranous helices are depicted shaded grey to indicate that this part of the mode1 is not yet supported by experimental data.
92
DANIEL RICQUIER AND m D E R I C BOUILLAUD
acids present in mitochondrial carriers is made of two transmembranous a-helices linked by a hydrophilic loop (Fig. 3). Remarkably, all delineated antigenic regions of UCP are located at the N-terminal extremity of a-helices. The apparent weak immunogenicity of hydrophilic loops may suggest that they are not exposed toward the hydrophilic intermembrane space, but rather are folded into the membrane. The methodology we developed may be used for any other membranous protein.
E. Expression of UCP in Mammalian Cells In parallel experiments, we explored different systems of expression of functional UCP.The rat UCP was first expressed in Xenopus laevis oocytes injected with UCP mRNA synthesized in vitro from a transcription plasmid (32).UCP was expressed and present in Xenopus oocyte mitochondria, but its uncoupling activity or its ability to bind nucleotides could not be assayed. The rat UCP cDNA was then placed under the control of the SV40 promoter in an expression vector active in mammalian cells (33).Transient and stable expression of UCP in CHO cells were obtained. Mitochondria containing UCP were isolated from stable cell lines. Respiration and membrane potential measurements showed a weakly coupled respiration and a decreased membrane potential, in comparison with mitochondria isolated from wild-type CHO cells. Addition of GDP to mitochondria containing UCP improved the level of coupling of oxidative phosphorylation and increased the membrane potential, as expected from the presence of a functional UCP (33). We then tried to express A3 and A9 UCP mutants in CHO cells. These mutants correspond to a short or long deletion of a putative nucleotide binding site in the UCP (Fig. 2). Curiously, whereas the transient expression of these mutants was observed, we never succeeded in cloning CHO cell lines expressing UCP mutants. One interpretation of these observations was that UCP mutants were no more inhibited by nucleotides and provoked celI death. In fact, we were unable to set up suitable truly inducible expression vectors in mammalian cells and could not test this hypothesis.
F. Expression of UCP in Yeasts 1. EXPRESSION OF WILD-TYPE UCP IN Saccharomyces cerevisiae
Since 1989, we have been developing a research program on the recombinant expression of rat UCP in Sacclzaromyces cerevisiae in close association with Dr. E. Rial (Madrid).We used expression vectors that can be strongly inhibited by glucose or activated by galactose. To better express UCP in yeasts, all nontranslated regions of the cDNA were removed. An unexpected problem was the existence of a very high electrophoretic conductance activated by CDP and ATP in yeast mitochondria; this conductance completely obscured UCP activity.
THE MITOCHONDRIAL UNCOUPLING PROTEIN
93
Study of this phenomenon, which was initially considered an artifact, led us to propose a physiological si&icance for it in yeasts (34).Moreover, we observed that this conductance was inhibited by phosphate (34).In the presence of phosphate, UCP activity (GDP-inhibitableproton or chloride conductance), estimated from osmotic swelling measurements could be assayed in yeast mitochondria. We also used respiring mitochondria in order to assay their membrane potential and respiratory activity. UCP expressed in yeast was functional: yeast mitochondria were sensitive to free fatty acids, which markedly decreased mitochondrial membrane potential and increased the respiratory rate; GDP addition decreased respiration rate and restored the high membrane potential (35)(Fig. 3). More recently, a careful analysis of the respiratory rate of yeast mitochondria containing UCP showed that the uncoupling of respiration mediated by UCP existed prior to fatty acid addition, and that GDP was initially able to decrease the respiratory rate; such a phenomenon was not observed in control mitochondna (36).Two other groups studied UCP expression in yeast. Bathgate et al. showed that UCP impeded yeast growth (37).The aim of these authors was to validate a model of expression of proteins deleterious to plant cell mitochondria, in order to manipulate pollen fertihty. This idea arose from the fact that maize male sterility is due to activation of a pathway allowing a specific conductance through the tURF 13protein of pollinic cells (38). Garlid and colleagues also obtained UCP expression in yeast; they purified UCP from yeast and reconstituted its activity into liposomes (39). 2. FLOWCYTOMETRY OF YEAST EXPRESSING UCP Mitochondria exhibit a high membrane potential and constitute a very elecb-onegativeintracellular compartment that can easily accumulate cations, thus fluorescent and lipophilic cationic probes have been designed to label mitochondria specifically.In particular, Petit et al. described the conditions required to label yeast mitochondria using a fluorescent 3,3 -dihexyloxacarbocyanine iodide [DiOC(6)3]probe (40).The presence or absence of UCP expression by yeast mitochondria has been analyzed by flow cytometry (41).Expression of wild-type UCP had a small but detectable effect on the accumulation of DiOC(6)3 by yeast mitochondria (C. Fleury, unpublished data). UCPF/Y, UCP A3, and UCP A 9 mutants had a gradual effect on mitochondrial activity (Fig. 4). UCP A9 behaved as a very potent mitochondrial uncoupler, collapsing the mitochondrial membrane potential in almost all yeasts 3 hr after induction of’the expression (Fig. 4). When the incubation time was prolonged, cells exhibiting a normal mitochondrial potential were obtained. Sorting of these cells revealed two populations of cells: cells with “uncoupled mitochondna containing several copies of expression vector and cells with “coupled mitochondna having a very low number of copies of the expression vector (41).
94
DANIEL RICQUIER AND FREDERIC BOUILLAUD
1000
CPA9
UCPA3
500
0 0
50
100
150
200
250
Fluorescence (log scale on 256 channels) FIG.4. Flow cytomehy analysis of yeast cells expressing UCP. The different curves show the repartition of yeast cells according to DiOC(6)3 fluorescence intensity, which is related to mitochondrial membrane potential. The thin curves correspond to control yeast (transfected by the UCP cDNA in the wrong orientation), treated or not treated with the synthetic protonophore carbonyl cyanide rn-chlorophenylhydrazone (CCCP) that uncouples respiration and collapses mitochondrial membrane potential. The thick curves correspond to yeast expressing wild-type UCP (VCP+) or UCP mutants (see also Fig. 2).
3. UCP MUTANTS AFFECTINGNUCLEOTIDE SENSITMTY
Analysis of mitochondria isolated from yeast indicated that the UCP A 3 mutant (Fig. 2) was still activated by fatty acids but did not respond to nucleotide addition (Fig. 5).Therefore, it was concluded that residues 267-269 are essential to UCP nucleotide binding. The activity of this UCP A 3 mutant was consistent with the prediction made from sequence alignments (41).Another group reported that replacement of arginine 2 76 by leucine resulted in a UCP that was insensitive to nucleotide inhibition (42).Mutagenesis of the ADP/ATP carrier expressed in yeast also pointed to the importance of arginine residues (43). Previously photoaffinity labeling experiments, using azido-ATP or ADP, identified residues close to this region in the UCP (44, 45) as well as in the ADP/ATP carrier (Fig. 2,46). Thus, the C-terminal third of UCP and, more precisely, residues forming the N-terminal end of helix 6 are implicated in nucleotide binding. Taking into account both the fact that inhibitory nucleotides come from the cytosolic compartment and the topological model of the UCP (Fig. 3), we propose that the nucleotide binding site of UCP is a structure open toward the cytosol and tightly closed to the matrix side (Fig. 6).
control
Membrane potential
-
2 minutes
UCP+
J
UCPU
J
FIG.5 . Respiration and membrane potential measurements of yeast mitochondria containing wild-type or mutant UCP. Top traces show oxygen consumption by mitochondria isolated from yeast (oxygen electrode recording).The membrane potential (lower baces) was assayed simultaneously from the same mitochondria labeled with a fluorescent probe (40, 41). Control, yeast mitochondria not expressing UCP; UCP+, yeast mitochondria expressing wild-type UCP; UCPA3, yeast mitochondria expressing a UCP mutant in which amino acids 267,268, and 269 were deleted (see Fig. 2). N, NADH addition; P, palmitate addition; G, GDP addition.
FIG.6. Schematic representation of interactions between UCP and its ligands, deduced from experiments. Amino acids shown form the C-terminal end of the UCP (positions 253 to 306).Residues common to the UCP and the ADP/ATP carrier are shaded. Boxed residues are involved in nucleotide binding (42, 44, 45). The cysteine at position 304 that can influence fatty acid sensitivityis in the encircled group of amino acids.
96
DANIEL RICQUIER AND Fl@DERIC BOUILLAUD
4. CYSTEINE MUTATIONSAND FATTY ACID SENSITIVITY In order to defme whether any of the cysteine residues of the UCP are necessary for its activity, each cysteine residue was separately mutated into a serine residue. Functional analysis of mutations from isolated mitochondria revealed that none of the seven cysteine residues present in the UCP is critical for its activity, although quantitative differences were observed. This study (35)disagreed with the theory supporting participation of cysteine residues in UCP proton transport. We monitored growth of yeast strains after addition of bromopalmitate, a nonmetabolized form of palmitate. Growth of yeast expressing UCP was significantly slower in the presence of bromopalmitate. In particular, we observed that the growth of a yeast strain expressing the UCP mutant Cys,,,/Ser was strongly impaired (36). These data pointed to cysteine-304 as a residue implicated in fatty acid activation of the UCP. This cysteine residue was changed either to glycine, alanine, threonine, isoleucine, or tryptophan. In all cases, the sensitivityto fatty acid was modified. A good correlation was found between the growth rate in the presence of bromopahitate and the fatty acid effect on respiring mitochondria (36).The C-terminal end of UCP contains a hydrophilic domain of amino acids 296 to 306, which is not present in the ADP/ATP carrier. It was tempting to propose that this polar tail was a “fatty acid sensitizer.” In fact, this is not true, because a truncated UCP lacking residues 296 to 306 was still activated by fatty acids (C. Fleury and F. BouiUaud, unpublished data).
II. The Uncoupling Protein Gene A. Organization of the Rat UCP Gene Southern analysis of rat genomic DNA, using the rat UCP cDNA, shows that this gene is unique and is present only in mammals (22).Using the same probe, the rat UCP gene was cloned from a rat genomic library (47).An 18kb DNA fragment was isolated that included the full-length transcripted region, 5 kb upstream of the cDNA 5‘ end and 5 kb upstream of the cDNA 3‘ end (48).This 18-kb DNA fragment was entirely sequenced (48). The position of the unique transcription start-site was determined using both primer extension and S1 nuclease mapping. S1 nuclease analysis of the 3‘ extremities revealed two extremities separated by 366 nucleotides; sequencing of this region indicated that they correspond to two polyadenylylation sites. Alignment with the cDNA sequence revealed 6 exons whose extremities were characterized by consensus sequences of GT/AG splicing sites (48).
97
THE MITOCHONDRIAL UNCOUPLING PROTEIN
HSS -2 120
-4551
-2494
HSS -150
ATAAA
ATAAA
-2283
r
I 5 kb
0 kb
1
I
13 kb
18 kb
FIG.7. Organization of the rat UCP gene. The transcription unit contains six exons and two polyadenylylationsites (48).Human and bovine UCP genes have lost the first polyadenylylation site. In the DNA located 5' upstream of the transcriptional start site, two hypersensitive sites (HSS) were observed (48).The 5' flanking region is characterized by a minimal promoter (MP) (at bp - 157) and a 211-bp enhancer element at bp -2494 (64).
The organization of the rat UCP gene is shown in Fig. 7. A TATA box is present at position -28, very close to a putative CAAT box at -31. AS also noticed by Kozak et al. for the mouse gene (as),every exon encodes a particular transmembranous domain of the UCP.
8. Comparison of the UCP Gene from Different Species In all species studied, there is a single gene encoding UCF'. We also cloned human (47, SO) and bovine (51) UCP genes, whereas the mouse gene was cloned by Kozak and colleagues (49). In the case of the human gene, the whole transcription unit was isolated, preceded by 2 kb of DNA upstream of the putative TATA box (SO);this 5' flanking region was recently extended to - 7 kb (52).In fact, the genomic organization of the UCP gene (at least the transcription unit) is well conserved among animal species. In rat, mouse, and human, the UCP gene is made of 6 exons and the intron positions are almost entirely conserved. Human and bovine genes have only one polyadenylylation site. The human UCP gene was assigned to the long arm of chromosome 4 in q31 (50),whereas the mouse gene was assigned to chromosome 8 (53).
C. Comparison of the UCP Gene with Genes of Other Mitochondria1 Carriers The homology between UCP and other mitochondrial carriers that share a triplicated structure (each repeated domain being encoded by 2 exons) has
been discussed in the first part of this essay. It implies that genes of several
98
DANIEL RICQUIER AND FRmERIC BOUILLAUD
mitochondrial carriers have a more or less similar organization and derive from a common ancestor. The triplicated structure of maize ADP/ATP carrier genes is also obvious. This is not true for human ADP/ATP carrier 1 and Nmrospora mama ADT/AITP carrier genes, which contain 4 exons; however, in these genes, several exonic limits are similar to those found in the UCP gene. The human ADP/ATP carrier 1gene has been assigned to chromosome 4, as was done for the human UCP gene (50). A major difference between the UCP gene and genes of other carriers is that the UCP gene is the only gene to be uniquely expressed in a cell type and to be strongly inducible by physiological factors.
D. Control of UCP Gene Transcription 1. UCP GENETRANSCRIFTION Is CELL-SPECIFIC AND POSITIVELY CONTROLLED BY NOREPINEPHRINE, THYROID HORMONES, AND RETTNOIC ACID As far as it is known, no UCP has been detected, even in a very low amount, in tissues other than BAT. Therefore, it is presently believed that the UCP gene is uniquely transcribed in brown adipocytes. Several physiological or pharmacological studies have demonstrated that norepinephrine is a strong activator of UCP synthesis (1-3, 54). Run-on transcription in nuclei isolated from rats either exposed to 5°C for 15 min or treated with an adrenergic agonist demonstrated a transcriptional control of the UCP gene (54). This conclusion was confirmed using cultured brown adipocytes (55-59). The unique transcription of the UCP gene in brown adipocytes as well as its rapid and marked activation by norepinephrine encourages us to analyze the mechanisms controlling UCP gene transcription. In other respects, thyroid hormones regulate positively the transcription of the UCP gene (60, 61) and retinoic acid activates it strongly (62, 63). 2. ESSENTLAL &-ACTINGELEMENTS ARE IN DNA UPSTREAM OF THE TRANSCRIFTIONAL STARTSITE In order to delineate cis-acting elements controlling UCP gene expression, we fused 4551 bp of DNA upstream of the transcriptional start site of the rat UCP gene to the DNA encoding chloramphenicol acetyltransferase (CAT). Transgenic mice bearing this transgene were created (64).Assays of CAT activity revealed that the transgene is uniquely expressed in the BAT of transgenic animals; moreover, exposure to cold markedly induced CAT activity in the BAT of these mice. In parallel experiments, the 4551 bp/CAT plasmid was introduced into primary cultures of brown adipocytes or other celIs: a strong CAT activity was detected in brown adipocytes, and addition of norepinephrine or CAMPincreased the CAT activity (64, 65). Taken to-
THE MITOCHONDRlAL UNCOUPLING PROTEIN
99
gether, data from transgenic animals and transfected cells demonstrated that essential cis-acting elements were present in the 4.5-kb piece of DNA used. 3. A 200-bp ENHANCER ELEMENT Is PRESENT AT -2.4 kb
In order to delineate regions involved in the regulation of UCP gene transcription, 5’ and internal deletions were made in the 4551-bp-CAT plasmid, and these new plasmids were used to transfect in vitro-differentiated brown adipocytes or other types of cells. These studies led to the identification of a strong 211-bp activating element, located between base pairs -2494 and 2283 (64). When this short element was fused, in a sense or antisense orientation, to the minimal promoter of the UCP gene (at bp -157) or to the promoter of Herpes simplex virus thymidine kinase, it behaved as an enhancer element (64).The importance of this enhancer was also noticed in the mouse UCP gene (66) and confirmed in the rat gene (63, 67). 4. A SHORTPIECEOF DNA CONTAINING THE ENHANCER DIRECTSSPECIFICAND INDUCIBLE EXPRESSION OF A REPORTER GENEIN BROWNADIPOCYTES OF TRANSGENIC MICE In order to map cis-acting elements functionally, we developed a strategy based on the creation of transgenic mice. Because we had observed that 4551 bp of 5’ flanking DNA was able to direct, in a specific and regulated manner, the expression of a reporter gene in the brown adipocytes of transgenic mice (uiaksupra), we started a program to create transgenics from different types of CAT constructs. Recently, eight positive founder mice bearing the 211-bp enhancer and the first 400 bp of the 5’ flanking region attached to the CAT DNA were outbred to generate heterozygous lines (65). Four lines of transgenic mice, out of six analyzed, expressed CAT activity, In the four lines, a low CAT activity was detected in interscapular brown adipose tissue, but was undetectable in liver, heart, or brain (Fig. 8). Exposure to F C , or injection of these mice with norepinephrine, a P3-adrenoceptor agonist, or all-trans-retinoic acid, stimulated CAT activity in brown adipose tissue and did not induce any CAT activity in other tissues (Fig. 8).Therefore, data obtained from these transgenic mice demonstrated that a DNA fragment made of the 2 11-bp enhancer fused to the 400 bp of the proximal promoter contains sequences that can confer both specific transcription in brown fat and activation by cold, adrenergic agents or retinoic acid. This short DNA construct is the smallest fragment known to be able to drive expression of a reporter gene specifically in brown adipose tissue (65). Using transgenic mice, Boyer and Kozak (68)proposed that a cis-acting regulatory sequence between - 3 and - 1.2 kb of the 5’ flanking region of the mouse gene is required for control of ucp gene expression.Although we have
100
DANIEL RICQUIER AND FREDERIC BOUILLAUD
+1
Enhancer
r)
I
CAT
-2494 -2283
Bat
H
L
W
Br
FIG.8. Organ specificity and regulation of the expression of -400-enhancer-CAT reporter gene in transgenic mice. A schematic drawing of this construct which was used to generate transgenic mice, is shown in the upper part of the figure. This transgene was made of the 211-bp enhancer of the rat ucp gene (bp -2494 to bp -2283) attached to the proximal region of ucp promoter (bp -400 to bp + 111)in front of the CAT gene (65).The lower part of the figure shows the CAT activity (arbitrary units) measured in brown adipose tissue (Bat),heart @), liver (L),white adipose tissue 0, or brain (Br) of the -400-enhancer-CAT transgenic mice kept at 25°C (open columns) or exposed to 5°C for 16 hr (solid columns).
no data yet from transgenic mice bearing only the first 400 bp of DNA of the 5' flanking region, our recent data (65), and those obtained in Kozak's laboratory (66, 68), demonstrate that the 211-bp enhancer located at -2.4 kb plays a critical role in control of the UCP gene transcription.
5. MUTAGENESIS OF UCP ENHANCER DELINEATES A COMPLEX 20-bp REGULATORY ELEMENT, PARTIALLY RELATED TO THE APl BINDING-SITE AND THE RETINOIC ACIDRESPONSEELEMENT We undertook the dissection of the rat UCP enhancer. In fact, although rat and mouse UCP enhancers share closely related sequences, cis-acting elements and trans-activators seem to ddfer to some extent (62, 64-66, 68).In the case of the mouse enhancer, a cyclic-AMP response element and two
101
THE MITOCHONDRIAL UNCOUPLING PROTEIN
TTCC motifs are essential to expression, and it was proposed that the activity of the enhancer results from cooperation between two elements separated by 110 nucleotides (i66). In the case of the rat enhancer, a search for trans-activators was undertaken using an in vitro DNase I protection analysis and electromobility-shift assay (62).Two footprinted regions (FP1 and FP2) were delimited inside the enhancer (62);these footprints contain the two regions proposed to cooperate in mouse enhancer (66).In vitro, the FP1 footprinted region in the rat enhancer was able to bind factors related to nuclear factor 1 and Etsl. Electromobihty-shift assays in the presence of antibodies showed that the FP2-footprinted region can bind in vitro factors close to triiodothyronine receptors and the retinoid X receptor (see Fig. 9). This last observation stimulated us to look at the effect of retinoids on UCP gene transcription and to demonstrate that retinoic acid was in fact a strong activator of UCP gene transcription (62, 65). An activating effect of retinoic acid on UCP gene transcription has also been reported by others (63). Deletion of the enhancer completely abolished retinoic acid activation of rat UCP gene transcription (63, 65). A retinoic acid response element was tentatively localized at positions -2358 bp to -2334 bp in a footprint of the enhancer (62) (Fig. 9). In a second step, a functional analysis, based on mutagenesis of rat UCP enhancer, was carried out. These experiments were made with 1B8 cells, which are immortalized mouse brown adipocytes simi-
ENHANCER
PROMOTER
Nh etsl
+1
-2283 -509
-2494
Jun TR RXR RAR UARE
-
TR RXR
CACCC C/EBP C/EBP box
NF1
Spl CREB
FIG.9. Organization of the 5' flanking region of the rat UCP gene and identity of &urnactivators. The minimal promoter of the UCP gene can bind NF1, CREB, and Spl. The distal part of the promoter can bind C/EBP and other unidentified factors at the level of a CACCC box (62).A positive role of C/EBP (Y and C/EBP p on UCP gene transcription was reported by Yubero et al. (72).In the enhancer, binding sites for TR (iriiodothyroninereceptor),RXR, RAR, Jun, etsl, and NF1 were proposed (63, 64, 65, 67).UARE is an element made of an AP-1-typebinding site associated to an atypical retinoic acid response element. Mutations in UARE strongly impaired response to retinoic acid and norepinephrine (65; Larose and Ricquier, unpublished data). The participation of 'IX in UCP gene transcription was demonstratedby Rabelo et al. (67).The mouse UCP enhancer contains two CREs (66).The transcriptional start site is indicated (+ 1).
102
DANIEL RICQUIER AND m D E R I C BOUILLAUD
lar to HIB 1 B cells (69,70).These cells do not significantlytranscribe the UCP gene, but addition of norepinephrine, cyclic AMP, or retinoic acid rapidly and markedly activates UCP transcription (62,65).Transfection experimentswith deleted or mutated CAT constructs demonstrated that neither the putative retinoic acid-responsive element nor the FP2 footprint mediates the retinoic acid effect. Other putative retinoic acid-responsive elements, such as the thyroid hormone response element identified by Silva and colleagues (Si'), were not confirmed (65). Because no retinoic acid response element was identified in the rat UCP enhancer using point mutation or short deletions, we made two large deletion in the enhancer (65).Every deletion strongly inhibited responsiveness of the enhancer to retinoic acid or norepinephrine. The two deletions split a TGAATCA motif, a sequence resembling consensus AF-1 binding site (TGAClGTCA). To investigate the possible role of this putative AP-1binding site, we made two types of mutations, preventing Jun and Fos binding. These mutations strongly decreased the response to retinoic acid of the 455 l-bp/CAT plasmid transfected in 1B8 cells. These experiments demonstrated that integrity of the putative AP-1 binding site, located between bp -2422 and bp -2416, is required for enhancer activity in the presence of retinoic acid (65). Because norepinephrine is a physiological activator of UCP gene transcription (54-59), we also tried to map cyclic-AMP-responsive elements (CREs) in the 5' flanking region of the UCP gene. A CAT construct made of the minimal promoter responded to norepinephrine addition in 1B8 cells. A better response was observed with the 4551-bp/CAT plasmid. Deletion of the whole enhancer lowered its response, implying the presence of one or several CRE(s)inside the enhancer. Within the enhancer, two putative cyclicAMP-responsive elements were mutated but did not alter the response to cyclic-AMP. Surprisingly, mutagenesis of the AP-l-type domain lowered the activity of the 4551-bp/CAT DNA in the presence of norepinephrine, demonstrating that the AP-l-typedomain is not only involved in the activation of UCP gene transcription by retinoic acid but is also concerned with the activation by catecholamines (65;Larose and ficquier, unpublished data). A direct contribution of an AP-1 binding site to the activation of transcription by retinoids has not been reported. An explanation of the function of AP-1 in the activation of UCP transcription by retinoic acid could be that the retinoic acid effect is mediated by an unidentified retinoic acid-responsive element that is inhibited when the AP-1 site is mutated. This element could be the sequence immediately downstream of the AF-l-type domain. This sequence is similar to a sequence important in tut-induced activation of the HIV long terminal repeat and is contained in an inverted repeat of type 2. Interestingly,we mutated this element and observed a strong decrease in enhancer
THE MITOCHONDRIAL UNCOUPLING PROTEIN
103
activity in the presence of retinoic acid or cyclic-AMP.Moreover, a double mutation of the A€-1 element and this type-2 inverted repeat abolished almost 90% of the enhancer activity in transfected cells (65; unpublished data). In gel-shift experiments, the AP-1-type element can bind proteins related to Jun and Fos, whereas the type-2 inverted repeat can bind RARP and RXRa (65; Larose and Rcquier, unpublished data). In conclusion, these experiments delineated a complex 20-bp element in the enhancer, termed UCP gene activation regulatory element (UARE),which plays a major role in activation of UCP gene transcription by retinoids and norepinephrine (Fig. 9). CAN S BIND THE 5’ FLANKING REGION 6. OTHER~ ~ U ~ S - F A C T O R OF THE UCP GENE Using DNAse I footprint and gel-shift analyses, hypersensitive regions and binding sites for the CACCC-box binding protein were identified at position -500 (62, 71).Two sites that can bind the CCAAT/enhancer-binding protein were also identified at positions -457 and -325 (72).Three other footprinted boxes were also identified just ahead of and in the minimal promoter. The putative factors able to bind these three boxes were nuclear factor 1, cyclic-AMP response element-binding protein, and SP1 (62) (Fig. 9). The functional importance of this cyclic-AMPresponse element in the mouse gene was proved by mutagenesis and cell transfection (66).
7. PUTATIVEINHIBITORY REGIONSCANALSOCONTROL UCP GENETRANSCRIPTION Deletions in the 3’ part of the 4551-bp DNA attached to CAT DNA increased CAT activity both in transfected brown adipocytes and CHO cells, suggesting the presence of an inhibitory element in the proximal promoter (64). A silencer region was also delineated in the mouse promoter between -900 and -272 bp (66). We have also observed that, whereas the rat enhancer was active in brown adipocytes, the addition of the first 400 bp of the region upstream of the transcriptional start site inhibited the enhancer in CHO cells, but did not do so in brown adipocytes (64). In conclusion, our studies on the control of rat UCP gene transcription, experiments with transgenic mice, and in vitro analysis of the effect of mutations in the 5’ flanking DNA have established the importance of the enhancer located at -2.4 bp. The search for cis-acting elements mediating retinoic acid activation led to the unexpected discovery that mutations abolishing the response to retinoic acid and norepinephrine cluster in an element located at the 3’ boundary of the FP1 footprint; this element, UARE, is made of an AP-1-type element linked to an atypical retinoic acid-responsive element.
104
DANIEL RICQUIER AND FREDERIC BOUILLAUD
E. Polymorphism of the UCP Gene in Humans Many studies, mostly made in rodents, have indicated a significant contribution of BAT thermogenesis to regulation of body weight and body fat content (3,5).BAT is generally poorly active in genetically obese animals, whereas physiological or pharmacological activation of this organ facilitates energy expenditure and a decrease in body fat content. The UCP or UCP mRNA level or the level of UCP gene transcription is lowered in obese rats (54).Adrenergic treatment of rodents or dogs (73)reduces their body fat content. Genetic ablation of BAT in mice provokes a decrease of UCP by 96Vo and subsequent obesity (74).Expression of UCP in whte adipose tissue of transgenic mice decreased adiposity in obese mice (75).All these data support a role for UCP and BAT energy expenditure in body energy equilibrium in animals. In humans, although recent studies based on UCP or UCP mRNA analyses have confirmed the presence of typical brown adipocytes (76-77), the situation is unclear because it is not possible to assay the activity of BAT. However, because the role of inheritance in individual differences in body fat in humans has been recognized (78), the search for genetic defects contributing to obesity has stimulated research on several candidates genes, including the UCP gene. A study was undertaken to identdy sequence variation in the human UCP gene and to investigate its relationship with parameters such as body weight, body mass index, or body fat content. This study allowed Oppert et al. to identify, for the first time, the presence of DNA polymorphism in the human UCP gene (79).When DNA was digested with the enzyme BclI and hybridized to the UCP probe, a 4.5-bp segment was seen in most subjects and an 8.3-kb band was detected in some subjects. Allelic frequencies were 0.72 and 0.28, respectively, Three genotypes were found: BclI+/BcZI+, BcZI+/BcZI~, and BclI-/BcD-, with respective genotype frequencies of 0.52, 0.40, and 0.08. These frequencies were in Hardy-Weinberg equilibrium, and Mendelian inheritance was demonstrated by the segregation pattern in the families. We used this BclI restriction fragment length polymorphism to carr y out association studies made on 216 subjects from 64 families of the Qukbec family study, designed to investigate the role of genetics and DNA sequence variations in obesity and its complications (79).In fact, no differences were found in body mass index, percent body fat, subcutaneous fat, and resting metabolic rate among the three genotypes. However, when comparing low and high fat gainers for percent body fat during a 12-yearperiod, a higher frequency of the 8.3-kb d e l e was found in the group of high fat gainers (79). This parameter correlated sigdicantly with DNA sequence variations in the UCP gene. In collaboration with other groups, the same UCP gene polymorphism
THE MITOCHONDRIAL UNCOUPLING PROTEIN
105
was investigated in 238 morbidly obese Caucasian subjects. The analysis revealed that the presence of the UCP mutation may have deleterious effects on the progression of obesity during adulthood (80);it was concluded that the 8.3-kb allele of the UCP gene is a predictive factor associated with high weight gain. In collaboration with another medical group analyzing obese patients and restricting their food intake over 6 months, it was demonstrated that the 8.3-kb allele of the UCP gene is associated with a low decrease in body mass index, whereas obese patients having the BclI site in their UCP gene lost more weight (81).Recently, the polymorphic BcZI site was mapped to the 5’ flanking region in the UCP gene (52)but no functional analysis of this region in the human gene has yet been undertaken.
111. Conclusions and Perspectives We have summaized in this review our contribution to research on UCP and the UCP gene. In order to better understand the functional organization of the UCP, recombinant expression of UCP or UCP mutants in yeast is a powerful system. The flow cytometry of yeast can be used to sort out cells according to their mitochondnal potential. Accordingly, we are presently extending this approach to aiialysis of a library of random mutants. In other respects, the recombinant expression of various subdomains of the UCP in yeast and bacteria is in progress. We anticipate that such experiments will contribute to a further understanding of the structure and function of UCP. The specific transcription of the UCP gene in brown adipocytes remains a challenging question. Studes of the rodent UCP gene in transgenic mice strongly support an important role for the 200-bp enhancer at -2.4 kb, in the ability of the rat UCP gene to be transcribed. Analysis of this enhancer points to a region containing an unexpected element, the mutation of which impairs transcriptional activation by retinoids or norepinephnne. This element, UARE, is made of an AP-l-type element attached to an unconventional retinoic acid response element. Elucidation of the molecular mechanisms at the UARE site will require further investigations and identification of proteins binding to this site. The hypothesis of specific factors controlling UCP gene transcription through binding to UARE will be explored.
ACKNOWLEDGMENTS We express our gratitude to our collaborators, who contributed to most of the studies reported here and to those undertaken since 1984: A. M. Cassard-Doulcier, L. Casteilla,
106
DANIEL RICQUIER AND FREDERIC BOUILLAUD
0. Champigny, C. Fleury, C. Forest, G. G m t i , C. Gelly, M. Larose J. C. Matamda, C. LeviMeyrueis, G. Mory, S. Raimbault, J. P. Revelli, F. Serra, D. Vacher, and F. Villarroya. Our research is supported by Centre National de la Recherche Scientifique, Direction des Recherches, Etudes et Techniuqes, Fondation pour la Recherche Mkdicale, and Association de Recherches sur le Cancer. DR is established CNRS investigator; FB is established INSERM investigator.
REFERENCES 1. B. Cannon and J. Nedergaard, Essays B i o c h a 20, 111 (1985). 2. S. Klaus, L. Casteilla, F. Bouillaud and D. Ricquier, lnt. J. Biochem, 23, 791 (1991). 3. J. Himms-Hagen, Prog. Lipid.Res. 28,67 (1989). 4. D. G. Nicholls and R. Locke, Physiol Reu. 64, 1 (1984). 5. N. Rothwell and M. Stock, Nature (London) 281, 235 (1979). 6. D. G. Nicholls, BBA 549,l (1978). 7. C . M. Heaton, R. J. Wagenvoord, A. Kemp and D. G. Nicholls, EJB 82,515 (1978). 8. D. Ricquier and J. C. Kader, BBRC 73,577 (1976). 9. D. G. Nicholls and E. Rial, Trends Biocha. Sci. 19,489 (1984). 10. K. F. Lanoue, T. Strzelecki, D. Strzelecka and C. Koch,JBC 261,298 (1986). 11. E. Rial, A. Poustie and D. G. Nicholls, EJB 137,197 0983). 12. S. A. Cunningham, H. Wiesinger and D. C. Nicholls, EJB 157,415 (1986). 13. D. G. Nicholls, EJB 49,585 (1974). 14. K. D. Garlid, D. E. Orosz, M. Modriansky, S. Vassanelli and P. Jezek,JEC27l, 2615 (1996). 15. E. Wmkler and M. Klingenberg,JBC 269,2508 (1994). 16. C. S. Lin and M. Klingenberg, Bchem 2 1,2950 (1982). 17. D. Ricquier, J. P. Barlet, J. M. Garel, M. Combes-Georgesand M. Dubois, BJ210,859 (1983). 18. P. J. Strieleman, K. L. Schalinske and E. Shrago,JEC 260,13402 (1985). 19. M. Klingenberg and E. Winkler, EMBOJ. 4,3087 (1985). 20. P. Jezek, D. E. Orosz and K. D. Garlid,JBC 265,19296 (1990). 21. H. Aquila, T. A. Link and M. Klingenberg, EMBOJ. 4,2369 (1985). 22. F. BouiUaud, D. Ricquier, J. Thibault and J. Weissenbach, PNAS 82,445 (1985). 23. F. Bouillaud,J. Weissenbach and D. Ricquier,JBC 261, 1487 (1986). 24. R. K r h e r and F.Palmieri, in “Molecular Mechanism in Bioenergetics” (L. Emster, ed.), p. 359. Elsevier, Amsterdam, 1992. 25. J. E. Walker and M. J. Runswick,J. Bioenmg. B i d . 25, 435 (1993). 26. F. Bouillaud, L. Casteilla and D. Ricquier, MoZ. BioZ. E d . 9,970 (1992). 27. J. W. R. Schwabe, L. Chapman, J. T. Finch and D. Rhodes, Cell 75,567 (1993). 28. B. Miroux, L. Casteilla, S. Klaus, S. Raimbault, S. Grandin, J. M. Clement, D. Ricquier and F. BouillaudJBC 267,13603 (1992). 29. L. Capobianco, G. Brandolin and F. Palmieri, Bchem 30,4963 (1990). 30. C. Eckerskorn and M. Klingenberg, FEES Lett.226,166 (1987). 31. B. Miroux, V. Frossard, S. Raimbault, D. Ricquier and F. Bouillaud,EMBOJ. l2,3 739 (1993). 32. S. Klaus, L. CasteiUa, F. Bouillaud, S. Raimbault and D. Ricquier, BBRC 167, 784 (1990). 33. L. Casteilla, 0.Blondel, S. Klaus, S. Raimbault, P. Diolez, F. Moreau, F. Bouillaud and D. Ricquier, PNAS 87, 5124 (1990). 34. S. Prieto, F. Bouillaud, D. Ricquier and E. Rial, EJB 208,487 (1992). 35. I. Arechaga, S. Raimbault, S. F’rieto, C. Levi-Meyrueis,P. Zaragoza, B. Mirow, D. Ricquier, F. Bouillaud and E. Rial, BJ 296,693 (1993).
THE MITOCHONDRIAL UNCOUPLING PROTEIN
107
36. M. M. Gonzalez-Bmoso, C. Fleury, I. Arechaga, P. Zaragoza, C. Levi-Mepeis, S. Raimbault, D. Ricquier, F. Bouillaud and E. Rial, Eur. J. B io c h a . 239,445 (1996). 37. B. Bathgate, E. M. Freebairn, A. J. Greenland and G. A. Reid, Mol. Microbiol. 6,363 (1992). 38. N. Glab, P. X. Petit and P. P. Slonimski, MGG 236,299 (1993). 39. D. L. Murdza-Inglis, H. V. Patel, K. 8.Freeman, P. Jezek, D. E. Orosz and K. D. Garlid,JBC 260,11871 (1991). 40. P. X. Petit, N. Glab, D. Marie, H. Keffer and P. Metezeau, Cytomehy 23,28 (1996). 41. F. Bouillaud, I. Arechaga, P. X. Petit, S. Raimbault, C. Levi-Meyrueis, L. Casteilla, M. Laurent, E. Rial and D. Ricquier, EMBOJ. 13, 1990 (1994). 42. D. L. Murdza-inglis, M. Modtiansky, H. V. Patel, G. Woldegiorgis, K. B. Freeman and K. D. Garlid,JBC 269,7435 (1994). 43. D. R. Nelson, J. E. Lawson, M. Klingenberg and M. G. Douglas,JMB 230,1159 (1993). 44. P. Mayinger and M. Klingenberg, Bchem 3 ,l 10536 (1992). 45. E. Winkler and M. Klingenberg, EJB 203,295 (1992). 46. P. Dalbon, G. Crandolin, F. Boulay, J. Hoppe and P. V. Vignais, Bchem 27,5141 (1988). 47. F. Bouillaud, F. Villaroya, E. Hentz, S. Raimbault, A. M. Cassard and D. Ricquier, Clin. Sci. 75, 21 (1988). 48. F. BouiUaud, S . Raimbault and D. Ricquier, BBRC 15,783 (1988). 49. L. P. Kozak, J. H. Britton, U. C. Kozak and J. M. Wells,JBC 263,12274 (1988). 50. A. M. Cassard, l? Bouillaud, M. G. Mattei, E. Hentz, S. Raimbault, M. Thomas and D. Ricquier,J. Cell. Biochem. 43, 255 (1990). 51. L. Casteilla, 0. Champigny, F. BouiUaud, J. Robelin and D. Ricquier, BJ 257,665 (1989). 52. A . M . Cassard-Doulcier, F. BouiUaud, M. Chagnon, C. Gelly, F. T. Dionne, J. M. Oppert, C. Bouchard, Y. Chagnon and D. Ricquier, hit. J. Obesity 20,278 (1996). 53. A. Jacobson, U. Stadler, M. A. Glotzer and L. P. Kozak,JBC 260,16250 (1985). 54. D. Ricquier, F. Bouillaud, P. Toumelin, G. Mory, R. Bazin, J. Arch and L. Penicaud,JBC 2 6 1 13905 (1986). 55. S. Rehnmark, J. Kopecky, A. Jacobsson, M. Nechad, D. Herron, B. D. Nelson, M. J. Obregon, J. Nedergaard and B. Cannon, Exp. Cell. Res. 182, 75 (1989). 56. J. Kopecky, M. Baudysova, F. Zanotti, D. Janikova, S. Pavelka and J. Houstek, JBC 265, 22204 (1990). 57. S. Klaus, A. M. Cassard-Doulcier and D. l3icquier.J. Cell Biol. 115, 1783 (1991). 58. 0. Champigny, B. R. Holloway and D. Ricquier, MoZ. Cell. Enclorrinol. 86,73 (1992). 59. U. C. Kozak, W. Held, D. Kreutter and L. P. Kozak, Mol. Endocrinol. 6,763 (1992). 60. A. C. Bianco, X. Sheng and J. E. Silva,JBC 263,18168 (1988). 61. J. E. Silva, Mot. Endocrinol. 2, 706 (1988). 62. A. M. Cassard-Doulcier, M. Larose, J. C. Matamala, 0. Champigny, F. BouiUaud and D. Ricquier,JBC 269,24335 (1994). 63. R. Alvarez, J. Deandres, P. Yubero, 0. Vinas, T. Mampel, P. Iglesias, M. Giralt and F. Villarroya,JBC 270,5666 (1995). 64. A. M. Cassard-Doulcier, C. Gelly, N. Fox, J. Schrementi, S. Raimbault, S. Klaus, C. Forest, F. Bouillaud and D. Ricquier, Mol. Endocrind. 7,497 (1993). 65. M. Larose, A. M. Cassard-Doulcier, C. Fleury, F. Serra, 0. Champigny, F. Boulllaud and D. Ricquier,JBC 271 (1996).In press. 66. U. C. Kozak, J. Kopecky, J. Teisinger, S. Enerback, B. Boyer and L. P. Kozak, MC Biol. 14,59 (1994). 67. R. Rabelo, A. Schifman, A. Rubio, X. Y. Sheng and J. E. Silva, Endo~7inology136, 1003 (1995). 68. B. B. Boyer and L. P. Kozak, MC Biol. ll, 4147 (1991).
108
DANIEL RICQUIER AND F'REDERIC BOUILLAUD
69. S. R. Ross, L. Choy, R. A. Graves, N. Fox, V. Solevja, S. Klaus, D. Ricquier and B. M. Spiegelman, PNAS 89,7561 (1992). 70. S. Klaus, L. Choy, 0. Champigny, A. M. Cassard-Doulcier, S. Ross, B. Spiegelman and D. Ricquier,]. Cell Sci. 107,313 (1994). 71. P. Yubero, 0. Vinas, R. Iglesias, T. Mampel, F. Villaroya and M. Giralt, BBRC 204, 867 (1994). 72. P. Yubero, C. Manchado, A. M. Cassarddoulcier. T. Mampel, 0.Vinas, R. Iglesias, M. Giralt and F. Villarroya BBRC 198,653 (1994). 73. 0. Champigny, D. Ricquier, 0. Blondel, R. M. Mayers, M. G. Briscoe and B. R. Holloway, PNAS 88,10774 (1991). 74. B. B. Lowell, V. Ssusulic,A. Hamann, J. A. Lawitts,]. Himmshagen, B. B. Boyer, L. P. Kozak and J. S. Flier, Nature (London)366, 740 (1993). 75. J. Kopecky, G. Clarke, S. Enerback, B. Spiegelman and L. P. Kozak,]. Qin. Znoest. 96,2914 (1995). 76. G. Garutti and D. Ricquier, Int. J . Obesity 16,383 (1993). 77. M. L. Kortelainen, G. Pelletier, D. Ricquier and L. J. Bukowiecki, ]. Histochern. Cytochem. 41,759 (1993). 78. C. Bouchard and L. Pemsse, Obesity Res. 4,81 (1996). 79. J. M. Oppert, M. C. Vohl, M. Chagnon, F. T. Dionne, A. M. Cassard-Doulcier, D. Ricquier, L. Pemsse and C. Bouchard, Znt. J. Obesity 18,526 (1994). 80.K. Ckment, J. Ruiz, A. M. Cassard-Doulcier, F. Bouillaud, D. Ricquier, A. Basdevant, B. GuyGrand and P. Froguel, in press. 81. F. Fumeron, D. Betoulle, F. Bouillaud, J. C. Melchior, D. Ricquier and M. Apfelbaum, in press.
Molecular Regulati on of Cytokine Gene Expression: Interferon-y as a Model System' HOWARD A. YOUNG^ AND PARITOSH GHOSH Cellular and Molecular lnzmunology Section Labwatoy of Experimental lmniunology Division of Basic Sciences NCI-FCRDC Frederick, Mayland 21702 I. Extracellular Signals That Modulate IFN-y Production . . . . . . . . . . . . . . 11. The Role of DNA Methylation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. IFN-y Promoter Structure and Regulatory Elements . . . . . . . . . . . . . . . . IV. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References ....................................................
111 114 119 124 125
Interferon-y is a multifunctional cytokine that plays an important role in most aspects of immune development, maturation, and function. Also known as type I1 interferon, interferon-y (IFN-y)l is a single-copy gene found in all mammalian species as well as in chickens. The human gene is located on chromosome 12 and consists of four exons and three introns (1, 2). This exon-intron structure is also highly conserved among species, and sequence analysis of the coding region suggest that IFN-y may represent an ancient gene duplication (3).This hypothesis also may be reflected in a duplication of enhancer elements, as is discussed below.
'
Abbreviations: IFN-y, interferon-gamma; IL, interleukin; NK, natural killer; Th, T helper: LGL, large granular lymphocyte; PCR, polymerase chain reaction; PMA, phorbol myristate acetate; GVHD, graft-versus-host disease; CsA, cyclosporin A; NFAT, nuclear factor-activated T cells; PBMC, peripheral blood mononuclear cells; PHA, phytohemagglutinin; APC,antigenpresenting cells; STAT, signal transducer and activator of transcription; TGF-p, transforming growth factor beta; HIV, human immunodeficiency virus; GM-CSF, granulocyte-macrophage colony-stimulating factor; MIP, macrophage inflammatory protein; Oct, octamer; HTLV, human T-cell leukemia virus; CREB, cyclic-AMP response element binding protein; ATF, activating transcription factor; A€-1, activator protein-I; EMSA, electrophoretic mobility-shift assay; Con A, concanavalin A; YY-1, Yin-Yang-1. To whom correspondence may be addressed. Frogrrss in Nucleic Acid Resexch and Molecular Biology, Vol. 56
Copyright E 1997 by Academic Press.
109
All righb of reproduction in any Corm resewed 0079-6603/97 %25.00
110
HOWARD A. YOUNG AND PARDOSH CHOSH
IFN-y is predominantly produced by two cell types, T cells and large granular lymphocytes (NK cells) (4,5).In the T-cell populations, both CD4+ and CD8+ T cells express IFN-y in response to numerous stimuli. The major IFN-y inducers in the T-cell population include antigen in the context of the major histocompatability complex (MHC) and interleukins (ILS), including IL-2 and IL-12, although many other IFN-y inducers have been reported (for review, see 6). It has been thought that the CD8+ population is the predominant IFN-.I producer in peripheral blood T cells, but this may merely reflect the difference in the memory/naive cell ratio generally seen in peripheral blood (7-9). Memory T cells do produce more IFN-y than do naive T cells, but when CD4+ and CD8+ memory (CD45RO+) cells are compared on an equal cell basis, IFN-y production by the two populations is similar. (9). The CD4+ population can be subdivided into two distinct subsets, T helper 1 (Thl) and T helper 2 (Th2). The Thl population is defined as producing IFN-y, IL-2, and tumor necrosis factor (TNF),whereas the Th2 popThese subsets ulation is defined as producing IL-4, IL-5, and IL-10 (10,U). cross-regulate each other, and the in vivo ratio of these populations is critical to disease resistance/progression as analyzed in murine models and human disease (reviewed in 11, 12).Using murine cells as experimental model systems, the divergence between the two populations is distinct, because clonal cell lines that produce either IFN-y or IL-4 can be obtained. An additional population, designated Tho, also can be isolated and this population produces both IFN-y and IL-4 (13).Tho cells are thought to be a precursor of both T h l and Th2 cells. Equivalent human populations are more difficult to generate, because many Th2 cell lines produce low levels of IFN-y. The difference in obtaining clearly defined cell lines between the human and murine cells may be attributed, at least in part, to differences in the extent of methylation of the IFN-y genomic DNA (see Section 11). The other major IFN-y producer in the circulation is the large granular lymphocyte (LGL), or natural kdler (NK) cell. This cell is thought to be one of the first lines of defense against viral infection, bacterial infection, and possibly cancer, although its role in cancer surveillance is in question. Although the LGL population is not generally thought of as existing in subsets, cells that express high levels of CD56 are thought of as the more immature cells and generally produce lower levels of IFN-y than the more mature populations (CD56dim) (reviewed in 14).In contrast to T cells, LGLs require only one soluble signal (e.g., IL-2 or IL-12) for IFN-y production (19,although an additional cell population, which has characteristics of dendritic cells, has been reported to be required for LGL IFN-y production and function (16, 17). Through the use of the polymerase chain reaction (PCR), additional cell
INTERFERON-? GENE TRANSCRIPTION
111
populations have now been reported to express IFN-y mRNA, including B cell lines, fibroblasts, and macrophages (18-22). With the exception of a report demonstrating IFN-y mRNA by Northern blot and IFN-y protein expression in human B cell lines (19),the expression of IFN-y mRNA in other cell types can be detected only by PCR, and secreted protein is generally not detected. Thus, the physiological relevance of these observations is unclear and little work has been done on characterizing the control of transcription in these cell types. However, it is of interest that when the human IFN-y genomic DNA is transfected into NIH 3T3 fibroblasts, strong constitutive expression is seen when nuclear RNA is analyzed (23).Furthermore, the RNA levels are unaffected by treatment of the cells with phorbol myristate acetate (PMA), a strong inducer of IFN-y mRNA in T cells and LGLs. Cytoplasmic IFN-y mRNA is detected when cells are treated with cycloheximide to inhibit protein synthesis. The mechanism behind this nuclear-to-cytoplasmblock is not known and has not been further investigated. However, the observation of transcription of the transfected human genomic DNA in the fibroblast cell line [and the report of expression of the endogenous murine IFN-y gene in mouse L-929 cells (22)] raises the possibility that specific lymphoid transcription factors are not required for basal or constitutive IFN-y transcription. Thus, the lack of endogenous IFN-y expression in many cell types may be predominantly controlled by DNA methylation, the expression of silencer DNA binding proteins, or T-cell/NK cell-specific nuclear DNA bindmg proteins that are induced by specific extracellular signals (see Sections I1 and 111).
1. Extracellular Signals That Modulate IFN-y Production Transcription of the IFN-y gene can be modulated by a wide variety of extracellular signals. However, the molecular mechanisms underlying the effect of these different signals on IFN-y transcription, both positive and negative, have not been completely elucidated. The core promoter region (- 108 to + 64) can integrate antigenic stimulatory signals in a manner that mimics the pattern of endogenous gene expression (24,25).This is the same site that is responsible for the glucocorticoid-mediated suppression of IFNy gene transcription (26). As discussed in Section 111, there are multiple NFKB and nuclear factor-activated T cell (NFAT) sites in the IFN-1, promoter and intronic regons that are also functionally responsive to antigenic stimulations. In addition, multiple responsive elements to CD28 costimulatory signals also have been tentatively identified in the promoter and in the third intron (unpublished observations). These different regulatory regions are likely to be directly involved in the transcriptional activation of the IFN-
112
HOWARD A. YOUNG AND PARITOSH GHOSH
y gene because, depending on the stimulation used to activate peripheral blood T cells, IFN-y can be produced in a cyclosporin A @A)-sensitive or -resistant manner. It has been shown that activation of T cells by PMA and aCD28 results in the production of IL-2 and IFN-y in the presence of CsA (27; unpublished observation). This finding may explain the failure of CsA treatment to prevent graft-versus-host disease (GVHD) after allogenic bone marrow transplantation (28).The cytotoxic T cells thought to be responsible for GVHD express CD28 on their surface, and can produce cytokines responsible for an immune response in the presence of CsA. Reports show that the source of the costimulatory ligand (B7.1 versus B7.2) seems to dictate the differentiation pathway, either to IFN-y or IL-4-producing T cells (30).Although the signaling events correspond to these coreceptor-ligand interactions that have not been fully elucidated, the membrane-proximal events appear to be the same in both cases (3432).An attractive hypothesis for one mechanism by which this differential expression may be controlled is differential methylation of the IFN-y promoter, depending on the source of the coreceptor ligand, because methylation of the promoter appears to play a role in the transcription of the IFN-y gene (see Section 11). Among the inducers of IFN-y, cytokines play a major role in viva One of the most potent inducers of IFN-y gene expression is IL-12, and IL-12 can synergize with a variety of extracellular signals, including cytokines such as IL-2 (33, 34) and IL-7 (39, and other cell surface molecules such as CD2 (36)and CD28 (37),to further enhance IFN-y expression. The effect of IL12 on the accumulation of IFN-y mRNA occurs at both the transcriptional and posttranscriptional levels (33),whereas the synergy between IL-12 and IL-2 was observed only at the level of mRNA stability (33).The regulation of IL-12 responsiveness in activated T cells by CD2 (36)has been reported. Antisera recognizing the adhesion domain of CD2 and CD2R inhibit the proliferation and IFN-y production by activated T cells on IL-12 simulation, whereas expression of IL-2 is unaffected. Also, CD2R plus CD2 antisera synergize strongly with IL-12 in inducing proliferation and IFN-y production by phytohemagglutinin (Pm)-activated T cells. In that study, an important role of CD2 has been proposed where there is an IL-12/IFN-y positive-feedback loop between activated T cells and antigen-presenting cells (APCs). Kubin et aZ.demonstrated the synergy between IL-12 and CD28 signaling (37).Interestingly, this synergistic signaling pathway was insensitive to CsA and was mostly independent of endogenous IL-2. They also demonstrated the inhibitory effect of IL-10 on the IFN-y production by peripheral blood mononuclear cells (PBMCs) in response to Staphylococcus uureus in the presence or absence of monoclonal antibody aCD28(mAb). This inhibi-
INTERFERON-y GENE TRANSCRIPTION
113
tion was due to the fact that IL-10 can inhibit both IL-12 production and B7 expression on APCs (37, 38). A number of laboratories have reported studies elucidating the cytokine signaling pathways at the biochemical level (reviewed in 39, 40). Different members of the Janus family kinases (Jaks) and signal transducer and activator of transcription (STAT) family proteins are the principal players in the cytokine signaling (39, 40). In case of IL-2 signaling, Jakl and Jak3 tyrosine kinases are activated on receptor engagement (41).These activated Jaks in turn activate STAT proteins, STAT5 in fresh peripheral blood lymphocytes (PBLs) and STAT3 plus STAT5 in preactivated PBLs (42, 43). An IL-2-responsive STAT binding site has been identified in the IL-2 receptor p chain gene (42),and recently STAT binding sites have been identified in the human IFN-y promoter and first intron (82).However, other DNA binding proteins also may play a role in IL-2 signaling-11-2 treatment results in increased NFKBproteins in the nucleus, utilizing an NK cell line (NK 3.3) that expressed increased IFN-y mRNA and protein in response to IL-2 (44).Given the multiple NFKBbinding sites in the IFN-y genomic DNA, it is likely that these proteins contribute to the increased IFN-y transcription. In the case of IL-12 signaling, Jake and Tyk2 (another Jak family kinase) have been shown to be the major tyrosine kinases activated by IL-12 (45). With respect to STAT proteins, STAT3 and STAT4 are involved in the IL-12 signaling pathway (46).Although the IL-12 signaling pathway has been characterized at the biochemical level, nothing is known about how STAT proteins induced by IL-12 may be involved in the molecular mechanisms regulating IFN-y gene transcription. A potential STAT4 bindmg site has been identified in the human IFN-y first intron, and a complex was observed in gel-shift experiments using this region as a probe with nuclear extracts from IL-12-treated human peripheral blood T cells (unpublished observations).Interestingly,in the analysis of IL-12 induction of IFN-y in the NK 3.3 cell line, no nuclear NFKBproteins were induced by IL-12 (44),thus suggesting that other proteins, possibly STATs, are involved in the transcriptional response of these cells to IL-12. We have shown that there is a loss of IFN-y production by splenic T cells isolated from tumor-bearing mice (47, 48). A similar result also has been observed when peripheral blood T cells from both cancer patients and patients with infectious diseases such as AIDS were analyzed (unpublished observation). Analysis of the nuclear transcription factors in splenic T cells from tumor-bearing mice and in cancer patients revealed the loss of the ~ 6 5 1 ~ (NFkB) 50 heterodimer and the presence of a predominantly ~ 5 0 1 ~ homodimer 50 (47, 49). Nuclear factors that bind the IFN-y
114
HOWARD A. YOUNG AND PAFtITOSH GHOSH
core promoter region also were absent when this regon (-70 to -40, AAAACTGTGAAAATACGTMTCCTCAGGAGA) was used as a probe in electrophoretic mobility-shift assays (see Section 111) (48). Although the mechanism responsible for the loss of these transcription factors is still unknown, it appears that tumor-derived TGF-p and IL-10 play a major role in the immune suppression observed in a tumor-bearing host (50, 51).Consistent with this hypothesis, a recent report demonstrates the role of tumorderived TGF-p in the shift of splenic T-cell populations from T h l to Th2 responses via direct and IL-lO-mediated pathways (52).Thus, maintaining a Tcell population (i.e., Thl) that has the capacity to express IFN-y strongly may be critical in the host response to cancer, similar to what has been seen in infectious diseases (53, 54).
II. The Role of DNA Methylation The methylation of CpG dinucleotides in mammalian cells is apotent and powerful mechanism for the regulation of gene expression. Methylation is regulated by levels of DNA methyltransferase, and changes in the level of DNA methyltransferase can profoundly change the phenotype of cells (for review, see 55). The methylation status of genes is routinely measured by Southern blot analysis utilizing enzymes that differ in their activity depending on the methylation state of the DNA. The most common pair of enzymes used for this type of analysis are MspI and HpuII, both of which recognize the sequence CCGG. Mspl cuts the DNA regardless of the methylation state of the internal C, whereas HpuII cuts the DNA only when the site is not methylated. The Msp11HpuIIsites in both the human and the murine IFN-y genomic DNAs are shown in Fig. 1. In addition to these enzymes, further analysis of the methylation state of the IFN-y gene is possible through the use of the restriction endonuclease SmBI (recognition site, TACGTA). This enzyme cuts DNA only when the internal C is not methylated (56).As shown in Fig. 1, there is a SnuBI site in the human and murine IFN-y promoters (- 73 to - 68), and this site is conserved in all species where genomic DNA sequences have been determined. Early evidence for a role of DNA methylation in the control of IFN-y expression was demonstrated through the use of the drug, 5-azacytidine,which inhibits DNA methylation. As reported in 1986 (57), treatment of a murine T-cell with 5-azacytidine restored the ability of this cell line to express IFNy in response to IL-2. However, in this report, no changes were noted in Southern blot analysis of the murine IFN-y gene utilizing Msp11HpaII.Activation of IFN-y expression in the human T-cell line, Jurkat (19, 58), and in murine Th2 clones (59),was also observed following azacytidine treatment.
I
n
m
L
?ma
Zoo0
XXM
4000
I
I
I
I
I
I
I
HUMAN GENOME
I
two
Zoo0 I
I
I
I
I
I
8Mx)
MXK) I
I
I
I
I
I
MOUSE GENOME FIG.1. Positions of methylation sensitive SrulBIIHpaII sites in the human and murine IFN-y genomic DNA. Enzyme sites are indicated. Boxes represent exons and scales below map represent relative size in base pairs.
116
HOWARD A. YOUNG AND PAFUTOSH GHOSH
A more thorough investigation of IFN-y gene methylation in human T cells and cell lines (60)indicated that increased IFN-y production in human T-cell lines correlated with hypomethylation of the Msp11HpaII sties, especially a site 5’ to the first exon and one in the first intron. However, the authors also reported that in two epithelial cell lines that did not produce IFN-y, the IFNy gene was also hypomethylated, thus suggesting that hypomethylation may be necessary but not sufficient for gene expression. The use of SnuBI to analyze the methylation status of the IFN-y promoter was first reported by Pang and colleagues, who demonstrated that hypomethylation of the promoter correlated with gene expression in human B cell lines (19).This study was then extended to the analysis of murine CD4+ T helper clones (59).In this report, there was a clear correlation between methylation of the promoter and IFN-y gene expression. In murine T h l clones, the promoter was not methylated whereas in murine Th2 clones, the promoter was >95% methylated. An example of the Southern blot hybridization pattern can be seen in Fig. 2 (lanes 4 and 5). Also shown in this figure are results obtained with a murine Tho clone. In this cell line (lane 6), both methylated (upper band) and nonmethylated (lower band) DNA are seen. This is what might be predicted ifTh0 cells are true precursors of both T h l and Th2 cells (13).A similar correlation to gene expression of the methylation state of the promoter SnuBI site and the HpuII site in the first intron was observed when human primary T-lineage cells were analyzed. In this report, methylation of these sites was observed in thymocytes, neonatal T cells, and adult CD4+ naive T cells (which do not express IFN-y), whereas substantial hypomethylation was seen in adult CD8+ T cells and adult CD4+ memory T cells, both of which have high capacities to produce IFN-y (61). Similar correlations between methylation and gene expression can be made when bulk human lymphocyte populations are analyzed. As shown in Fig. 3, DNA from peripheral blood LGLs shows almost complete hypomethylation at the SnuBI site whereas DNA from monocytes is highly (>95%) methylated. In contrast, total T-cell DNA is approximately 60% hypomethylated. Surprisingly, purified B-cell DNA is almost 40% hypomethylated, despite the fact that human peripheral blood B cells have never been reported to express IFN-y. This observation further supports the hypothesis that hypomethylation of the promoter is necessary but not sufficient for gene expression. One explanation of the impact of DNA methylation on IFN-y gene expression may be that if the promoter is hypomethylated, the chromatin structure is in an “open” confirmation, as supported by early DNAse I hypersensitivity studies (58, 62), thus permitting the binding of distinct DNA binding protein family members, precluding methylation and promoting transcrip-
117
INTERFERON-Y GENE TRANSCRIPTION
1
2
BamHl
3
4
5
6
Bam + SnaBl
FIG.2. Southern blot analysis of murine T helper clone DNA. Genomic DNA was extracted from murine T helper clones (obtained from Dr. Dennis Taub, NCI-FCRDC), digested with either BamHI alone (lanes 1-3) or BurnHI and SnaBI (lanes 4-6) and transferred to a nylon membrane after electrophoresis on a O.8% agarose gel. The blot was hybridized with radiolabeled murine IFN-y cDNA (lundy provided by DNAX, Inc., Palo Alto, CA). The top band in lanes 4 and 6 represents the intact (methylated)10-kbmnrine IFN-y genomic DNA; the bottom band (lanes 4-6) is the hypomethylated DNA that has been cleaved in the promoter region by SrmBI.
tion. Alternatively, if the promoter is already methylated, transcription may not occur due to the binding of proteins known to bind to methylated DNA. How the methylation of specific sites occurs or is maintained in T cells is unknown, but likely relies on accessibility of these sites to the DNA methyltransferase. Because DNA methyltransferase RNA increases in peripheral blood T cells following mitogenic stimulation (H. A. Young, unpublished observations), this accessibility may depend on the functional state and activi-
118
HOWARD A. YOUNG AND PAEUTOSH GHOSH
FIG.3. Southern blot analysis of human peripheral blood lymphocyte DNA. Human peripheral blood lymphocyte subsets were kindly provided by Dr. John Ortaldo (NCI-FCRDC). DNA was extracted, digested with BumHI and SnaBI, transferred to a nylon membrane after electrophoresis, and hybridized with a radiolabeled human IFN-y cDNA. The top band represents methylated DNA and the bottom band represents hypomethylated DNA.
ty of the DNA binding proteins. Along this line, it is of interest that tissue culture cell lines, including the human T lymphoblastoid cell line Jurkat, eventually lose their ability to transcribe the IFN-y gene and, as stated above, IFNy transcription can be reactivated with 5-azacyt1dinetreatment. It is possible that these cell lines lack the expression of specific DNA binding proteins that inhibit accessibility of DNA methyltransferase. Alternatively, these cell lines may express higher constitutive levels of DNA methyltransferase than normally observed. This latter possibility is supported by the observation that introduction of antisense methyltransferase cDNA into human T cell lines results in hypomethylation of the IFN-y promoter and increased IFNy gene expression (J. Mikovitz and H. A. Young, unpublished observations). The results discussed thus far relate to the impact of DNA methylation on IFN-y gene expression during development. By this it is suggested that the methylation pattern of the IFN-y gene in cells that have the capacity to express this gene is acquired during development and differentiation. Factors other than these also may play a role in controlling DNA methylation, and studies by Mikovits and co-workers have found that infection of T-cell lines by either human T-cell leukemia virus ( m V - 1 ) or human immunodeficiency virus (HIV) results in increased DNA methyltransferase RNA levels and increased DNA methyltransferase activity (J. Mikovits, F. Ruscetti and H. A. Young, unpublished observation).Furthermore, with increased time in culture, these infected cell lines are found to have a decreased capacity to ex-
INTERFERON-’)’ GENE TRANSCRIPTION
119
press IFN-y and Southern blot analysis has shown that decreased protein expression correlates with increased promoter methylation. It is an intriguing possibility that in cell populations where HIV infection is high, increased DNA methylation shuts down the IFN-y production and permits expansion of a Th2-like population that produces IL-4. The increased IL-4 expression further inhibits IFN-y expression, eventually leading to a more Th2-like Tcell population in the periphery and progression of disease, as has been hypothesized (63).
111. IFN-y Promoter Structure and Regulatory Elements The core (-100 to -1) IFN-y promoter structure has been highly conserved through evolution. The human and mouse sequences differ only at 11 positions in this region and over the region -265 to - 1,and the human and mouse promoters are almost 77% identical (1,2, 64, 65). Furthermore, a 17nucleotide stretch containing the SnaBI site previously discussed is identical in the human, mouse, rat, and canine promoters (l,2,64-67). Although this is the only site for potential methylation in the human and canine promoters, the mouse promoter contains two additional CpG dinucleotides in this re@on,with the rat promoter sharing one of these sites. However, the methylation status of these admtional sites has not been analyzed. The initial deletion analysis studies on the human IFN-y promoter, utilizing human peripheral blood T cells (24) or Jurkat T cells (25, 68) in transient transfection experiments, identified the core promoter region (- 100 to -30) as being essential for promoter activity. Penix and co-workers identified two regions (-96 to -80 (distal)and -73 to -48 (proximal),containing the SmBI site) within the core promoter as capable of forming DNA-protein complexes before and after cell activation (25). The proximal element has strong homology to the IL-2 promoter NF-IL2 element, whereas the distal element is homologous to regulatory elements in the granulocyte-macrophage colony-stimulating factor (GM-CSF) and macrophage inflammatory protein (MIP)promoters and bound GATA-3 as well (25). Other proteins binding to these elements have been further extensively analyzed by Penix et al. (69) and Cippitelli et al. (26).A constitutive complex formed with the proximal element was found to contain Oct-1 (26, 69), but the role of this DNA binding protein in regulating IFN-1, gene expression remains unclear. More importantly, both studies have concluded that the activator protein, CAMPresponse element protein, and activating transcription factor (AP-1-CREB-AT) family members are important in regulating the function of this region. Penix et al. has fully characterized the complexes
120
HOWARD A. YOUNG AND PARITOSH GHOSH
binding to the proximal element (69) and determined that CREBIATF-1, ATF-2, and cJun all interact with this region. Furthermore they suggested that AW-2-c-Jun heterodimers or c-Jun homodimers are involved in the activation of transcription, whereas CREB may inhibit transcription. In addition, AP-1 binds to the proximal and distal elements following PMhionomycin treatment of Jurkat cells (26) and is required for glucocorticoid inhibition of IFN-y promoter activity. Mutations that resulted in the loss of AP-1 binding were no longer sensitive to glucocorticoid inhibition and dominant negative c-Jun mutants could inhibit promoter activity. Thus, it appears this core binding region is essential for the interaction of critical transcriptional activators with the basal transcriptional complexes. The proximal region also has been used as a tool to examine differences in the DNA binding protein profile among cell types. As shown in Fig. 4 and as previously reported (59),when nuclear extracts from murine Thl and Th2 cell lines are compared, qualitative differences are observed in the complexes formed. Although a number of different complexes are seen, certain complexes are more predominant in T h l nuclear extracts and other complexes are more prevalent in Th2 nuclear complexes. These results suggest that the regulation of gene expression in these cell types also may be influenced by the types of DNA binding proteins present in the nucleus. Even more interesting are the results obtained when splenic T cell nuclear extracts obtained from healthy and tumor-bearing mice are compared. As shown in Fig. 5, there is a striking absence of DNA-protein complexes that interact with this region in the T-cell extracts from tumor-bearing mice, which is consistent with a significantly decreased ability of these cells to express IFN-.)I(as discussed in Section I). The results suggest that T cells in tumor-bearing animals exist in a state of “transcriptional anergy” with respect to their ability to express IFN-y, and this state is reflected in significant alterations in the DNA binding protein repertoire in the nucleus of these cells. In addition to the studies on the core promoter, a number of other studies have identified elements further upstream that appear to play a role in regulating the strength of the IFN-?Itranscriptional response to specific extracellular and intracellular signals. Early studies identified DNAse I-hypersensitive sites in the promoter (approximately 200 bp and 3 kb upstream of the promoter) that appeared in Jurkat cells in response to PMhPHA stimulation (58, 62). Other studies, utilizing heterologous promoters and promoter deletions, identified multiple putative enhancer elements in the human IFN-y promoters (70, 24). Later reports identified sites in the human promoter responsive to PMMPHA and the HTLV-1 tux gene (68) and the formation of specific DNA-protein complexes with different regions of the promoter (71),although in these studies, the specific DNA binding proteins were not identified.
INTERFERON-? G E N E TRANSCRIPTION
121
TH1 SCTH2 FIG.4. Gel-shift analysis of nuclear extracts from murine T helper 1 and 2 clones with the human IFN-y core promoter oligonucleotide. Left lane, Thl nuclear extract; middle lane, specific competition (SC)of Thl nuclear extract complex formation with unlabeled human IFN-y promoter oligonucleotide; right lane, Th2 nuclear extract. The complexes that are more intense with each nuclear extract are indicated by arrows.
Sica and co-workers were the first to identify specific DNA binding proteins that interact with the human IFN-y genomic DNA (72).Based on DNAse I hypersensitivity studies by Hardy et al. and on transfection studies by Ciccarone et al. that demonstrated the presence of enhancer activity in the first intron, Sica et al. utilized in vitro footprinting to identify a specific region GAATTTTCC that could enhance IFN-y promoter activity.He then observed that bacterially derived c-Re1protein could strongly bind to this element.
122
HOWARD A. YOUNG AND PARITOSH GHOSH
N TB Fig. 5. Gel-shift analysis of nuclear extracts from splenic T cells isolated from normal or tumor-bearing mice with the human IPN-y core promoter. Splenic T cells from tumor-bearing mice were obtained from mice that had a progressively growing renal carcinoma. N, Normal spleen T cell nuclear exbact; TB,tumor-bearing spleen T cell nuclear extract.
In additional studies, they also identified numerous NFKBbinding sites in the IFN-y genomic DNA, including the promoter and introns 1, 2, and 3 (A. Sica and H. A. Young, unpublished observations).It is of interest that one of these sites, lying in the promoter region from -284 to -260, is the same region that was previously identified as containing a PMkPHA-inducible element (68).These sites can enhance promoter activity when placed downstream of an IFN-y promoter-reporter construct. Consistent with these results, when genomic DNA constructs lacking most intronic sequences are
INTERFERON-y GENE TRANSCRIPTION
123
stably transfected into a murine T lymphoblastoid cell line, activity of the genomic DNA decreases 90% (H. A. Young, unpublished observations). Most but not all the NFKBsites interact with the NFAT family of proteins in a cyclosporin A-sensitive manner, and calcineurin expression vectors have been found to up-regulate the IFN-y promoter. It is of particular interest that the IFN-y intronic site (TATGAATTTTCC)is almost identical to the NFAT binding site in the murine IL-2 promoter (TATGAAACAAATTTTCC).The additional AAACA nucleotides in the IL-2 promoter are required for AP-1 bindmg and their absence in the IFN-y sequence is consistent with the presence of NFAT but not AP-1 in the protein complexes formed with the IFN-y intronic DNA elements (A. Sica and H. A. Young, unpublished observations). An additional overlapping NFKB/NFATbinding site exists in the human promoter (-284 to -260) and appears to be required for full promoter activity in response to PMhionomycin stimulation in Jurkat cells. This enhancer site binds NFKB/NFATproteins in EMSA experiments and when binding of these proteins to this site is eliminated by a point mutation, promoter activity after PMhPHA treatment decreases by about 50%. Thus it appears that the NFKBproteins may act to enhance the transcriptional response to specific extracellular and intracellular signals but by themselves are not required for promoter activity. The NFKBcomplexes likely interact with other elements of the promoter as well. Based on sequence homology to the CD28-responsive element in the IL-2 promoter, CD28-responsive elements have been identified in the IFN-y promoter and third intron. Thus it may be that multiple elements cooperate in enhancing transcription in response to CD28 stimulation and, given the report that the IL-2 CD28-responsive element can interact with NFKB proteins (74),these sites also may be targets for NFKBbinding. Although it is not yet clear if NFKBINFATproteins cooperate to enhance IFN-y promoter activity or compete and interfere with transcription, as has been suggested for the IL-2 and IL-4 promoters (75), it is likely that the role of these proteins in influencing IFN-y transcription niay depend on the types of signals that affect their nuclear localization and function. Other studies, using the mouse promoter, identified four consensus estrogen-responsive elements present in the promoter (65). Furthermore, although estrogen by itself did not induce either IFN-y expression or promoter activity, estrogen was shown to augment the response to concanavalin A (Con A) (65). This report strongly supports at a molecular level, stumes demonstrating that female mice produce more IFN-y than do male mice (76), and this increased IFN-y expression in females may eventually prove to be one important determinant in why autoimmune disease occurs more prevalently in women. Another interesting region of the promoter was initially identified in the
124
HOWARD A. YOUNG AND PARITOSH GHOSH
deletion analysis peiformed by Chrivia et a2. (24).These transfection studies, performed in peripheral blood human T cells, demonstrated that a silencer activity is present in the region -251 to -215. This region, which is also highly conserved in the mouse promoter, has been more intensively studied by Ye et al. (77).The initial report identified two multiprotein complexes that interact with this region. One of the complexes contained an AI-2-like protein but was not AP-2, and the other complex contained the ubiquitous DNA binding protein Yin-Yang-1 (YY-1). This protein (see 78, 79) has both enhancer and silencer activities. This element can inhibit the IFN-y core promoter but not a heterologous promoter in transient transfection experiments, and cotransfection of a YY-1 expression vector with the IFN--ypromoter also resulted in decreased promoter activity. Further studies have identified at least one other functional YY-1 binding site in the IFN--ypromoter. This site is of particular interest in that it overlaps with an additional AP-1 binding site. These authors have proposed a model whereby IFN-y transcription is suppressed by YY-1 in the absence of specdic activation signals. On activation, positive activators,such as AP-1, may displace YY-1 and permit transcription initiation to occur. Furthermore, after the nuclear levels of activated enhancer binding proteins (such as AP-1) decay, YY-1 rebinds to the enhancer elements to block further transcription. This model would be consistent with the kinetics of IFN-y transcription, because most initiation appears to occur 2-9 hr following stimulation and is consistent with the earlier report that deletion of this region resulted in a much higher level of basal promoter activity (24).It is also noteworthy that other lymphokine genes, including GM-CSF and IL-3, contain silencer elements of similar sequence, and YY-1 has been found to interact with the element in the GM-CSF promoter (80).
IV. Summary The regulation of IFN-y transcription appears to be quite complex. In addition to the interaction of numerous regions of the genomic DNA with multiple DNA binding protein family members, DNA methylation may serve to act as an early determinant of the capacity of a cell to initiate transcription. Transcriptionalactivation occurs in response to both soluble extracellularsignals and cell contact, and it appears quite likely that this activation may result from the interaction of different families of DNA binding proteins with different enhancer elements. Furthermore, because chronic IFN-y transcription and subsequent expression would likely be detrimental to the host (see 81), mechanisms have evolved to quench expression at both transcriptional and posttranscriptional levels. Given the complexity of cell-to-cell in-
INTERFERON-y G E N E TRANSCRIPTION
125
teractions in the immune system, it is reasonable to expect that additional mechanisms regulating IFN-?Itranscription, involving previously identified or as yet unidentified DNA binding proteins, remain to be defined.
ACKNOWLEDGMENTS We thank Valentina Ciccarrone, Marco Cippitelli, Linda Dorman, Antonio Sica, Vincenzo Viggiano and Jianping Ye for their contributions in elucidating the mechanisms involved in IFNy transcription, and Susan Charbonneau and Joyce Vincent for typing and editing this manuscript. We also thank Christopher B. Wilson and Laurie Penix for permission to cite unpublished data.
REFERENCES P. W. Gray and D. V. Goeddel, Nature jlortdon) 298, 859 (1982). Y. Taya, R. Devos, J. Tavemier, H. Cheroutre, G. Engler and W. Fiers, E M B O J . 8,953 (1982). P. B. Sehgal, L. T. May and F. R. Lamndsberger,J. Interferon Res. 6,39 (1986). T. Kasahara,J. J. Hooks, S. F. Dougherty and J. J. Oppenheim,]. Zmmunol. 130,1784 (1983). T Kasahara, J. Y. Djeu, S. F. Dougherty and J. J. Oppenheim,J. InamunoZ. 131,2379 (1983). H. A. Young and K. J. Hardy, P h a m c o l . "her. 45,137 (1990). M. E. Sanders, M. W. Makgoba, S. 0.Sharrow, D. Stephany, T. A. Springer, H. A. Young and S. A. Shaw,J. Zmmunol. 140, 1401 (1988). 8. C. B. Wilson, J. Westall, L. Johnston, D. B. Lewis, S. K. Dower and A. R. Alpert,J. Clin. Znvest. 77,860 (1986). 9. K. Conlon, J. Osbome, C. Morimoto, J. Ortaldo and H. Young, Eur. J. Immunol. 25, 644 (1995). 10. T. R. Mosmann, H. Chenvinski, M. W. Bond, M. A. Giedlin and R. L. Coffman,]. Immunol. 136, 2348 (1986). 11. T. R. Mosmann, Immunol. Res. 10,183 (1991). 12. W. E. Paul and R. A. Seder, Cell 78,241 (1994). 13. G. S. Firestein, W. D. Roeder, J. A. Laxer, K. S. Townsend, C. T. Weaver, J. T. Hom, J. Linton, B. E. Torbett and A. C. Glasebrook,J. Immunol. 143,518 (1989). 14. H. Spits, L. L. Lanier and J. H. Phillips, Blood 85,2654 (1995). 15. H. A. Young and J. R. Ortaldo,J. Imn~unol.139,724 (1987). 16. S. Bandyopadhyay, B. Perussia, G. Trinchieri, D. S. Miller and S. E. Starr, J Exp. Mcd. 164, 180 (1986). 17. A. B. Wilson, J. M. Harris, and R. R. A. Coombs, Cell. Zmmunol. 113, 130 (1988). 18. D. Benjamin, D. P. Hartmann, L. S. Bazar and R. J. Jacobson,Am.J.Hemutol. 22,169 (1986). 19. Y. Pang, Y. Norihisa, D. Benjamin, R. R. S. Kantor and H. A. Young, Blood 80, 724 (1992). 20. M. J. Fultz, S. A. Barber, C. W. Dieffenbach and S. N. Vogel, Int. Zmmunol. 5,1383 (1993). 21. P. Di Marzio, P. Puddu, L. Conti, F. Belardelli and S. Gessani,]. Exp Med. 179,1731 (1994). 22. P. L. Rady, P. Cadet, T. K. Bui, S. K. Tyring, S. Baron, G. J. Santon and T. K. Hughes, Cytokine 7, 793 (1995). 23. H. A. Young, L. Varesio and P. Hwu, M. C Biol 6,2253 (1986). 24. J. C. Chrivia, T. Wedrychowicz, H. A. Young and K. J. Hardy,]. E x p Med. 172,661 (1990). 1. 2. 3. 4. 5. 6. 7.
126
HOWARD A. YOUNG AND PARITOSH GHOSH
25. L. P e a W. M. Weaver, Y. Pang, H. A. Young and C. B. Wilson, ]. Exp. Med. 178, 1483 (1993). 26. M. CippiteLli,A. Sic&V. Viggiano, J. Ye, P. Ghosh, M. J. Birrer and H. A. Young,]. BWZ. Chem. 270, 12548 (1995). 27. P. Ghosh, A. Sica, M. Cippitelli, J. Subleski, R. Lahesmaa, H. A. Young and N. R. Rice, JBC 276, 7700 (1996). 28. S.-Y. Pai, D. A. Fruman, T. Leong, D. Neuberg, T.G. Rosano, C. McGarigle, J. H. Antin and B. E. Bierer, Blood 84,3974 (1994). 29. V. K. Kuchroo, M. P. Das, J. A. Brown, A. M. Ranger, S. S. Zamvil, R. A. Sobel, H. L. Weiner, N. Nabavi and L. H. Glimcher, Cell 80,707 (1995). 30. G. J. Freeman, V. A. Boussiotis, A. Anumanthan, G . M. Bernstein, X. Y. Ke, P. D. Rennert, G. S. Gray, J. G. Gribben and L. M. Nadler, Immunity 2,523 (1995). 31. M. Ghiotto-Ragueneau, M. Battifora, A. Truneh, M. D. Waterfield and D. Olive, Eur. 1.Immunol. 26,34 (1996). 32. J. A. Nunes, A. Truneh, D. Olive and D. A. Cantrell,]BC 271,1591 (1996). 33. S. H. Chan, M. Kobayashi, D. Santoli, B. Perussia and G. Trinchieri,]. Zmrnunol. 148, 92 (1992). 34. S. H. Chan, B. Perussia, J. W. Gupta, M. Kobayashl, M. P o s p d , H. A. Young, S. F. Wolf, D. Young, S. C. Clark and G. Trinchieri,]. Exp. Med. 173,869 0991). 35. P. T. Mehrotra, A. J. Grant and J. P. Siegel,]. Zmmunol. 154, 5093 (1995). 36. J. A. Gollob,J. Li, E. L. Reinherz, and J. Ritz,J. Exp. Med. 182,721 (1995). 37. M. Kubin, M. Kamoun and G. Trinchieri,]. Exp. Med. 180,211 (1994). 38. A. M. D’Andrea, M. Aste-Amezaga, N. M. Valiante, X. Ma, M. Kubin and G. Trinchieri, J. Exp. Med. 178,1041 (1993). 39. N. G. Copeland, D. J. Gilbert, C. Schlinder,Z. Zhong, Z. Wen, J. E. Darnell Jr., A. L.-F. Mui, A. Miyajima, F. W. Queue, J. N. Ihle and J. A. Jenkins, Genomics 29,225 (1995). 40. J. N. Ihle, Cell 84,331 (1996). 41. T. Miyazaki, A. Kawahara, H. Fujii, Y. Nakagawa, Y. Minami, Z.-J. Liu, I. Oishi, I. Silvennoinen, B. A. Witthuhn, J. N. Ihle and T. Taniguchi, Science 266,1045 (1994). 42. J.-X. Lin, T.-S. Migone, M. Tsang, M. Friedman, J. A. Weatherbee, L. Zhou, A. Yamauchi, E. T. Bloom, J. Mietz, S. John and W. J. Leonard, Immunity 2,331 (1995). 43. J. Hou, U. Schlinder,W. J. Henzel, S. C. Wong and S . L. McKnight, Immunity 2,321 (1995). 44. J. Ye, J. R. Ortaldo, K. Conlon, R. Winkler-Rckett and H. A. Young,J. Leukoc. Biol. 58,225 (1995). 45. C. M. Bacon, D. W. McVicar, J. R. Ortaldo, R. C. Rees, J. J. O’Shea and J. A. Johnston,]. Ezp. Med. 181, 399 (1995). 46. N. G. Jacobson, S. J. Szabo, R. M. Weber-Nordt, Z. Zhong, R. D. Schreiber, J. E. Darnell, Jr. and K. M. Murphy,]. Exp. Med. lSl, 1755 (1995). 47. P. Ghosh, A. Sica, H. A. Young, J. Ye, J. L. Franco, R. H. Wiltrout, D. L. Longo, N. R. Rice, and K. L. Komschlies, Cancer Res. 54,2969 (1994). 48. P. Ghosh, K. L. Komschlies, M. Cippitelli, D. L. Longo, J. Subleski,J. Ye, A. Sica, H. A. Young, R. H. Wiltrout and A. C. Ochoa,]. Natl. Cancer Znst. 87,1478 (1995). 49. X. Li, J. Liu, 1.-K. Park, T. A. Hamilton, P. Rayman, E. Klein, M. Edinger, R. Tubbs, R. Bukowski and J. Finke, Cancer Res. 54,5424 (1994). 50. L. M. Weiskirch, Y. Bar-Dagan and M. B. Mokyr, Cancer Zmmunol. Immunotk. 38,215 (1994). 51. L. Gore&, A. Prokhorova and M. B. Mokyr, Cancer Immunol.Immumtkr. 39,117 (1994). 52. H. Mae& and A. Shiraishi,j. Immunol. 156,73 (1996). 53. D. D. Schoof, Y. Terashima, G. E. Peoples, P. S. Goedegebuure, J. V. Andrews, J. P. Riche and T. J. Eberlein, Cell. Immunol. 150, 114 (1993).
INTERFERON-Y GENE TRANSCRIPTION
127
54. M. Clerici, F. T. Hakim, D. J. Vernon, S. Blatt, C. W. Hendrix, T. A. Wynn and G. M. Shearer,J. Clin. Znuest. 91,759 (1993). 55. P. W. Laird and R. Jaenisch, Genetics 3,1487 (1994). 56. Y. Yang and Q. Li, NARes 18,3083 (1990). 57. W. L. Farrar, F. W. Ruscetti and H. A. Young,]. Zmmunol. 138,1551 (1985). 58. K. J. Hardy, B. M. Peterlin, R. E. Atchison and J. D. Stobo, PNAS 82,8173 (1985). 59. H. A. Young, P. Ghosh, J. Ye, J. Lederer, A. Lichtman, J. Gerard, L. Penix, C . B. Wilson, A. J. Melvin, M. E. McGurn, D. B. Lewis and D. D. Taub,J. Zmmunol. 153,3603 (1994). 60. R. Fukunaga, M. Matsuyama, H. Okamura, K. Nagata, S. Nagata and Y Sokawa,NARes 14, 4421 (1986). 61. M. J. Melvin, M. E. McGum, S. J. Bort, C. Gibson and D. B. Lewis, Eur.J. ZmmunoZ. 25,426 (1995). 62. K. J. Hardy, B. Manger, M. Newton and J. D. Stobo,]. Zmmunol. 138,2353 (1987). 63. M. Clerici and G. M. Shearer, Zmmunol. Today 15,575 (1994). 64. P. W. Gray and D. V. Goeddel, Lymphokines 13,151 (1987). 65. H. S. Fox, B. L. Bond and T. G. Parslow,J. Immuno2. 146,4362 (1991). 66. H. Dijkema, P. H. van der Melde, P. H. Pouwels, M. Caspers, M. Dubbeld and H. Schellekens, EMBO]. 4, 761 (1985). 67. K. Devos, F. Duerinck, K. Van Audenhove and W. Fiers,]. Znte7feron Res. l2,95 (1992). 68. D. A. Brown, F. B. Nelson, E. L. Reinherz and D. J. Diamond, Eur. J. Zmmunol. 21, 1879 (1991). 69. L. A. Penix, M. T. Sweetser, W. M. Weaver, J. P. Hoeffler, T. K. Kerppola and C. B. Wilson, JBC, in press. 70. V. C . Ciccarone, J. Chrivia, K. J. Hardy, and H. A. Young,J. ZmmunoZ. 144,125 (1990). 71. D. A. Brown, K. L. Kondo, S.-W.Wong and D. J. Diamond, Eur.J.Zmmunol. 22,2419 (1992). 72. A. Sica, T-H. Tan, N. Rice, M. Kretzschmar, I? Ghosh and H. A. Young, PNAS 89, 1740 (1992). 73. J. D. Fraser and A. Weiss, MCBioZ 12,4357 (1992). 74. P. Ghosh, T.-H. Tan, N. R. Rice, A. Sica and H. A. Young, PNAS 90,1696 (1993). 75. V. Casolaro, S. N. Georas, Z. Song, I. D. zubkoff, S. A. Abdulkadir, D. Thanos and S. J. Ono, PNAS 92,11623 (1995). 76. N. Sarvetnick and H. S. Fox, Mol. Bid. Med. 7,323 (1990). 77. J. Ye, P. Ghosh, M. Cippitelli, J. Subleski, K. J. Hardy, J. R. Ortaldo and H. A. Young, JBC 269,25728 (1994). 78. S. Hahn, Cuw. Biol. 2, 152 (1992). 79. A. Shrivastava and K. Calame, NARes 22,5151 (1994). 80. J. Ye, H. A. Young, J. R. Ortaldo and P. Ghosh, NARes 22,5672 (1994). 81. H. A. Young and K. J. Hardy,j . Leukoc. Biol. 58,373 (1995). 82. X. Xu, Y.-L. Sun and T. Hoey, Science 273,794 (1996).
This Page Intentionally Left Blank
RecA Protein: Structure, Function, and Role in Recombinational DNA Repair’ ALBERTOI. ROCAAND MICHAELM. Cox Deparhnent of Biochemistry College of Agriculture cmd L$e Sciences University of Wisconsin Madison, Wisconsin 53706 I. On the Function of Homologous Genetic Recombination in Bacteria . . A. The Function of Homologous Genetic Recombination in Bacteria Is Recombinational DNA Repair . . . . . . . . . . . . . , . . . . . , B. The Initiation of Recombination in an Eschm’chiu culi Cell Normally Requires DNA Damage . . . . . . . . . . . . . . . . . . . . . . . . . 11. The Structure of RecA Protein . . . . . . . . . . . . . . . . . . . . . . , . . . ..... A. General Properties . , ............................ B. Sequence Alignments . , . . , . . . . . . . . . . . . . . . . . . . . . . . . . . , . . . . . , . C. Expanded Discussion of Structure-Function Relationships . , . . . . . . 111. RecA Protein Interactions with Its Ligands in Viho; Biochemical Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. DNABinding . . .. . , . . . . . B. Polar Assembly and Disasse and the Importance of 3’ DNA Ends C. ATP Hydrolysis . . . . . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , . . Iv. RecA Protein-mediated DNA Stxand Exchange . . , . . . . . . . , . A. Overview of the Reaction . B. DNAPairing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Unidirectional Extension of the Hybrid DNA and the Role of ATP Hydrolysis . . . . . . . . . . . . . . . . , . . . . . . . . . . . . . , D. Exchange Reactions with Four DNA Strands V. Interaction of RecA Protein with Other Proteins . . . . . . . . . . . . . . . . . . . . A. The Single-StrandDNA Bindmg Protein . , B. The RecF, RecO, and RecR Proteins , , . , . . C. The RuvA and RuvB Prote’ D. Other Proteins Affecting R VI. Other Functions of RecA Protein in Vivo ................ A. The RecA Coprotease Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. SOS Mutagenesis . . , , . . , . . , . . . . . . . . . . . . . . . . . . , . . . . . , . . . . . . , , C. Chromosome Partitioning . , . . . . . . . . .. . , . . , . . , . . , . . . . . . , , . D. Induced Stable DNA Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
130
132 136 138 138 138 162
171 171 176 179 184 184 186 195 200 200 201 203 206 208 208 208 209 2 10 2 10
Abbreviations: All’$, adenosine-5 -0-(3-thiotriphosphate); dsDNA, double-stxanded DNA; PI, isoelectric point; pK, negative logarithm of dissociation constant; ssDNA, singlestranded DNA; W, ultraviolet light. Progress in F;urleic Acid Research and Molecular Biology, Vol. 56
129
Copyright 0 1997 by Academic Press. tights ofreproduc,tion in any form reserved 0079-6603/97 925.00
130
ALBERT0 I. ROCA AND MICHAEL M. COX
VII. Epilogue: Relating RecA Biochemistry to DNA Repair . . . . . . . . . . . . , . . References . . . . . . . . . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
210 213
The RecA protein plays a central role in recombinational processes in Escherichia coZi In this capacity, it binds to two DNA molecules and aligns homologous sequences within them. It then promotes a DNA strand-exchange reaction that creates branched DNA recombination intermediates. The protein also has a regulatory function, serving as part of the system that induces the SOS response to DNA damage. As a regulatory function, RecA protein facilitates the autocatalytic cleavage of the LexA repressor, the A repressor, and a few other proteins. Because RecA does not participate as a classical protease in this reaction, the effect of RecA on repressor cleavage is usually referred to as the coprotease function. RecA also plays a direct role in a process called SOS mutagenesis. This review focuses largely on the recombinational activities of RecA protein. The RecA protein has been found in all bacteria in which it has been carefully sought, including Mycoplasma with its minimal genome (1). The recA gene has been sequenced in over 60 bacterial species. The wide distribution of bacterial recA genes, in classes of bacteria that took separate evolutionary paths during Precambrian times, indicates that the protein evolved very early. Structural homologs of RecA protein have been found in Archaeans2 (2) and in eukaryotes from yeast to humans (3-5). The RecA protein of E. coli is a 352-aminoacid polypeptide of M , 37,842. This modestly sized polypeptide has separate binding sites for at least three stands of DNA, ATP, the LexA and A repressors, and other RecA monomers. The active species of RecA prctein implicated in most of its activities is a structure in which RecA protein monomers are assembled into a right-handed helical filament on DNA. Its multitude of functions is reflected in a complex enzymology that still has significant chapters unwritten.
1. On the Function of Homologous Genetic Recombination in Bacteria In the summer of 1984 at the Cold Spring Harbor Laboratory, there was a symposium entitled “Recombination at the DNA Level.” It focused broadly on recombination mechanisms and covered what was then known about Our description of molecular taxonomy follows the recommendations of Woese et d.(24.
ReCA PROTEIN
131
the mechanisms of homologous genetic recombination, site-specific recombination, and transposition in a marathon of over 90 presentations. One of the most memorable presentations was the final one, in which Alan Campbell was asked to summarize the entire meeting. A paper outlining his remarks was later published in the symposium volume (6).Following an inexorable tide of papers on recombination mechanisms, the following comments on general recombination stood out: The function of general recombination has hardy surfaced in this Symposium. Perhaps there was nothing constructive to say on the subject. I am left uncertain as to how many investigatorsin that area consider that the function is too obvious to require discussion, how many think that general recombination serves no useful function, and how many consider the question uninteresting or intractable. But there is a real question that needs to be answered sometime. Our immediate interest.. . [is] . . . in recombinational mechanisms. But we all know that the immediate selective value of those [recombination]genes and products does not depend on their role in reshuffling genes in natural populations.
The passage reflects a persistent temptation to study the mechanism of recombination without adequately considering its biological functions. The interest in mechanism is easily explained. Homologous genetic recombination is the cellular process underlylng most classical and many modern methods of molecular genetics. A sexual union of two individuals of any species is accompanied by genetic recombination, resulting in new combinations of genetic alleles in the offspring. In the laboratory, introduction of DNA into a cell is also often accompanied by recombination, leading to an insertion of the DNA into the cellular genome. The recombination associated with genetic crosses allowed Gregor Mendel to ask and answer new questions about inheritance. Today it allows gene hunters to isolate the genes responsible for human genetic diseases. Recombination generates genetic diversity and the generation of diversity allows us to study genomes. Without the inherent capacity of homologous chromosomes to come together and exchange genetic information via recombination, it is hard to imagine where our understanding of chromosomes, genes, and their functions might be. As we learn more about genomes, we seek to change them by genetic engineering and human gene therapy. Changing a genome requires recombination. Requiring further pretext or context for pursuing an understanding of recombination mechanisms seems almost gratuitous. However, Campbell's point is not irrelevant or even esoteric. A real understanding of the recombination mechanism is impossible without an appreciation for recombination function. The function most important to a cell is not necessarily the function most important to geneticists. The absence of talks on recombination function at the 1984 symposium (6) did not reflect a
132
ALBERT0 I. ROCA AND MICHAEL M. COX
dearth of ideas on the subject. Key ideas about recombination function were simply not represented.
A. The Function of Homologous Genetic Recombination in Bacteria Is Recombinational DNA Repair In addition to its role in the generation of genetic diversity, homologous genetic recombination is required for recombinational DNA repair and is involved in the successful segregation of chromosomes at cell division in all organisms. Which of these functions can account for the maintenance of recombination systems in all cells? One place to look for a discussion of this question is in the literature addressing one of the most intractable questions in evolutionary biology: the evolution of sex. “Sex” is commonly defined as either genital union, gender, or mixis. Mixis is more or less synonymous with recombination, and this is the context for our discussion. In higher eukaryotes, recombination is closely linked to reproduction. A number of researchers have argued that the creation of genetic diversity is the primary function of recombination in eukaryotes, providing the selection pressure for the maintenance of recombination systems (7-10). Viewed in this context, there are several major advantages to recombination (7, 10). Recombination allows rapid adjustment of phenotypes to meet a changing environment. Beneficial mutations from different lineages can be readily combined, and deleterious mutations can b e eliminated. All of these factors may help account for the limited distribution of asexual eukaryotes, in which genomic change is largely limited by the rate of mutation. Without sex, the combination of two beneficial new mutations would require two independent mutational events. Most mutations are bad, and an inability to eliminate deleterious mutations by recombination may be especially important in restricting the distribution of asexual species. A small sexual population lacking a mutant-free phenotype can always generate one by recombination. A similar asexual population can generate a mutant-free phenotype only by back mutation. A detailed theory of the consequences of deleterious mutations on asexual populations was presented by Muller (U),and has been given the name “Muller’s ratchet” (7,lO, 12).Mutation loads may even limit the size of the genome of asexual organisms (10). However, recombination comes at a price. For every 100 offspring, asexual organisms will contribute 100 individuals that can reproduce in the next generation. Sexual females will contribute only 50. The maintenance of males, along with the processes of meiosis and syngamy, represents a substantial energetic cost that must be borne by sexual species. Recombination may break up the same favorable combinations of mutant alleles that it creates. Recombination also does not provide a means for two or more rare genes, individually deleterious but beneficial when grouped, to spread
ReCA PROTEIN
133
through a population together. The costs of recombination can lead to a short-term advantage for asexual populations over sexual ones. For these and other reasons, there has been a search for additional functions of recombination. One view holds that mechanisms for sexual genetic exchange arose at least in part from parasitic genetic elements (e.g.,transposons) (13).A number of other researchers have argued that the primary function of eukaryotic recombination is to repair DNA (14-18). A significant advantage of this idea is that it provides a clear selective pressure for the maintenance of recombination systems at the level of individual organisms. The recent demonstration that the gene responsible for Bloom’s syndrome in humans is a homolog of the bacterial recQ gene (19) [encoding a helicase involved in homologous recombination (20,241 provides an interesting link between recombination and DNA repair in eukaryotes.The gene for Werner’s Syndrome also exhibits significant similarity to recQ (22). The function of recombination and the selective advantage of sex in eukaryotes remain controversial. The major theses outlined above are not mutually exclusive, and several or all of the factors cited may have played a significant role in the evolution of eukaryotic sex, The situation in bacteria is quite different. There is a broad consensus that recombination originated in bacteria or their progenitors as a DNA repair process (7-10,14, 16, 18,23-27). The requirement for DNA repair probably remains the major selection pressure for the universal maintenance of homologous recombination systems in bacteria (26).Sex is not linked to reproduction in bacteria. Sexual genetic exchanges are generally quite rare (28-30), and it is harder to invoke the creation of genetic diversity as a factor in the origin or maintenance of bacterial recombination systems. On the other hand, potentially mutagenic DNA lesions are virtually omnipresent (25, 26). Each bacterial cell growing aerobically in a rich laboratory broth suffers on average several thousand DNA lesions of all types in every generation (26, 31).When a bacterial cell lacks functional copies of key genes involved in recombination, the prototypical phenotype is an extreme sensitivity to DNAdamaging agents. Many cells mutant for recombination functions are very sick or dead (26, 32). Recombinational DNA repair is a specialized process directed at DNA damage only when it occurs in certain contexts. Much DNA repair is made possible because DNA is double stranded. If one strand contains a lesion, a segment of the strand containing the lesion can be removed. The second strand can then be used as a template for replacement of the damaged strand segment. No recombination is required. However, there are important situations in which a second DNA strand is not available to direct repair. In some cases, cross-links or double-strand breaks produce damage in both strands. Alternatively, a DNA lesion can be located in a single-stranded region of the
B. Doublestrandcrosslink repair
A. PostreplicationRepair
C. Double-strand break repair
3
5'
5
3
1
Homologous DNA
Replication
5
J :-, 3
5 3
5
1
1
Nuclease, helicase, RecA
1
RecBCD
. J
+
3’ 5'
RecA (+ RecFOR?)
UvrA,B,C
7
I
Helicase, Nuclease, RecA (+ RecFOR)
RecA (+ RecFOR)
r
I
1
Nuclease, RecA replication
2 I
4 UvrA,B,C
Repair,replication
+
1
Replication, ligase
RuvA, RuvB, RuvC
ReCA PROTEIN
135
DNA. In these situations, the information for faithful DNA repair must come from a different homologous DNA molecule via recombination. Some current models for recombinational DNA repair are presented in Fig. 1. Each of the pathways is directed at a different type of damage, but there are a number of common themes. First, the pairing of two DNA molecules involves one single-stranded DNA and one duplex DNA. If singlestranded DNA is not already present, early stages of recombination are dominated by enzymes that produce it (helicases and nucleases). The pairing is likely to involve the transient formation of a novel triplex DNA intermediate. Second, pairing is followed by a strand exchange in which the single strand is paired with its complement in the duplex, and the like strand in the duplex is displaced. The product duplex is referred to as heteroduplex DNA if it contains one or more mismatches or lesions, or hybrid DNA if it does not. Third, the strand exchange usually results in the formation of branched DNA intermediates. Because they are formed from homologous DNA molecules, the branches can move in a process known as branch migration. Branch migration can continue into regions where both DNA molecules are duplex, forming the branched recombination intermediate first proposed by Holliday (33), and thus called the Holliday intermediate. Finally, repair is completed by enzymes that process the branched DNA intermediates, including specialized helicases and nucleolytic activities, DNA polymerases, and DNA l i g a ~ e . ~ Consider in more detail what is likely to happen when a replication fork encounters an unrepaired DNA lesion (Fig. 1A). DNA polymerases will not insert nucleotides opposite the lesion, so the replication halts. Whether or not replication resumes at some point further along the template, the lesion is left in a single-stranded DNA gap and must be repaired in a process known as postreplication DNA repair. The repair pathway shown is based on that proposed by Howard-Flanders and colleagues (34).The single-stranded DNA in the gap is paired with homologous duplex DNA derived from the other side of the replication fork. Nucleolytic processing of the donor duplex leads to formation of the branched recombination intermediate, and branch migration across the DNA lesion provides an undamaged complementary strand at that location. By rendering the DNA surrounding the lesion double-stranded, the lesion can be repaired by the excision repair pathways. Although the In Chapter 1 of his recent hook (334,Kuzminov has provided a detailed and lucid description of the evolution of thought about models for recombinationalrepair. FIG.1. Models for recombinationalDNA repair.The postreplicationrepair model is adapted from West et al. (34).The double-strandcross-link repair pathway is based on a model by Cheng et ul. (414).The double-strandbreak repair model shown is based on models presented by Szostak et al. (35)and Smith (29).See text for detds.
136
ALBERT0 I. ROCA AND MICHAEL M. COX
details differ for the repair of double-strand breaks and cross-links, the same general sequence of steps is evident. The double-strand break-repair pathway (Fig. 1B) is also similar to the pathways proposed for most meiotic recombination in eukaryotes (35) and conjugational recombination in bacteria (29). The central recombination activity in bacteria is the RecA protein. RecA is found in all bacteria, and clear structural and functional homologs have been identified in eukaryotes, from yeast to humans. RecA is broadly responsible for facilitating recombination steps involving DNA pairing and branch migration. To initiate, regulate, and complete recombinational exchanges, RecA is augmented by over a dozen additional proteins. The overall system is complex, sophisticated, adaptable to different repair and recombination scenarios, and critical for cell survival. Escherichiu coZi cells lacking a functional recA gene grow poorly. About 50% of the cells in a typical culture are dead and 10%contain no DNA (32).Cells containing both a recA null mutation and an additional mutation tending to increase DNA strand breaks (Zig, durn, xth, ung, and poZA) are inviable (36-38). Recombination and recombinational repair are all but eliminated in a recA cell. A dramatic example of the relationship between recombination and repair can be seen in recent work on the bacterium Deinococcus radioduruns, which can survive a dose of radiation several thousand times that required to kill other organisms. This extraordinary radiation resistance depends completely on an intact recA gene (39-41). Homologous chromosomes in these bacteria may be arranged in pairs, with each pair permanently linked by many Holliday junctions to facilitate rapid repair of DNA breaks (41).This same mechanism may explain the resistance of D. rudiodurans to DNA strand breaks caused by prolonged desiccation (42).
B. The Initiation of Recombination in an Escherichia coli Cell Normally Requires DNA Damage Intracellular recombination in bacteria exhibits a demonstrable dependence on DNA damage. Some of the literature on which this statement is based has been reviewed (25, 26). New work continues to reinforce the theme. During infection of homoimmune E. coZi lysogens, undamaged nonreplicating A phage DNA circles undergo very little recombination. Prior ultraviolet (W) irradiation of phages dramatically elevates recombinant frequencies (43-45). The initiation of A recombination is dependent to a large extent on the exonuclease functions associated with methyl-directed mismatch repair, suggesting that it is the generation of single-strand gaps in the DNA that leads to the initiation of recombination (45). Other observations correlate a requirement for RecA protein with oxidative DNA damage. Oxidative damage accounts for the major portion of the
ReCA PROTEIN
137
thousands of genomic DNA lesions suffered by every E. coli cell in every generation when grown aerobically (26, 31). Introduction of a mutation in the fur gene produces an iron overload, leading to much increased oxidative stress and DNA damage, including lethal and mutagenic lesions (46). Doublefur recA mutants die after a shift from anaerobic to aerobic conditions. Functional recombinational repair is therefore necessary for protection from the killing effects of aerobic environments, but SOS induction is not (46). The prevalence of oxidative damage can also be correlated directly with the frequency of intracellular recombination. A duplication strain carrying a lac+ allele between two direct-order flanking sequence repeats generates lac- clones by recombination. Among 40 mutants isolated by J. Roth and colleagues that did not produce lac- recombinants, 3 had mutations in the recA gene as expected. Unexpected was the finding that many mutations that eliminate recombination also cause phenotypes suggesting a block in oxidative metabolism (e.g., the citrate cycle and/or electron transport). These results suggest that intermediates or by-products (e.g.,oxygen radicals) might be important in the generation of initiating substrates (nicks, breaks), which are a prerequisite for recombination (46a). It is not the DNA damage per se that leads to the initiation of recombination, but the double-stranded DNA ends and single-stranded gaps that appear as a result of the damage. For example, the RecBCD enzyme is a key activity in bacterial double-strand break repair. RecBCD does not bind to DNA lesions; it binds to DNA ends created by DNA damage. Similarly, RecA filament formation is not stimulated by DNA lesions. Instead, RecA filaments form on single-stranded DNA gaps made available by other processes. The importance of the DNA gaps and breaks associated with damage as opposed to the damage itself can be seen in another RecA activity, the induction of the SOS response. Filamentous phages defective in minus-strand synthesis cause an induction of the SOS response when used to infect E. coli cells, whereas the wild-type phage did not (47).The single-stranded phage DNA, in the absence of DNA damage, was sufficient to induce the SOS response in vivo. Recombination that occurs during the rare sexual exchange can be viewed as a by-product of the inherent recombinational repair processes. However, initiation of recombination during conjugation or transduction does not require DNA damage. During conjugation or transduction, the double-strand break-repair path (RecBCD)is simply appropriated to bring about an efficient genetic exchange. This appropriation comes about because the processes of conjugation and transduction present the cell with the substrates, DNA ends, and single-stranded DNA gaps, required to initiate recombination efficiently. The RecF pathway may be appropriated the same way in certain conjugational contexts. The process of conjugational recom-
138
ALBERT0 I. ROCA AND MICHAEL M. COX
bination is as efficient as it is because its evolution was shaped by the biological imperative to repair DNA.
II. The Structure of RecA Protein
A. General Properties Bacterial strains engineered to produce the RecA protein at veiy high levels are widely available. The RecA protein is easily purified in large quantities (48-53), although care must be taken to remove persistent contamination with exonuclease I (54).The protein is quite soluble in aqueous buffers and is stable for many months if stored at - 70°C.The isoelectric point of the purified protein has been measured in at least six studies, giving pl values from 5.0 (55)to 6.2 (56).Averaging all the measurements (53, 55-59) gives a p l of 5.6.In solution, the protein has a tendency to aggregate into oligomers, filaments, and bundles of filaments (60-63). The aggregation state of a given solution of the protein is affected by pH, ionic strength, and temperature. Concentrated solutions of the protein can appear almost opalescent. The structure of RecA protein has been determined at 2.3 A resolution (64,65).There is a major central domain flanked by two smaller subdomains at the N and C termini (Fig. 2). Monomers in the crystal are packed to form a continuous spiral filament, with six monomers per right-handed helical turn (Fig. 3). These results are consistent with other studies showing that RecA protein forms a helical nucleoprotein filament on DNA. The filament exhibits a deep helical groove. A variety of physical studies (detailed below) indicate that the groove can accommodate up to three DNA strands. The structure was solved in two forms, one with bound ADP (65) and the other with no nucleotide (64).Both are highly informative, although some work indicates that they represent an inactive (ADP-bound)form of the protein (3, 6 6- 68).
6. Sequence Alignments Sequence alignments of related proteins from different species are commonly used to iden* functionallyimportant segments of aprotein’s primary structure (69).In a previous review, we published a multiple sequence alignment of 16 bacterial RecA proteins (70).This and other alignments helped to identdy and elucidate the functions of several conserved regions of the RecA protein. For example, the Walker A motif defined by the sequence analysis of nucleotide binding proteins (including RecA) (71)has subsequently been shown to bind the phosphates of NTPs in the crystal structures of proteins such as Ras p21 (72),Ef-Tu (73),and RecA (65). RecA alignments have also helped identify RecA homologs in other classes of organisms (4).
ReCA PROTEIN
139
FIG.2. A RecA monomer, based on the structure determined by Story et al. (6.3 The ahelical regions are labeled A through J; the p strands are labeled 0 through 10. Dashed lines denote regions that are disordered in the crystal structure.
Flc. 3. A R e d fdament, based on the structure published by Steitz and colleagues (64,65). Four turns of a helical filament are shown in a space-filling representation. There are 6 RecA monomers per turn, or 24 altogether in the structure presented. Two monomers are shaded dark.
140
ALBERT0 I. ROCA AND MICHAEL M. COX
Several recent advances recommend a renewed examination of RecA sequences as an avenue for generating insights into structure-function relationships in this protein. First, the number of available bacterial RecA sequences has increased more than fourfold since the original alignment was published. The data set is now sufficient to use for phylogenetic analysis (74, 75). Second, the availability of the structure of the RecA protein, including an ADP-bound form (64),provides an enhanced context for sequence analysis. Third, the Rad5l protein of yeast has been identified as a structural (3) and functional (76) homolog of RecA. Finally, several additional eukaryotic RecA homologs have been advanced based on sequence identities, notably Dmcl of yeast (77) and Rec2 of Ustilugo (78).Together, these sequences have the potential to eliminate noise in the alignments and focus attention on structural motifs that are essential and perhaps unique to RecA function. An expanded set of alignments is presented here, along with an analysis of some of the information to be derived from it. The results focus attention on a segment of the RecA primary structure that has been little explored to date, and that is likely to play a role in coupling ATP hydrolysis to a protein conformational change. The results also suggest the existence of a RecA homologue in E. coli. 1. BACTERIALRecA ALIGNMENT The alignment of 64 full-length bacterial RecA homologs is depicted in Fig. 4. The sequences are from the GenBank and EMBL data bases and the respective accession numbers are given at the end of each sequence. Four of the sequences, indicated by dates, are personal communications. There were 2 1 sequence errors corrected during the preparation of this alignment (A. I. Roca, unpublished data). The summary lines at the top of Fig. 4 provide an overview of the conserved features of the alignment. For example, the Walker A box (involvedin ATP binding) from residues 66 to 73 is highly conserved with six invariant residues (pointed out in the “iden” summary line) and two positions where nonconservative amino acid substitutions occur. This region is flanked by positions where chemically conservative changes occur, such as aromatic side chains at position 65 and hydrophobic residues at position 75, shown in the “chml” and “chm2” summary lines. The alignment is placed within the context of the secondary structural elements (“stm” summary line) defined by the RecA crystal structure (64). For instance, the Walker A box is a loop that occurs between @-strand1and a-helix C. Finally, the fifth summary line (“prof”) is a consensus sequence for bacterial RecA proteins calculated as the highest scoring residue at that position of the weighted alignment in the RecA profile (see Fig. 4 legend). In general, the RecA protein is very well conserved. The percent identity
ReCA PROTEIN
141
with respect to the E. coli sequence ranges from 49% for Mycuplasma pulmonis to 100% for Shigella flexneri. Proteins with identity as low as 30% can be structurally related (79),so it is likely that the structure of the bacterial homologs will resemble that of the E. coli RecA crystal structure (64). Indeed, electron microscopy shows that the RecA nucleoprotein filament from Thermus aquaticus is very similar in structure to that produced by the E. coli RecA protein (80).There are 59 residues that are invariant among the 64 bacterial sequences. Based on two slightly different sets of assumptions (chml versus chm2 summary lines), either 100 or 106 residues were ident&ied wherein changes are restricted to chemically conservative substitutions.
FIG. 4. Multiple sequence alignment of 64 bacterial RecAproteins. In the alignment, a dash (-) represents a gap introduced in a sequence to optimize the alignment with the other RecA proteins. A period (.) indicates a residue identical to the amino acid found in the E. coli RecA sequence at the top of the alignment. An asterisk (*) marks the position of a protein intron found in the RecA proteins of Mycobucterium tubm-culosisand Mycobacterium Zqrue (91).The GenBank accession numbers for the different published sequences are given at the end ofthe alignment (GenBank release 86 and EMBL release 41). Sequences with dates listed are personal communications. Above the alignment are several summary lines, calculated with a C program (frmtMSA) written by A. I. Roca. Here sbu provides information on the secondary structural elements from the E. coZi RecA crystal shucture, where “a” denotes a-helices, “b” denotes pstrands, “1”denotes functional loops, and “?” denotes disordered regions (64). Each secondary structural element is uniquely designated by a letter or number placed at the second character in the string of characters defining the element. The a helices are labeled A to J, the p strands are labeled 0 to 10, and the loops are labeled 1 and 2 (64).Next, iden represents the 58 residues identical in all the sequences of the alignment; chml lists 105 chemically conservative residue substitutions based on the following classification: acidic (DE), small aliphatic (AG), amides (NQ), aromatic (FWY), basic (HKR), cysteine (C), hydrophilic (ST), hydrophobic (ILMV), and proline ip)(M. Gribskov, personal communication). Then chm2 enumerates 99 chemically conservative residue substitutions based on a different classification: acidic (DE),aliphatic (AGILV), amides (NQ), aromatic (FWY), basic (HKR), sulfur (CM), hydroxyl (ST), and imino (P) (415).In the chml and chm2 summary lines, the corresponding residue of the E. coli RecA protein sequence is the one listed. Also, invariant residues in these same summary lines are indicated by a period (.) for clarity. Finally, prof is a weighted consensus sequence of bacterial RecA proteins, generated by the PROFILEMAKE program of the GCG package using the BLOSUM62 symbol comparison table for scoring different amino acid substitutions (416).PROFILEMAKE is an implementation of profile analysis (95).The alignment of bacterial RecA sequences was generated with the PILEUP program o f the CCC Wisconsin Sequence Analysis Package (Release 8.0, Genetics Computer Group, Inc.) (417) using default parameter values. PILEUP produces a progressive, painvise alignment of sequences (418). A few manual adjustments of small gaps were performed after visual inspection of the ahgnment. The numbering of residues begins at alanine-the N-terminal residue found in the mature E. coli K-12 RecA protein (419).Amino acid positions cited throughout the text refer to the sequence of E. coli RecA unless stated otherwise. The bacterial alignment was weighted using an unpublished program (M. Gribskov, personal communication) that is based on a modified version of the Felsenstein algorithm (420).
1
10
20
30
. . . . . .
40
.. .
.
.
50
.
.
60
.
.
70
.
.
80
.
.
90
.
.
100
.
aBaaaah__91bbbb__aCaaaaaaaaaaaaaAZMbb??aAaaaaaaaaaaaaaaaauOb G - A C I G G P - R I - L E S - G ~ T - L D i e f .l-t-...-i j.i-.iy l - i _ . f i . - e d Chl chnt Lf 1 .. -t-l_._._. ,-.A. i . iy.-. a f i. a i d p r o f -STELK)TnrlEE~VmSBEBKQKL~LSQIEKQFGKGSIMRLGBK~~QBVEVISTSSLGLBIALGIGGLPRGRIIEIYGPESSG~LALHAIAEAPKKGGVCAFIBAEHALB stru f den
..
..
~
..
..
.
.........
..... .-.-
E c o l ~ I D E N K P K A L A L W I E K Q F G K G S I M R L G E D R S ~ ~ V ~ I S T G S L S L D I M G A G G L P I I G R I V E I Y G P E S S G K T T L T L Q V I A R ~ ~ T C A F I D A E100 HALD Apol -AQ PG1D.S EG .S RA .N R-PTE-Q.0V. G., .I R .I .MA.HA E K.6 103 K. .C-6ADH-SI.A.PS IA .I. Y .R 1.V .HM. S. .KQ.G.V 100 A l a i A S . N K . 9..EL. .K Afac -DEPGSKNEFSP .FM .G AV .A.DKPGIN-.PDVK G .G Q R .V .KA S .A. .A.P 101 .S .Q .S .NTV .DNTV+ A. .AV .T .I .K .M A. .QC.KA.G 99 Acal D.-. Atum --SIAPNSLRLVEDKSVD.S .E .S RS K SNENVV-E .V .I. K I. .A. T. E .KK.G 1. V 111 Avar M I N T D T S S .TMV.N RS .A D-ATRMR .A. T. .L R .V I . VA.HA. .EV. K. . M A V 0 . 103 Amag -------------IDR ..E.. VS RA K. .GKWVV-ET.VV R I 6. .V .I.V. R 1.V .A.H I.E. .KK.G V 98 Apyr -RVSENLSE.H .EV. .SS .R .AV.P.KAVETV-E .P .I .T. V. .I.K .T F.V .A.H E. .KR.GVAV 105 .KR .A V. .M.DHERQ-AIP A. G .I. K .S T. E .KQ.A V 99 Avin Bsub -SDR.A. .DM .K .K .-KTDT-RIS.VPS .A. .T .I. Y. R IV .VA.HA .EV.QQ-R.S 97 MDK .S .KU .-EVVE Q. .V.P .IA.NA V .Y.R I .AIH A. E. .KA.GIA F. 87 Bfra Bper F D K T S K AAAE A. S Y DNEVEH-.IQVV. 6. V R VI V .EM .KL.G V 107 A LIKU..SPVGO-GIKSEI .S I V E...I..Y. R. I F. A EV.K..GIA... 115 Bbur -MSKLKEKREKAVVtiIERAS.EE.IEL.RV Babo ---MSQNSLRLVEDNSVD.T, .D S RA ANDQVV-EI V V K .A.HT E .KK.GI V 111 Cjej A . D . .R.S.O .KSLD.T .T.L .DKE-VEQ1DS.G VG .L . I . .V.K I .HI. .EC.KA.SV 99 C t r a A S V P D R K R .E IAY .A S KHS.AH-EIS. K. .A. .L . I . V K F .ATHIV. N. .KM.GVA.Y 101 Cper ----FIAW .KD. L. .IEM.M .V.K .-QGAP-QMDAV .C.D I. .V.K .I .VA.H.V. E. .KL.GAA.Y 102 D...AL... D AV D-ENRP-PIQ...S.NTAI......I.. F.R V VA.HA ..Q..KA. GIA 112 Cglu +HPKKTATKATA.KGNDR Orad -HSKDATKEISAPTDAKERS .IET.IIS .A K. .AESKL-. .QVV .L V. .I6. .T .G .A.AIV. Q. .KA.G 112 101 Eagg 101 Ecar .TQ.E K. DTKT-L- .S 6. .V .I F .S. Q. .K A. 101 Hinf D..A L. D L -4ANQK.SVV.S. A Y.K. .VA.HAV ..V.K..GIA.. N. 114 L l a c -LATKKKTNFDDITKKYGAERD S VS R V .n DSTVSR- IA. G .I. K EC KM G A 100 Lpne ____IwE M f 1a _____31 RS S .M. DTDVAA- IQAV., G I R .S .On KL.G.A. 100 Mcla --MD .. .S .S .M.OADIGE-.LQVV G .V R .I .SA. .EM.KL.GVA DD 100 Mnet D..S .S .M.DADIGE-.LQVV .G .V .R . I . . .SA. .EN.KL.GVA 100
... . ... ....... .. ..... ... .... .. ........... .. ... ............ . ............ .. ..... . ... ............ .......... .. ..... . ..... . ... .... . ............... .. . .......... .. .. ....... ... .... ..... ... ............... ... ............ .. .. ... ....... .. .. ........... ... ... ........... . . . .... ... ... ..... ...... ...-. ... . ........... .... ....... ... .. .. ... ....... .. . .. . ... .......... . .... ....... .. .. ... ... ..... .. .... .. .. ....... ... ......... ... ............... ..... ... .................. . . .... ....... . ........... .. ... . ......... . .......... .. ...... . .... ..... .. ... ............ . ........ . . ...... ............. . .... ..... .... .. . ................ .... ....... .... ..... .. .. .. .. ............ .. ....... . ... ... ......... .. ............ .... ................ .. . ... ....... .. ... .. ... . .. ... .............. ........... . .. ..... .... .. . .. .. . . ...... ......... ........ . .......... .. .. ...... .. ........... ........ .... .... .... ..... .......... .......... . ... ....... ............ - - ....... ... .. ...... .... ..................................................................................................... ..................................................................................................... . ...................... . .. ...... .. ........... ........... .. ............. .... .. ..... ..... .. ......... .............. .... .. ....... .. ... ...... . . . . ..... ..... ... ....................... . . . .......... ... ....... ............ . ... ...... .... .................. .. . ......... ...... ............ ..... ..... .... .. .............. ..... ... ...... ............ .... ..... ... .. ............ ..........
.
Mlep Mtub MWc Mpul Mxal Mxa2 Ngon Prnir Pvu 1 Paer Pcep
r
rp
0
--
.. .
... .. ... .... ................ .........E 1 0 1 . ... .. ... .... .. ............ .......... 101 .. ... .. ............ .......S.E 115 .. . ... . . .. ......... . .... ..SI . 107 . .. ... .. .. .. . ......... .......... 102 -- . .. ....... . ... ......... ....... 103 ...... .. ... .... ...... .......... .. .....F . 100 ...................................... ........... ........ ...................... .... ......... 101 .................................... ........... .. ........ .............................. ........... 1 0 1 ... .......... ..... . ...... .... ................... ... . .... ....... 99 . ..... ............. ..... ..... .... .. .................... .......... 100 .. ........... ..... .. ...... 99 -.. . .......... ..... . ..... ...... .... ................... ... .. .... ....... 9 9 -... ... ... ....... .. ........ ..... .... ... ............ .. .. . ... ....... 98 .. ... ... ....... .. . ....... .. ..... ...... .... ... ............ .. .. . . . ....... 111 .. ... .. ....... ...... 111 ..... . ... ........ . .. ..... ... .................. .. ....... 112 .. ..... ... .. .............. .. ....... 98 .. . .. ........ .. ... .... .... . . - - . ... ...... .. .. ............ .. ............ 100 101 .......................................... .......................................................... .......................................... .......................................................... 1 0 1 .. 99 ... ... ... .. ... .......... .. ..... .. ... ............ . . .. .. .......... 113 . .. ....... 100 ... .. ... ..... .. .. . .. . ... .... .. . ............ . ... . . .. ............ ....... 100 .. .. .. ... .. .. ... ..... . .... ...... .... . ............... .. ....... 100 ... ... .... ..... ................ ........ 103 ... ... ... ..... ...... .. ...-.... ... . ............ .. .. ....... 104 . ...... ... .. ..... ....... .. .......... 102 ... .... .... .. .. . . ... .... . ....... 99 .. .. .. ... ......... ... .... ... .. .......... .... .. ....... 99 ... ..... .. .... . ............... .......... 100 ........................... ............ ..... .............. 9 9 - ............................ ............................ ............................. ............ ..................... 99 -- ... ..... .......... . . ..... .... ............... .......... 99 -..................................................................................................... 101
... ..
.... .... .... ... ... .... . . ..... . .. .. ... .. . . ... ...... .. . . .......... ..... .... ... ..
QVPDRE EL.MA NY V D-EMCQ-P1SV.P IA V I R VA.HAV.N ..A V.GVA . TQTPDRE .EL. VA .SY. V D-EARQ.PISV P IA V I R VI VA.HAV.N. .AA.GVA MSTELQNTIENND1RESQMUNS.E .KE I K E ..M.. V. .QSDNL-NI F.S... L. .N I. .Y. K. I. S.HA.CEV.KL.GIA . - MSENNQSNQNNQIN I I K S T I E E .K NE L OKEKC- DVF S. .YAINS I .F. K. . I F IA.HT. EI .KKNGFA V SKLAE L. V VAS R V T .-GEAREOK.A V. PS VGV.R V Y. R V VF N .HA. .QV.AA.GVA ..VNQE. E. .IEL.MSA V. R .N.EPM. R. .QA. P .I V .V.K I. .F .C.HIV. E. .KR.GI.GYV . ----MSOD.S. .A S .A KM0GSQQEE.NL.V G L V R F C .EAV.QC.KN.G V. V .---N .V R .S ..I.. N Y V R I --M .KR R AV .R.DHERQ.AIPA 6. .I... .K S E .KQ.A V ------------MTAE S. .A M. DGEAAE..IQVV 6. V R V EL.KL.G.A M.D K R AV M.0HDRQ.AIPA G......I....K...................S...Q..A.A....V....... PflU Pput --M D .KR R AV . .M DHER-Q.AIPA .G I K S E KN.A V -------FH).S E S RS K SNENVV.EI 6. V R I .A T E .K K. G I V Rleg E S RS K AKD VV-EI V G I K I .A T E .K K. G I G V R N l --MAQNSLRLVEDKSVD.S. Rpha --MSQNSLRLVEDKSVD.S. E S. RS K..SNENVI.EI G V....K...I.............A..T..ES.KK.GI...V....... Rcap --MATTGLFEMNDKGKAD EC .A R K .G.TPP P-EX .A T. .G .I .K .H C. .EE.KK.GV .V .6 .I .K . I .H.V.EE.KK.GV .V Rsph - - - - - - - - - - - - M D . A . ES .A. R K ANSPV.-EI .AT. I .A SY V K .QRPNV-. I .A. G I V.K. I..F HL ES.KK.G Rpro ---------MSNID.ER. Smar -Sfle Saur -M.NDR .DTVIK M..S .AV. K. .D-NIGR-R.S. T. ..VT N ...V..Y.K...I............VA.HA..EV.SN.GVA.......... .--RAEM.QV I. .A. S Y. K I VA HAV 0 K GIA Spne +AKKPKKLEEISKKFGAf RE ND K L D TA .V V R VI V .HAV. N .KA.GQV V Swnb ------------FV\GTDRE D .A R AV M. D- KE-PI V. P V. I. R .V V .HAV. N. .KA.Ga V. .V S l i v A A G T D R E .D .A .R ..A V. .H.D-.THE-PI. V.P .T A. .V Sven A A G T D R E .D .A R AV .M.D-.TOE -PI .V .TA V R .V .HAY. N. .KA.GQV V Sy70 ------+SAISMNPD.E NLV.N RN A D--AAQHK.A ..PS.A.T..O.H.-.. F.R VA.HA ..EV.KA.GVA.. SY79 -GFVRR.APE.E NLV.S RN .A H-AARLR .A.T L K IV .HA EV.KO. GIA V Tmar PEEKQ.KSV.EK. .K R. .EN .I. .DETQVQ- P. .V.P .A1 .T. V. .Y.R .F.Q .A.HA E. .KM.GVA Taqu --ME .R.S.E N. .KT .E .AV .MPKL-Q.D V. P 6. .L .I 1 R .V T. .F .G .A.T I.(1. .KG.GV A. .V .S.R .E N. .KA .E .AV .MPKO--Q.DV.P .A. .L . I . .IR. .G .A.T I . .O .R.GV A. .V Tthe T f e r ______M .QRS.G.S 5 . .D .AV .DHNAIK-. I.VY G. .L .V R .V .HA. .SC.AA.G.A DN.T-.-. V .EL .V Vmg M DN.A-.F EL Vcho -M .KR .S V. .M.D-.VIE- A. V. P., .M .T K. .V ..A. .QC.KL.G.A Xory Ypes --
FIG. 4.
110 stru 1den chl ch2 prof Ecol Apol Alai Afac Acal Atun Avar AMg Apyr Avln
I20
.
.
130
.
.
140
.
150
.
.
.
170
.
.
180
.
.
190
.
- ~ L P _ G E _ L I 6 S V I U L P I _ G A R _ S A _ R F _ Q R _ G o .m... n. i 2-1 .-.-._. a. . . v.vvi. L._i
._.
.....
.
160
.
aDaaaaLibbb-EaaaaaaaaaaaU4bbb
.
200
.
1lllllllaFaaaaaaaaaaaaaaaaaaa~5bbbbbbl2llll
.
i .-k i. 2-1.v a. v iv 1 P 1 L. PWAKKL6VBIBWUISPPBTSEQALEIABnLVRSGAVBIIVVBSVMLVPKAEIEGE~GBSHVGLaARLnSLRKLTGSISKSNTTVIFI~IREKIG
......
.
......
.
..
...
.
P I I A R K L G M ) I ~ L L C S O P O T G E Q A L E I a ) A U R S G A M ) V I V Y D S V M L T P ~ I E 6 E I G D S H I G L M R M S P A I I R K L A G N ~ ~ S I I T L U F I ~ I W200 IKIG .6 .V .I ..A .A.T.V .L V. R L. .DM V. H. .L .L .T.SVSR .M .L .L 203 N. .KA .L .VL .D.K IK .sr. M. . I .V.E .~.onun.V. .MS.VISK WA .E. V. 200 .Q.S .NVKR I .A.M.F .K .N 0. H. .L .L.MTA.ISRU.K. V. .KK M. 201 .v . H A.N. v I.L ..a L L .IT. .A. R. .arv 199 v. Las I .T. T. v L I R .M LP .a. L L TASISK.KOIV. 211 T. .SA .V .S .V.P.VP.A IV I .V.R .on .A. V. .L L. . I T. . I ~ K . G c T V .LA 203 197 .S .A.L.E I . € . .A .A.T.V.P .L .V.RG.L M. .N H. .L .L .T.SVSK.K.IV K. .K .V . Y I .Y .AES.IN V. .O.L A.EA0V.K 0. .L .L .K.AVHR .A .E 205 0. .A .NV. 0. .v .TA. V. .N .v n. v. .L L ..IT. .IQNA.C.~ 199 .AE .v VI. .v .on v. .L .L .S.AINK.K.IA .E. V. 197 Bsub .v. .a .N.E E. .L B f r a RF A V FI N AEQ.1 ..S. 1.1 ffl ..NKV..O.. L L TSAVSKTR.TC L.E. 187 Bper va s .NLTD I T v .s L I v .M LP L L .TAT I . RT.CIIV 207 .AEH.I 6..L L. 0. .M I.Q. L. .K. L. .IT.I.SK .CIM .R 215 Bbur V. .KA .NVAE.YL ~ a b o.v .HLE I .I .T.T.V . . I . .L R .M .L .v .T.SIS R. .atv 210 c j e j VK .KN .NT.D.YV .F VETI L ...on v. a. .L .L .T.IVH . i v 199 C t r a N. .ALI.AN.M).NI C. .0. .S.AEL .I V. S. L .V. V. .P .L .TAT.AR 1..CA .E 201 Cper V. .KR N. .D.VV .TE .V . . I . .L V. R N. V. P. .L .L .T.TINK .CVV .L.E. V. 202 Cglu .O .T.A V .A.M.V .I.I I , .M V. .O L L .HT.A.YN.G.TA .L.E 212 Orad . V A...NA.E..V....N.......EEL.V....I..V..........R.....CII...LP..Q..L....L...TAI.SKTC.AA.....V.E... 212 Eagg K T........I.........................................NA............... 201 K .T .I .A. 201 Ecar A .VKI.FV .N .v . I .I .L L. ..T.QI.NA. c. w 201 Hinf L l a c E. .KA N. .E L .Y .G.Q.AEK.IT LV.1 V .D SV .P .HINKTK.TA .L.E. V. 214 Lpne 5 . .P .KV. E. .V .T.M. V. .A .VII .I4 V. Q. .L .L .TA.I.R .V 200 M f l a V. .Q .NVSD I .A.M.V .S .V .N .Q L L .TA.I.RT .V 200 S. ...NVP E . . I .A.M.V S. . I V ..R.. .M Q. . L . L 1..I.R 1. ..V.. 200 k l a .A.M.V s. VI. .R . L .L T. .I.RT .v 200 nmt .a..A .NVP E. . I
....... ... .. ...... ...... ........ .. .... . ... .. . ... .. ......... .. . ..... .. .a ......... ... .. ........... ........ .............. .... . ... .a............ ... ....... .... ................. .n ..... .. .... . . ....... .. ........... ....... . ........ ..... ... . . .... ... . .....a .. ...... ... ... . ..... .... . .a. .... ..... .. ...... .... ....... ... .... ... .. . .... .. ... ...... ............... .. ... .. . ... ........... ... ..I.. .... ........ .a. .a. .... ... ........... . ..... ....... ...... .... .a. ... .. .. ..... ... .... ....... ................. .... ... .. ... .. ............ ... .. .. .. ...... ....... ... ..a.. .... .. . .. ........... ... ............ . ...a . . ...... ... .. ....... .. .......... ..... ....... a. ... .. . .. ... ....... ....... .............. ..a. . ... .. . .... .......... ...... . ..... ...... .. . ... ........... . .. ........ ...... ... . ... .. ....... .. ........... ... .. ............. .... .. .... . ... .... ........................... .... .......................... ....... ......................................... .... .... ... .......... ... .. .... ...........on.. ... ..a. .... . ... .. ... .. ..... ...... .... ...... . ............ . ... ........... ... .............. .... . ... .. . ... .. ........... .. .. ................ ...... .. .... .. .a. . ........... ... ........ ... ....... ... ... ... ........... ... ......... ......n.......a. ... ...
.. .. ... ... .. ......
..... .......... .... ..
...........
...........
.. ....... ... ........... ...... ..... ..
...........
.. ...... .. ...........
m. ........... ...... ... . ....
....
...
.............. .......... .... ... ........... .. ........... ......... .. ...........
. .....T. S. .V ........... .A.M. I....L .1 V . I ......V.R ..L ...M ...Y V. .Q. .L ....L ..MT.A.SN.G.TA .... .L.E ... 201 . .... .T. S. .V.. ......... .A.M. I....L .I V . 1 ......V. R ..L ...M.. ..V. .Q ..L ....L ..MT.A.NN.G.TA .....L.O ... 201 ..I.TNK ..V....N.....O.LEM.IN.NSI.L.........V..T.LD..NS.QSI..Q.....K.L...~LIAK...TV.....L.E... 215 .V. .KN ..I.... ..I.. ..S .......V .I..K. . S I . L .........V.E. .LN. .MK.QSI ..(I.. L . .K. L. .IT.S.SKNK.SV .... .V.E ... 207 Mxal VS .......WEE ..V ............TEH .V ...... L ......... V .R ......M. .A. ..V Q. .L ....L ...T .AVSR.G.CI. .......... 202 203 Mxa2 VG .......RT.D ..L ............ AEM.V ....I..L........V....L...M..A...VQ..L....L...T.TIAK.Q.CV........... Ngon .V. ......KVEE .Y L. .............T. V .. . G I .M V. .......V ...... .DM ....V. .Q. .L ....L ...T. HI.KT. ..V V. ......... 200 Mlep Mtub Mmyc Mpul
Pmir Pvul Paer Pcep PflU Pput Rleg Rmel Rpha Rcap Rsph Rpro Smar Sfle Saur Spne Samb Sliv Sven Sy70 Sy79 Tmar Taqu Tthe Tfer Vang Vcho Xory Ypes
E. .K 0. .K .K.CQT
... .Q.......................... .S............................. .V ...................N ................201 ....Q ........................... S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V . . . . . . . . . . . . . . . . . . . N . . . . . . . . . . . . . . . . 201 .0. .G ....NV.D ..V ............T.M. V . .N .....I.......V ........M. .A. V. .a. .L ....L. . I T . .I.NA.C.V ........... 199 VQ . .A ....NVP E. .I ........... .T ...V ...SI.M ..I ...... V ....... .M .. .L P. .Q..L ....L ...T.TI.RT.C.V ........... 200 .E. .G ....NV.D ..V ............T.M. V. .N. I.......... .V ........M. .M. V. .Q..L ....L. .IT. .I.NA.C.V ........... 199 .E. .G ....NV. D. .V ........... .T.M. V. .N. ... .I..... ..V.. ......M. .M.V ..a. .L ....L. .I T. .I.NA.C.V ........... 199 .V ........ LQ... I ............1.T.V ...R ...L ..........R ......M ..TVP..Q.. L....L ...TASISK... NV ........... 198 . V ........LE ...I ............T.T.V .... I . I L . I .... ..V.R.. ....M .. .LP.MQ ..L ... .L ...TASIS K. .CnV ........... 211 .V ........LO...I . . . . . . . . . . . . T . T . V . . . . . . . L . . . . . . . . . . R . . . . . . M . . . L P . . Q . . L . . . . L . . . T A S I S K . . . M V . . . . . . . . . . . 211 .Y. .K. ...SLED ..I............V .T. V .....SL V. ............... M..ATV.AQ. .L. .......TASIGR ..CnV ........... 212 .Q ..K. ...NL.E ..I............V.T.V.....NL............S....on..na..SQ..L........TASICR..CnV........... 198 .A ..K ....N..E.11 ............A.T.1 ...GI.M.11 ...... V ..S.....M ..AQ.ASQ.. L....L ...TASINRT.CITV .......... 200 ....K. ......................... .T ....... .I........................................ .NA ............... 201 .................................................................................................... 201 .E. .QA ....... .YL ....H ...G ...AE.FV ..... . I V ................. M. .T. V. .a. .L .... L ...S.AISK ...TA .......E. V . 199 .A. .AA ...N. .E ..L ....S .. .G ...AG K. I D.....LV ........V .R .. .D.D .....V. .Q.......... .GASINKTK.IA .....L. E. V . 213 .E. .a ........ .IL ....N ...... .V.M.V ....L.L. .I......V.R ..... .M ....V. .9..L ....L ..ITSA. N. .K.TA .... .L.E ... 200 .E. .K ........ . I L ....N .......V.M.V ....L.L ..I......V. R ......M ....V. .Q ..L ....L ..ITSA.N ..K.TA.. ...L.E ... 200 .E. .K ......... I L ....N ...... .V. M. V ....L .L ..I ......V .R ......K ....V. .Q ..L ....L ..1TSA.N. .K.TA .....L.E ... 200 .T. SAA .....E. ..VA ...N. .S ....A.Q. V. .A. ..L ..I......V.R ......H. .V QV. .a ..L ..K. L. .I .. .MGR.GCT V. .L. .L.Q ... 203 .V ..TSV ........ I C ......M ....V.Q.V ..A . ..I V . 1 ......V.R ......M ..AQV..Q.. 1.......I T ..1GK.GCTV ..L ..L.Q. .. 204 .V. .KN ....LKS ..I.. ..H.......V.E.V ...V. .L ........ .V.R .....AM ..NQV. .Q..L ....L ..I.SVNK.KAV . V. .T ........ 202 .L ..K . . . . . V Q E . . V . . . . . . . . . . . . V E L . . . . . . . . . . . . . . . . . . V . . . . . . . . M . . Q . V . . Q . . L . . . . L . . . T A V . S K . . . A A . . . . . V . E . V. 199 .L ..QR ...QVED.. V ............ VEL .................. V.R ......M ..Q.V..P.. L ....L . ..TAV.AK.. .AA .....V.E.V. 199 .6. .H .....LE ...I........... .A.M.V ......L ..I...............M ....V. .Q. .L.. ..L.N.TA.ISR ....V ........... 200 ... .K ....N. .E. .V ................... ..I.. ..I.. ............ .M ...... .Q...L ...... .T ...... .CnC ........... 199 .V. .K ....N. .E. .V ......................................... .M .......Q.. .L ...... .T .......cMC ........... 199 .V. .A ....NV. 0. .L ............A.M. V. .SS.. I V . 1 ...............M. .QL P. .a ..L., ..L ...T ..I.R ... .VV ....L ..... 199 ....K ........................... T........I......................................................... 201 FIG. 4. (continued)
.
.
.
220
.
.
..
o?
.
.
240
.
.
250
.
.
260
.
.
270
.
.
280
.
.
2 90
.
300
.
...
-
.... . . .
... . . . .
..
.
E C O ~VMFG-NPETTTGGNALKFYAVRLDIR-RI~AVKEGENVVGSETRVKVVKNKI-MPFKQAEFQILYEGINFYGELVOLGVKEKLIEKAGAWYSYKGEK-IM Apol S- . S ,M -SI DKDE T tQ M. P R V .D. M. SKV I AGIV S F CDSQRAlai P. .R .FS E. AE. I .Q.SEM I. IKSN .S.V..P.L.T.SID. M. .T. .SR S. .V L. .S.ELN.VN.S .NIGE -L. Afac Y R. , I.AMK.SATKSYDWS V. P R LA Y .YRG S. P LENV S P RRAcal S. ...+I MI1 K P.M ..R € . I K.V.QL A.0NIVQ Q.N . R.E NQ Atum -S M-.P V D.M VSKT AGIV S F ..NSQR-L.. . Avar . TI. S. Q-TL.K.TDEF .NRVK .A V. P R I D. I F .K.VSTL. C. .AEETGILLR K. N.D K - Amg .ME. V.-. IDRDE W L. P VVD.D.M SR I .ANVV K. S F NSTR, P R .FSD M. EV.-. L.-D. GEKK YRVK.R. L. P PE DVI CRICDII .TAANLGV. T. S S EKR-L Apyr Avin ..A T.. .SDE.I V .P . R K. .YR N. .I1 .PLG. L. .S .Q.S.-. P R S EV.--.AE-QL.Q.NO.M. NK.KI , .V P RT VD.M. SKE I1 TELD1VQ.S. S EE RLBsub Bfra , --GSQ-QI.D ..E.I.KQ.K.. V-.P RK D. MF .SHS II...ADLGI.K.S.S.. ..NDT.-L.. Bper S I. K DE N V. P 0 M. S. .SRE I1 QANVVD S S NRBbur S L EV.KIEQVT-RSGSSDD. I NKI .I .V P RKV L I .YF. K. .SREAGI L.AAI .HN. .Q.T. S. , .LWN.-L Babe S. S I RDE W L. P V D.M. .A. VSKV .AGVV S F NSQR-L V.--KVA-TL.QN.EPI.NRVK V P.. ..R.. DVMF LSRE 1.Y .LDIVO.S F...DK.-L.. C j e j A.-.YGT Ctra S. R S I I SIKGGENFDI .NRIK .A L-.P. RT 0. .F N. SSA.CI1. A.EKNI .D. K. S.FN.QDR.-L. P R M D-SI.Q.DG1 T.NR I V-.P. D.M.N. ..SKE.NI.. V N1VP.S f ..GOIR-L.. Cper I Cglu S. .K C Q-TL.O.QDA1. NR L V S.P .I D.M. SRESSVI .A. DN61VK.S.S. FT E *L Orad Y R , V.+. QPT.V. NEA.ANAVK1 .T V. .EV. LALV .K.FOQLS D. G.AADMD1 .K SF G0.R-. Eagg I OV..... ., IN I H N Ecar , .T.. ID. .E , V. ,I H .H .N.D.---. Hinf ,, .S T.. S. D. ..II N.. .L. .R.VD .SKA .LE .H ..V.. S. .N Llac S. P R V GSTK.EEGSGDNKTQ1 K I . K I V. P V. LVD.HF SST LNIA E.61. K S FA.ND. . Lpne S. .SI. K. .EI L. V .P . .MT .D .N .SRES.IIN .QLN 5. Q. -.T...I.K.D E.T V-.P L O.......SRE..IIE...NL................ Mfla Mcla S T. I .K.DE.I .K .I V.. P .D.M .SR L. .I1E. .TN L. V. .S .N Mmt .S .T.. I .K.OE.I .K .IV. P .D.H .SRL I I E .TN L. V. S. .N
.... .. ............ . ... . ... ..... . . . .. ....... .............. .... ..................... ..... .... ............................. ...................... ...................... . . . . ........ . .. ... . ... .......... ................ .. .......... .. ..... .... ......... .................. ................... ............ . ................... . .. . ..... ........................ .. ................... . .......... ..... . . .. ......... .. ......... .. .... ...... ......... .. .. .......... ......... . . . ................................ ... ........ .................... . ......... ....... ......... . .... .... .. ........... .
..........
c. 4
230
aHaaaaaaaaU9bbJJlDb--s t r u 1111-11 11l - a G a a a a - b 6 b b b b d _ - ~ b 7 b b b b - - b e b b b b b GT_A P .E.. LKF R R. K K. .P 6 6 -Q iden chl . . 8. ..y.. di vk-.-.---.-id-. i_.-WY--L. . C .M V L J.. di Vk-v.-. 1 y. -.iwY-. p r o f VMFGYGNPETTTGGNALKFYASVRLBIRGNTRIGSSIKBGBEVIGNRTRVKVVKNKVEAPPFKQAEFBI~GEGISREGELfBLGVKLGIVEKSGAlJYSYNGEKGEEVSIGQ
-
.
210
..... ...................... ............................ ................... ..... .................. .........
. . ......... .... ......... .... ....... ... .....,,.. .. ... ... ......... . ....
300
. .. . . .... ... ..... .. ... . ... 303 ... . .. 300 . .. ... . .... ...... .. ...... . ... 3 0 1 ...... ...... ........ .- ... 299 .... .. .... ......... .. ... 311 . .. ... .. ..... .S . 303 . ... ..... ... .... ... .. ... 297 . .. ... ..... . .... .. 305 ........... .. ......... ... ..... .. 299 ... .. . .. .. .... .. ... .... . .. 297 ..... .. ... ... .. 287 ... .......... . ....... . . .. .... . ...... . ... 307 . .. .... . .. . .. 318 ... ......... . .... .. ........ .. ... .. .. 310 ....... .. ... ... .. ... 300 .. ... . ... .. . . 302 .... .... ...... .... ... 302 .. ...... .. ... .... . . .. .. 312 .... .... . ... ...... ............ ... ... .. 313 ........................... ...- ... 301 ...... ...... ................ ........ ........... .. 3 0 1 ....... .. ........ .. ... .... ...-. .. 3 0 1 . ...... . ... .... ... . . ... .- ... 318 .......... .. . .. .. .. .... ...... .--... 300 ............ ... ... ... 300 ... .. .. ...... .... . ..... ...- ... 300 ... .. .. ...... .... .. . . ..... ...- ... 300
...... .K .........M.V.-. .E-TL.D.VD A. .NR .. . . I . . ..V-SP.. .... .D ....K. SRE.S.1.M. .EQGFVR.S.S.FT. E. +.L .. 3 0 2 ...... .K ........ .M.V.-.YE-TL.D.T. A. .NR ........ .C+SP .......D ....K. .ME. S. I.M. .WG. .R.S .. .FT. E. .*L .. 3 0 2 .- ........K ....FS.I ..EV.--KAE-NILNNYEII.NKIK I . .....T-.I ...TTTISL ..NK..DKL.. ... .L.SYEI ...S.V.. ..QN. .- ... 315 . ..-. ....P ..R .......I...V.-KST-NIMLNNDIS. NQI........ L- .P .. .I..TE .IFSK ...KF ..VA ..ALVHDVL Q. K...F ..N.N--.A. 307 ....-................. .ME. .-. T.-NI .D. DA ....KA ........V- .P ..oE ...DL )I. .S. .HRV ..VL ....ATG ....S.SYF .L R. .R... 3 0 2 ....-. .............. .Q.... .-. ..-.I.N.D .... .R .........V-.P ...E V. .D. M. .T. .SRE.D. I . .ASN.NI V. .S.S.F.F N. .R-. .. 303 ... .............S ...... .-.T.S I .K. .E.L.N ..... . I .. .V-.P ..R ... .D ...... .SWE ... I . I .. .NDI.N. S. .... .N.A.-. .. 300 ........................... -...-S ..N...I............V-...........M......T....I.....H..V.........N...... 3 0 1 ....-. ....................... .-S. .N.DE .............. .-...........M ..... .TF ...I.....H. .V .........N ...- ... 3 0 1 ... .-. ..................... -- .T .-.....DE .............. V-SP ..R .........K. .YRT ..I 1....QL G. V ..S ......Q. S.-. .. 299 ... .-. ............ . S .......--. . .-S I .KND€ . I .N ......... .V-SP ..RE.1 .D .......SRQ ..I 1....QA.IVD ........N ...- ... 300 ... .................... .-.T.-. ....DE .............. V- .P ..R .........K. .YL N. . M I ....LHGFV .. S ....A. N. S.-. .. 299 .... ..................... -.T...... DE ......... I....V-SP ..R .........K ..YRN.. I1....SQG.V ..S ....A.Q.N.... 299 -- ...-....R.E.1.M .........M-.P. ...V ..D.M ....VS K l .........AGIV ..S ...F .NSPR-L .. 298 .... ..................... -....- ......................- .-S. ..R.E ...No ........ .M-. P....V . .O. M. ...VSKT .. .I. ....AGIV ..S ...F .NSQR-L .. 3 1 1 ....- ...................... .-S. ..R .E. I.NO ......... M- .P ....V . .D. M. ...VSKT ........ .AGIV ..S ...F .NSQR-L .. 311 ... .- ...5. ...............-. 1.-. I.DRDE.I.NO .........V- .P..REV ..D .......SKY ........ .AGVV A. S . . . . .GD .R... 312 ... .................... .-. T .-.I .DRD..I. NT.K .......V- .P ..REV ..D .M .....SKT ........ .AGVV ..S .S . . .GD .-R ... 298 ... .................I.. .-. ..-S1.DK.E. I. .P.K .......V-SP .. .T.D.D. M. .5. .SKE ..I1.... .LEI ...S. S.F .NKIR---. .. 300 Smr ... .-. ....................... .-.I ...DE .............. .-.......... .M ......SR ........ .H.M .........N .. .... 3 0 1 Sfle ....-............................................... -............................................. ... 301 ....ENDIVD .S . . . . .N..R - - - M .. 299 Saur ... .-. ....P ..R .....S ....EV .-. AE-QL.Q.QE1 ..NR.KI. .....V- .P ..RV ..VD.M. .Q. .SKE .. .I ..... V- .P...E. VVE.M. ....SKT ...LKIASDLD1.K. .......D. ... 317 Spne ... .-. ....P ..R ...........V .GNTO .KGTGWK.TN. .K . .K I. - E-TL.D.TDA. .NR ......... V- .P .......D. .. .Q ..SRE .G. 1.H ..EHGFVR .......E .DL.. 300 Samb ... .- S. ......R ............... SliV ....-S....... R ............... - E-TL.D.TDA ..NR .........V-.P .......D ....P..SRE.G.I.M..ENGFVR...... . E . D W L .. 300 -- E-TL.D.TDA. .NR .........V- .P .......D. ...Q. .SRE.G. 1.M ..EHGFVR.......E .DQL.. 3 0 0 Sven ... .- S . ......R ............... -- QTLK.GS.GEF.IRAK ...A...V-.P..RI.. .D.IF.K ..SRV.CML..AEQTGV.TRK ......E . D l t - - . A. 304 Sy7D ISY .- ...V ....T ............... - QTLK.GT.-EY. TRAK .......V- .P ..R I ...0. .F . K ..STL .C ....AEETGV. LGK......N .D... 304 Sy79 .N.-S. ......a ............... ...KD .I.NVIS ..I....V- .P ...T .QT Y. I..K. .DRE Y. .FNIA.N.GIVDRK. S ..Y .TTL.GEEVS L. . 307 Tmar ... .- S. ...... L ......TM.MEV.-.GE-PI .. 300 Taqu ..Y .-. ....P..R .....S .....V .---KS.QPI. V.NEA. .IKVK .......L- .P ..RE ..LE .YF .R .LDPVMD . .NV A. AAGV .....S .F ..GEHR-L .. 300 T t h e .TY .-. ....P ..R ........... V .- - K S . Q P I .V. NEA ..VKV ........ L- .P ..RE ..LE.YF. R .LDPVAD ..NV A. AAGV .....S .F ..GELR-L -.-.1.KSDE ...ND .........V-.P ..RE ...A.Y .....SRLS ........FDIV.. S ......P.HR-... 300 T f e r ..Y.- S ....................... Vang ....-. .................... .-.T.-SI ...DE A. .N ...I.......-...... .DT .....Q. F. RE., ...... .H ..V .........N. D... 299 .-.T.-.I ....E .. .N ...I...... .-.....E.N T. .M. .Q.F.RE ...I.....H.M V. .S......N.D.-. .. 299 Vcho ... .-. .................... . W E A. .V ......... GD .R... 3 0 1 X o r y . .MPM S. .V . ................... .-.I .K.DEI1 .NQ.KI ......L- .P ....VVTE .......SRE .. .I D .DV ...............-. .................I N .........L ............Y .D.-. .. 3 0 1 Ypes ... .-. ...........................
...
.-*S Mlep .-S Mtub MWC .I. Mpul V Mxal Hxa2 .-S Ngon Pmir Pvul Paer Pcep Pflu .-S Pput -S Rleg -S S. RTIE~ Rpha S. S. Rcap .-. Rsph .-S Rpro
...
.-
FIG. 4 . (continued)
stru iden chl e m prof
.
310
.
.
320
.
.
330
.
.
340
.
.
350
.
~aIaaaaaaa~aJaaaaaaaaaaaaa????????????????????????
stru Iden 1 chml V chm2 . 1 GRENAKQYLKENPELAEEIEKKIREKLGLSSSAAASETBEBSEEEEEAEEBNEEEAPPVPAPBBLEVEVEKAAAAKSprof G
.
ECOl GKANATAWLKDNPETAKEIEKKVRELLLSNPNSTPDFSVDDSEGVAETNEDF Apol Alai Afac Acal Atum Avar Arnag Apyr Avln Bsub Bfra Bper Bbur
.RE..KQF.R.H..M.AD..RR...QAGVVAEAMLVGPDE.GAEH
.RO. .KQY .E.K. .LLN. L.. . ..THFKLTK ....N.VIRY.EE. , . .RQY .RVK. .FPGIF.QGI .GAMAAPHPLGFGERR.VQC€S6.PYaNG .PIST.. .AVI. .a. .TKASDQTAAHDETE.EPDLLES .RE. .KTF .R.. .D. .N.. .LAL.QNAGL1ABRFLQNGGP .AGEGDDGSDEG ..RE.RD.. .KIQFK Y ..R..EEK..An.A. .F. EQ.KW. ..K. DKGAVVSANSVAKAN.EDE. DVDLDEEE ..G A I .QNAGLISEAIAAVPDL.GTP. .. .REQ.KKY. LEH. ALE. ..R. ...VS6LVRPD.ENSVGEK.. .....AKF.E.. ..V.AAV. .SI .W. .AA.A.ARPAALA.EPAD.DLDY
.RE. .KQF.. E .KDIMLM.QEQI. .HYGLDN. GVVWPAEETMEL. FE. .RDA.KQCIA.. .L.E.L.GLIF.K. REHK .D. VREY. .EHK.M. I.. N. NQG1VSRAATFPASEAED.E RESVIEY. SKEV. L. MLD. RL, KIIFN. FDQEN. NFIEFK.DES. BabO .RE .Kay.. , .V. R.. .TTL.QNAGLIAEQFL.DGGPEEDAAWWM C j e j .RE.SK.F..E...I.D..T.AIQNSMGIEWISGSEDDEGE. Ctra .REAVREE. .R.K.LFH.L. RRIY.SVQASQAPA4ACVDSE.RE.. .AAK Cper .RE..KQY..E..AV.LO..HOI..KYSLPLAKAVESTSVEENTEESVES CglU .EKVRLS.. E.. .LTD. L.D. IFKK. GVGKYAAASDELT.DPVELVP. V. .DDEADTEADAED Drad .EKTI .YIAER. .MEQ. .RDR.MAAIRAGNAGEAPALAPAPAAPEAAEA SCNY E KV A. LD L .DM. .G-TGELSVATTA.DADDM.TSEEF Eagg Ecar .CNF.. E.SLVKATKNFNGC .SMK. .NE.I.KSD.L.ARL.AE.VA. .EQALMDIEQ. .NNT.SES. .E Hinf Llac AEK.KNY.. EHODVFD. .DH. ,.AAHGLLDD. EVKTEEETTAFKN Lpne .E. VRLY. .E. .QV. A.L. W I .TE. .EKKL.VLAS.SE.LFETIDD Mfla .D. .REF.REH.. I . N. .OA. I, .HSNLANAANTTAPDEE.DE Mcl a .E KEF RE.. A1 .A. .A. I.DNSNVLAD.MTAARSE.D Mmet .E. .KEF.RE.. AX A.. .A. I .DNSNVLAD.WTAARSE .D
.
. .
I
. ...
..
. . .... .. .. . .. . .. . ... . . . . .. . . . .
.
352 348 33 1 354 349 363 358 344 348 349 347 318 352 365 360 343 352 352 376 363 354 325 354 365 348 344 342 342
2' s t r u c t u r a l elements from RecA c r y s t a l structure i d e n t i c a l residues i n a1 ignment chenlcally conserved residues using c l a s s i f i c a t i o n 1 chemically conserved residues using c l a s s i f i c a t i o n 2 weighted p r o f i l e consensus o f 64 bacteri a1 RecA proteins Ecol Apol A1a1 A f ac kal Atun Avar Mag APyr
Avtn Bsub Bfra Bper Bbur BabO
CjeJ Ctra Cper ccrlu Drad Eaw Ecar Hlnf Llac Lpne Mfla kla hat
Escherichia c o l i Acetobacter polyoxogenes Achol epl asma 1a i d l awl 1 Addiphillun f a c i l i s Aclnetobacter calcoacet icus Agrobacterium tumefaciens Anabaena v a r i a b l l i s Aquaspi r i11urn magnetotacticua Aquifex pyrophilus Azotobacter vlnelandi i Bacillus subti l i s Bacteroides f r a g i 11s Bordetell a pertuss 1s B o r r e l l a burgdorferi Bruce11a abortus Canpylobacter jejun 1 Chlamyd Ia trachomat 1s C l o s t r l d l um perfrlngens Corynebacter ium g l utami c m Delnococcus radlodurans Enterobacter aggl merans Erwlnia carotovora Haanophilus influenzae Lactococcus l a c t i s Legionella pneunophila Methylobacl 1l u s f 1age11atum L t h y l m n a s Clara Methyl ophi 1us methylotrophus
501672 D13183 ME1465 D16538 L26100 M36776 M296BO X17371 L23135 596898
X52132 M63029 x53457 U23457 LO0679 U03121 U15281 U16739 Johnston Sloan Rood 12-21-94 X77384 U14965 U01876 219517 x55554 LO7529 H881D6 x55453 M35325 X59514 Emerson 12-17-92
Mlep Mtub Mmyc Mpul Mxal Mxa2 Ngon Pmir
..E.
.RNF.LE.ADV.N.
....IK.K.GIGAVV.D--.
.ILPTPVDF
..E..RNF.VE.ADV.D.....IK.K.GIGAVV.D.P.N.GVLPAPVDF
.RTSVIQ. .NADENRIN.LTEQ.KK.IKQD .RQKLI .Q.ES.N. LFE. .FQ. I V . KENQKLS .RER.AE. .REH. DVLEALG.EITGTSALPSSPA. VEVAA .RE. VKEY. REH.. I. .D.. GR. L. KYGIGKSGA. SPRRRT. PRRPKVAARSAAV ..D. VRV. ..E.. .ISD. .DA. I.A.NGVEMHI .EGTQDETDGERP.E .....,NY. .EH. .MYN. LNT. L. .M. .NHAGEFTSAADFAG.ESDSDAD.TKE Pvul . .. . ..M..E.. .MYN. LNT. L. .M. .NHAGEFTSARDFANDSDDAADIEETEK Paer . ....AKY .E. .. .IGSVL. .TI .DQ. .AKSGPVKADAEEVADAE.D ...NRI ..S.GVVAMPDGAGMAEAMDEE. Pcep . .D. .REF.RE.. .IR. .a1 .DK. .TPAPDVKPAANREPVEEV.EADTDI P f i u ... .SAKF .A.. .DIVATL. Pput . ,...AKY. AE.. AIGA. ...a1 .DK. .TSGAVAAAGKAAAV.AD.DiMA.ADAGY Rleg .RE. .KTF .R.. .DL. R.. .LSL.QNAGLIADRFLQNGGP. PDDGDGDATAEM Rmel .RE. .KLF. RE.. .LLR. . .TAL.QNAGLIADRFLENGGPE.D. DEAARpha .GE . .KTF .R. ..DL. R. ..IAL. .NAGLIADRFLQNGGPA.DDGDGADRcap .RE. . K Q . R. ..D I .Y. ..D. I.ASHGLEF-GV.PTAE. LTEE
cc
346 350 345 339 342 358 348 355 356 346 347 352 355 351 361 361 355 343 340 354 353 347
.RE. .K.F. RA. .TV. GD.. D R I .ASHGLDFSTGE .GK6. .LVDM .RE.VKQY..EH.QISN....II..KSSAIT.INL.QTEE.....CNF..E..AI.A.LD..L.D...HS-GGELVAASGDDFEDDEA.TSEOF .................................................... ..E. VKMY.. E. .QIKE. .DR. L.. K. GISDGDVfETE.APKSLFD€. .SE. .KKY. AEH.. IFD. .D .Q..SKFGLIDGEEVSEQDTENKKDEPKK.EAVNEEVPLDLGDELEIE1EE388 Sanb . .E. .RNF.. . ..DL.N.. ...IK.K.GVGVRPEEPT-ATES. PDAATA-fSAPAVPAPATAKVTKAKAMAKS 372 S1i v . .E.. RNF.. . ..DL.N.. ...IKQK. GVGVHPEESATEPGADAASAAPA---DAAPAVPAPTTAKATKSKAAAAKS 374 Sven . .E. .RNF., . ..DL.D. ..R. 1K.K.GVGVRPDAAKAEAATDAA.ADTAGTDDAAKSVPAPASKTAKATKATAVKS 377 Sy70 .RD. .VKY .EE. .DV. AIVIP.. ..N.DHSSMGfG.EHHT1E.E 340 361 Sy79 .RD. T I M . EEH. DFRATV. HE. ..K. ALGAQVSANTVGQRLPKPLK. PTAKLLQSS . ..RRI.. KYGLLSVEKEEQRKEKKSSGE.AS 356 Tmar .SS. .VQF. , ....I6. 340 Taqu . .EK.AEY. RER. .LLE. .RA. .L. RADKVVLAAGEEEGE 340 Tthe . .EK.AEA.RER. .LLE. .RA.. L. RSDQVVLAAGEDEGE-346 Tfer ..D. .ROY.. VH. .L.AN. .OR1 .AAAAGH. LAFAEEVESPQRSAS 348 Vang . ,. ..CKF.RE. .AA.MALDT. L. .M. .NPAELIVEEPILSEMPQE.EL Vcho . .. . .CKY.. E.. .I. .TLD.. L. .M. .NPE.MQL1AETSSAADDV. FGAVPEEF 354 355 Xory . .D. .RTY .R.. .QV. VRL.AEL. .KFQPAEARAKPAlTKRRNKTQLHSIAGRR 356 Ypes . .. . .S N Y . .E.. AN. A. LD.. L. .M. .NGG. GEQPVAAATA.FADGAD.TNEEF Rsph Rpro Smar Sfle Saur Spne
FIG.4. (continued)
Mlep Mtub
myc Mpul Mxal
Mxa2 Ngon Pmir Pvul Paer Pcep Pflu Pput Rleg he1 Rpha Rcap Rsph Rpro Snar Sfle Saur Spne Sant, S1 i v Sven Sy70 sy79 Tmar Taqu Tthe Tfer Vang Vcho Xory Ypes
Mycobacterium leprae Mycobacterium tuberculosis Mycoplasma mycoides Mycoplasma pulmonis Myxococcus xanthus RecAl Myxococcus xanthus RecA2 Ne 1sser ia gonorrhoeae Proteus m i r a b i l i s Proteus vulgaris Pswdmonas aeruglnosa P s e u d m a s cepacia Pseudcinonas fluorescens Pseudmonas putida Rhizoblum legumi nosarum Rhizobium m e l i l o t i Rhizobltim phaseoli Rhodobacter capsulatus Rhodobacter sphaeroides Rickettsia prowazekli Serratia marcescens Shigella f l e x n a r i Staphylococcus aureaus Streptococcus pneumoniae Streptoqyces d o f a c i e n s Streptomyces 1ividans Streptomyces venezuel ae Synechococcus sp PCC7OD2 Synechococcus sp PCC7942 Thermotoga maritima Themus aquaticus Thermus thermophi 1us Thiobacil l u s ferrooxidans V i b r i o angui 11arum Vibrio cholerae Xanthmonas oryzae Yersinia pestis
X73822 * protein intron X58485 * protein intron L22073 L22074 L40367 L40368 XI7374 X64842 X14870 x55555 X05691 X52261 D90120 M96558 L12684 x59956 x59957 X62479 X82183 X72705 U01959 M22935 x55553 L25893 M94061 217307 230324 X76076 U04837 M29495 Coleman 9-20-94 L23425 L20095 L20680 1317392 U03058 M26933 M80525 U10162 X71969 X61384 Mongkolsuk 2-4-94 X75336
150
ALBERT0 I. ROCA AND MICHAEL M. COX
Figure 5 shows the invariant and chemically conserved residues mapped onto the crystal structure of the RecA protein with ADP bound (64).The conserved residues cluster in three areas corresponding to the major central domain and flanking two smaller subdomains (64). In Fig. 5, the N-terminal subdomain corresponds to the gray residues near the top left comer of monomer a, whereas the C-terminal subdomain is in the bottom left comer of the same monomer. Complementary information is presented in Fig. 6. Figure 6A outlines the major regions of secondary structure in RecA, along with two loop regions (L1 and L2) that are undefined in the crystal structure. A similarity curve that highlights regions of sequence conservation among bacterial RecA proteins is presented in Fig. 6B. Figure 6C shows the trends in the charge conservation in the RecA family and may provide insight into those regions of the crystal structure that are disordered and hence hinder the calculation of electrostatic potential surfaces. As expected, wherever there is low sequence conservation (Fig. 6B) there is high charge variability, e.g., the carboxyl-terminus region (Fig. 6C). Interestingly, there are a few places where there is strong sequence conservationyet also some charge variability, such as around position 290. Finally, Fig. 6D is a difference plot showing the change in molecular surface area per residue in the RecA monomer crystal structure (64)in the presence of different ligands. These values are defined by the calculated interactions of residues with an imaginary spherical probe having a radius of 1 A. These calculations can provide a general sense of the ligand interactions with respect to the RecA protein sequence. If a Iigand blocks or impedes the probe’s interaction with a given amino acid residue, there is a decrease in the molecular surface area that registers as a peak in Fig. 6D. There is a limitation to this approach. These calculations can overestimate interactions, i.e., a peak does not necessarily represent a van der Wads contact. For example, Tyr-264 does not make substantial contacts with ADP in the crystid s t r u c ture. However, the cavity between Tyr-264 and ADP cannot accommodate the probe and therefore a signal is generated in the plot. Nevertheless, most of the peaks in Fig. 4D reflect identifiable interactions seen in the crystal structure. Because key regions in the interior of the RecA filament are disordered (e.g., loops L1 and L2), it is impossible to calculate an electrostatic potential surface (81)in much of the regon where DNA binding is known to occur. Keeping this caveat in mind, it was still interesting to visualize the distribution of positive and negative surfaces of the RecA filament as seen in the crystal structure. Figure 7 shows that there is an asymmetric distribution of charge with respect to the long axis of the filament. Namely, the 5’ end of the filament with respect to bound single-stranded DNA (ssDNA) (smooth face) is predominantly positive, whereas the 3’ end (lobes) is predominantly neg-
a
C
FIG.5 . Conserved residues in the bacterial alignment mapped onto the RecA crystal structure (Brookhavenenby 2REB). An l&subunit right-handed RecA filament (left) has three symmetry-related subunits in gray. Each of these monomers (labeled a, b, and c) is enlarged on the right. If one is looking down from the top of the filament, monomer b is rotated 60" in a clockwise direction from monomer a. Monomer c is rotated 180" from monomer a. Hence, monomer a is the inside face of RecA with respect to the filament axis. The rodlike protrusion at the top left part of monomer a is a-helix A (see Figs. 2 , 4 , and 6 4 . Note that = 15%of the RecA protein is disordered in the crystal structure and therefore is not visible. The identical residues in the alignment are in dark gray, the chemically conserved residues are in light gray, and the ADP molecule is black. Monomers a and c show all of the identical and conserved residues that are visible on the protein faces shown. Monomer b shows only the conserved residues in class I (ADP binding) (4). Conserved residues in this figure are based on an analysis of 58 bacterial RecA sequences, excluding Bbur, Ctra, Cper, Rcap, Samb, and Sy79, which were not availablewhen the figure was generated. This representation of the RecA proteins was generated with the help of David Goodsell (Scripps Research Institute) using a black and white rendering technique that he developed (421).The fdament is oriented with the end nearest the S end of a bound single DNA strand oriented toward the top.
A
aAB0
B
1
C 2 0 3 E 4
L1 F
5 L 2 G8
8
H910I
250
300
7
J
0.6 0.5
0.4
.E &
.- 0.3 d
izE
0.2 0.1
0
D
-
N
s
5g
6o
50 40
30
0
I3
20
-20 0
50
100
150 200 E. coli position
350
FIG.6. Plots of similarity, charge density, and molecular surface with respect to the bacterial RecA alignment. (A) Secondary structural elements from the RecA crystal structure (64) are schematidy shown with a helices in gray and p strands in black. The L1 and L2 loops are de-
RecA PROTEIN
153
ative. Support for this calculation comes from the measurement of a large permanent dipole in a RecA-dsDNA complex (82).The positively charged amino terminus (see Fig. 6C) contributes to the basic surface seen in Fig. 7. The primary role of the amino terminus is in forming part of the
picted as horizontal lines. (B) A similarity plot showing the conserved regions in the bacterial RecA (Fig. 4)and the ReckRadS 1-likealignments (Fig.9B). Interesting regions of the RecA protein are marked as follows: (a) MAW motif; @) Walker A box (€-loop);(c) Glu-96; (dj Walker B box; (e)Asn-193; (f) Gly-211and Gly-212;(8)RecA signature; (h)proposed DNA-bindingwing; (i) Tyr-264; (jj C-terminal end of Rad51 and Dnicl. The similarity plots were generated using the PLOTSIMILARIlYprogram of GCG. For the unweighted alignment of bacterial RecA proteins in Fig. 4, a window size of 10 residues was used. This is approximately the average length of secondary structural elements in the RecA crystal shucture. The BLOSUM62 table was used to calculate sequence similarities.Values were scaled from 0 to 1,where 1is the maximum possible score. Only that part of the curve corresponding to the E. coli RecA sequence is shown, i.e., 350 residues out of the approximately 400 residues (columns) in the multiple sequence alignment. The PLOTSIMILARITY program is not sensitive to weights such that the entire ReckRadSl-like alignment could not be used to generate a similarity curve. To alleviate the problem caused by the overrepresentation of bacterial RecA sequences, a subset of the alignment, using only the following sequences, was used: Ecol, RadS1, and Dmcl of S. cereuisiue, and Sms, Rad57, Rad55, Mei3, Rec2, and UvsX. The same window size, symbol comparison table, and scaling were used as described above. Also, only the values corresponding to the E. coli RecA sequence are shown, is., the values corresponding to the gap insertions in the E. coli RecA protein (Fig. 9B) were omitted in the plot. (C) Charge density plot of the RecA alignment. The average of all 64 bacterial sequences is graphed. The most basic (Afac, full length calculated pl = 9.5) and the most acidic (Cglu, pI = 4.9)RecA proteins are also portrayed. Instead of plotting the data from the other 62 RecA sequences, the variance in the data is shown below the charge density curves. As before, only those parts of the curve correspondingto the E. coli RecA protein are shown. The Protean program (Version 3.04,DNAStar, Inc.) was used to calculate charge. This method sums charge at pH 7 over a defined window size using standard pK tables (422).A window size of 19 residues was used. This larger window size was used to reduce noise in the data. @) Molecular surface area changes due to ADP, adjacent subunits, or a neighboring filament interacting with a RecA monomer in the crystal structure. A positive peak reflects a decrease in molecular surface area, and therefore potential contacts between the RecA monomer and the respective ligand. Note that there is no signal for the Walker B box, because the y phosphate of ATP is not present in the RecA crystal structure (65).The solvent accessible molecular surface (423)was calculated for one RecA monomer in the crystal structure (65) using the DMS program of the MIDAS 2.0 system (424).In short, an imaginary spherical probe is rolled over the entire surface of a macromolecule, thereby generating a molecular surface. This molecule surface is smoother than a van der Wads surface because small crevices cannot accommodate the rolling probe. The smallest probe possible (radius = 1A) was used to generate a surface revealing the most features. The default values for atomic radii were used. Values for the surface area (in Azj for each nondisordered residue were obtained. Calculations were repeated in the presence of each of the following interacting ligands: ADP bound by the RecA monomer, subunits abutting the RecA monomer within the filament. and monomers in a neighboring filament making interfilament contacts. The decrease in molecular surface area due to the presence of the ligand is plotted.
154
ALBERT0 I. ROCA AND MICHAJ3L M. COX
FIG.7, Electrostatic potential surface of an 18-monomer RecA filament calculated using the program GRASP with default settings (425).The gradient of electrostatic potential values ranges from - 4 to +4 k,t and is represented from black to white, respectively. The orientation of the filament is the same as that in Fig. 5 . On the right are two views looking down the long axis of the filament from the 5' end (top)and 3' end (bottom).A color version of this figure is available using the following URL: http://atahualpa.biochem.wisc.edu/.
monomer-monomer interface within a RecA filament (see Fig. 6D and 83), but part of it is exposed in the filament groove. The overall dipole is a significant feature of the helical entrance to the central DNA binding pocket of the RecA filament. It is interesting to speculate that the dipole may play an indirect role in DNA binding or in the mechanism by which secondary DNA molecules enter and exit a nucleoprotein filament formed on ssDNA. Conserved regions may be best described in the context of the crystal structure. Four classes of conserved residues have been defined (4).Class I residues are involved in making ADP contacts in the crystal structure (Fig. 6D), and possibly catalyzing ATP hydrolysis and mediating the induced conformational switch induced by ATP hydrolysis (4).Class IT residues are
ReCA PROTEIN
155
involved in the hydrophobic packing of the RecA monomer. Many of the conserved but not necessarily invariant residues are evident in the central domain of monomers a and c in Fig. 5. Class I11 residues lie at the subunit-subunit interface (Fig. 6D). The Class N residues constitute a “structurally unusual’’ part of the RecA protein (4),described in more detail below. No conserved residues have been uniquely associated with DNA binding. The amino-terminal subdomain has conserved residues surrounding P-strand 0 (Figs. 4 and 6B) that fall into class 111. The carboxyl-terminal subdomain has conserved residues near @-strands9 and 10 and the beginning of a-helix I. Figure 6D shows that this region is involved in interfilament contacts in the crystal structure (64). Most of the conserved residues are in the major central domain. This domain comprises the inside face of the filament, which is where important RecA activities such as DNA strand exchange are thought to take place. Monomer a in Fig. 5 shows the inside face of the RecA protein. Many of the conserved residues on the inside face are involved in ADP binding, and monomer b in Fig. 5 shows only the class I residues. They surround the bound ADP molecule (shown in black). Monomer c in Fig. 5 shows that the outside face of the RecA monomer has conserved residues also. There are other conserved residues, but they are not visible in these illustrations because they are buried in the crystal structure. Additional information about many of the conserved residues can be found in several recent reviews (70,84, 85). In some cases, sequence similarities are found in clusters (Fig. 6B) that correspond to known motifs such as the Walker A box (74, now called the P-loop motif (86). In the RecA alignment, the motif is 66-GpESsGKT-73 (invariant residues are capitalized), which most resembles the motifs found in Ras p21 proteins, elongation factors, myosins, and thymidine kinases (GxxxxGKpS], where x represents any residue and the residues in brackets refer to mutually exclusive choices) (86). This loop binds the phosphates of nucleotides. Another AT€-bindingmotif that is less well defined among different protein families is the Walker B box (71).The consensus sequence is four hydrophobic residues followed by an aspartate, which in the crystal structures of Ras p21 (72) and Ef-Tu (73) is a conserved aspartate at the end of a p-strand. In the RecA alignment this occurs at @-strand4,whose sequence is 140-viwD-144. It has been proposed that the conserved aspartate interacts with the y-phosphate of A”, although this moiety is not present in the crystal structure of the RecA protein (65). The PROSITE data base includes a unique RecA signature sequence based on a smaller RecA alignment (87).The signature occurs at 214-ALK-
156
ALBERT0 I. ROCA AND MICHAEL M. COX
FyasvR-222 and the new signature based on the expanded alignment in Fig. 4 is 2 l-A-L-K-F-FY]-[AS]-[DST]-[ILMQVI-R-222. From the crystal structure, this site makes contacts with an adjacent monomer in the RecA filament (64), as seen in Fig. 6D. Another substantial region of conservation is found to the N-terminal side of the Walker A box. This region, spanning residues 42-65, is described in more detail below. Regions of substantial divergence in the alignment occur in loops (residues 30-40 and 230-240), at the ends of secondary structural elements (residues 85-90,110-113,130-135,180-190,280-285 and 295-300), and at the amino and carboxyl termini (Fig. 6B). Two of these divergent regions (residues 110-113 and 130-135) correspondto parts of the monomer-monomer interface in the RecA filament (64).Four of these areas (residues 30-40, 180-190,280-295, and 295-300) form part of the interfilament contact regions seen in the crystal structure (Fig. 6D) (64,88).Divergence at these sites indicates that the interfiiament contacts are either serendipitous interactions unique to the crystal, or that there is species specialization in forming such interfaces. Unique insertions relative to the E. coli RecA protein occur in loops such as those made up of residues 295-300 and 195-209 (loop L2) as well as at the end of p-stand 6. Especially notable insertions are the unrelated protein introns [called “inteins” (89)l of M. tuberculosis and M. leprae found in the turn between p-strands 7-8 and in loop L2, respectively (Figure 4)(90, 91). Interestingly, it has been found that an unprocessed Mycobacterium tuberculsosi RecA protein (containing a 440-residue insertion) will negatively complement processed, i.e., functional, M. tuberculosis RecA protein. This suggests that the unprocessed M. tuberculosis RecA protein folds properly so that it can form mixed filaments with wild-type RecA protein and then inactivate RecA function (90, 92). The first two residues of the amino terminus and the last 24 residues of the carboxyl terminus are disordered in the crystal structure and presumably are solvent-exposed.The amino terminus varies in length and is not well conserved (Fig. 6B), although almost all of the amino termini of the bacterial proteins are long enough to span a-helix A of the E. coli RecA crystal structure. The sequence of the amino terminus of a mature RecA protein is not always certain due to the presence of alternative start codons at the 5’ end of recA genes (93).The carboxyl terminus is also poorly conserved in the alignment (Fig. 6B). Though the sequences may not be similar, there is a general pattern of negatively charged residues (Fig. 6C), with some interesting exceptions. Figure 8 shows a plot of the length of the carboxyl termini versus their calculated isoelectric point. In general, the carboxyl termini average 50 residues in length and have an acidic character (PI= 5).
157
ReCA PROTEIN
80 -
70
I
I
I
I
I
I
I
-
0
-
60 -
SY79 0
-
Mxa20 Xory
Af ac 0
-
50 -
40
'
SlivO SambO
0
-
I
~ven.
@?=
-
-
Tfer 0
-
-0
-
Mxal
4
30 -
<-
-
Alai 0
-
Ecar 0
20
I
I
I
I
I
I
I
I
I
-
FIG.8. Distributionof bacterial RecA carboxyl-terminallength versus calculated pl. For this analysis, the carboxyl terminus of the RecA protein is defined as starting at the invariant glycine, which occurs at position 301. Isoeleciric (pl) points were calculated using the GCG ISOELECTRIC program with default parameters. The average carboxyl-terminallength and calculated pl are 49 2 9.5 and 4.9 ? 1.9,respectively. The pl values reported are for comparison purposes only and are not intended to reflect an experimentally determined PI for an equivalent polypeptide.
2. RecA ALIGNMENTSINCLUDING BACTERIOPHAGE AND EUKARYO~C RecA HOMOLOGS An alignment of bacterial RecA proteins and known structural (3, 4)and functional (76, 94) homologs is presented in Fig. 9A. This expanded ahgnment focuses attention on a subset of the residues conserved in the bacterial alignment. In general, these residues are involved in ATP interactions, monomer-monomer contacts, and the hydrophobic core of RecA (64).Note that the summary lines represent conserved residues from aZE of the ReckRad5 1-like sequences. Bacterial RecA protein sequences other than E. coli RecA are not shown in the listing of sequences, to conserve space.
158
ALBERT0 1. ROCA AND MICHAEL M. COX
The bacterial RecA alignment was used as a basis to search for addition-
al RecA homologs in the data base. As described in the Fig. 9 legend, proteins exhibiting sequence similarities confined to the ATF’ binding motifs were discarded in the analysis. Proteins found with the Profile (95) and BLAST (96) searches are aligned with the bacterial RecA and eukaryotic Rad51 proteins in Fig. 9B. The similarity curve for this alignment is shown FIG.9. Alignment of 64 bacterial RecA proteins with RecAlRad51-likehomologs. (A) Bacterial RecA aligned with Dmcl and Rad51 homologs. (B) RecA, Dmcl, and Rad5l proteins aligned with RechRad51-like candidate homologs. See Fig. 4 legend for explanation of alignment format. Due to the larger number of gaps in these alignments, spaces represent gaps for clarity. The dash (-) identifies gaps in the summary lines with respect to the E. coli RecA sequence. The locations of the MAW motif and Walker A and B boxes are indicated. Note that the other 6 3 bacterial RecA proteins are not shown due to space constraints.However, the summary lines were calculatedusing those sequences.Therefore, some residues do not appear in the summary lines, e.g., Leu-51 is not conserved due to Thr (Apyr, Tmar) and Met (Sy70) substitutions in sequences not shown. Also, nowhere in the Rec2 insertion could a reasonable catalytic glutamate (corresponding to E. coli position 96) be found that is in the same context as the other Rad51-likeproteins. A weighted RecA profile was fist generated with the PROFILEMAKE program using the BLOSUM62 table. The RecA profile was used to search the SwissProtdata base (Release 30) for RecA-like sequences using the PROFILESEARCH program and the following optimal parameters: no averaging, gap penalty = 17, and gap length weight penalty = 1.7. The strength of the structural similarity is reported as a Z score (standard normal deviate, where unrelated sequences have a mean of 0 and a standard deviation of 1).The Z scores above 6 are an indication of si&icant structural similarity (112).In an analysisof small structural features such as the helix-tum-helix motif, it was found that significant Z scores were around 10 (426). Using the RecA profile, it was found that full-length “normal” bacterial RecA sequences, ie., nonpartial sequences and nonprotein intron-containing sequences, had standardized 2 scores greater than 120. The Dmcl and Rad51 homologs had Z scores around 40. Except for Rad55, all the search hits with Z scores between 6 and 9.5 were AT€-bindingproteins with the highly conserved A-box @‘-loop) motif (86‘).Those sequences were ignored in the following analysis. Search hits with Z scores above 9.5 (exceptRad55, Z = 8.2) were used to generate an alignment of RecAlRad51-likeproteins. Many of these proteins were known to have similarity to RecA (5, 78, 109).Pairwise comparisons of the profie consensus and profile search data base hits were generatedwith the PROFILEGAP program. BLAST searches (96) were also performed with the profile consensus and various individual RecA or Rad5l sequences, but this did not yield any new hits. It is dimcult to automatically align sequences with low sequence identity (79).For example, two different alignments of bacteriophage T4 UvsX protein with the E. coli RecA have been reported (70,427). Therefore, to generate the alignments of the Dmcl and Rad5l proteins, these sequences were first automatically aligned using PILEUP. This alignment was manually merged with the previously generated RecA alignment. The structural alignment of Dmcl with the RecA crystal structure was used as a guide during this process (4). The alignment in A differs slightlyfrom the structural alignment in the L2 loop region. This region is disordered in the crystal structure and poorly conserved among RecAlRad51-likeproteins. Each of the remaining RechRad51-like sequenceswere manually added using the BLAST and the paitwise PROFILEGAP alignments as guides. The Rec2 protein (78)has a very large insertion between the Walker A and B boxes that is not found in the other RecARadSl-like proteins. This insertion is not shown here for the sake of clarity.
HAM m o t i f
A
A box
-c
10
5
.
stru iden
.
20
.
.
.
4 O h
BAaaaaaaaaaaaaaah
.
.
.
50
.
.
60
.
.
--aBaaaaa-blbbbb-aCaaaaaaaaaaaa-442bbbbG G L
-
I .
-70 .
C
.-i-.J.-.
-t-l-. -----t-l_L.
80
.
.
L
90
.
L
-
.
2
100
TISTGSLSLDIALGAGGLPNGRIVEIYGPESSGK~TLQV~QR EGKTCAFIDAEHALD GSMRLGEDRSMDVE KQ. 31.. . .INTMS.T.VF.EFRC.. .QMSHTLCV'TT.LPRE MGGGEGKV.Y. .T.GTFR SGLHTAE.VAYAPR.DLLEIKG1SEAKADKLLNEAARLVPM.FVTAA.FHMRRSELICLT. .KN. .TL. . . .VET.S.T.LF.EFRT. .SQ.CHTLAVTC. IFLD 1GGGfGK.LY. .T. GTFR AGmYESIAYlPKR.LLLIKGISEAKAKLLGEASKLVPH.FTTATEWIRRSEL1. .T.. .KQ. .TL.Q ..VET.S.T.LF.EFRT. .SQICHTLAVTC.LPID MGGGEGK.LY. .T.GTFR AGIYTCNGLWHHTK.NLTGIKGLSEAKVDKICEAAEKLVNV.YI1GS. .LU(RKSVIR.T. .PA. .EL.. . .IElLQ.T.AF.EFR. .. .QIAHTLCVST.LPVS MHGGNGKV. Y. .T.GTFR AGYHTYESVAHAPK.ELLNIKGISEAKADKILAEAAKLVPH.FTTATEFHQRRSEI1Q.T. .KE. .KL.Q .IET.S.T.LF.EFRT.. .Q.CHTLAVTC.LPID RGGGEGKAHY. .T.GTFR AGYHTVE. VAYAPK. ELINKGISEAKADKILTEAMLVPM.FTTATEFHaRRSEIIQ.T. ..KE. .KL.Q ..IET.S.T. HF. EFRT.. .QICHTLAVTC .LPID RGGGEGKAHY. .T. GTFR AGFHNE.VAYAPK.ELINIKGISEAKADKILAEAAKLYPM.F~A~FHQRRSEI1Q.T...KE. .KL.Q ..IET.S.T.HF.EFRT.. .QICHTLAVTC.LPID RGGGEGKAHY. .T.GTFR B box
110
.
120
.
.
-
130
.
.
.
140
.
.
-
s t r u a-----DaaaaU3bbb_aEaaaaaaaaaa--aMbbb iden D -S -y-le-i---viv-. chml chm2 _ypleLi----l
------
IYARKLGVDIDNLLCSQPDTGEQALEICDALA
P
. . .
150
.
.
.
160
.
210
.
.
220
.
.
230
.
.
170
.
._. ._.
m
.
.
.
200
.
k .
240
.
.
b7bbbb
250
.
-P
.
260
.
.
270
.
.
280
.
b8bbbbb-daaaaaaaaa
VMFG *NPETTTGGNALKFYASVRLDIRRIGAVKEGENVVGSEIRVKVVK~I*AAPFKQAEFQILYGEGINFYGELVDLGVK ASAL.ASADGRKP1. .HV.AHASAT. ILL.KGR GDE. .AKLPDSP M.E.ECWV.GfKGITDSSD GCQ.LCK.VDSP CL.EAECV.A.YEDGVGDPRE.DE GRA.N PD.KKPI.. IMAHSSTT.. GFKKGK GI S.N PD.KKPI. .IAHSSlT. . .SL.KGR 6.Q.ICKIYDSP CL.ESE. 1.A.NSDGVGDPKEIIAPV GG . .ISD.KKPA. .HV. AHA.T. . .HL.KGK G.Q. .CKIFDAP NL.ESE. V.. .TP .GVAOAKD GAA. .A AD.KKP1. IIAHASTT. .YL.KGR 6. .. ICKIYDSP CL.EAE. H. A. NADGVGDAKE GAA. .A AD.KKP1. ..IIAHASTT. .YL.KGR 6. ..ICKIYDSP CL. EAE .H.A. NADGVGDAKD GAA. .A AD.KKP1. , , IIAHASll. .YL.KGR G. ..ICKIYDSP CL.EAE.M. A. NADGVGDAKD
i-v.v v
-
i
. .
.
Ecol ScDM Sc51 Sp51 L151 Gg5l k51 Hs51
. . ..
190
178 167 167 167
R S G A V D V I V V O S V A A L T P K A E E G E I G O S ~ G L A A R ~ S Q ~ K L A G ~ K Q S ~ L L I F I N Q200 IR~IG .RGELSERPQKLN. HLF.. NRLAEEF.VAVFLT. VQSDP. 268 RGELSARWHLMF. A.QRLA0.FGVAVVVT. VVAQVD 332 .RGELSARWKARF. .T.QRLADEFGIAVVIT. .VVAQVO 296 .RGELAERWLA.MLSR. IXIAEEF. VAVWT. .VIADP . 285 .RGELSARPnHLARFL.M.LRLADEFGVAVVIT..VVAQVD 274 .RGELSARPEIHLARFL.H.LRLADEFGVAVVIT. .VVAQVD 274 .RGELSARWHLARFL.H.LRLADEFGVAVVIT. .VVAQVD 274
R
g_l-.-r
.
189
i-
----. - -
.&h--.-r
.
m
iden chml chm2
G
180
.
A
v-.
.VRLVSIAQRFGLDPDDALN. VAYARAYNADHQ.RLL . .A. QNAESRFSL. .... .M. .YRTOFS .VRLLAYADR.GLNGEEVL. .VAYARAYNADHQ. .LLPPA.MMSESRFSLL.. ..CT. .YRTDFS .DRIVPIAERFGHDASAVL. .IIYARAY .Y .HQYNLLL. . .MMSEEPFRLLI. ...I. .FRVDFS .ERLLAVAER.GLSGSDVL. .VAYARGFNTDHQTQLLYQASAMMAESRYALLI.. .AT. .YRTDYS .ERLLAVAER.GLSGSDVL. .VAYARGFHTDHQTQtLYQASN4"fESRYALLI. . .AT. .YRTDYS . ERLLAVAER.GLSGSDVL. .VAYARbFNTDHQTQLLYOASAMnVESRYALLI. ..AT. .YRTDYS
stru -11 11-1111 l-aGaaad6bbbbb
.
100 161 225
lllillllaFaaaaaaaaaaaaaaaaaahb5bbbbbbl21111
.ERIKQIAEG.ELDPESCLA. VSYARALNS.HM. LVEQ .GEELS.. DYRL. ... .IN. WRVDYC
--
.
-f i.s f i . - e -
.tJ-.-.t-.--
.,_.y.-i-.
.
5 N(MALAAALGCl1EKQFGK
38 GGIYTVNTV.SllRRHLC. 1KGLSEVKVEKIKEAA.K.IQV.FIPATVQLDIRQRWSL 102 66 55 44 44 44
.
Ecol SCDM Sc51 Sp51 L151 Gg5l k51 Hs51
40
-- -1-I
Chnll
chmZ Ecol SCMl SCkl Sp5l L151 Gg51 k51 Hs51
30
a
stru iden chml chm2 280 334 400 365 349 339
2' s t r u c t u r a l elements from RecA c r y s t a l s t r u c t u r e i d e n t i c a l residues i n alignment chemically conserved residues u s i n g c l a s s i f i c a t i o n 1 chemically conserved residues u s i n g c l a s s i f i c a t l o n 2
Ecol ScDH Sc5l Sp5l L151 Gg5l 339 k 5 1 339 Hs51
Escherichi a c o l i Saccharoqyces c e r e v i s i a e Saccharomyces c e r e v i s i a e Schizosaccharorqyces pwnbe L i l i u m longiflorum Gallus g a l l u s Nus m s c u l u s Hano sapiens
RecA h c l Rad51 Rad51 Rad51 Rad51 Rad51 Rad51
501672 M87M9
D10023 D13805 D21821 LO9655 Dl3803 D14134
0
MAW motif
5
.
stru
10
.
.
20
.
iden
Chnl C
40
.
-
.
-i L
5 K(KALAAALGQIEK(IFGK GSIHRLM 38 GGIYTVNTV.STTRRHLC. 1KGLSEVKVEKIKEAA.K.IQV.F I02 66 55 44 44 44
stru
DRSM IPAT VTAA TTAT ITGS TTAT TTAT 'ITAT
.
SGWAE. VAYAPR DLLEIKGISEKACKUNEAARLVPM.F AGWTVESIAYTPKR .LLLIKGISEAKAMUGEASKLVPM. F AGIYTCNGUVIHTK. NLTGMGLSEAKVDK1CEAAEKLVNV.Y AGYHTVESVAHAPK. ELLNIKGISEAKAOK1LAEAAKLVPM.F AGYHTVE. VAYAPK. ELINIKGISEAKADK1LTEALVPM.F AGFHTVE. VAYAPK. ELINIKGISEAKAOK1LAEALVPM.F daaaaaaaaaaaaak
.
DVE TISTGSLSLDIALGA VQLDIRQRWSL.. ..KP. .SI.. .FHWRRSELICLT.. .KN. .TL.. EYHIRRSELI. .T. . .KQ. .TL.Q . .LLKRKSVIR.T. ..PA. .EL. . EFHQRRSE1IQ.T.. .KE. .KL.Q EFHQRRSEI1Q.T.. .KE. .KL.Q EFHQRRSE1IQ.T. .KE. .KL.Q
.
---t --.--.-t-
Chnl C M c3
.
60
.
.
70
.
.
80
.
90
.
6GLPHGRIVEIYGPESSGKTTLTLQVIAAAQR .IMWS.T. VF. EFRC.. .QMSHTLCVlT.LPRE .VET.S.T.LF. EFRT. .SQ.CHTLAVTC. IPLD VET.S.T.LF. EFRT.. s9ICHTLAVTC.LPID . .IETLQ.T.AF. EFR.. . .QIAHTLCVST.LPVS .IET.S.T.LF.EFRT.. .Q.CHTLAVTC .LPID ..IET.S.T.#. EFRT.. .QICHTLAVTC.LPID ..IET.S.T.W .EFRT.. .QICHnAVTC.LPID
. .
..
.
.
.
100
.
.9-
.
i d-
EGKTCAFIDAEHALD MGGMGKV. Y. .T. GTFR 1GGGEGK.LY. .T. GTFR IIGGGEGK.LY. .T.GTFR HH6GNGKV.Y. .T. GTFR RGGGEGKAWY. .T.GTFR RGGGEGKMY. .T. GTFR RGGGEGKMY. .T.GTFR
--.-.-.t
-
-.-.
.
t
I _
. . . .
AtRe 30 DR.. ..E. .MND.wSS... .vT.. .s AGGA L.. .F.S.I.T. .L.. .K. .V.. ........... .A. HA. .EV.K L . GNMLV.. .. .F. EcSn 15 GADYPRWaGQCSACHAUNT1TEVRLAASPMVARNERLSGYA.S AGVMVQK LSDISLEELPRF.. .FKEF.RV.. ..VVP.SA1L.G. NPCA. .S. .L. .TLCKLAQPII KTLWTG.ES.Q KYLEICEK NS1SPDNGPECFT.ADVbH.EL. . . .IFTHG.T. . F .ES.T. .SQ.LH.LALSV.LSEP AGGL.GK.VY.TT.GD.P Sc57 37 AVRPNGVCVVDFLTLTPKELARLIQRSINEVFRFQQLLVHEYNE .s.. IP LSQLIVESPKPL.S.1TG. .EI. L .FQARS.Y. .F. .PGI.. .NFGI. LVCNSLEGIQQSEINDKILW. .TFQE sc55 1 nei3 95 EHHQRRSEL1S.T. .KN. .TL. A . .IET.SVT. .F .EFRT. .SQICHTLAVTC.LPFD MGGGEGK.LY. .T. GTFR Rec2 215 .LSLGRQRHVF .S.. RE. .DL. VRSAVLT.LV.ESG.. .QMI. (188 aa insertion > RaM 17 LPGISQTYINKL. .AGYSSLETLAVASPQDLSVAA.IPLSTAQKIIKEARDALDIRFKTAL E.KKERMNVKK.. ..PA. .GL.A ..IETRTIIT.FF.EFG.. . .P.CH.LSY#V.LPPE KGGLSGKAYY. .T.GTFR uvsx 1 HSDLKSRLI. A.TSK.TA ELTASKFFNEKD VVR.KIPIN.. .SGEIT. .mS. LLILA. .SK.F . SNFG.THVSSYH. QYPDAV .L .Y 3 .FGIT
.
.
. . .
.
.
.
B box iE=
.
110
.
.
120
.
.
130
.
-D
stru a--------l)aaaaUbbb--aEaaaaaaaaaa--aU4bbb
-
iden chml ---j-lchin2
-
ECOl P S C M .ERIK01 Sc51 .VRLVSI Sp51 .VRLLAV L151 .DRIVPI Se51 .ERUAV h 5 1 ERLLAV Hs51 . ERLLAV
.
21 --
-
-- e - i - - v l -e-i--Iv-.
IYARKLGVDIDNLLCSQPDT AEG.ELDPESCLA.VSYARAN AQRFGLDPDDALN. VAYARAYN ADR.GLNGEEVL. .VAYARAYN AERFWDASAM.. IIYARAY. AER.GLSGSDVL.. VAYARGFN AER .GLSGSDVL.. VAYARCN AER.GLSGSDVL. .VAYARAFN
100 161 225 189 178 167 167 167
-- - -aBaaaaa--dlbbbb-aCaaaaaaaaaara-k_b2bbbb---6W-
Job-
1den
8
50
.
A box
- - a B a a a a e - d l-b bGb 6b -_a C_ a La aGa aua a: aDa aLa a- -D- a -9 2 b b b b - -t-l---. .-i-.,y.-. .t-l-.----fi .-i-.,y.-. .t .-- -t-l-L-. t
Job-
---------A-
M
Ecol ScM Sc51 Sp51 L151 Gg5l k51 Hs51
30
.
daaaaaaaaaaaaaah
-I
.
140
.
.
150
.
S - -
v-.
.
170
160
.
-
RSGAVDVIVVDSVAALTPKAEIEGEIGDSHIG GEQALEICDALA .RGELS S.HQH.LVEQ.GE ELS. .DYRL.. .. .IM.NFRMYC .M. .YRTDFS .RGELS ADHQ.RU. .A.Q MSESRFSL.. .RGELS ADIK1. .UQaA.N MSESRFSLL.. .CT. .YRTDFS .A KHSEEPFRLLI.. . I . .FRVDFS .RGELA Y.HQYNLLL. RGELS TDHQTQLLYQASA MAESRYALLI .AT. YRTDYS TDHQTPLLYQASA IVESRYALLI .. .AT. YRTDYS RGELS TDHQTOUYPASA IVESRYALLI.. .AT. .YRTDYS .RGELS
.
.
.
180
.
190
200
lllilllla-Faaaaaaaaaaaaaaaaaahb5bbbbbbi2llli
._.
... . . ..
.
.
.
m
d
.i-
.I-
LAARISPAnRKUGNU;QSNnLIFINQI~IG200
. 268
E R W L N . HLF. .NRLAEEF.VAVFLT. .VQSDP ARQMHLAKF. .A.PRLAD. FGVAVVVT. .VVAQVD A R M A R F . .T.QRLADEFGIAWIT. .VVAQVD ERWLA.RSR.TKIMEF. VAVYHT. .VIAW. ARPnHLARFL.M.LRUDEFGVAVVIT. VVAQVD ARPIIHLARFL .I.LRLADEFGVAVVIT. VVAPVD ARPIIKARFL.M.LRUDEFGVAVV1T. .VVAQVD
. .
332 296 285 274 274 274
~
124 135 165 82 171 264 140 96
-
-D--vi-. .-
stru a-----DaaaaU3bbb---aEaaaaaaaaaa--aQ4bbb iden chlnl chm2
-----------AtRe . ECSM
Sc57 Sc55 Me13 Rec2 RadA uvsx
I
.
.
..
.
-
Ecol ScDN AS Sc51 GG Sp51 G I L151 GG Gg51 GA k 5 1 GA Hs51 GA
210
.
.
220
.
230
.
-
---
-
WIFE *NPElTTGGNALKFYASVRLDIR AL.ASADGRKP1. .HV.AHASAT.ILL. MA.% PD.KKP1.. .INAHSSTT. .GFK S.N PD.KKP1.. .I.AHSSlT. .SL. ..ISD.KKPA. .HV. AHA.T. . .HL. A..A AD .KKPI...IIAHASlT..YL. A. .A AD.I(KPI. ..IIAHASTT. .YL. A..A AD .KKPI...IIAHASlT..YL.
s t r u --1111-1111 iden chml chm2
.
-----
.YY.
.
240
.
.
b7bbbb
250
.
-
.
260
.
270
.
.
b8bbbbb-aHaaaaaaaaa
.
.
-i o
stru iden chnl chm2
280 Ecol Escherl chi a co1i
KGR KGK
334 400 365 349 339 339 339
KGR
KO: KGR KGR KGR
--- -- -
.
278 464 262 201
2' structural elements f r o m RecA crystal Structure identical residues i n alignment chemically conserved residues using c l a s s i f i c a t i o n 1 chemically conserved residues using classlficatlon 2
R I G AVKEGENVVGSETRVKVVKH(I*AAPFKQAEFQILYGEGINFYGELVDLGVK
GDE. .MLPDSP GCQ.LCK.VDSP G.0.ICKIYDSP G.Q. .CKIFDAP 6. ..ICKIYDSP G.. ICKIYDSP G. . .ICKIYDSP
182 228 136
280
-9
v-. v-v--.-t
224
DM.E.ECWV.GEKGITDSSD
CL.EAECV.A.YEDGVGDPRE.DE CL.ESE. 1.A.NSDGVGDPKEIIAPV NL. ESE. V.. .TP .GVADMD CL.EAE.M.A.NADGVGDAKE CL. EAE.M. A. NADGVGDAKD CL.EAE.H. A.NADGVGDAKD
ScDN Saccharomyces cerevisiae Sc51 Saccharowces cerevisiae Sp51 Schizosaccharonytes p d e L151 Lllium longiflorum Gg5l Gallus gallus k 5 1 Flus Rxlsculus Hs51 Homo sapiens
RecA Dmcl Rad51 Rad51 Rad51 Rad51 Rad51 Rad51
501672 H87549 D10023 013805 D21821 LO9655 D13803 D14134
-
l-aGaaaaJI6bbbb--b--~b7bbbb-b8bbbbb___aHaaaaaaaaa
I -
Red Ram UvsX
.
.
s t r u -1111-11111~aGaaaaJI6bbbb4 G R iden &I-.>-chml -------. chm2 --.&a F--
Me13
.
. .n.. TA. RK ...... L .c ... .s.. ..R. ...... .ma. . .a. .L. ...L ..ns. ASKAGCT. ..L. ...Y.. . A.SKA. ...VE. .IVC ...N VMR.HR. .LPT.. ."LSETS I.. 1CL.AEEEO PKLR. I . IQVM TPRLESMLSSRPAYE.. ITPS. IFTVSCNDLI N0.HIINVOLPILLE RSK.SIKLVI1.. I S LG. L .YFFQN. FK LSO. VRYKLVII. G MPINILR ERFQ .FKIVEE VKRVRITK .VRLLAV ANR.GLSGEEVL. .VAYARAYN SDH(I.QLLN0A.A ICETRFSLLI.. .ATS.YRTDFL .RGELS SRQTHLAKF. .T.QRLADEFGIAVVIT. .VVAQVD <(18s aa insertion).REIG.V ...NLP..FQPDQA AA.DID SLFPR WERI ENII.KA. . L . . WNNIYYIRAINTDHQIA.V.D.QELVSKDPSIKL.. ... .TSHFRAEYP .REILA VRMIKLKHLHQ.TRLAEVYDIAV. IT. .VMRPD A.L. SM.. .PERVIHTPVPS LEQLRIDMVNQ.D AIER.EKV. VFI. .LGN. AS .K .T. DALNEKVVSOMlR.KT.KSLF. IVTPYFSTK. I P C .A.. HrYETQE a
.
AtRe EcSM sc57 sc55
-
lllllllla-Faaaaaaaaaaaaaaaaaa~5bbbbbbl21111
...V. S. . I . ...F. ....E.. SAGK.KSS
-
K.DEDI. LRA. . R. Q. S. V SR. Y.
....E.MF.. .VSKL.CVL.CA
304 AtRe Arabldoprl s t h a l i ma
EcSM Sc57 Sc55 GGPSA.. NPD.KKPI.. .IIAHASTT.ISLK 309 k i 3 Rec2 MFY. D.TVAV. .HT.YHVPGI.IPLKKS 290 RadA . SK.VM. .GTGPH.SADTVF.I GKR QI.D.SDLP.YQFVLN.E.SRT VKEKSKFFIDVKFDG. .DP.SG.L.RALE 278 UvsX
.
FIG.9. (continued)
Escherlchia c o l l Saccharomyces cerevisiae Saccharomyces cerevisiae Neurospora crassa Ustilago maydis Sulfolobus solfataricus T4 bacteriophage
RecA-1 ike L15229 X63155 Rad57 M65061 Rad55 U01144 ME13 029638 Red L18882 RadA U45310 KO3099 UvsX %S
162
ALBERT0 I. ROCA AND MICHAEL M. COX
in Fig. 6B. Four conserved structural elements are highhghted by the similarity curve for this alignment. Three of these areas are involved in ATP interactions: the A box, Glu-96, and the B box. The fourth region involves residues 42-65, discussed further below. Proteins identified in the search for RecA homologs include a number of proteins previously shown to have some similarity to RecA and to have some role in recombination processes. One is the UvsX protein of bacteriophage T4, a protein with a well-studied structural (4, 97, 98) and functional (99-101) relationship to RecA. The RecA-like protein of Arubidopsis thu2iunu chloroplasts is very similar to bacterial RecA proteins (over 50% identity) (102).This reflects the widely held view that chloroplasts originated from cyanobacterial ancestors (103). Mei3 and Rec2 are believed to be Rad51 homologs from Neurospuru m s s u (104)and Ustilago m y d i s (78),respectively. In Succhuromyces cerevisiue, Rad55 and Rad57 are in the same genetic pathway for radiation repair as Rad51 (105,106). Both proteins act downstream of Rad5l in the RADSO-RAD57 epistasis group (107).Recently, it has been shown that Rad51 and Dmcl colocalize in prophase nuclei, and it has been suggested that both proteins may interact to form components of recombination nodules (108).Last, mutations in the s m gene of E. coli result in sensitivity to the DNA-damaging agent, methylmethane sulfonate (109). We also analyzed the RadA protein sequence of Sulfolobus so2faturicus (]lo), a member of the Archaea domain (111).This protein is the first demonstration of a RecA homolog in this domain (110), whereas homologs have been characterized in both the Bacteria and Eucarya domains. The major ATP interaction motifs of RecA protein are present in the RadA protein. The only region with substantial sequence conservation in Fig. 9B that is not clearly involved with ATP binding and/or hydrolysis occurs on the Nterminal side of the Walker A box. This region is largely buried in the crystal structure (Fig. 6D). Some class I1 residues and all of the class IV residues fall into this region, spanning residues 42-65 and encompassing a-helix B and most of p-strand 1. The sequence in this region is not similar to any other known motifs found in data bases such as PROSITE (87) or validated Profiles (112).Sequence conservation in this region has been noted previously in virtually all studies making use of RecA sequence alignments, but no function has been assigned.
C. Expanded Discussion of Structure-Function Relationships Structure-function analysis with respect to the primary structure is incomplete, and some conflicts exist with respect to reported functional as-
ReCA PROTEIN
163
signments for individual amino acid residues. The overall effort to relate structure to function is bolstered by an array of over 400 available mutants, over 35 of which have been purified and studied in vitro (70, 85). The discussion here is limited to activities with a role in the recombinogenic function of RecA.
1. ATPBINDING The single binding site for ATP or ADP within the central domain of the RecA structure represents the most highly conserved and clearly identified substructure in the RecA protein. At least three of the structural elements making up this binding site are generally present in bacterial RecA proteins and in the eukaryotic homologs. These elements include the P-loop (86), spanning residues Gly-66 through Thr-73 in E . coli RecA, with the RecA consensus sequence being GpESsGKT; the Walker B box, spanning residues 140-viwD-144;and the single residue Glu-96. The interaction of these three structural elements with ATP and ADP is also generally well-supported by the published structure (65). The structure provides additional information. The following class-I residues mediate additional interactions with ADP. Asp94 is believed to stabilize the loop containing Glu-96 by a hydrogen bond to the backbone NH of residues 96 and 97. Asp-144 is thought to interact with the y-phosphate of ATP and the bound Mg2+ ion via a water molecule. This residue has an unusual cis peptide bond between it and Ser-145. Gln-194 has also been proposed to interact with the y-phosphate of ATP and mediate a conformation change on ATP hydrolysis between ATP and a-helix G and loop L2. Additional contacts between RecA protein and bound ADP occur at Asp100, Ty-103, and Gly-265. There is limited amino acid residue conservation at these positions. The P-loop has been analyzed extensivelyby mutagenesis. Essentially no amino acid substitutions are permitted at positions 66 and 68, and only a few are permitted at position 69, whereas permitted changes at position 70 are restricted to amino acids with small side chains (113).At positions 71-73, limited mutagenesis indicates that there is little tolerance for changes, because all changes introduced to date produce a rec- phenotype. A notable mutation is K72R, which creates a RecA protein that binds but does not hydrolyze ATP (114). In vitro, this mutant RecA protein forms nucleoprotein filaments on DNA, carries out a limited DNA strand-exchange reaction, and facilitates cleavage of the L e d protein (114,115).The protein is therefore at least minimally functional in standard RecA activities in vitro. However, the mutant gene confess a rec- phenotype equivalent to a recA deletion (113).The properties of this mutant in vivo and in vitro have become
164
ALBERT0 I. ROCA AND MICHAEL M. COX
important in efforts to sort out the function of RecA-mediated ATP hydrolysis. Other interesting P-loop mutations occur at position 67, because some of the changes at this position permit a mutagenic separation of RecA functions (116).A Pro to Trp change at position 67 creates a RecA protein that has a high constitutive coprotease activity, but affects recombination functions only modestly. Glu and Asp substitutions dramatically reduce recombination functions but leave a low-level constitutive coprotease activity. Changes to Lys or Arg, in contrast, eliminate the coprotease activity, but generally have minimal effects on recombination functions.
2. DNA BINDING To organize the discussion, we point out at the outset that RecA filaments can bind up to three DNA strands within the filament groove, with three distinct DNA strand binding sites (117-120). Ample data suggest that four DNA strands cannot be bound in the center of the filament (121).A variety of evidence serving to define the three sites will be presented in a number of sections below. The sites are usually called simply I, 11, and I11 (117-120). Site I has the highest affinity for a DNA strand and is generally the site at which ssDNA is bound to RecA proteins. Site I1 is the site generally occupied by a strand complementary to that in site I, although heterologous sequences are tolerated. Site I11 would then be the site occupied by a DNA strand displaced during DNA strand exchange. The numbering of these sites varies from group to group. When strands of identical ssDNAs are added successfully to a RecA filament, site I11 actually binds to the DNA with the second highest affinity. In some schemes, site I11 is therefore called site 11, and vice versa (117-119). Because there is no RecA-DNA cocrystal, the DNA binding sites of RecA protein have been difficult to define structurally. On the basis of the crystallographic studies, Steitz and co-workers proposed that two loops that are undefined in the crystal structure, L1 (residues 157-164) and L2 (residues 195-209), are involved in DNA bindmg (64).They further suggested that L2 might be part of the single-strand DNA binding site (site I), whereas L1 might be involved in the binding of duplex DNA (site 11) (64).There is some sequence conservation in these regions among the bacterial RecA proteins. The conservation breaks down when the alignments are extended beyond bacteria. Loop L1 displays a fairly well-conserved net negative charge among the bacterial proteins (Fig. 6C). The charge density in loop 2 is less conserved. It has been suggested that the polypeptide backbone of these loops might contact DNA, obviating a need for amino acid conservation (4, 122). If the L1 and L2 loops are involved in DNA binding, the crystallographic studies pro-
ReCA PROTEIN
165
vide no information to indicate whether they are involved in DNA interactions at sites I, 11, or 111, or in more than one of these sites. The loop L1 region has also been targeted for mutagenesis, and has exhibited significant mutational flexibility (122).Pro-151 and Glu-154 are upstream of the L1 loop. Pro-151 can accept a Ser substitution and maintain activity. Glu-154 is a critical residue and all changes except a conservative aspartate substitution result in a recA- phenotype. A number of substitutions can be made at Gly-157 (at the beginning of the L1 loop) and display partial wild-type activity. Camerini-Otero and co-workers have carried out studies that tend to support the idea that the loop L2 region is involved in DNA binding at DNA strand-binding site I. These include proteolysis mapping (123)and cross-linking studies ( 1 2 4 , as well as studies of the weak but measurable DNA binding and DNA pairing propeities of a 20-amino acid peptide that includes L2 (125). The recent protein-DNA cross-linking studies have been intriguing, but have not yet clarified the picture. One study demonstrated cross-linking between bound DNA containing the photoreactive analog 5-iodouracil (IdU) and amino acid residues in the L1 and L2 loops (124),generally supporting the proposal of Story et al. (64).A more detailed examination by Wang and Adzuma indicated that the L1 and L2 loops were involved in binding to different DNA strands at sites I and 11, respectively (126).Binding of IdUcontaining ssDNA to RecA protein, followed by irradiation, resulted in crosslinking to Met-164, in loop L1. When RecA was first bound to a nonphotoreactive oligonucleotide, then chased with an IdU-containing oligo to place it in site 11, cross-linlung occurred at Met-202 or Phe-203, in loop L2. This work supports the general proposal of Steitz and co-workers (64) for DNA binding motifs, but it conflicts with the idea that loop L2 is part of site I
(125). Two studies involving direct W cross-linking of RecA protein to bound DNA yielded different results. In one, Tyr-103 was cross-linked to DNA in the absence of nucleotide cofactor (127').When ATPyS was added to the reaction mixture, Tyr-103 was again cross-linked, along with another residue in the region between residues 178 and 183 (127).Neither of these cross-links occurs within loops L1 or L2. In another study, cross-links were found in the regions of residues 61-72,178-183 (probably Lys-183),and 233-243 (128). The cross-link at residue 183 is intriguing because it is located near the exterior of the RecA filament, far from the filament axis region where DNA is presumed to bind. In evaluating the various results, it must be kept in mind that RecA protein must have binding sites for three DNA strands, and may have additional DNA interaction sites on the filament exterior to facilitate
166
ALBERT0 1. ROCA AND MICHAEL M. COX
the movement of DNA into and out of the filament core. In effect, the various DNA binding studies and proposals need not be mutually exclusive,and all of them may reflect a protein-DNA interaction important in some context of RecA action. Even if they are all informative, we may still have an incomplete outline of the protein-DNA interaction scheme in a RecA filament. Roles in DNA binding have also been proposed for the N and C termini (129-131). These proposals have not yet been substantiated by further studies, and are presently difficult to evaluate. Circular dichroism spectroscopy was used to show that a synthetic peptide representing the first 24 residues of the RecA protein amino terminus does interact electrostatically with ssDNA (129).However, the lack of sequence or charge conservation (Fig. 6) in the amino terminus suggests that this is not a general DNA binding mechanism used by RecA proteins. The carboxyl terminus has been implicated in modulating dsDNA binding through the hypothesized mechanism of electrostatic repulsion with the double helix (130,131).However, the existence of basic carboxyl termini in some RecA proteins challenges the universality of this model. It is interesting to note that Myxococm wlnthus has two RecA proteins-one with an acidic C terminus (Mxal)and one with a basic C terminus (Mxa2).Also, all of the Streptomyces RecA proteins have long basic carboxyl termini (Fig. 8). Finally, a DNA binding role has been assigned to various aromatic and charged amino acids between residues 240 and 310 on the basis of sequence similarities among ssDNA-binding proteins, such as E. coli SSB and GVP of bacteriophage IKe (132).Based on NMR studies (133),a structural model for a “DNA-binding wing” of GVP has been proposed, which consists of an antiparallel p sheet made of two p strand separated by a turn.It has been proposed that an analogous DNA-binding wing may exist from residue 243 to 257 of the RecA protein (133).Although this region of the bacterial RecA proteins is conserved and positively charged (Fig. 6, B and C), the eukaryotic RecA-like proteins do not retain this motif. This motif was dismissed as having a role in DNA binding because of its location in the s b c t u r e relative to the filament axis and its involvement in the monomer-monomer interface (Fig. 6D) (64). An aspect of the DNA binding problem not addressed by any of these proposals is the extension and underwinding of DNA that accompanies binding (134-136).Intercalators promote the binding of RecA to DNA (137-139),and intercalation of aromatic amino acid side chains could provide a mechanism for disrupting base-stacking interactions. It has been suggested that some SSDNA-binding proteins use aromatic residues for intercalation (140).However, neither the tryptophans nor tyrosines of RecA protein are involved in DNA
ReCA PROTEIN
167
intercalation (141, 142). It remains a formal possibllity that phenylalanine residues are involved in DNA binding. Phe residues at positions 191 and 2 17 are invariant in the bacterial alignment, but the former is buried in the monomer crystal structure and the latter is involved in monomer-monomer contacts in the filament (Fig. 3D) (64).Other phenylalanines are chemically well conserved in the bacterial alignment (positions 21, 92,203, and 255), but only position 92 is conserved among all of the eukaryotic RecA-like homologs. This residue is buried in the crystal structure (64).The phenylalanine at position 203 is part of loop L2, but it is not intercalated into the DNA (143). Binding to multiple DNA strands is the essence of RecA function, and the lack of conserved residues that can be associated with DNA binding is somewhat surprising. Any discussion of the molecular mechanism of RecA binding to DNA currently entails a high degree of uncertainty. In general, the DNA binding proposals made to date are not well-supported by the alignments presented above. The best proposals so far presented are likely to define only a portion of the overall DNA binding problem.
INTERFACE 3. THEMONOMER-MONOMER Residues Leu-114, Ile-128, and Ala-148 are involved in monomermonomer contacts (see Figures 2 and 6D). Specifically, Leu-114 belongs to the class 111residues as defined by Story et al. (4).Residues Ser-172,Arg-176, Lys-250, and Pro-254 are the invariant residues making monomermonomer contacts at the amino-terminal side of the monomer in the RecA filament. Residue 248 also forms part of the polymerization interface at the amino terminus of the RecA monomer. A RecA mutant (K248A) (144)shows no activity for ssDNA binding, ssDNA-dependentATPase, filament formation, and strand exchange. In uivo, this mutant shows a UV-sensitive phenotype. Other data have demonstrated the involvement of parts of the N-terminal33 amino acids in monomer-monomer interactions (83). Skiba and Knight have reported on an extensive mutagenic analysis of part of the monomer-monomer interface within the RecA fdament (145). The region they chose is coincident with the RecA signature described above. Weak partial activity is maintained with a number of substitutions at Lys-216, Phe-217, and Arg-222. Arg-222 is in the amino-terminal polymerization interface of the filament, and is hydrogen bonded to Glu-63. Lys-216 is hydrogen bonded to the main chain carbonyl of Ala-95 on the adjacent monomer (145).Phe-217 makes van der Waals contacts with Thr-150 and ne155 of the adjacent monomer. Arg-222 is hydrogen-bonded to His-97 of the adjacent monomer. They proposed that the effect of AT€’ is propagated through the filament by the interaction of Glu-63 and Arg-222.
168
ALBERT0 I. ROCA AND MICHAEL M. COX
4. THEMAW MOTIF The structural motif between residues 42 and 65 in the E. coli RecA sequence is conserved among all RecA homologs. Only two mutants have been isolated in this region: RecAl3 (L51F) and RecA56 (R60C). In vim, both mutants display null phenotypes. The E. coli harboringrecA13 are sensitive to UVlight, are recombination defective (146-149), and cannot induce X prophage (147) or participate in the autocatalytic cleavage of the L e d repressor (150). The recA56 strains have similar phenotypes with respect to recombination deficiency, UV sensitivity, and prophage induction (148,151,152),along with the inability to induce the SOS response (153,154) or promote SOS mutagenesis (155).In uitro,both mutants can bind ssDNA in the absence or presence of ADP. However, neither can bind ssDNA in the presence of A n . In addition, neither mutant protein can hydrolyze ATP. These mutants do not display the allosteric effect enabled by ATP (or ATPyS) binding, i.e., they cannot adopt an active conformation for strand exchange in the presence of ATP (156). The “structurally unusual” class IV residues (4) occur within this region at positions 42,48,54, and 55. Thr-42 makes a hydrogen bond with Asp-48, which is completely buried (Fig. 6D). In the alignment, both are conservatively substituted by Ser and Asn, respectively. The invariant Gly-54 and Gly55 have unusual and 9 torsion angles and stack on top of Asp-48. Consequently, there is no room for side chains at these two positions, explaining the lack of residue substitutions in the alignment. Finally, the amide backbone of Gly-54 makes a hydrogen bond with the carbonyl of Asp-48. Additional residues conserved in this region include Ma-50, Gly-52, Pro57, and Glu-63, which are invariant. Glu-63 is in a tract of conserved hydrophobic residues (Ile-61,Val-62, Ile-64, and Tyr-65). The Leu residues at positions 47 and 51 are also notable. Leu-47 is chemically conserved in the analysis at the top of Fig. 4.Leu-51 is found in 61 of 64 bacterial RecA sequences, and does not appear to be chemically conserved only because of one Met and two Thr substitutions in the alignment. Curiously, both of the Thr substitutions are in the RecA proteins of thermophilic bacteria, where an overall decrease in the content of polar residues is observed (157).A leucine residue corresponding to Leu-51 occurs in all of the RecA homologs examined to date (Fig. 9B). Based on the crystal structure (4), residues Leu-47, Leu-51, Ile-61, and Ile-64 were assigned to class I1 (part of the hydrophobic core of the RecA protein). Leu-47 and Ile-64 make van der Wads contacts with both Leu-75 and Ile-225 (158).Thus, residues in this region interact with different parts of the RecA protein on either side of the ATP binding site, at least in the conformation illustrated by the crystal structure.
+
ReCA PROTEIN
169
This area of the RecA protein is therefore likely to be involved in an ATPinduced conformational switch, consistent with one of several functional suggestions advanced by Kowalczykowski (84). We suggest that the region encompassing residues 42 to 65 be defined as a unique RecA structural motif with the acronym MAW, for Makes ATP Work. The MAW motif is summarized in Fig. 10 (42-tgxxxldxalxxGGlxxgxivEiy-65)and its structure is depicted in Fig. 11. Conservation of a protein sequence motif across all bacterial and eukaryotic domains often correlates with a ligand-binding function, as seen with the RecA motifs associated with ATP binding. No ligand-binding function has been suggested for the MAW motif, in part because most of the motif is buried in the RecA structure.
5. MISCELLANEOUS CONSERVED RESIDUES IN THE RecA PROTEIN To our knowledge, no information exists for residues Gly-22, Ala-50, Gly52, Pro-57, Glu-63, Gly-108, Glu-123, Ala-147, Leu-149, Ala-168, Ala-174, Pro-206, Arg-226, Gly-267, and Gly-288. With respect to the functional significance of other conserved amino acid residues in the RecA structure, we focus primarily on work carried out since our previous review (70). The crystallographic studes shed light on the function of a number of other residues (4, 64, 65, 141, 142). Gly-122 is at the beginning of a-helix E. Val-146 is a class I1 residue involved in the hydrophobic core of the RecA monomer. All of these residues are excellent candidates for site-directed mutagenesis studies. OF TRYPTOPHAN REPORTER 6. INCORPORATION GROUPS INTO THE RecA PROTEIN
This allows the use of fluorescence assays to monitor the conformational changes that occur on nucleotide and/or DNA bindmg. Morimatsu et al. (159) replaced Tyr-103 and Tyr-264 with tryptophan. Previous studies (65, 160, 161)showed that these residues interact with ATP and/or ADP. In vivo, both mutants retain wild-type levels of W resistance. In vitro, the mutant proteins exhibit a modest reduction in DNA strand exchange activity. The fluorescence of each tryptophan is quenched in the presence of ADP or ATP. However, the quenchmg of Y103W in the presence of ADP is greater than that of Y264W. This can be rationahzed by noting that the Tyr side chain stacks next to the adenosine base of ADP in the crystal structure (65). Interestingly, complete quenching was not observed when the Y103W was bound to ATP, suggesting that there might be dfferent binding modes for ADP and ATP. A quenching of the fluorescence signals was observed for both mutants
position MAW motif
E. coli RecA baderialRecA
Rad51&Dmcl
42
50
45
55
60
65
t g x x x I d x a l x x G G l x x g x i v E i y
a helix B p strand 1 T G S L S L D I A L G A G G L P M G R I V E I Y S R A I A I N A M I F G V I V F E T Q I K T C T D V I V G G S V R N Y L L T Y M N V T Q A V R Y S T V
I L G A G G I E T G Q I T E A F K L Q V M L S L S M M T V
T G S K A L D E
Q E N
Q
FIG.10. Sequence of the MAW motif and its distribution in the bacterial and eukaryotic RecA homologs. Uppercase letters represent invariant residues. Lowercase letters represent semiconservative changes, In some instances, lowercase letters are used when the frequency of nonconsenrative substitutions is low, e.g., at position 43, Arg occurs only once ( h a g ) .A triangle (A) represents a one-residue deletion. The positions of the secondary structural elements are indicated.
ReCA PROTEIN
171
in the presence of different DNA substrates. This suggests that these residues form part of the binding site or are involved in a conformational change associated with DNA binding. As controls, alanine substitutions at both positions were also studied. In viuo, these mutants were W sensitive whereas in vitro they were not competent for strand-exchange activity. In a similar study, His-163 was changed to Trp;this mutant could be used as a diagnostic tool to monitor the conformational state of the RecA protein (162). This work is considered in the following discussion of the RecA ATPase activity. Finally, Phe-203 of the L2 loop was replaced by a tryptophan (143).
7. THESms PROTEIN All of the universally conserved RecA sequence motifs occur in this protein, advancing the interesting possibility that a second protein with RecAlike function exists in E. coli. This notion is supported by the sensitivity of sms null mutants to alkylating agents, suggesting that Sms has some role in DNA repair. The sms gene is cotranscribed with the serB gene, which encodes a component of the serine biosynthesis pathway. It has been suggested that the Sms protein may be involved in the repair of endogenous alkylation damage (109). The C-terminal region of the Sms protein, not shown in Fig. 9B, also has sequences similar to the Lon protease of E. coli. The sms gene is identical to r d A , mutations in which confer sensitivity to UV- and X-irradiation when logarithmic phase cells are grown in rich media (163, 1 6 3 ~ )Because . Sms lacks significant sequence similarity to RecA protein in regions beyond RecA residue 150, an in vitro analysis of Sms activities could help elucidate the function of the MAW motif, as well as the function of structural motifs in RecA beyond residue 150, such as the L1 and L2 loops. The sequence similarity between Sms and RecA is noted in another review (163b).
111. RecA Protein Interactionswith Its Ligands in Vho; Biochemical Approaches
A. DNA Binding In the context of recombinational DNA repair, RecA protein brings two homologous DNA molecules together and promotes a DNA strand-exchange reaction between them. Binding of the first DNA can be thought of as primary binding, involving either ssDNA to strand-binding site I or dsDNA to sites I and 11. Binding of the second DNA can be called secondary binding. When ssDNA is bound at site I, secondary binding can involve dsDNA binding at strand binding sites I1 and I11 or the binding of a second ssDNA at site
172
ALBERT0 I. ROCA AND MICHAEL M. COX
FIG.11. (A) Schematic representation of the crystal sbctural of the RecA monomer in the ADP-bound conformation.This view shows the inside face of the RecA monomer. The o! helices and p strands are indicated by coils and arrows, respectively.The disordered loops L1 and L2 are indicated by dashes. The ADP molecule is displayed as a ball-and-stick figure. The MAW motif (black) is in the center of the structure. The Walker A box is at the turn that immediately follows @-strand1. The Walker B box is located at P-strand 4, which is behind the Walker A box. (B) A close-up of the structure of the MAW motif. The conserved residues from Fig. 10 are shown. The hydrogen bond between Thr-42 and Asp-48 is represented by a dashed line. The figures were generated using the program MOLSCRIPT (428).
I1 (117-120). This discussion covers only the primary binding of RecA protein to single-stranded, gapped, and double-stranded DNA substrates. 1. BINDINGTO SINGLE-STRANDED DNA
Early studies detected direct binding of RecA protein only to singlestranded DNA (ssDNA) (164,165).In the absence of nucleotide cofactor or with ADP, RecA protein binds to ssDNA to form a “collapsed”filament. The resulting structure has a helical pitch of 64 hi and an axial rise of 2.1 A per nucleotide (166,167).In the presence of A P , dATP,ATPyS, or ADP-AIF,,
173
ReCA PROTEIN
6 [3 strand 1
a helix B
FIG.11. (continued)
an extended filament is formed. This is a right-handed helical form with six RecA monomers per turn, a pitch of 95 A, and a diameter of 100 A (66, 168-170). There are three nucleotides of ssDNA per RecA monomer in this filament. The DNA is bound along the ribose-phosphate backbone, with the DNA bases displayed in the filament groove (137,171,172). This single strand of DNA binds to a specific site within the filament, deep within the groove and near the filament axis (173),which has been called strand-binding site I (117,118,120). RecA binding to ssDNA is generally nonspecific with respect to sequence, although the protein exhibits an enhanced affinity to certain homopolymers such as poly(dT)(174,175a).Optimal binding of wild-type RecA protein to poly(dT) generally occurs only with polymers more than 50 nucleotides long (176). Although most measurements indicate a RecA to ssDNA stoichiometry of one monomer per three nucleotides, some titration experiments provided a stoichiometry twice as high (177-180). These results could be explained by an intrinsic site size of six nucleotides per RecA monomer, with the lower site size resulting from the binding of a second set of RecA monomers to the first. Alternatively the intrinsic site size could be three nucleotides per monomer, and the higher site size could reflect the binding of a second DNA strand to site I1 within the filament. The larger site size is observed only when the titration involves a DNA-based signal, such as a fluorescence change arising from
174
ALBERT0 I. ROCA AND MICHAEL M. COX
RecA binding to a suitably modified ssDNA. The issue has been resolved in favor of the latter interpretation by Brenner and colleagues (181).Their results are also in broad agreement with a range of observations from electron microscopy (3, 173, 182, 183).
2. BINDINGTO DOUBLE-STRANDED DNA The barrier for RecA binding to double-stranded DNA (dsDNA) turned out to be kinetic rather than thermodynamic. At pH values above neutrality, the rate-limiting step in dsDNA binding is the nucleation of filament formation (184-186). Once nucleated, filament extension is rapid and the resulting filaments are stable as long as ATF' is regenerated. The nucleation step is highly pH dependent. It occurs rapidly at pH 6.0, making this the preferred pH for many studies of RecA interaction with dsDNA. Kinetic analysis has divided the nucleation step into two parts (185). A preequilibrium has been detected in which RecA psotein binds weakly to native DNA. This phase involves the net release of one proton, and exhibits little dependence on temperature or DNA length. The rate-limiting nucleation then occurs, accompanied by partial DNA underwinding (185). This phase is accompanied by the uptake of three protons (or a net uptake of two protons for the overall process), explaining the sharp decrease in nucleation rates as pH is increased (55, 185). The nucleation phase is also very temperature dependent, exhibiting an Arrhenius activation energy of 39 kcal mol-I (185). Larger DNAs provide bigger targets for nucleation, and thus stimulate the overall binding process. Also, nucleation is accompanied by DNA underwinding, so any alteration in DNA structure that tends to unwind the DNA [negative superhelicity, singlestrand gaps, DNA lesions, increased temperature (70)] enhances the rate of nucleation (185). Reports of enhanced RecA binding to DNAs containing DNA lesions (187-189u) thus do not reflect a special RecA affinity for DNA lesions. Instead, they are readily accommodated into a general model for RecA binding to dsDNA in which stmctural perturbations of the DNA by the lesions enhance the rate of the nucleation step in the binding pathway (185). Reports that RecA protein binds more readily to Z form DNA (190,191)have been explained by a similar kinetic perturbation in the general DNA binding pathway (191,192).The filaments formed after RecA binds to Z DNA are still right-handed in their helicity (191,192). Unlike the case with ssDNA, binding to dsDNA occurs only in the presence of ATP or ATP analogs such as AWyS (184, 185). Therefore, only the extended form of the filament is observed when RecA is bound to dsDNA. Within the RecA filament, the dsDNA is also extended and undenvound. The approximately 1.5-fold extension gives a pitch of 100 A (the same as for bind-
ReCA PROTEIN
175
ing to ssDNA), and an axial rise per base pair of 5.1 A. the DNA is underwound by 43% in the presence of ATPyS (133,with a small reduction to 39.6%underwinding when ATP is used and is being hydrolyzed (136).This level of underwinding corresponds to about 18 bp per turn of the nucleoprotein filament helix. The degree of dsDNA underwinding observed within filaments actively hydrolyzing ATP varies very little from one filament to another, implying a broad structural homogeneity in these complexes (136).Notably, the structural limit of DNA extension is 70% (193, 194). Like ssDNA, dsDNA is bound along the phosphate-ribose backbone and the minor groove (137,141,171,172).The major groove is evidently displayed in the filament groove. The binding does not involve the intercalation of Trp or Tyr residues (141, 195). The mechanism by which the DNA is extended within RecA filaments has not been elucidated. The bases of the DNA are oriented perpendicular to the filament axis (195,196). It has been argued, on the basis of a variety of experiments, that the two strands are still base-paired to each other (172,195-198),i.e., underwinding does not translate into strand separation within the filament. Two studies using oligonucleotides modified with the fluorescent chromophore (+)unti-benzo(u)pyrene-7,8-dihydrodiol-9,lO-epoxide (BPDE) provided evidence that all three strands that can bind to a RecA filament interact (199, 200). There is a clear homology dependence seen in the signal changes evoked when additional DNA strands are added to filaments with a DNA strand already bound at site I. However, the work also indicates that the base pairing in duplex DNA is not as tight as in B-form DNA (199,200). The dsDNA is located deep within the filament groove (204, at a site that should include DNA strand binding sites I and 11. 3. BINDINGTO dsDNA WITH ssDNA GAPS
A single-strand gap of 50 nucleotides or more is the best way to circumvent the slow nucleation step in RecA binding to dsDNA at neutral pH values and above. Nucleation occurs in the gap, and the filament is rapidly extended to encompass the adjacent dsDNA. The entire filament exhibits a structural polarity that reflects the orientation of the strand in the gap (183). The two strands of the dsDNA are also bound asymmetrically. The strand in the gap apparently is bound in the site normally occupied by ssDNA, and thus can be considered bound at site I. The strand bound at site I is better protected from nuclease digestion than its complement (bound in site 11)by factors of 2-3 (202,2031, providlng one experimental distinction helping to define sites I and 11.This reflects the different fates of each strand. During strand exchange reactions, the strand in site I remains in the filament throughout the process, while its complement is exchanged out of the filament.
176
ALBERT0 I. ROCA AND MICHAEL M. COX
6. Polar Assembly and Disassembly of RecA Filaments, and the Importance of 3’ DNA Ends Distinct nucleation and extension phases of RecA filament assembly on DNA can be discerned when either ssDNA or dsDNA is the lattice on which assembly occurs. On ssDNA, both phases are rapid and the kinetics have not been analyzed in sufficient detail to provide rates for individual steps. On ssDNA circles over 8000 nucleotides long, filament formation is completed in less than 2 min at 3 7°C. If nucleation is sufficiently rate limiting as to restrict the process to only one nucleation event per circle, we can infer a rate of filament extension of over 1100 RecA monomers per minute. The active binding unit or protomer (monomer, dimer, or hexamer?) has not been defined. Rods of RecA protein are a predominant structural form under conditions commonly used to monitor RecA reactions in vitro (60). The single-stranded DNA binding protein (SSB) of E. coli plays an important role in the assembly process. RecA protein does not readily bind to regions of secondary structure in ssDNA. When RecA binds to ssDNA molecules with significant secondary structure, such as the DNA derived from bacteriophage M13, binding does not lead to full coating of the DNA circle. Added SSB greatly stimulates binding and facilitates the creation of a contiguous and stable filament that coats the entire DNA molecule (204-209). The effects of SSB are best explained by a mechanism in which SSB binds to and melts out the secondary structure in the ssDNA, and is then displaced by RecA protein (210,211). In effect, the RecA protein and SSB compete for binding sites on the ssDNA. When bound to ATP, RecA protein can displace SSB from the DNA. However, SSB affects the steps of filament formation differently. SSB strongly inhibits the nucleation of RecA filament formation, while it is readily displaced from the ssDNA during the filament extension phase. This leads to an order-of-addition effect. When RecA is added first to a solution containing ssDNA, it binds to regions lacking secondary structure. The best stimulation of RecA filament formation occurs when SSB is added after RecA protein, where it removes secondary structure in the DNA and is displaced by extension of the RecA filaments already bound (207, 210-212). However, when SSB is added before RecA protein it completely covers the ssDNA and there is a substantial lag in RecA protein filament formation (207,212)explained by the inhibition of filament nucleation. The properties of the RecA filament assembly process have played a prominent role in the development of models to explain the function of RecA and RecBCD proteins in the initiation of recombinational processes. Many recent efforts to reconstitute portions of homologous recombination reactions in witro have stressed the importance of prepared regions of single-
ReCA PROTEIN
177
stranded DNA with free 3‘ ends in the initiation of genetic exchanges (213-215). In bacteria, the in vivo evidence that 3‘ ends are more recombinogenic or invasive is limited. The observation that sbcB mutations, which affect the 3’ to 5’ exonuclease I, restore recombination levels in recBC mutants (27,216-218) has often been cited as evidence that the generation and protection of 3’ ends is important. However, some mutations or deletions that eliminate ex01 function do not restore recombination in recBC mutants (219), calling the importance of 3’ ends into question. The key in vitro observation relevant to the role of 3‘ ends in recombination has been the polarity of the RecA filament assembly reaction. On ssDNA, filament extension after nucleation is polar and proceeds 5‘ and 3’ (220).On circular ssDNA, the process continues until the DNAis completely encompassed. On linear ssDNA, nucleation is presumed to occur virtually anywhere and to proceed to the 3’ end. The 3’ end is therefore much more likely, than the 5’ end to be coated with RecA protein. The observation is readily incorporated into models in which 3’ ends exposed by the action of nucleases or helicases are more recombinogenic or invasive than 5’ ends. A greater propensity for RecA filaments on ssDNA to promote DNA pairing at the 3 ’ end of a linear ssDNA has been directly demonstrated in vitro (213, 221-224). However, as detailed below, the final chapter on the molecular origin of these effects and their implications for recombination in vivo has not yet been written. Filament assembly is also polar on dsDNA. With linear dsDNA, the slow nucleation step is readily circumvented with 5‘ single-strand extensions but not by 3’ extensions (203, 225, 226). The single-stranded DNA in a 5’ single-strand extension or a single-strand gap becomes the initiating strand in filament formation. The filament extension then encompasses the adjacent dsDNA, proceeding 5’ to 3’ along the initiating strand. The major pathway for disassembly of RecA filaments is also unidirectional and end dependent. On dsDNA, RecA monomers dissociate from the end opposite to that at which assembly occurs (226).Filament disassembly therefore proceeds with the same polarity as assembly, 5’ to 3’ relative to the DNA strand in site I. The rate of disassembly is highly pH dependent. Below pH 6.75, linear RecA filaments exhibit little or no net disassembly. Above pH 6.75, end-dependent disassembly is observed, increasing in rate with increasing pH until a maximum is observed above pH 8.1 (203).The maximum rate may approach 200 monomers min-l (203). Filament assembly must have the equivalent equilibrium constants for subunit addition at either end, unless assembly or disassembly at one end is coupled in some manner to an input of chemical energy. Inasmuch as assembly and disassembly of RecA filaments can occur at different filament ends in one test tube under a single set of condltions, one process or both
178
ALBERT0 I. ROCA AND MICHAEL M. COX
must be facilitated by ATP hydrolysis. Assembly occurs readily with ATP analogs that are not hydrolyzed, but disassembly does not, indicating that end-dependent filament disassembly is coupled to ATP hydrolysis. The disassembly process accounts for only a small fraction of the ATP hydrolyzed by a filament. In the filament interior, ATP is hydrolyzed by each monomer but protein-protein interactions on both sides of the monomer apparently prevent dissociation most of the time. On the disassembly end, the probability that the monomer will dissociate as a result of ATP hydrolysis is a function of pH. More recently, a 5’ to 3‘ disassembly of RecA filaments from linear ssDNA has also been documented (227).Net disassembly is observed only in the presence of SSB. As RecA monomers or protomers dissociate, they are replaced by SSB, which serves to inhibit renucleation of new filaments. Because the filament extension process displaces SSB but proceeds 5‘ to 3’, the SSB bound to growing segments of the DNA at the disassembly end is not readily displaced by RecA (227).This disassembly process must contribute to the observed bias for RecA-mediated DNA pairing at the 3’ end of the linear ssDNA. Not only can filaments nucleate at points away from the 5’ end, but the disassembly process will actively remove RecA protein from the 5’ end. The rate of disassembly from ssDNA and its pH dependence have not yet been analyzed in detail, but at pH 7.5 the rates are comparable to those observed from dsDNA under the same conditions. It should be noted that disassembly is prevented if dATP replaces ATP as nucleotide cofactor, or if the E. coZi RecO and RecR proteins (described in Section V,B) are present (227).In the presence of the RecO and RecR proteins, RecA filaments stably coat linear ssDNAs contiguously from the 5’ to the 3’ ends. It can b e argued, therefore, that free 5’ DNA ends should participate in RecA protein-mediated DNA pairing processes as readily as 3’ ends in any cell that contains the RecO and RecR proteins. RecA monomers in the interior of filaments formed on ssDNA exhibit only a very modest exchange reaction with free RecA monomers, and no exchange at all when dATP replaces ATP (228). Some direct transfer of RecA monomers or protomers between adjacent RecA nucleoprotein filaments is observed (229), but ATP hydrolysis rarely results in simple dissociation of a RecA monomer from the filament interior. When filaments are formed on dsDNA, however, a relatively robust exchange between bound and free RecA monomers in the filament interior is observed (228). This indicates that there is a change in the state or conformation of the filament when a second DNA strand is added at site 11. Experiments probing the interaction of DNA strands within the RecA filament with BPDE-labeled oligonucleotides also suggest a change in filament state when strand-binding sites I1 and/or 111are occupied (199, 200).
ReCA PROTEIN
179
The mechanism of the exchange observed for monomers in the RecA filament interior is not yet known. It could involve simple dissociation and replacement by other monomers. Alternatively, it could represent an exchange between RecA nucleoprotein filaments and unbound aggregates or filaments of RecA protein with which they come into transient contact. The latter process could b e mechanistically related to the direct transfer of RecA protomers between RecA-ssDNA filaments (229),or from nucleoprotein filaments to unbound DNA (177,230).Conformation changes observed within bundles of RecA filaments associated with slow hydrolysis of ATPyS might be interpreted as a side-by-side exchange of RecA monomers between filaments (231).
C. ATP Hydrolysis The RecA protein is a DNA-dependent ATPase. Rates of ATF’ hydrolysis are modest. The maximum reported value for kcadK, at 37°C is about 2 x lo4 sec-l M-l, some that of the dffusion-controlled limit. The activity is affected by the structure ofthe bound DNA and by monomer-monomer interactions within the filament. The kinetics are complicated by cooperativity, and the K , is more accurately reported as So,5. Two methods are commonly used to monitor ATP hydrolysis by RecA. One employs labeled AT”, and thin-layer chromatography to separate A”, ADP, and Pi (232, 233). The other is a spectrophotometric assay in which ATP hydrolysis is enzymatically coupled to an NAD-NADH interconversion (203, 212, 233-235). 1. DNA-INDEPENDENT ATP HYDROLYSIS
RecA-mediated ATP hydrolysis in the absence of DNA is slow but detectable (232). Under conditions used to monitor many RecA-mediated processes in uitro, with the aggregate concentration of salt and buffers at less than 150 d, the DNA-independent ATPase activity peaks at about pH 6.0 with a kcat of 0.1 min-’. At pH 7.5, the kcat drops to 0.015 min-l. The rate of ATP hydrolysis increases as a third- to fourth-order function of salt concentration. At very high (1.5-2 M) concentrations of a wide variety of salts, the rates of ATP hydrolysis become comparable to rates observed in the presence of DNA (236, 237). The nature of the salt dependence and observed changes in circular dichroism spectra at high salt concentrations ( 2 3 8 , 2 3 8 ~ ) suggest that salt can facilitate a conformation change that to some degree mimics that induced by the binding of DNA to RecA protein. 2. DNA-DEPENDENT ATF’ HYDROLYSIS Accurate measurement of RecA-mediated ATP hydrolysis on ssDNA depends either on the use of DNA cofactors that lack secondary structure
180
ALBERT0 I. ROCA AND MICHAEL M. COX
[such as poly(dT)] or on the presence of SSB, which serves to eliminate secondary structure and permit uniform DNA binding by RecA (as already described). When bound to ssDNA, RecA protein hydrolyzes ATP with a kc, of%-30 min-l at 37°C. The ATPase activity exhibits no dependence on pH between 5.5 and 9.0 (232).The effects of temperature between 25 and 45°C produce a linear Arrhenius plot, from which an Arrhenius activation energy of 11.8 It_ 0.3 kcal mol-l can be derived (233)for the reaction. The ATP hydrolysis occurs uniformly throughout the RecA filament, with no enhancement at filament ends (176, 212, 239). As ssDNA is titrated into a reaction mixture, the observed rate of ATP hydrolysis generally increases linearly until sufficient DNA is present (three nucleotides per RecA monomer) to bind all of the RecA protein present. This direct relationship between ATP hydrolysis and DNA binding is observed under most, but not all, conditions. The rate of ATP hydrolysis can be a convenient method to monitor DNA binding if appropriate controls and corroborating methods are included in the data set. ATP is hydrolyzed to ADP and P,. ADP acts as a competitive inhibitor of ATP hydrolysis. Another common inhibitor is the ATP analog ATPyS [adenosine 5 '-0-(3&io)triphosphate), which is bound but not appreciably hydrolyzed by RecA protein and acts as a potent competitive inhibitor. The inhibition patterns in both cases are greatly complicated by additional effects of these two nucleotides (240, 241).As already described, each tends to stabilize cooperatively different forms of a RecA filament. A variety of r N " s and d N " s are hydrolyzed by RecA protein (232,242, 243). Among them are dATP, UTP, PTP (purine ribonucleoside triphosphate), and dUTP, which serve as cofactors in at least some RecA-mediated reactions (178,232,243).Others, including ITP,CTP, dCTP, and GTP, are hydrolyzed with a nearly normal Vm,, but do not support DNA strand exchange. Bryant and colleagues have pointed out that although a number of nucleotide triphosphates are hydrolyzed, only those with a measured S,, below 100 serve as cofactors in the DNA strand-exchange reaction (243, 244). The reactions with dATP are particularly notable. With dATP, hydrolyhc rates are increased about 20% relative to ATP hydrolysis. RecA filament stability and the rates of DNA strand exchange are also enhanced (115, 178, 228). The RecA mutant, RecA K72R, which binds but does not hydrolyze ATP, will promote a limited DNA strand-exchange reaction only if dATP is used as the nucleotide cofactor (114,115). When RecA protein is bound to dsDNA, the kcat for ATP hydrolysis is reduced to 20-22 mir-l. The rates again exhibit no dependence on pH between 5.5 and 9, as long as RecA filaments are fully formed on the DNA
(184).
ReCA PROTEIN
181
3. MECHANISTICASPECTS OF ATP HYDROLYSIS: CONFORMATION CHANGES AND COOPERATIVITY As already described, three cbfferent foims of RecA filaments have been defmed structurally,principally by electron microscopy.When bound to ATP or to certain ATP analogs, the resulting nucleoprotein filament on ssDNA is extended. With either ADP or no nucleotide cofactor, the filament forms are described as collapsed, although they are distinct. The structure derived from crystallographic studies probably most closely approximates the ADP-bound form. The same three filament forms can also be defined on the basis of DNA binding studies. The affinity of RecA protein for ssDNA is increased by ATP and decreased by ADP, with a form exhibiting intermediate affinity present in the absence of nucleotide cofactor (245, 246). The ADP-bound and nonucleotide filament forms are inactive in DNA pairing and strand-exchange activities. Based on the extrapolationthat these RecA species represent conformations occurring during the normal ATP hydrolyhc cycle, these observations have inspired a number of models for ATP hydrolysis. One proposal suggested that ATPhydrolysiswould be coordinated in the monomers along the filament, leading to an accordion-likeextension and retraction of the filament structure (166, 167). However, a detailed examination of conformations observed during the slow hydrolysis of ATPyS (66)indicated that hydrolysis resulted in no collapse to anythmg resembling the ADP-bound Mament form. The degree of DNA undenvinding within a filament actively hydrolyzing ATP is quite uniform from one fiament to the next, conhtbuting to a picture in which there is a broad structural homogeneity between filaments (136, 203, 227). Complementary studies on the RecA ATPase inhibition patterns observed with ADP and ATPyS demonstrated that conversion of the ADP filament form to the ATP form proceeded with a transient phase during which the ADP filaments were disassembled and reassembled in the ATP form. The two forms are mutually exclusive. The observed ATP and ADP filament forms are therefore not directly interconvertible in the sense of the entire filament alternating between them, and the accordion model has been eliminated from consideration. The possibility remains that within a filament actively hydrolyzing ATP, a subset of the RecA monomers might exist at any moment in a conformation equivalent to the ADP fdament conformation defined by the studies carried out with ADP alone. ATP hydrolysis in individual monomers could result in cycling between the ATP and ADP filament forms, with the filament as a whole representing a dynamic mixture of these and perhaps other conformations. As suggested (5, 85),this could lead to a cycling between conformations with high and low affinities for ssDNA, effectively coupling the
182
ALBERT0 I. ROCA AND MICHAEL M. COX
ATP hydrolysis to an association/dissociation cycle. This would solve the molecular dilemma posed by the competing requirements of a protein like RecA to bind tightly to DNA to promote some reaction, yet dissociate when the reaction was completed (5,85).In support, studies on the end-dependent disassembly of RecA filaments provide evidence that ATP hydrolysis is involved in the dissociation of RecA monomers (or larger protomers) from one filament end (5, 85,226,227). However, there is no obligate coupling of ATP hydrolysis to RecA dissociation,and dissociativeprocesses rarely account for more than a minute fraction of the ATP hydrolyzed. When RecA filaments are bound to ssDNA circles, there is very little exchange between free and bound RecA, and none at all if dATP is used as the nucleotide cofactor (228).On linear ssDNA, ATPmediated dissociation is observed at the filament end nearest the 5’ end of the DNA, but ATP hydrolysis in the filament interior does not result in dissociation. Even the end-dependent dissociation is suppressed on ssDNA when dATP is used (227).A picture emerges in which dissociation in the filament interior is limited by interactions with RecA monomers on either side, while dissociation at the “disassembly” end would occur with a probability influenced by the nucleotide cofactor used and other factors (203,227). As of this writing, it is unclear whether the inactive RecA filament conformation observed in the presence of ADP alone represents a conformation that occurs even transiently in individual monomers during the normal ATP hydrolytic cycle. New probes of RecA conformation changes are clearly needed. A particularly promising approach involves the R e d H163W mutant protein, with the change located in loop L1(162,247).The substitution does not appear to affect appreciably any aspect of RecA function. However, the new tryptophan residue provides a fluorescence signal that is quenched on the binding of nucleotide cofactors. ADP addition to a RecA-ssDNA complex leads to a small quenching of the fluorescence signals, providing new evidence that the ADP filament form is conformationally distinct from the form present in the absence of nucleotide (162).ATP addition leads to a much greater quenching of the fluorescence signal (162,247).The change in signal reflects a first-order process that occurs with a rate constant equivalent to the kc,, for ATP hydrolysis (E. Stole and F. R. Bryant, personal communication).The signal is also closely associated with the conformation change to an extended filament form that accompanies ATP binding. ITP is hydrolyzed by RecA protein with a kcat equivalent to that for ATP, but it does not support DNA strand exchange, it does not promote the conformation change to the extended filament, and its binding does not cause the quenching of the fluorescence signal brought about by ATP (247).The signal change observed with ATP is similar under at least some conditions to that brought about by ATPyS binding (E. Stole and F. R. Bryant, personal communication).
ReCA PROTEIN
183
These observations are leading to a dissection of steps in the ATP hydrolytic cycle. The rate-limiting step is a first-order process and thus follows ATP binding. The absence of a burst of ATP (or ATPyS) hydrolysis indcates that the rate-limiting step precedes the hydrolytx step. Filament extension occurs with a first-order rate constant identical to the kcat for ATP hydrolysis. However, the rate-limiting step and filament extension are not equivalent, because ITPhydrolysis appears to share the same rate-limiting step with ATP hydrolysis, yet does not support filament extension. The result is a scheme in which NTP binding is followed by a slow transition, possibly a conformation change that does not affect the fluorescence properties of the H163W mutant protein. If ATP is the nucleotide cofactor, the slow transition is accompanied by or rapidly followed by filament extension. A relatively fast hydrolyhc step then follows (F. R. Bryant, personal communication). The steps involved in dissociation of ADP and Pi are not yet defined, but do not limit the rate of the reaction. RecA protein does not promote a detectable exchange of rH]ADP + ATP, HPO, + H,180, or HP180, + H,O (248).Thus, ATP hydrolysis is neither macroscopically nor microscopically reversible on the enw e . Once extended, the filament remains in this state as long as ATP is regenerated. Filament collapse and disassembly occur only when the ADP/ATP ratio exceeds 1.0 (240,241,246,248).As previously indicated, the degree of extension and associated DNA underwinding is essentiallythe same whether the extension is brought about by ATPyS, or within a filament actively hydrolyzing ATP. The RecA monomer conformation observed in all monomers in filaments bound to ADP must be populated for a very small fraction of the time required for the entire ATP hydrolytw cycle. Alternatively, this conformation may not occur at all during the hydrol@c cycle. Monomer-monomer interactions within the filament may constrain each monomer to maintain an extended conformation even when ADP is transiently bound. Individual monomers might go through a series of conformation changes during the hydrol@c cycle, but the conformation present when ADP was transiently bound would b e distinct from that observed when the entire filament was formed in the presence of ADP. Inhibition studies show that ATPyS and ADP are antagonistic inhibitors of ATP hydrolysis, and that the ATP and ADP filament forms are structurally incompatible (240).Whereas filaments form on ssDNA in the presence of either ATP or ADP, the incompatibility of the two filament forms results in nearly total dissociation when a 50150 mixture of ATP and ADP is present, with each nucleotide antagonizing the filament form stabilized by the other (241). A reasonable hypothesis is that the binding of ATP to a sufficiently large subfraction of the monomers in a filament results in the formation of a static and extended structural core conformation maintained by certain seg-
184
ALBERT0 I. ROCA AND MICHAEL M. COX
ments of the RecA monomer structure and reinforced cooperatively through monomer-monomer interactions. Outside of this core structure, other elements of the structure of each monomer may undergo conformation changes in concert with the ATP hydrolfic cycle to effect the late stages of DNA strand exchange. When ATP hydrolysis occurs in a monomer or a segment of monomers at the disassembly end of a filament, a momentary collapse of the core structure to the ADP form could lead to dissociation at that point. These ideas incorporate some aspects of the active cluster hypothesis proposed by Kowalczykowski (242,246). These ideas and observations imply a high degree of cooperativity between monomers within a RecA filament. The cooperativity is manifested by Hill coefficients greater than 1.0 for ATP hydrolysis observed under many conditions (249),and a cooperativity parameter, o,of 50 & 10 for binding to ssDNA (245). It is also manifested by unusual kinetic effects. For example, RecA-mediated GTP hydrolysis is greatly stimulated by low concentrations of ATE' (250).Also, very low concentrations of ATPyS actually stimulate the hydrolysis of ATP under some conditions (240). Cooperativity must play a role in the filament disassembly reaction. Within RecA filaments formed on dsDNA, individual monomers hydrolyze ATP with a turnover of about 20 min-', or about 1 ATP every 3 sec. If the ATP hydrolytx cycles of adjacent RecA monomers were entirely independent, ATP hydrolysis would occur in the monomer at the disassembly end about 1.5 sec (on average) after the previous monomer dissociated. If ATP hydrolysis and end-dependent dissociation are coupled, the maximum rate of dissociation would be one every 1.5 sec, or 40 monomers per minute. However, the maximum observed rates of filament disassembly are at least three times this. This implies that the ATP hydrolfic cycle of adjacent monomers are coupled and temporally offset from one another, leading to waves of hydrolysis proceeding down the filament from the disassembly to the assembly ends (70,226). To reconcile the observed rates of filament disassembly with the ATPase turnover rates, the waves must be coordinated and spaced every 6-12 monomers (70,226).In the filament interior, a new wave must reach a given monomer every 3 sec. More accurate measurements of end-dependent filament disassembly rates are needed to refine this picture.
IV. RecA Protein-mediated DNA Strand Exchange A. Overview of the Reaction Typical DNA strand-exchange reactions used to study RecA function in vitro are illustrated in Fig. 12. The reactions shown are arranged for convenience, because it is easy to distinguish products from substrates with a wide
ReCA PROTEIN
185
3 strands
4 strands
FIG.12. RecA protein-mediated DNA strand-exchangereactions.
range of assays. DNA substrates are usually derived from bacteriophages, often M13 or its derivatives. The reaction can involve either three or four DNA strands, as shown, and it nicely mimics several of the putative steps in recombinational DNA repair. A Holliday intermediate is formed transiently in the four-strand reaction. The discussion remains organized around the concept of three DNA strand-binding sites in the RecA filament, in spite of the capacity of RecA filaments to promote the four-strand exchange. There are at least four experimentally or at least conceptually distinguishable phases of a RecA protein-mediated DNA stand-exchange reaction. The first is the formation of a RecA nucleoprotein filament that completely coats the single-stranded or gapped DNA substrate. In the discussion below, this DNA substrate is called DNA1, and it binds at the DNA strand-binding site defined as site I. The resulting nucleoprotein filament is effectively a sequence-specific DNA-binding entity, with the sequence specificity determined by the bound DNA. Phase 1is also called presynapsis. The bound single strand is aligned with a linear duplex DNA (DNAS) in the second phase of the reaction. DNA2 is brought into the filament at DNA strand-binding sites I1 and 111. In the third phase, there is a rapid strand switch to create 1000 bp or more of hybrid DNA. This often produces the branched molecules shown in Fig. 12.Phase 3 requires ATP but not its hydrolysis. In reactions involving four DNA strands, the DNA pairing and strand switching in phases 2 and 3 are limited to the single-strand gap in the gapped duplex DNA (115, 2S1). In other words, the fundamental DNA pairing process in RecA filaments will accommodate only three DNA strands. Phases 1-3 can result in the formation of thousands of base-pairs of hybrid DNA without ATP hydrolysis. However, under most conditions, the completion of DNA strand exchange requires a fourth phase in which the nascent hybrid DNA is extend-
186
ALBERT0 I. ROCA AND MICHAEL M. C O X
ed in a facilitated branch migration reaction that proceeds until products are formed. This final phase requires ATP hydrolysis and will accommodate four DNA strands. For a variety of reasons, phases 2 and 3 are difficult to distinguish experimentally. Phases 1 and 4 are readily distinguished and experimentally defined relative to the DNA pairing in phases 2 and 3. The RecA filament formed in phase 1hydrolyzes ATP with a kc, of 30 min-l. On addition of a homologous duplex DNA, the rate of ATP hydrolysis decreases rapidly to a turnover of 20-22 min-l (239).The decline is dependent on the length of homology in the duplex DNA, and therefore reflects a change brought on by DNA pairing (239).The apparent efficiency of the strand-exchange reaction is low, with about 100 ATPs hydrolyzed for every base pair of hybrid DNA formed under typical reaction conditions. The branch migration in phase 4 is unidirectional (5' to 3' with respect to the single strand initially bound), and proceeds at a rate of 380 bp m i ~ l at 37°C (233).This polarity affects the efficiency with which DNA pairing occurs at the two ends of the duplex DNA substrate. The end of the duplex at which a complete strand exchange would normally be initiated (the 5' end with respect to the displaced strand) is therefore called the proximal end, and the other end is called the distal end. Phase 4 proceeds readily past mismatches, lesions, and even heterologous inserts (up to a few hundred base pairs in length) in one or both DNA substrates, a capability that may be critical to its function in recombinational DNA repair (115,252-255). The DNA binding and filament formation that occurs in phase 1,as well as the properties of the resulting filaments, has already been described in Section II1,A. The discussion below focuses on the DNA pairing (phases 2 and 3), and the hybrid DNA extension that occurs in phase 4.
B. DNA Pairing DNA pairing occurs within the filament, as originally proposed by Paul Howard-Flanders (256)and confirmed over the past decade in a wide range of studes. However, the four-stranded DNA pairing intermediate suggested is yielding to models in which DNA pairing within the RecA filament is restricted to three strands. The initial pairing interaction in these reactions always involves alignment of a single strand with a homologous duplex DNA. Holliday structures involving reciprocal strand exchange among four DNA strands are created via branch migration followinga three-stranded DNA pairing event. Although four-stranded DNA structures have often been promoted as potential recombination intermediates (5,256-260), there is no unambiguous evidence for such an intermediate in RecA reactions and much evidence against it (121). Consider the DNA binding properties of RecA protein reviewed above. RecA binds readily to ssDNA, but very slowly to dsDNA. Many of the en-
ReCA PROTEIN
187
zymes identified as functioning early in recombinational processes are helicases and nucleases (RecJ, RecQ, RecBCD) that help generate a singlestranded DNA substrate for RecA (29,85,217). A wide range of biochemical and biophysical studies highlight the ease with which three DNA strands are incorporated into the major groove of a RecA filament, and the general absence of evidence for the incorporation of a fourth DNA strand (117,119,121, 199, 261-264). The DNA strand binding site 111 can be distinguished by some of the same experimental approaches used to distinguish site I from site 11. For example, a DNA strand bound at site 11is more sensitive to nuclease degradation than strands bound at site I or I11 (196). When combined with in vivo work demonstrating the importance of single-stranded DNA in initiating recombination, a simple picture emerges. In most situations requiring homologous recombination, the major DNA pairing event involves a singlestranded DNA bound by RecA protein and a homologous duplex DNA. The mechanism of the DNA pairing process in the DNA strand-exchange reaction has not been worked out in detad. However, several experimental facts seem clear and consistent over a range of studies in different laboratories. First, the DNA pairing process does not require ATP hydrolysis. Second, DNA pairing does require the extended RecA filament conformation brought about by binding to ATP or ATPyS. Finally, the product of the reaction is a complex in which the hybrid DNA product of a strand switch is stabilized in the filament, probably with the displaced strand remaining within the filament, wound around the hybrid DNA but not necessarily interacting with it.
1. KINETICS The kinetic mechanism of DNA pairing is surprisingly poorly understood. DNA pairing can, in principle, be broken down into a number of steps. The RecA nucleoprotein filaments on ssDNA will bind at least weakly to duplex DNA whether it is homologous or not (118, 119). Nonhomologous interactions with the duplex DNA substrate would be a presumed first step. These interactions lead to the formation of extensive aggregated networks of DNA when RecA protein is mixed with heterologous duplex DNA (265, 266). Rapid DNA pairing can occur within these networks when the DNA is homologous, although incubation of RecA nucleoprotein fdaments with completely heterologous DNA can block pairing of homologous DNA added later (266). The presence of heterologous DNA sequences flanking a homologous sequence in the duplex DNA did not enhance the rate of DNA pairing in one study using contiguous RecA filaments formed on ssDNA in the presence of SSB (267). However, another study demonstrated an enhancing effect of flanking heterologous DNA when the more fragmented filaments formed in the absence of SSB were used (265). Pairing, in the sense of an initial ho-
188
ALBERT0 I. ROCA AND MICHAEL M. COX
mologous a h p e n t , can occur anywhere along the length of either DNA substrate, and is generally completed within a few minutes (239, 268, 269). Formation of a paired DNA species retaining significant stabilitywhen RecA protein is removed can take somewhat longer and generally requires the presence of a DNA end allowing a net exchange of DNA strands (268,269). There is still much to be learned about this process by careful kinetic studies. One effort in this direction has recently been reported (270).The use of very short single-strand oligonucleotides (suboptimal for RecA binding under most conditions) and an unusual mixture of ATPyS and ADP in this study produced rather slow rates of pairing, making the results difficult to compare with other RecA studies. However, this and most other studies done to date contribute to the present consensus that the homologous alignment of the two DNA substrates (roughly the process described above as phase 2) does not limit the overall rate at which stably paired complexes are formed. This suggests that some aspect of the strand switch in phase 3 is rate limiting. 2. DNA STRUCTURES AND PROPOSED INTERMEDIATES A more vigorous effort has been directed at the DNA structural aspects of the pairing reaction. The central idea guiding much of this effort is the possible involvement of a novel DNA triplex structure, sometimes referred to as R-form DNA (271).Perhaps the first mention of a recombination triplex DNA intermediate was put forward by Lacks (272), and the structures of DNA triplex intermediates suggested by a variety of studies are remarkably consistent with the original proposal by Lacks. A consideration of possible paths of DNA pairing reveals two reaction stages during which a DNA triplex could occur (Fig. 13).In one path, the duplex DNA would enter the RecA filament groove and a l i p via its major groove with the ssDNA. n s would form what has been called a prestrand switch triplex (273). The strands of the duplex substrate would retain their Watson-Crick base pairing in this intermediate. Rotation of the DNA bases to pair the ssDNA with its complement to make a hybrid duplex would subsequently place the displaced DNA strand into the minor groove of the hybrid. In the second path, the duplex DNA would enter the filament and interact with the ssDNA initially through its minor groove. Rotation of the bases would generate a hybrid DNA and leave the displaced DNA strand in the major groove to form a poststrand switch triplex. In a poststrand switch triplex, the hybrid DNA strands would be paired in a Watson-Crick conformation.A variety of models have been proposed for the location of the third strand in either triplex, with the most probable involving secondary hydrogen bonds to groups in the major groove (271,274-277). A current proposal for base triplets in an R-form triplex is presented in Fig. 14. The first pathway in Fig. 13 is particularly attractive, because it provides
189
ReCA PROTEIN
pre-strand switch triplex
+
RecA filament
ssDNA
dsDNA
poststrand
switch triplex
FIG.13. Possible DNA pairing pathways dnring DNA strand exchange in the RecA protein filament. Nucleotide bases are represented by circles; the small attached filled circles denote location of the covalent bonds to deoxytibose. Pathway 1 is one in which the major groove of the dsDNA substrate is presented first in the filament groove. Rotation of the bases then results in the displaced strand positioned in the minor groove of the hybrid DNA product. Pathway 2 is a “minor groove first” path. In h s case, the displaced DNA strand is positioned in the major groove of the hybrid DNA. DNA strand binding sites would be numbered as shown in the insets for the two pathways. Potential DNA triplex structures in which the extra DNA strand is arrayed in the major groove of the Watson-Crick duplex is indicated.
a clear structural mechanism whereby the two DNA molecules can be homologously aligned. However, there is no evidence for a prestrand switch triplex in RecA reactions. Most studies have detected only the hybrid DNA present after the strand switch has taken place. The RecA filament appears to stabilize the products of DNA strand exchange, using binding energy to promote the strand switch. The existence of a triplex DNA intermediate on the reaction pathway remains controversial. Experiments probing the structure of paired DNA species formed within RecA filaments in the presence of ATPyS suggest a Watson-Crick pairing relationship between the two strands of the hybrid DNA product (198).When distal joints are formed with ATPyS, and cross-linked with psoralen derivatives prior to deproteinization, only the two strands corresponding to the product hybrid duplex are cross-linked (278),consistent with the result of Adzuma (198).These results do not provide information about the location of the displaced DNA strand. A series of recent chemical probing and cross-linking studies (279-281) has provided strong support for the second pathway in Fig. 13. The duplex DNA appears to enter the groove of the filament and align via its minor groove with the ssDNA. A rotation of the DNA bases produces hybrid DNA. The results of one of these studies suggest that the displaced strand is with-
190
ALBERT0 I. ROCA AND MICHAEL M. COX
FIG.14. Proposed DNA triplets for the R-form DNA. Reprinted from Cox (124, with permission.
in the RecA filament, but somewhat displaced from the major groove of the hybrid duplex and not hydrogen-bonded to it (281).However, some direct evidence that all three strands that can bind to a RecA filament are close enough to interact has been provided by studies using BPDE-labeled oligonucleotides (199, 200). An important implication of this work is that there is no homologous recognition between the two DNA molecules prior to the strand switch. All three strands are cross-linked only in distal joints formed within a RecA filament while ATP was being hydrolyzed (278, 282). This structure may represent a triplex DNA species, but the cross-linkingpatterns are also consistent with a dynamic structure in which the two alternative duplexes are locally interconverted as ATP is hydrolyzed, or a structure in which regions of hybrid DNA alternate with regions that have not undergone a strand switch (278, 282).When distal joints were formed with ATP, but cross-linking was canied out after the RecA protein was removed, cross-links were limited to the two strands corresponding to the hybrid duplex (278).At a mini-
ReCA PROTEIN
191
mum, this indicates that the DNA structure changes when RecA protein is removed. In distal joints formed within the RecA filament, the third strand appears to be at least transiently displaced when ATP is hydrolyzed. This strand is susceptible to degradation by exonuclease I (54).When pairing is restricted to the distal end of the duplex DNA, the addtion of exonuclease I results in a reverse strand-exchange reaction, in which the hybrid DNA is stabilized by the progressive enzymatic removal of the displaced stand (54). Much of the evidence for a poststrand switch triplex involves the characterization of a putative stable triplex species that persists after the RecA filament is removed (276, 277, 283-285). The stable triplex is detected only when DNA pairing is restricted to the distal end of the duplex DNA substrate. The putative triplexes remaining after the removal of RecA protein cannot be extended as they must be within the RecA filament, and their potential relationship to the species present in the filament can be debated. The triplex structure itself, however, could represent a novel DNA conformation with interesting evolutionary signlficance.The triplexes have been defined by their thermal stability (distaljoints appear to be more stable than proximal joints), and by the resistance of all three strands to various nucleolyhc and chemical probes (277,283-285). To function in recombination, an R-form triplex must have like strands arranged in parallel, and the structure must form in a sequence-independent manner. These features set it apart from the stable triplex DNA species characterized to date (286,287). Early models for R-form DNA placed the third strand in the major groove, stabilized in part by hydrogen bonds to the N-7 guanine (which is not involved in Watson-Crick base pairing) and secondary hydrogen bonds to groups involved in Watson-Crick pairing. However, substituting 7-deazaguanine for guanine in both strands of the duplex DNA substrate had no effect on the rate or efficiency of the DNA strand-exchange reaction (275).This was also true when three of four strands were substituted in a four-strand reaction. This result has been extended to DNA substituted with 7-deazaadenine, and DNA with both purine N-7 substitutions (S. K. Jain and M. M. Cox, unpublished results). These results argue against triplex models that rely on hydrogen bonding to purine N-7 for significant stabilization, and have been reinforced by methylation protection experiments done on the structures remaining after RecA deproteinization (277).Recent computer modeling predicts that stable triplex structures can be assembled without involving purine N-7 (271). The formation of stable triplex DNA species has proved difficult to confirm by electron microscopy (EM).The triplexes have a reported thermal stability at least as great as duplex DNA (277, 283, 284), and the methods used for spreading DNA do not denature duplexes. Nevertheless, no triplexes have
192
ALBERT0 I. ROCA AND MICHAEL M. COX
been observed in many EM trials, using conditions inspired by the various published reports (278; R. B. Inman and M. M. Cox, unpublished results). Any triplexes present in these reactions are disrupted by surface tension or some other factor associated with DNA spreading. There has been only one report of results suggesting the spontaneous formation of an R-form DNA triplex (288). However, the DNA species studied have a T, of only about 30°C. Many of the studies of stable DNA species formed after removal of RecA protein have been carried out at higher temperatures. This further suggests that any triplexes left behind after RecA removal exhibit some hysteresis in their formation. Within the “minor groove first” pathway (Fig. 13),the formation of a DNA triplex, even transiently, remains attractive as a structural device to facilitate homologous recognition between the two DNA substrates. The one viable alternative is a pathway in which the RecA filament catalyzes the conversion of the duplex substrate to a duplex hybrid DNA without a triplex intermediate of any kind. In this last scenario, homologous recognition would be mediated by Watson-Crick base-pairing alone, with the homology search involving iterative rotation of bases in the duplex substrate followed by dissociation until homologous alignment was established. This part of the RecA-mediated DNA strand-exchange mechanism is still in need of further definition. 3. ENERGETICS
Whatever the detailed pathway for the pairing reaction, it is clear that it results in an efficient formation of hybrid DNA and does not require ATP hydrolysis. The strand switch and stabilization of the hybrid DNA product are facilitated by binding energy within the major RecA filament groove. This has been demonstrated in several studies. First, a limited DNA strandexchange reaction (producing 1-2 kbp of hybrid DNA) can occur under some conditions in the presence of AWyS (251, 255, 289-291). A similar DNA strand-exchange reaction is promoted by the RecA K72R mutant protein, with an alteration in the consensus ATP binding site that allows it to bind but not hydrolyze ATP (114,119. It has been further demonstrated that it is merely the filament conformation conferred by ATP that is required for DNA strand exchange,because an NTP cofactor is not needed if this conformation can be established (292). These results are in line with the fact that DNA strand exchange between homologous DNA molecules is not an energetically demanding reaction. It OCCUTS under some conditions in the complete absence of proteins (293), and is promoted by a variety of eukaryotic proteins that do not hydrolyze ATP and (in many cases) have no role in recombination (5,294). The reaction observed in the absence of ATP hydrolysis is generally lim-
ReCA PROTEIN
193
ited to formation of 1-2 kbp of hybrid DNA (289, 291). The reason for this limitation is not clear. It has been proposed that the limitation reflects discontinuities in the RecA filaments (114, 298, 292). Addition of excess RecA protein does not overcome the limitation (119, but small discontinuities might still be present. An alternative proposal is presented in Fig. 15.The discontinuities could be not in the filaments but in the DNA pairing intermediate. Efficient DNA pairing in the absence of ATP hydrolysis requires the addition of Mg2+ in significant excess to that required to saturate the available NTP cofactor as Mg-NTP (115).Initiation of pairing at one point in the filament would lead to propagation as the filament binding energy drew in additional DNA. The propagation would be accompanied by rotation of the filament and the duplex DNA, as a by-product of the binding energy-facditated incorporation of more DNA. As propagation continued, DNA pairing at other locations within the filament would be an intramolecular and potentially favorable process. Pairing at some distance from the point where the initial pairing interaction was being propagated would create a topologically trapped external loop of substrate duplex DNA (Fig. 15).Multiple loops of this type might be created. Once RecA protein was removed, the stable hybrid DNA would be defined by the DNA between the end of the substrate duplex and the first external loop. Because moving these loops would require disruption of paired regions to extend others, and would have to involve rotation about the RecA filament, such loops might well be immobile without an active mechanism to produce the required rotation or disassembly of RecA filaments in these locations. The effects of Mg2+ on the DNA strand-exchange reaction in the absence of ATP hydrolysis may be instructive. When Mg2+ concentrations are similar to the NTP concentration so that the concentration of free Mg2+ is minimal, the formation of paired DNA is reduced considerably. The limited pairing that does occur, however, can slowly generate completely exchanged hybrid DNA products over 7 kbp in length at significant yields (115).If the probability of pairing is reduced, the probability of the secondary pairing events needed to form the loops must also be reduced. Hence, the pairing that does occur generates longer hybrid DNAs. The capacity of the filaments to generate the longer hybrid DNAs under these conditions also tends to argue against the idea that limitations to the length of the hybrid DNA at higher Mg2+ concentrations are brought about by filament discontinuities. The effects of high Mg2+ concentrations in blocking DNA strand exchange are not relieved if RecA K72R filaments are formed at low MgZ+concentration, followed by raising the latter (115). The DNA strand-exchange reaction through phase 3 is probably sufficient to explain the role of RecA protein in genetic recombination during con-
XI
IV
V
ReCA PROTEIN
195
jugation or transduction. However, the limitations of the pairing reactions described above are not only manifested in the length of the hybrid DNA generated. Without ATP hydrolysis, DNA strand exchange also is bidirectional (223,291),will not bypass structural barriers in the DNA (115,254,255), and will not accommodate four DNA strands (115, 251). A unidirectional DNA strand exchange capable of bypassing structural barriers in the DNA could be useful in the context of recombinational DNA repair, and herein lies a function for ATP hydrolysis.
C. Unidirectional Extension of the Hybrid DNA and the Role of ATP Hydrolysis NTPases can generally be classified according to one of three biological functions: motor proteins, molecular timing devices, or recycling functions (295).The ATPase activity of RecA is most commonly portrayed as a recycling function (5, 85, 257,295), causing the dissociation of RecA monomers from the filament after DNA strand exchange has occurred. As already noted, there is evidence that the end-dependent dissociation of RecA monomers from RecA filaments is at least one of the functions of RecA-mediated ATP hydrolysis (226). However, we have also noted that disassembly of RecA filaments rarely accounts for more than a minute fraction of the ATP hydrolyzed in RecA filaments. ATP is hydrolyzed uniformly by RecA monomers throughout the filament (176, 212, 239), and under some conditions ATP or dATP hydrolysis proceeds in the absence of measurable RecA dissociation (228, 229). Defining the role of the RecA ATPase can be divided into two parts: biochemical function and mechanism. The observation that DNA strand exchange proceeds in the absence of ATP hydrolysis provides an opportunity to further define function. In effect, the limitations of the reaction in the absence of ATP hydrolysis elucidate biochemical functions provided by the ATPase activity. When ATP is hydrolyzed, the DNA strand-exchange reacFIG.15. A model to explain the cessation of DNA strand exchange in the absence of ATP hydrolysis after formation of 1-2 kbp of hybrid DNA. Formation of a discontinuous DNA pairing intermediate is illustrated in five steps. (I) Pairing is initiated at one end of a duplex DNA substrate. Extension of the paired region requires the rotation of both the fdament and the duplex DNA, as shown by circular arrows. (11) As the paired region lengthens, some probability exists for an intramolecular pairing interaction elsewhere in the filament (black arrow). (111)Pairing at the new location creates a new point for continued spooling of the duplex DNA into the fdament. However, a segment of DNA is left outside of the filament as an external loop by the second pairing initiation. (IV).Multiple loops can form (e.g., segments B-C and D-E), with paired segments (e.g., C-D) between them. (V) Resolution of the loops requires their rotation about the axis of the RecA nucleoprotein filament. Reprinted from Shan et al. (119,with permission.
196
ALBERT0 I. ROCA AND MICHAEL M. COX
tion becomes unidirectional (223,291),will bypass substantial structural barriers in the DNA substrates (115,254,255),and will accommodate four DNA strands (115,251).All of these become properties defining a fourth phase of DNA strand exchange, a phase involving the active and ATP-dependentconversion of the DNA pairing intermediates generated in phase 3 to fully exchanged DNA products. The molecular mechanism by which the hydrolysis of ATP confers these properties on the strand-exchange reaction is more difficult to resolve. At least three ideas have been proposed for how ATP hydrolysis might effect DNA strand exchange at the molecular level. Konforti et al. (223)proposed that ATP hydrolysis caused dissociation of RecA from the hybrid DNA, with the overall reaction rendered unidirectional by unidirectional assembly of a new filament on the displaced strand. This mechanism provides no explanation for the RecA-mediated bypass of short heterologous insertions during strand exchange. In addition, in reactions carried out in the presence of SSB, the RecA protein involved can be quantitatively accounted for in a complex with the hybrid dsDNA long after DNA strand exchange is complete (296). Other results have shown that the SSB is bound to the displaced DNA strand (297).No net disassembly of the RecA filament is required during unidirectional DNA strand exchange, nor is assembly of RecA filaments on the displaced strand. The results eliminate the mechanistic proposal that DNA strand exchange is driven by dissociation of RecA monomers from the initial filament followed by a unidirectional reassembly of the filament on the displaced strand. However, they do not rule out a localized redistribution of RecA monomers within the filament. The two remaining models are both consistent with existing data. The first is derived from the proposal that ATP hydrolysis is coupled to dissociation of RecA monomers from filaments at the point where strand exchange is taking place (256).In its original form, this proposal did not provide an explanation for RecA-mediatedbypass of DNA structural barriers during strand exchange. This idea has been updated with the suggestion (289) that discontinuities in the RecA filament limit the progress of ATP-independent DNA strand exchange. ATP hydrolysis would then be required simply to redistribute the RecA protein (114, 289, 292). Continued extension of the hybrid DNA would depend on available contiguous RecA filament, remaining largely ATP independent. The second model (298) is derived from the fact that if DNA strand exchange is to take place, the two DNA substrates must rotate (Fig. 16). The most straightforward way to facilitate strand exchange is to couple ATP hydrolysis directly to this rotation. This could involve external loops of DNA with the RecA protein facilitating the rotation shown in panel V of Figure 15. If one of the two DNA substrates binds to the outside of the filament, even
ReCA PROTEIN
197
FIG.16. A DNA strand exchange between two duplex DNA molecules. To move the DNA branch, the molecules must be rotated.
over a short region, the rotation can be effected by passing this external segment between monomers in a ratchetlike action coupled to ATP hydrolysis (298).There are six RecA monomers per turn in the filament. Each 360" rotation would occur in six steps, with one-sixth of the RecA monomers in the filament participating in each step. Because DNA within a RecA filament is undenvound to 18bp per turn, each complete rotation would move the DNA branch by 18 bp. A unidirectional rotation would necessarily result in unidirectional strand exchange. A rotation of this kind coupled to ATP hydrolysis would resolve any external loops of DNA such as those depicted in Fig. 15. This model has been called the facilitated DNA rotation model (298). It requires that ATP hydrolysis in the filament be organized into waves that would be separated by perhaps six monomers and travel unidirectionally through the filament, much as was suggested in Section III,C, describing the ATP hydrolytic activity of RecA protein. The model has the problem that no DNA binding sites have been defined on the filament exterior, although little effort has been directed at this possibility. A different version of a facilitated DNA rotation model has recently been proposed (298,299). In this altered version, the DNA is rotated, but all strands remain within the filament. Both the RecA redistribution and facilitated DNA rotation models provide an explanation for how heterologous sequences are bypassed during DNA strand exchange in an ATP-dependentmanner. Bypass requires that the heterologous DNA sequences be unwound. Radding and colleagues demonstrated that a nick in the heterologous region prevents bypass, indicating that the bypassed DNA is unwound indirectly through the application of torsional stress (299). The underwound state of DNA bound by RecA is the key to understanding how redistribution of RecA monomers might facilitate bypass (Fig. 17).RecA must be bound to a homologously paired three-stranded complex on both sides of the heterologous insertion. Disassembly of part of the RecA filament on one side of the insertion would lead to the release of undenvound DNA. If free rotation of the remaining filament ends was precluded (as it would be in the circular nucleoprotein filaments commonly used
198
ALBERT0 I. ROCA AND MICHAEL M. COX
A1
E
FIG.17. Mechanisms for the bypass of heterologous DNA insertionsin the duplex DNA substrate during DNA strand exchange (A) Bypass by filament disassembly.DNA strand exchange proceeds up to the homology-heterology boundary without the aid of ATP hydrolysis. DNA bound within the RecA filament is extended and underwound by about 40%. Dissociation of some RecA monomers (ovals)results in the release of some underwound DNA. Because the filament is formed on a DNA circle and has no ends free to rotate, the underwinding in the DNA can be translated into unwinding of the DNA in the heterologous insertion. DNA strand exchange can then be continued beyond the insertion without the aid of ATP hydrolysis. This model is based on ideas communicated to us by s. Kowalczykowski. (B) Bypass by facilitated DNA rotation. If RecA protein is bound to the DNA with the filled strand, and promotes the DNA rotation shown by a mechanism described elsewhere (298),the DNA within the heterologous insertion will be unwound as a result of the rotation.
ReCA PROTEIN
199
in these experiments), then the underwinding in the released DNA could be translated into strand separation, with as many as 8 bp separated for every six RecA monomers dissociated. A modest degree of filament disassembly in a stalled strand-exchange complex could therefore explain the bypass. In the facilitated DNA rotation model, continued rotation of the exchanging DNAs stalled at the insertion would directly translate into unwinding of the DNA in the insertion, as shown in Fig. 17. The RecA redistribution model predicts that the bypass of heterologous insertions requires dissociation of part of the adjacent filament, while maintaining the underwound state of the released DNA so that it can be translated into unwinding of the insertion. A linear filament of RecA protein, unconstrained and free to rotate at the ends, should be unable to promote bypass if the model is correct. However, RecA filaments formed on linear ssDNA promote an efficient bypass of heterologous insertions (K. MacFarland and M. M. Cox, unpublished results). In addltion, the rate of filament assembly of a RecA filament is generally faster than disassembly at the other end, making it unlikely that a gap in the filament large enough to facilitate the unwinding of 50 to 100-bp heterologous inserts would appear on a circular DNA substrate with any reasonable probability. However, there is an exchange of RecA monomers between free and bound forms that is observed when the RecA is bound to duplex DNA or during DNA strand exchange (228; Q. Shan and M. M. Cox, unpublished results).This could signal the kind of RecA redistribution required by the model under strand-exchange conditions, and an evaluation of the role of this exchange process must await a definition of its mechanism. The facilitated DNA rotation model has the unique virtue of providing a quantitative accounting of the ATP hydrolyzed by RecA protein during DNA strand exchange. There are two relevant calculations. First, to bring about a 360" rotation by the facilitated rotation mechanism (298), every RecA monomer in the filament would hydrolyze 1ATP. In a typical filament containing 2000 RecA monomers, this translates into 2000 ATPs to move the DNA branch by 18 bp, or 100 ATPs per base pair of hybrid DNA created. This is the efficiency observed under optimal conditions in uitro. The evident inefficiency can be rationalized as an energetic surplus that permits the bypass of structural barriers during repair. Notably, DNA repair processes tend to be profligate consumers of chemical energy (26). Consider, for example, the chemical energy invested in the degradation and replacement of a strand of DNA, 1 kb or more in length, to repair one DNA mismatch (300). Second, if ATP hydrolysis and DNA rotation are coupled, the rate of branch movement should have a predictable relationship to the rate of ATP hydrolysis. A given monomer d l hydrolyze ATP every time the external DNA molecule passes, so that each hydrolytic event corresponds to one ro-
200
ALBERT0 I. ROCA AND MICHAEL M. COX
tation. During strand exchange, each monomer hydrolyzes about 20 ATPs per minute. The resulting 20 rotations should move the branch 20 X 18 bp, or 360 bp. This predication has been tested in a detailed study of the temperature dependence of the DNA strand-exchange reaction and the ATP hydrolysis that accompanies it. Arrhenius plots were constructed for strand exchange and RecA-mediated ATP hydrolysis during strand exchange, based on experiments carried out from 25” to 45 (233).The plots were linear and parallel over the entire temperature range, providing another piece of evidence consistent with a direct coupling of ATP hydrolysis to the last phase of DNA strand exchange (233).Perhaps more sigdicantly, the distance between the lines corresponded to a factor of about 18 bp. The role of All’ hydrolysis in the DNA strand-exchange reaction remains a controversial topic. However, the potential exists for the redefinition of the RecA ATPase activity in terms of a novel motor function in recombinational DNA repair. Resolution of the various issues remaining in defining the function and mechanism of ATP Rydrolysis will continue to be a challenge.
D. Exchange Reactions with Four DNA Strands ATP hydrolysis is required for the promotion of a reciprocal DNA strand exchange involving four DNA strands. This is perhaps the observation most difficult to rationalize in terms of the RecA redistribution proposal. If a RecA filament can bind only three DNA strands within its major groove, and the filament cannot promote a four-strand DNA exchange reaction without ATP hydrolysis, redistribution of the monomers within the filament is unlikely to produce a different result. The facilitated DNA rotation model provides a mechanism for the promotion of a four-strand exchange reaction. Once pairings were initiated in the single-strand gap as a three-strand reaction, as required, homologous DNA in the flanking four-strand regions would effectively be aligned as well. DNA rotation as depicted in Fig. 16 would inevitably spool one strand of the duplex region and the gapped DNA substrate out of the filament ahead of the point where a strand was being brought into the filament from the second duplex substrate. By this mechanism, only two DNA strands would occupy the interior groove of a RecA filament at any point during the four-strand portion of an exchange reaction.
V. Interaction of RecA Protein with Other Proteins The RecA protein promotes only a few stages of recombinational processes in which over a dozen other proteins may take part. The list below is admittedly selective and focuses on proteins with a demonstrated effect on
ReCA PROTEIN
201
RecA activities in vivo and/or in vitro. The discussion is also limited to proteins involved in the recombinational functions of RecA protein. A particularly complete listing and description of bacterial recombination functions appears in another review (85).
A. The Single-Strand DNA Binding Protein Some effects of the single-strand DNA binding protein (SSB) on RecA protein binding to ssDNA have already been noted. SSB and RecA compete for binding sites on the DNA. SSB inhibits the nucleation of RecA filaments, but not extension. When RecA protein is added before or concurrently with SSB, some RecA protein will bind to ssDNA regions lacking DNA secondary structure. The SSB will melt out regions of secondary structure, then will be displaced from the DNA by RecA filament extension (210, 211).A relatively nonspecific binding to ssDNA is the only biochemical activity of SSB. The ssb gene was first detected by virtue of the effects of gene defects on DNA replication (301).Several mutant alleles have been studied. In addition to effects on DNA replication, ssb mutants also confer a sensitivity to DNAdamaging agents such as W light, and a reduction in recombination proficiency (302).One allele, ssb-3, confers a UV sensitivity approaching that observed for recA deletions (303).The wild-type gene encodes a protein with 177 amino acids with a combined molecular weight of 18,843 (304). The DNA-binding properties of SSB have been reviewed in detail (305, 306). SSB generally binds to DNA as a tetramer. At least four distinct binding modes have been detected, with transitions between them occurring as a function of solution conditions. The various SSB species have binding site sizes ranging from 35 to 65 nucleotides of ssDNA. The species predominating at low salt concentrations and relatively high SSB binding densities is SSB,,. The smaller site size in this DNA binding mode reflects an interaction of only two of the tetramer subunits with the DNA. This form of SSB is characterized by a high level of cooperativity in DNA binding (305,306),giving rise to a fairly uniform coating of SSB on the DNA (307).The DNA contour length is reduced by about 60% when complexed with SSB,, (307).At very high salt concentrations or low SSB binding densities, the SSB,, species predominates. This form is characterized by a more limited cooperativity reflecting interactions between two tetramers to form octamers, but no higher order structures (308).In the electron microscope, the octamers give rise to a ‘beads on a string” appearance to the SSB-DNA complexes (307).The contour length of the ssDNA is reduced about 78% when complexed to SSB,, (307).A prominent intermediate DNA binding mode is SSB,,, with many properties similar to SSB,,, including a beaded appearance on ssDNA when viewed in the electron microscope. Another species detected in some studies, SSB,,, has not been characterized in detail. SSB,, is the predominant
202
ALBERT0 I. ROCA AND MICHAEL M. COX
species under the conditions used for most in vi&o experiments with RecA protein (309),and is the species responsible for the enhancement of RecA protein filament formation (307, 310, 311).Bound RecA protein tends to be displaced by SSB under conditions favoring the formation of SSB,, (311). The maintenance of contiguous extended RecA filaments (hydrolyzing ATP) on random sequence ssDNA requires a continuous presence of SSB (311).This could be explained by a cycle of partial RecA dissociation followed by SSB-facilitated rebinding. Alternatively, it could reflect a stabilizing interaction of SSB with RecA filaments. The former explanation is favored by the observation that other single-strand DNA-bindmg proteins that should not interact with RecA protein, including the gene-32 protein of bacteriophage T4 and even the yeast protein yRPA, will facilitate RecA filament formation. In addition, there is no indication that SSB is present in RecA filaments formed in the presence of SSBwhen visualized by electron microscopy (307,312;R. B. Inman and M. M. Cox, unpublished results). However, it has already been noted that RecA filaments on ssDNA exhibit little exchange between free and bound RecA monomers in the presence of SSB, as might be expected in the dissociatioxvrebindingmodel. A direct and apparently specific interaction between SSB and RecA protein is suggested by changes in the intrinsic fluorescence of SSB (234) and by immunoprecipitation studies (310).In adhtion, the SSB that promotes the formation of RecA filaments in one reaction mixture is not immediately available to promote formation of new RecA filaments when combined with a second RecA-ssDNA reaction mixture incubated without SSB (M. M. Cox, unpublished results). The latter experiments suggest that SSB is sequestered in a weak interaction with RecA filaments. We postulate that a relatively weak interaction between SSB and RecA filaments exists that is subject to disruption with the methods used to spread RecA protein for electron microscopy. The functional sigmficance of such an interaction is not clear. It must be inconsequential for the formation of RecA filaments in vitro, because other single-strand binding proteins can substitute for SSB in this function. Many studies indicate that the elimination of secondary structure in the ssDNA is the unique function of SSB in presynapsis. Once RecA filaments were formed, an SSB interaction with them might have a modest effect in preventing the dissociation of RecA monomers in the interior of RecA filaments. Indeed, these filaments undergo a gradual partial disassembly when SSB is removed (311).However, the most likely role of an SSB-RecA interaction would come in the later stages of DNA strandexchange reactions. SSB binds to the displaced single strand as it is generated and tends to prevent secondary DNA pairing reactions involving it (297). In vitro, an interaction of SSB with the RecA filament might not provide an
203
ReCA PROTEIN
enhancement of SSB binding to the displaced DNA strand. However, an advantage might be realized in the more complex in vivo environment if the same RecA-SSB interaction helped sequester SSB and position it for binding to the displaced DNA strand. Over a billion years of coevolution has provided ample opportunity for the development of both weak and strong interactions between proteins involved in recombinational processes, to better organize events in recombinational DNA repair.
B. The RecF,
RecO, and RecR Proteins
RecA protein may never act by itself as a uniform filament of RecA monomers in the cell. Evidence is accumulating that a number of proteins not only affect the assembly and disassembly of RecA filaments, they also remain intimately associated with the filament and may affect its function. Primary candidates at present are the RecF, RecO, and RecR proteins. The RecO and RecR proteins have already been shown to interact with RecA filaments in a way that facilitates assembly and blocks dlsassembly. The recF gene was first detected in 1973 (152)as a UV-sensitive,recombination-deficient mutant in a recBC sbcBC background. The gene is contained in an operon that also includes the dnaN gene, an arrangement that seems to suggest some link with replication. It is speculated that RecF is part of a system that diverts ssDNA from being a template for replication to being a substrate for recombination (27). The RecO and RecR proteins described below are included in this hypothesis. The sequenced recF gene encodes a 357-amino acid polypeptide (40.5 kDa). The RecF protein has been purified by at least four research groups and characterized in vitro (313-318). The protein contains a consensus nucleotide-binding fold (Walker A box). RecF protein alone has no stimulatory effect on any RecA activity tested to date. At sufficiently high concentrations, it inhibits RecA binding to DNA and RecA protein-mediated DNA strand exchange (314, 316). The RecF protein possesses a very weak ATP hydrolytic activity (kcat about 0.3 min-') (318). In uivo, recF mutant bacteria exhibit a delayed activation of the SOS response that might reflect slow formation of the RecA filaments required to facilitate LexA cleavage (319).The E . coli strains in which SSB is overexpressed exhibit a recF-like phenotype (320).Mutations in recFare suppressed by certain recA mutations, such as recA441 and recA803 (319, 321). The RecA441 (previously tif) and RecA803 proteins exhibit an enhanced capacity to displace SSB and bind ssDNA in vibro (322, 323).This suggests some sort of in vivo interaction between RecF protein and RecA protein, although the function of RecF has not yet been elucidated. The rec0 gene was identified in 1985 (324).It is situated in a operon with the rnc gene, which encodes ribonuclease 111, and the era gene, which en-
2 04
ALBERT0 I. ROCA AND MICHAEL M. COX
codes a GTF-bindingprotein with sequence similarities to the yeast RAS proteins (325). The sequenced rec0 gene encodes a protein with 242 amino acids (26,000 Da) (326, 327). A predicated nucleotide-binding fold in the RecO protein (326) is not conserved among the limited number of other RecO sequences available (A. I. Roca, unpublished results). No RecO-mediated ATP hydrolysis has been detected. The purified RecO protein has been characterized in several studies (227, 316, 317, 328, 329). It binds to both ssDNA and dsDNA and behaves as a monomer in solution. It promotes an ATP-independentrenaturation of complementary DNA strands (328).A weak assimilation of ssDNA fragments into homologous supercoiled duplexes to form D-loops is catalyzed by RecO protein (329). The recR gene was identified in 1989 (330,331).It is located on the E. coli chromosome near the dnuX gene and shares an operon with a small open reading frame of unknown function called 4 - 1 2 . The sequenced gene encodes a 201-amino acid protein with a predicted molecular mass of 21,965 Da. A RecR sequence alignment with the two other available RecR sequences (A. I. Roca, unpublished results) does not show a canonical Walker A box, contrary to the reported existence of a nucleotide-binding fold (332).There is a highly conserved Cys cluster (CXXCX~CXXC), which may chelate zinc as in other proteins ( 3 3 2 ~ Zinc ) . fingers are required for DNA damage recognition by UvrA (333).The RecR protein migrates in SDS-PAGE as a 26-kDa protein. The purified protein has been studied in concert with the RecF and RecO proteins (227, 316-328). The RecR protein from Bacillus subtih has also been purified (332).The B. subtilis RecR protein binds to dsDNA, and the binding is stimulated by DNA damage, ATP, and divalent cations. The B. subtilis and E. coli genes exhibit a 43% sequence identity. Several lines of evidence indicate that the RecF, RecO, and RecR proteins function at the same stage of recombination, possibly as part of a single complex. The phenotypes of mutations in the three genes are very similar, d e b ing them as an epistatic group (27,334).Mutations in all three genes are suppressed by recA441 and recA803 mutations (335).In addition, a gene in bacteriophage X, called orJ which can replace re& rec0, and recR functions in lambda recombination has been identified (336). An interaction has been detected between the RecF and RecR proteins in witro. In the presence of ATP, RecF protein binds to dsDNA very weakly, and RecR protein does not bind to dsDNA by itself at all. When both proteins are added to a reaction together, RecF protein binds to dsDNA much more readily and appears to coat the DNA if enough protein is present (318). RecR protein is present in these complexes. The ATPase activity of RecF pro-
ReCA PROTEIN
205
tein is also stimulated two- to threefold by RecR protein under some conditions (318).The functional significance of these reactions has not yet been worked out. Alone, RecF binds preferentially to gapped DNA ( 3 3 6 ~ ) . The RecF, RecO, and RecR proteins have been examined together in uitro in two studies (316,317).The RecO and RecR proteins, either with or without RecF protein, stimulated RecA protein binding to ssDNA coated with SSB. None of the proteins was active on its own. Although RecF protein was not required for the stimulatory effect, the presence of RecO and RecR seemed to nullify the inhibitory effect seen when RecF was used alone. RecO and RecR proteins remain bound to the DNA after formation of the RecA filament. The resulting filament has only 70% of the RecA of a normal filament, but it can still promote DNA strand exchange (317).Significant amounts of RecO, RecR, and SSB remained associated with RecA nucleoprotein filaments formed on ssDNA (317). The effects of RecO and RecR proteins do not end once the RecA filament is assembled. As already noted, the RecO and RecR proteins completely block the end-dependent disassembly of RecA filament from ssDNA (227).The proteins may remain bound as part of a joint filament with the RecA protein. Both proteins are required in amounts equivalent to about 1 monomer of each for every 50 monomers of RecA protein (227).This may reflect a kind of regulatory function for RecO and RecR. At a stalled replication fork where recombinational DNA repair is required, the ssDNA will generally be bound with SSB. The RecO and RecR proteins greatly facilitate the binding of RecA to an SSB-bound DNA substrate, and they prevent disassembly once bound. This may give the filament a sufficient lifetime to carry out an efficient DNA strand-exchange reaction to effect repair (see Fig. 1).The filament must eventually be disassembled and the replication fork restarted. Recent genetic studies indicating overlapping functions for the PriA protein (involved in the assembly of the replication primosome) and RecF protein are consistent with a role for RecF in the reassembly of an active replication fork (337).Additional work on priA has served to highlight the important role of DNA replication in many recombinational processes (338). A DNA repair perspective may be especially useful in constructing hypotheses for the function of the RecF, RecO, and RecR proteins that can be addressed experimentally. Mutations of the r e d , rec0, and recR genes exhibit a substantial defect in recombination only when they appear in recBC sbc genetic backgrounds (27, 85, 217). Single mutations in some of these genes do affect plasmid recombination, but nearly normal levels of conjugational DNA recombination can be achieved without them. However, cells that contain mutations in any of these genes in an otherwise wild-type background are sensitive to DNA-damaging agents such as ionizing radiation,
206
ALBERT0 I. ROCA AND MICHAEL M. COX
mitomycin C, or UV light (152, 331, 339-343). All of these gene products appear to have a fundamental role in DNA repair.
C. The RuvA and RuvB Proteins Some mechanism must exist for the disassembly of RecA filaments in vivo, especially if they are stabilized by the RecO and RecR proteins. RecA filaments may be displaced from the DNA by the combined action of two other proteins, the RuvA and RuvB proteins. The first m v mutants were isolated in 1974 (344).Two separate genes, mvA and ruvB, were subsequently found at the ruv locus in a single, L e d regulated operon (345).The ruvC gene is nearby. The ruvA gene encodes a 203-amino acid polypeptide with a molecular mass of 22 kDa. The protein migrates in SDS-PAGE at the position expected for a protein of 27 kDa. The mvB gene encodes a 306-amino acid polypeptide of 3 7 kDa; the sequence includes a consensus nucleotide-binding fold. As with the RuvA protein, RuvB migrates in SDS-PAGE at a position expected for a slightly larger protein (41 kDa). The fact that these proteins are induced during SOS provides another suggestive link with DNA repair. The RuvA protein structure has been determined (3454. The RuvA and RuvB proteins were first purified by the Shinagawa and West groups (346, 3477, and some of their in vitro activities have been described. The proteins act together to process branched DNA strand-exchange intermediates created by the RecA protein. RuvA protein binds to DNA, binding most tightly to Holliday intermediates and cruciform structures (348).One proposed role is that of a molecular matchmaker (348, 349), targeting the RuvB protein to the DNA. RuvB protein exhibits a DNA-dependent ATPase activity and catalyzes the branch migration of Holliday intermediates (346, 348, 350,351).Both activities are stimulated by RuvA protein. The requirement for RuvB protein decreases when RuvA protein is present, and the optimal conditions for ATP hydrolysis or branch migration change. Reported rates of ATP hydrolysis are greater when RuvB is bound to circular dsDNA, suggesting that it is coupled to a processive movement along the DNA (352,353).RuvB protein binds weakly to ssDNA or dsDNA, but the binding can readily be measured in the presence of high (> 10 mM) levels of Mg2 . RuvB protein is also a helicase (351, 354), although detectable helicase activity is observed only in the presence of RuvA protein. RuvB protein moves 5' to 3' along one DNA strand that is partially single stranded, displacing a short complementary strand. Complementary strands nearly 600 nucleotides long can b e displaced, although the efficiency falls off more or less linearly as the length of the strand increases from 100 to 600 nucleotides (354).No unwinding of fully duplex DNA has been observed, even for duplexes as short +
ReCA PROTEIN
207
as 50 bp. The helicase is thought to have an unspecific mechanistic role in the branch migration activity. The helicase activity is enhanced at DNA branch points (351). Under some conditions, the RuvB protein binds to dsDNA as a dodecamer (440kDa), forming back-to-back hexameric rings with D, symmetry (355). The DNA takes a path through the center of the fused rings, much like the situation with the p subunit of DNA polymerase I11 (356).The openings at either end of this donutlike structure have a diameter of about 20 A.More recent results indicate that the active form of RuvB protein is hexamer (352). The functional structure may be similar to that of some other helicases, such as the SV40 T-antigen (357, 358), the rho transcriptional terminator (359-364, and the DnaB protein (356). An analysis of the RuvB ATPase activity indicates the presence of nonequivalent ATP hydrolytw sites within a RuvB hexamer (353).ATP hydrolysis in individual subunits is likely to be coupled to conformation changes that modulate the interaction of each subunit with DNA so as to effect movement along the DNA. Overall, the RuvA protein is thought to bind to branched recombination intermediates and facilitate the binding of RuvB. The RuvB protein (or a RuvAB complex) then promotes a branch migration reaction that has been advanced as the major RuvAB function in viva. EM observations and other work have led West and colleagues to propose a model in which a RuvA protein is bound to a Holliday junction, sandwiched between two RuvB hexamers (366,367). In the model, two branches of the Holliday structure are propelled outward through the hexameric RuvB rings to promote branch migration (366, 367). These proteins may also modulate in some way the binding and cleavage of Holliday intermediates by RuvC protein (348,368). When added to RecA protein-mediated DNA strand-exchange reactions, the RuvA and RuvB proteins greatly enhance the bypass of heterologous insertions in the duplex DNA substrate. The combined action of these proteins permits significantbypass of inserts that are 1kbp in length (369).When barriers that cannot be bypassed are encountered, the RuvA and RuvB proteins promote an efficient reversal of the DNA strand-exchange reaction carried out by RecA protein (370).Based on in tiitro results, RuvA and RuvB could act as antirecombinase activities under some conditions, reversing potentially deleterious intrachromosomal recombinational events (370). Although the existence of a protective antirecombination pathway has been postulated (374, the idea that RuvA and RuvB might be involved has not been tested in uivo. It has been directly demonstrated that the RuvA and RuvB proteins will displace RecA filaments from dsDNA under some conditions (372).This may be a general pathway for RecA filament disassembly in the cell, but many questions remain about where RecA protein leaves off and RuvA and RuvB take over in a recombinational process. The issue is complicated by the pres-
208
ALBERT0 I. AOCA AND MICHAEL M. COX
ence of another protein in E. coli, RecG protein, with functions that at least partially overlap those of RuvA and RuvB (368,373, 374).
D. Other Proteins Affecting RecA Activities A number of other proteins can affect RecA protein-mediated DNA strand-exchange reactions. Kowalczykowski and colleagues have reconstituted a DNA pairing reaction that is dependent on the concerted action of the RecBCD enzyme, SSB, and RecA proteins (213, 214, 375, 376).The reaction is also dependent on the recombinational hotspot sequence, chi. The system provides an important model system for the presumed early steps in the double-strand break recombinational repair pathway. RecA function in vitro is also affected by some nucleases. RecJ is a 63kDa protein with a 5' to 3' exonuclease activity (377, 378).It is involved in the RecF and RecE pathways for genetic recombination (379-381).Under some conditions, the RecJ exonuclease enhances the RecA protein-mediated strand exchange by degrading the displaced DNA strand (382).Exonuclease I is a 3' to 5' exonuclease specific for ssDNA (383).Some mutant alleles of the gene encoding exonucleaseI (sbcB)suppress the recombination-deficient phenotype of recBC mutations (384). In concert with the RecA protein, exonuclease I will facilitate a DNA strand-exchange reaction with a polarity opposite to that normally promoted by RecA protein (54).As with RecJ, the reaction is accompanied by digestion of the displaced DNA strand. Exonuclease I copurifies with RecA protein through a wide range of chromatographic resins, suggesting a rather strong association of the two proteins (54).However, there has been no demonstration that the interaction is specific, and no function has been suggested or demonstrated.
VI. Other Functions of R e d Protein in Vivo The role of RecA protein in DNA metabolism is remarkably diverse. In its various capacities, RecA protein exhibits enzymatic activities not described above and interacts with a number of proteins not linked with recombination. These other functions are summarized briefly below.
A. The RecA Coprotease Function In the early 1960s and early 1970s, a series of observations had suggested the existence of an elaborate pathway for DNA repair and mutagenesis that was induced by DNA damage (385, 386). The observations and ideas were unified under the SOS hypothesis in 1974 by Radman (387).The first activity RecA protein was associated with in vitro was the cleavage of the bacteriophage h repressor as a biochemical step in the induction of the bacteri-
ReCA PROTEIN
209
al SOS response (388).Later, it became clear that the LexA repressor was inactivated by RecA-mediated proteolybc cleavage as well (389). Repressor cleavage was also the first functional assay used to guide the purification of RecA protein (388). The SOS induction pathway has been worked out in some detail (390-393). High levels of DNA damage cause an interruption of replication. The single-strandedDNA created as one result of the interruption is the primary signal for SOS induction (47).RecA protein binds to the ssDNA, perhaps aided by the RecF, RecO, and RecR proteins. The bound RecA protein assumes the extended filament conformation in the presence of ATP, and this filament form facilitates an autocatalybc cleavage of the LexA repressor, the b a c teriophage A repressor, and a few other proteins. The result is a general induction of over 20 genes repressed by LexA and induction of any bactericphage A lysogens present. The induced gene products include a variety of enzymes with functions in DNA repair, replication, recombination, and a specialized pathway for mutagenesis. RecA protein is induced as an SOS function. Facilitating the autocatalytic cleavage of LexA protein involves the binding of LexA deep within the groove of a RecA filament (394). The binding site generally spans two adjacent RecA monomers (394). The form of RecA protein bound on ssDNA has often been referred to as an activated species, or RecA* (390-393). It is likely that this RecA species is identical to the extended RecA filament involved in recombinational processes (394). The use of ssDNA as the molecular signal for SOS induction may be a simple manifestation of the fundamental DNA binding properties of RecA protein. The propensity of RecA to bind ssDNA almost exclusively under physiological conditions serves to target RecA protein to ssDNA gaps created by the interruption of replication. There is evidence that RecA protein mutants that bind more readily to dsDNA exhibit a phenotype in which SOS is constitutively induced (131,394, presumably because RecA protein is constantly and nonselectively bound to DNA as an extended filament. Binding of LexA in the groove of a RecA filament blocks DNA strand exchange by excluding the dsDNA substrate (39547).
B. SOS Mutagenesis Heavy DNA damage leads to the induction of a specialized system for mutagenic DNA lesion bypass as a part of the SOS system (392,396-398). The existence of the system can be rationalized as a pathway permitting replication restart in an environment where replication would otherwise be impossible (398).The deleterious mutagenic effects of replicative bypass of lesions would be balanced by the survival of a few cells that would otherwise perish. The pathway may also permit a resistance to DNA lesion for which no effective repair pathways exist (393,398). RecA protein has at least three roles
210
ALBERT0 I. ROCA AND MICHAEL M. COX
in SOS mutagenesis. First, some of the required activities are induced as part of the SOS response. Second, at least one of the required proteins is proteolyticdy processed in a reaction facilitated by RecA protein. Finally, RecA protein participates directly in the lesion bypass. Replicative lesion bypass in SOS mutagenesis requires the combined activities of proteins, including the UmuC, UmuD', and RecA proteins, along with DNA polymerase I11 (398, 399). UmuD' is a processed form of the UmuD protein brought about by a RecA-facilitated proteolyhc cleavage (400-402).Several studies have provided evidence that RecA protein has a direct role in this process (403-409, and an in vitro replication bypass requiring all of these proteins has been demonstrated (399).However, the precise molecular role of RecA in this process remains obscure.
C. Chromosome Partitioning The abnormal chromosome complement in cells lacking recA function has led to a proposal that RecA protein has a role in the proper partitioning of chromosomes at cell division (406).Part of the role of RecA in partitioning may be indirect, mediated by the protection of chromosomes from the nuclease digestion that OCCUTS in the absence of RecA (407).RecA could also be required to produce in concatemers of daughter chromosomes the tension needed for partitioning (406),mirroring the need for genetic recombination for the proper partitioning of chromosomes in cell division in eukaryotes.
D. Induced Stable DNA Replication DNA damage brings about elevated levels of DNA replication that are not dependent on initiation at the bacterial replication origin, oriC. This phenomenon is called induced stable DNA replication, or iSDR (338,408-410). The replication is closely linked to recombination functions and recombinational DNA repair (338,408-410).To some extent, it may reflect replication that must accompany many of the kinds of recombinational DNA repair processes outlined in Fig. 1.It has been suggested that single-strand 3' ends involved in an early stage of recombinational processes might, after being paired with a new complementary strand, act as primers for this DNA synthesis (411).
VII. Epilogue: Relating RecA Biochemistry to DNA Repair Many aspects of RecA enzymology remain controversial.The DNA pairing pathway is still in need of molecular definition, and the role of ATP hydrolysis continues to be debated. Another key challenge for the future is the
ReCA PROTEIN
211
elucidation of the collaborative roles of the various recombination proteins. In particular, a better understanding is needed of where one activity leaves off and another takes over. These issues are interrelated. For example, it can be argued that a hybrid DNA extension reaction coupled to ATP hydrolysis would in any case be of limited physiological significance (85, 348, 412). The RuvA and RuvB proteins promote the extension of hybrid DNA (a DNA branch migration) significantly faster than RecA protein, and they promote the bypass of long heterologous insertions that cannot be bypassed by RecA alone (369). There seems little need for RecA in this capacity. At a minimum, it is clear that the ATPase activity of RecA protein is biologically important, in spite of the existence of a range of proteins that promote DNA strand exchange without ATP (5).The ATP binding site in RecA protein is the most conserved part of the protein structure. Also, the RecA K72R mutant protein, which binds but does not hydrolyze ATP and promotes a significant strand-exchange reaction in vitro, produces a recA null phenotype in vivo (113;R. Devoret, personal communication). We have postulated that RecA-mediated ATP hydrolysis provides a molecular motor that augments an intrinsic ATP-independent DNA strandexchange process, rendering it unidirectional and capable of barrier bypass in the context of efficient recombinational DNA repair. The prediction arises that it should be possible to find recA mutants that confer a repair-deficient, recombination-proficient phenotype. Such mutants would have an intact strand-exchange capability within the filament, but the motor required to direct this process past DNA lesions during recombinational DNA repair would be compromised. Searching for such mutants is complicated by the fact that RecA protein is required to induce the SOS response to DNA damage. A repair deficiency can therefore result from a defect in the coprotease activity rather than a defect in recombinational repair. The mutant in question should therefore confer a recombination-proficient, repair-defective,and coprotease-constitutivephenotype (or the mutant should retain the repair deficiency when the SOS response is induced by other genetic devices). One mutant of this kind has been found and characterized (413).It is designated RecA423, or R169H (in a-helix F). Cells containing recA423 are nearly as proficient in conjugational recombination, transductional recombination, and recombination of A red- gam- phage as are wild-type cells. At the same time, the mutant cells are deficient in intrachromosomal recombination and nearly as sensitive to UV irrahation as are rgcA deletion strains. RecA423 is coprotease constitutive, so the repair defect does not reflect a lack of SOS repair functions. In vitru, the mutant protein exhibits a reduced capacity to bind DNA, and promotes DNA strand exchange and AT€' hy-
212
ALBERT0 I. ROCA AND MICHAEL M. COX
drolysis at much reduced rates. The strand exchange is blocked by short heterologous insertions in one of the DNA substrates (413).The results demonstrate that a RecA mutant protein with a si&icantly compromised DNA strand-exchange capacity can suffice completely in conjugational and transductional recombination. However, the structural and energetic requirements for recombinational DNA repair are clearly more stringent, because the mutant cells are almost completely devoid of recombinational DNA repair function. The work provides experimental support for the idea that many of the biochemical features of the RecA protein evolved to address the requirements of DNA repair. It also argues that RuvA and RuvB do not compensate for a reduced RecA strand-exchange function in DNA repair. The models for recombinational DNA repair presented in Fig. 1are quite similar to current models drawn for homologous recombination carried out to generate genetic diversity. A focus on the mechanisms of conjugational and transductional recombination can then be viewed as a means of elucidating the mechanisms of homologous genetic recombination in all contexts. However, the biological context within which research is carried out can greatly affect the outcome of the research, and a focus on recombination as a means to generate genetic diversity has not always been fruitful. A decadeslong and fruitless search for four-stranded DNA pairing intermediates in recombination was undertaken. Work on RecBCD enzyme and RecA protein proceeded at a fast pace in the late 1970s and 1980s, while many other interesting bacterial recombination functions were left in relative obscurity over this same period, perhaps due in part to their perceived involvement in secondary recombination pathways. The ATPase activity of RecA protein is still evaluated in the context of the minimal energetic requirements for homologous DNA strand exchange evident in the conjugational recombination pathway. In thinking about the function of proteins involved in recombination, and their mechanisms of action, a recombinational DNA repair paradigm is likely to have a predictive value superior to that provided by a paradigm centered on conjugational and transductional recombination. An understanding of protein functions in recombinational DNA repair should facilitate a more rapid advance in the understanding of all recombination processes.
ACKNOWLEDGMENTS The authors thank Randy Bryant,John Clark, Howard Gamper, Richard Michod, and John Roth for helpful discussions; and Randy Bryant, Robert Schleif,William Reznikoff, Kenji Adzuma, Mikael Kubista, Howard Gamper, Masayda Takahashi,and Brian Cali for reading and commenting on parts of the manuscript. We appreciate the help of John Clark, Steven Sander,
213
ReCA PROTEIN
Stephen Kowalczykowski, Howard Gamper, Randy Bryant, John Roth, Dan Camerini-Otero, and Stephen West in communicating results prior to publication. The model in Fig. 17A was based on ideas communicated to MMC by Stephen Kowalczykowski during a meeting in Seillac, France, during the summer of 1994. We also thank the following people for communicating unpublished sequences: John Coleman, Peter Emmerson, Sumiko and Masayori Inouye, Skorn Mongkolsuk, Wai Mun Huang, Joanne L. Johnston, Joan Sloan, and Julian Rood. We thank A. John Clark and Steve Sandler for communication of the Archaea RadA sequence prior to publication.Lrr Edelman and Mike Gribskov helped with technical aspects of the sequence analysis; Mike Hogan and Don Thomson provided computer programming assistance; and David Goodsell, Jean-Yves Sgro, Adam Steinberg, and Gary Wesenberg helped with computer graphics. Finally, we gratefully acknowledge the researchers who pubhshed RecA protein sequences and, in particular, those who verified their data for this work. This work was supported by NIH grant GM32335.
REFERENCES 1. C. M. Fraser, J. D. Gocayne, 0. White, M. D. Adams, R. A. Clayton, R. D. Fleischmann, C. J. Bult, A. R.Kerlavage, G. Sutton, J. M. Kelley, et al., Science 270,397 (1995). 2. S. Sander, L. H. Satin, H. S . Samara and A. J. Clark, NARes 24,2125(1996). 2a. C. R. Woese, 0. Kander and M. L. Whellis, PNAS 87,4576 (1990). 3. T. Ogawa, X. Yu, A. Shinohara and E. H. Egelman, Science 259,1896 (1993). 4. R. M. Story, D. K. Bishop, N. Kleckner and T. A. Steitz, Science 259,1892 (1993). 5. S. C. Kowalczykowski and A. K. Eggleston, ARB 63,991 (1994). 6. A. Campbell, CSHSQB 49,839 (1984). 7. J. F. Crow, in “The Evolution of Sex: An Examination of Current Ideas” (R. E. Michod and B. R. Levin, eds.),p. 56. Sinauer Assoc., Sunderland, Massachusetts, 1988. 8. M. T. Ghiselin, in “The Evolution of Sex: An Examination of Current Ideas” (R. E. Michod and B. R. Levin, eds.),p. 7. Sinauer Assoc., Sunderland, Massachusetts, 1988. 9. J. Maynard Smith, “The Evolution of Sex.”Cambridge Univ. Press, London and New York, 1978. 10. J. Maynard Smith, in “The Evolution of Sex: An Examination of Current Ideas” (R. E. Michod and B. R. Levin, eds.), p. 106. Sinauer, Assoc., Sunderland, Massachusetts, 1988. 11. H. J. Muller, Mutut. Res. I, 2 (1964). 12. J. Felsenstein, Genetics 78, 737 (1974). 13. D. A. Hickey and M. R. Rose, in “The Evolution of Sex: An Examination of Current Ideas” (R. B. Michod and B. R. Levin, eds.), p. 161. Sinauer Assoc., Sunderland, Massachusetts, 1988. 14. E. C. Dougherty, Syst. Zool. 4, I45 (1955). 15. H. Bernstein, BioScience 33,326 (1983). 16. H. Bernstein, F. A. Hopf and R. E. Michod, in “The Evolution of Sex: An Examination of Current Ideas” (R. E. Michod and B. R. Levin, eds.), p. 139. Sinauer Assoc., Sunderland, Massachusetts, 1988. 17. B. R. Levin, in “The Evolution of Sex: An Examination of Current Ideas” (R. B. Michod and B. R. Levin, eds.),p. 194. Sinauer Assoc., Sunderland, Massachusetts, 1988. 18. R. E. Michod, “Eros and Evolution: A Natural Philosophy of Sex.” Addison Wesley, Menlo Park, California, 1995. 19. N. A. Ellis, J. Groden, T.-Z.Te, J. Straughen, D. J. Lemon, S. Ciocci, M. Proytcheva and J. German, Cell 83, 655 (1995).
214
ALBERT0 I. ROCA AND MICHAEL M. COX
20. N. Irino, K. Nakayama and H. Nakayama, MGG 205,298 (1986). 21. K. Umezu and H. Nakayama,]MB 230,1145 (1993). 22. C.-E. Yu, J. Oshima, Y.-H. Fu, E. Wijsman, F. Hisama, R. Alisch, S. Matthews, J. Nakura, T. Miki, S. Ouais, G. M. Martin, J. Mulligan and G. D. Schellenberg,Science272,258 (1996). 23. H. Bernstein, H. C. Byerly, F. A. Hopf and R. E. Michod, Science 229,1277 (1985). 24. H. Potter and D. Dressler, in “The Recombination of Genetic Material” (K. B. Low, ed.), p. 217. Academic Press, San Diego, 1988. 25. M. M. Cox, MoZ. MicrobioZ 5, 1295 (1991). 26. M. M. Cox, BioEssays 15,617 (1993). 27. A. J. Clark and S. J. Sander, Crit. Reo. Microbiol. 20,125 (1994). 28. R. K. Selander and B. R. Levin, Science 210,545 (1980). 29. G. R. Smith, Cell 64, 19 (1991). 30. J. M. Smith,]. Hered. 84,326 (1993). 31. E. M. Park, M. K. Shigenaga, P. Degan, T. S. Kom, J. W. Kitzler, C. M. Wehr, P. Kolachana and B. N. Ames, PNAS 89,3375 (1992). 32. F. N. Capaldo, G. Ramsey and S. D. Barbour,]. Bact. 118,242 (1974). 33. R. Holliday, Genet. Rm. 5,282 (1964).
33a. A. Kuzminor, “Recombinational Repair of DNA Damage.” R. G. Landes Co., Austin, Texas, 1996. 34. S. C. West, E. Cassuto and P. Howard-Flanders,Nature (London) 294,659 (1981). 35. 3. W. Szostak, W. T. L. On; R. J. Rothstein and E W. Stahl, CeZl33,25 (1983). 36. L. S. Morse and C. Pauling, PNAS 72,4645 (1975). 37. A. G. Miguel and R. M. Tjmell, Biopkys.1.49,485 (1986). 38. T.C. Wang and K. C. Smith,]. Bad. 165,1023 (1986). 39. M. J. Daly, L. Ouyang, P. Fuchs and K. W. Minton,]. B a t . 176,3508 (1994). 40. K. W. Minton, Mol. Microbial. 13,9 (1994). 41. K. W. Minton and M. J. Daly, BioEssays 17,457 (1995). 42. V. Mattimore and J. R. Battista,]. Bact. 178,633 (1996). 43. W. Y. Feng, E. H. Lee and J. B. Hays, Genetics 129,1007 (1991). 44. J. B. Hays and J. G. Hays, Biopolymers 31,1565 (1991). 45. W. Y. Feng and J. B. Hays, Genetics 140,1175 (1995). 46. D. Touati, M. Jacques, B. Tardat, L. Bouchard and S. Despied,]. Bact. 177,2305 (1995). 41%. T. Galitski, Ph.D. Thesis, University of Utah, 1996. 47. N. Higashitani, A. Higashitani, A. Roth and K. Horiuchi, ]. Bact. 174, 1612 (1992). 48. M. M. Cox, K. McEntee and I. R. Lehman,]BC 256,4676 (1981). 49. T. Shibata, R. P. Cunningham and C. M. Radding,JBC 256,7557 (1981). 50. S. M. Cotterill, A. C. Satterthwait and A. R. Fersht, Bchem 21,4332 (1982). 51. J. G&th and C. G. Shores, Bchem 24, 158 (1985). 52. S. Kuramitsu,K. Hamaguchi,T. Ogawa and H. Ogawa,]. Biochem. (Tokyo)90,1033 (1981). 53. T. Ogawa, H. Wabiko, T. Tsurimoto,T. Horii, H. Masukata and H. Ogawa, CSHSQB 2,909 (1979).
54. W. A. Bedale, R. B. Inman and M. M. Cox, JBC 268,15004 (1993). 55. T. Simonson, M. Kubista, R. Sjoback, H. Ryberg and M. Takahashi,]. MoZ. Recog. 7,199 (1994).
56. L. J. Gudas and D. W. Mount, PNAS 74,5280 (1977). 57. J. H. Krueger and G. C. Walker, PNAS 8 4 1499 (1984). 58. N. Garvey, A. C. St. John and E. M. Witkin,]. Bact. 163,870 (1985). 59. C. Cazaux, F. Lanninat, G . ViUani, N. P. Johnson, M. SchnarrandM. Defais,]BC269,8246 (1994). 60. S. L. Brenner, A. Zlotnick and J. D. Griffith,JMB 204,959 (1988).
ReCA PROTEIN
215
61. M. Takahashi,JBC 264,288 (1989). 62. D. H. Wilson and A. S. Benight, JBC 265,7351 (1990). 63. S. L. Brenner, A. Zlotnick and W. F. Stafford,JMB 216,949 (1990). 64. R. M. Story, I. T. Weber and T. A. Steitz, Nature (London)355,318 (1992). 65. R. M. Story and T. A. Steitz, Nature (London) 355,374 (1992). 66. X. Yu and E. H. Egelman,JMB 227,334 (1992). 67. E. H. Egelman and A. Stasiak, Micron 24,309 (1993). 68. E. H. Egelman, Current Opin. Struct. Biol. 3,189 (1993). 68a. C. Ellouze, M. Takahashi, P. Wittung, K. Mortensen, M. Schnarr and B. Norden, EJB 233, 579 (1995) 69. M. Kimura and T. Ohta, PNAS 71,2848 (1974). 70. A. I. Roca and M. M. Cox, CRC Crit. Rev. Bwchem. Molec. Bid. 25,415 (1990). 71. J. E. Walker, M. Saraste, M. J. Runswick and N. J. Gay, EMBOJ 1 9 4 5 (1982). 72. E. F. Pai, U. Krengel, G. A. Petsko, R. S. Goody, W. Kabsch and A. Wittinghofer, EMBOJ 9,2351 (1990). 73. T. F. M. la Cour, J. Nyborg, S. Thinip and B. F. C. Clark, EMBOJ 4,2385 (1995). 74. J. A. Eisen,]. Mol. Euol. 41, 1105 (1995). 75. S. Karlin, G. M. Weinstock and V. Brendel,]. Bact. 177, 6881 (1995). 76. P. Sung, Science 265,1241 (1994). 77. D. K. Bishop, D. Park, L. Xu and N. Kleckner, Cell 69,439 (1992). 78. B. P. Rubin, D. 0. Ferguson and W. K. Holloman, McBioZ 14,6287 (1994). 79. T. Flores, C. A. Orengo and D. Moss, Protein Sci. 2,1811 (1993). 80. X. Yu, E. Angov, R.D. Camerin-Otero and E. Egelman, Biophys. J. 69,2728 (1995). 81. B. Honig and A. Nicholls, Science 268,1144 (1995). 82. M. Jonsson, U. Jacobsson, M. Takahashi and B. NordBn, JCS Faraduy Trans. 89, 2791 (1993). 83. T. Mikawa, R. Masui, T. Ogawa, H. Ogawa and S. Kuramitsu,JMB 250,471 (1995). 84. S. C. Kowalczykowski,Biochimie 73,289 (1991). 85. S. C. Kowalczykowski,D. A. Dixon, A. K. Eggleston, S. D. Lauder and W. M. Rehrauer, Microbiol. Rev. 58,401 (1994). 86. M. Saraste, P. R. Sibbald and A. Wittinghofer, Trends Biochem. Sci. 15,430 (1990). 87. A. Bairoch, NARes 19, 2241 (1991). 88. S. K. Liu, J. A. Eisen, P. C. Hanawalt and I. Tessman,J. Bact. 175,6518 (1993). 89. F. B. Perler, E. 0.Davis, G. E. Dean, F. S. Gimble, W. E. Jack, N. Neff, C. J. Noren, J. Thorner and M. Belfort, NARes 22, 1125 (1994). 90. E. 0.Davis, P. J. Jenner, P. C. Brooks, M. J. Colston and S. C. Sedgwick, Celt 71,201 (1992). 91. E. 0.Davis, H. S. Thangaraj, P. C. Brooks and M. J. Colston, EMBOJ 13, 699 (1994). 92. R. A. Kumar, M. B. Vaze, N. R. Chandra, M. Vijayan and K. Muniyappa, Bchem 35,1793 (1996). 93. W. Selbitschka,W. Arnold, U. B. Priefer, T. Rottschafer, M. Schmidt, R. Simon and A. Piihler, MGG 229, 86 (1991). 94. A. Shinohara, H. Ogawa and T. Ogawa, Ce2Z 69,457 (1992). 95. M. Gribskov, A. D. McLachlan and D. Eisenberg, PNAS 84,4355 (1987). 96. S. F. Altschul, W. Gish, W. Miller, E. W. Myers and D. J. Lipman, JMB 215,403 (1990). 97. J. Griffith and T. Formosa,JBC 260,4484 (1985). 98. X. Yu and E. H. Egelman,JMB 232, 1 (1993). 99. T. Formosa and B. M. Alberts,JBC 26l, 6107 (1986). 100. T. Yonesaki and T. Minagawa, EMBOJ 4,3321 (1985). 101. T. Kodadek, M. L. Wong and B. M. Alberts,JBC 263,9427 (1988). 102. H. Cerutti, M. Osman, P. Grandoni and A. T. Jagendorf, PNAS 89,8068 (1992).
216
ALBERT0 I. ROCA AND M I C m L M. COX
103. R. Cedergren, M. W. Gray, Y. Abel and D. Sankoff,,]. MoZ. Evol. 28,98 (1988). 104. R. Cheng, T. I. Baker, C. E. Cords and R. J. Radloff, Mutat. Res. 294,223 (1993). 105. J. C. Game, in “Yeast Genetics: Fundamental and Applied Aspects” (J. F. T. Spencer, D. Spencer and A. R. W. Smith, eds.), p. 109. Springer Publ., New York, 1983. 106. M. A. Resnick, in “Meiosis” (€? B. Moens, ed.), p. 157. Academic Press, New York, 1987. 107. A. J. Rattray and L. S. Symington, Genetics 139,45 (1995). 108. D. K. Bishop, Cell 79, 1081 (1994). 109. A. F. Neuwald, D. E. Berg and G. V. Stauffer, Gene 120, 1 (1992). 110. S. J. Sander, L. H. Satin, H. S. Samra and A. J. Clark, NARes 24,2 125 (1996). 111. C. R. Woese, 0. Kandler and M. L. Whellis, PNAS 87,4576 (1990). 112. M. Gribskov, in “Computer Analysis of Sequence Data, Part 11” (A. M. Griffin and H. G. Griffin, eds.),p. 247. Humana Press, Totowa, New Jersey, 1994. 113. J. T. Konola, K. M. Logan and K. L. Knight,JMB 237,20 (1994). 114. W. M. Rehrauer and S. C. Kowalczykowski,JBC 268,1292 (1993). 115. Q. Shan, M. M. Cox and R. B. Inman,JBC27l, 5712 (1996). 116. J. T. Konola, H. G . Nastri, K. M. Logan and K. L. Knight, JBC 270,8411 (1995). 117. M. Takahashi, M. Kubista and B. NordBn, Biochimie 73,219 (1991). 118. M. Takahashi and B. Nordh, A h . Biophys. 30, 1 (1994). 119. M. Kubista, T. Sirnonson, R. Sjoback, H. Widlund and A. Johansson, eds., “Towardsan Understanding of the Mechanism of DNA Strand Exchange Promoted by RecA Protein. Biological Structure and Function.” Proceedings of the Ninth Conversation, The State University of New York (R. H. Sarma and M. H. Sarmq eds.). Adenine Press, New York, 1995. 120. H. Kwumizaka, B. J. Rao, T. Ogawa, C. M. Radding and T. Shibata,NARes 22,3387 (1994). 121. M. M. Cox,JBC 270,26021 (1995). 122. H. G. Nasbi and K. L. Knight,JBC 269,26311 (1994). 123. R. V. Gardner, 0. N. Voloshin and R. D. Camerini-Otero,EJB 233,419 (1995). 124. V. A. Malkov and R. D. Camerini-Otero,JBC 270,30230 (1995). 125. 0.N. Voloshin, L. Wang and R. D. Camerini-Otero, Science 272,868 (1996). 126. Y. Wang and K. Adzuma, Bchem 35,3563 (1996). 127. K. Morimatsu and T. Horii, EJB 228,772 (1995). 128. W. M. Rehrauer and S. C. Kowalczykowski,JBC 2 7 1 11996 (1996). 129. A. Zlotnick and S. L. Brenner,JMB 209,447 (1989). 230. R. C. Benedict and S. C. Kowalczykowski,JBC 263,15513 (1988). 131. S. Tateishi, T. Horii, T. Ogawa and H. Ogawa, JMB 223,115 (1992). 132. B. V. Prasad and W. Chiu,JMB 193,579 (1987). 133. E. A. de Jong, D. J. van, B. J. Harmsen, G . I. Tesser, R. N. Konings and C. W. Hilbers,JMB 206,133 (1989). 134. A. Stasiak, E. Di Capua and T. Koller, JMB 151 557 (1981). 135. A. Stasiak and E. Di Capua, Nature (London) 299,185 (1982). 136. B. F. Pugh, B. C. Schutte and M. M. Cox,JMB 205,487 (1989). 137. D. F. Dombroski, D. C. Scraba, R. D. Bradley and A. R. Morgan, NARes 11,7487 (1983). 138. R. J. Thresher and J. D. Griffith, PNAS 87,5056 (1990). 139. S. K. Kim, B. Norden and M. Takahashi,JBC 268,14799 (1993). 140. R. V. Prigodich, J. Casas-Finet,K. R. Williams, W. Konigsberg and J. E. Coleman, Bchem 23,522 (1984). 141. M. I. Khamis, J. R. Casas-Finetand A. H. Maki, BBA 950,132 (1988). 142. S. Eriksson, B. NordBn and M. Takahashi,JBC 268,1805 (1993). 143. F. Maraboeuf, 0.Voloshin, 0. R. Camerini and M. Takahashi,JBC 270,30927 (1995). 144. T ‘I:Nguyen, K. A. Muench and F. R. Bryant,JBC 268,3107 (1993). 145. M. C. Skiba and K. L. Knight,JBC 269,3823 (1994).
ReCA PROTEIN
217
146. P. Howard-Flanders and L. Theiiot, Genetics 53, 1137 (1966). 147. N. S. Willetts and A. J. Clark, J. Bact. 100,231 (1969). 148. R. G. Lloyd and B. Low, Genetics 84,675 (1976). 149. L. N. Csonka and A. J. Clark, Genetics 93,321 (1979). 150, M. Sassanfar and J. W. Roberts,JMB 212,79 (1990). 151. A. J. Clark, J. Cell. Physiol. 70,165 (1967). 152. Z. Horii and A. J. Clark,JMB 80,327 (1973). 153. A. Bagg, C. J. Kenyon and G. C. Walker, PNAS 78,5749 (1981). 154. J. M. Weisemann, C. Funk and G. M. Weinstock,J. But. 160, 112 (1984). 155. D. G. Ennis, N. Ossanna and D. W. Mount,]. Bact. 171 2533 (1989). 156. S. D. Lauder and S. C. Kowalczykowski,JMB 234,72 (1993). 157. J. G. Webnur, D. M. Wong, B. Ortiz,J . Tong, F. Reichert andD. H. Gelfand,JBC269,25928 (1994). 158. K. M. Logan and K. L. Knight,JMB 232,1048 (1993). 159. K. Morimatsu, T. Horii and M. Takahashi, EJB 228,779 (1995). 160. K. L. Knight and K. McEntee,JBC 260,867 (1985). 161, S. Eriksson, B. Nordh, K. Morimabu, T. Ilorii and M. Takahashi,JBC 268, 1811 (1993). 162. E. Stole and F. R. Bryant,JBC 269, 7919 (1994). 163. W. P. Diver, N. J. Sargentini and K. C. Smith, k t . J. E d . BioZ. Ad. Stud. Physics. Chem. Med. 42,339 (1982). 163a. Y. Song and N. J. Sargentini,J. Bact. 178,5045 (1996). 163h. E. V. Koonin, R. L. Tatusov and K. E. Rudd, in “Escherichiu coli and Salmonella (F. C. Neidhardt, ed.), p. 2203. ASM Press, Washington, D.C., 1996. 164. M. M. Cox and 1. R. Lehman, ARB 56,229 (1987). 165. C. M. Radding, in “Genetic Recombination” (R. Kucherlapati and G. R. Smith,eds.),p. 193. American Society for Microbiology,Washington, D.C., 1988. 166. A. Stasiak and E. H. Egelman, in “Genetic Recombination” (R. Kucherlapati and G. R. Smith, eds.),p. 265. American Society for Microbiology,Washington, D.C., 1988. 167. J. Heuser and J. Criffith,JMB 210,473 (1989). 168. E. H. Egelman and A. Stasiak,JMB 191 677 (1986). 169. E. Di Capua, A. Engel, A. Stasiak and T. Koller,JMB 157,87 (1982). 170. B. Nordh, C. Elvingson,M. Kubista, B. Sjoberg, H. Ryberg, M. Ryberg, K. Mortensen and M. Takahashi,JMB 226, 1175 (1992). 171. M. C. Leahy and C. M. Radding, JBC 261,6954 (1986). 172. E. Di Capua and B. Miiller, EMBOJ. 6,2493 (1987). 173. E. H. Egelman and X. Yu, Science 245,404 (1989). 174. K. McEntee, G. M. Weinstock and 1. R. LehmanJBC 256,8835 (1981). 175. M. Amaratunga and A. S. Benight, BBRC 157,127 (1988). 1 7 5 ~ C. . Cazenave, M. Chabbert and J. J. Toulme, BBA 781,7 (1984). 176. S. L. Brenner, R. S. Mitchell, S. W. Momcal, S. K. Neuendorf, B. C. Schutte and M. M. Cox, JBC 262,4011 (1987). 177. J. P. Menetslo and S. C. Kowalczykowski,JBC262,2085 (1987). 178. J. P. Menetski and S. C. Kowalczykowski,Bchein 28,5871 (1989). 179. R. S. Mitchell, A. Zlotnick and S. L. Brenner, Biophys. J. 53, 220 (1988). 180. S. D. Lauder and S.C. Kowalczykowski,JBC 266,5450 (1991). 181. A. Zlotnick, R. S. Mitchell, R. K. Steed and S. L. Brenner,JBC 268,22525 (1993). 182. E. H. Egelman and A. Stasiak,JMB 200,329 (1988). 183. A. Stasiak, E. H. Egelman and P. Howard-Flanders,JMB202,659 (1988). 184. B. F. Pugh and M. M. Cox, JBC 262,1326 (1987). 185. B. F. Pugh and M. M. Cox,JMB 203,479 (1988).
218
ALBERT0 I. ROCA AND MICHAEL M. COX
186. S. C. Kowalczykowski,J. Clow and R. A. Krupp, PNAS 84,3127 (1987). 187. C. Lu, R. H. Scheuermann and H. Echols, PNAS 83,619 (1986). 188. M. Kojima, M. Suzuki, T. Morita, T. Ogawa, H. Ogawa and M. Tada, NARes 18,2707 (1990). 189. Y. H. Wang, C. D. Bortner and J. Griffith,JBC 268,17571(1993). 189a. S. K. Kim, M. Takahashi, B. Jernstrom and B. Nordkn, Carcinogenesis 14,311 (1993). 190. J. A. Blaho and R. D. Wells, JBC 262,6082 (1987). 191. J. A. Blaho and R. D. Wells, This Series 37,107 (1989). 192. J. Kim, J. Heuser and M. M. Cox,JBC 264,21848 (1989). 193. P. Cluzel, A. Lebrun, C. Heller, R. Lavery, J.-L. Viovy, D. Chatenay and F. Caron, Science 271, 792 (1996). 194. S. B. Smith, Y. Cui and C. Bustamante, Science 271,795 (1996). 195. M. Takahashi, M. Kubista and B. NordBn,]MB205, 137 (1989). 196. M. Kubista, M. Takahashi and B. NordBn,JBC 265,18891 (1990). 197. B. Nordkn, C. Elvingson, T. Eriksson, M. Kubista, B. Sjoberg, M. Takahashi and K. Mortensen,JMB 216,223 (1990). 198. K. Adzuma, Genes Dev. 6, 1679 (1992). 199. P. Wittung, B. Norden, S. K. Kim and M. Takahashi,JBC 269,5799 (1994). 200. P. Wittung, M. Funk, 8.Jemstrom, B. Nord6n and M. Takahashi,FEB S Lett. 368,64 (1995). 201. E. Di Capua, M. Schnarr and P. A. Timmins, Bchm 28,3287 (1989). 202. S. A. Chow, S. M. Honigberg and C. M. Radding,JBC 263,3335 (1988). 203. J. E. Lindsley and M. M. Cox,JMB 205,695 (1989). 204. T. Shibata, C. Das Gupta, R. P. Cunningham and C. M. Radding, PNAS 77,2606 (1980). 205. K. McEntee, G. M. Weinstock and I. R. Lehman, PNAS 77,857 (1980). 206. M. M. Cox and I. R. Lehman, PNAS 78,3433 (1981). 207. M. M. Cox and I. R. Lehman,JBC 257,8523 (1982). 208. M. M. Cox, D. A. Soltis, Z. Livneh and I. R. Lehman, CSHSQB 47, 803 (1983). 209. M. M. Cox, D. A. Soltis, Z. Livneh and I. R. LehmanJBC 258,2577 (1983). 210. K. Muniyappa S. L. Shaner, S. S. Tsang and C. M. Radding, PNAS Sl, 2757 (1984). 211. S. C. Kowalczykowski,J. Clow, R . Somani and A. Varghese,JMB 193,8 1 (1987). 212. S. C. Kowalczykowski and R. A. Krnpp,JMB 193,97 (1987). 213. D. A. Dixon and S. C. Kowalczykowski, Cell 66,361 (1991). 214. S. C. Kowalczykowski, Experientia50,204 (1994). 215. G. R. Smith, Experientia 50,234 (1994). 216. G. J. Phillips, D. C. Prasher and S. R . Kushner,J. Bact. 170,2089 (1988). 217. G. R. Smith, Cell 58,807 (1989). 218. A. J. Clark, Annu. Reu. Mimobid. 25, 437 (1971). 219. H. Razavy, S. K. Szigety and S. M. Rosenberg, Genetics 142,133 (1996). 220. J. C. Register 111and J. Griffith,JBC 260, 12308 (1985). 221. B. B. Konforti and R. W. Davis,JBC 265,6916 (1990). 222. B. B. Konforti and R. W. Davis,/BC 266,10112 (1991). 223. B. B. Konforti and R. W. Davis,JMB 227,38 (1992). 224. M. Dutreix, B. J. Rao and C. M. Radding,JMB 219,645 (1991). 225. S. L. Shaner and C. M. Radding,JBC 262,9211 (1987). 226. J. E. Lindsley and M. M. Cox,JBC 265,9043 (1990). 227. Q. Shan, J. M. Bork, B. L. Webb, R. B. Inman and M. M. Cox, unpublished (1996). 228. Q.Shan and M. M. Cox,JMB 257,756 (1996). 229. S. K. Neuendorfand M. M. Cox,JBC26& 8276 (1986). 230. J. P. Menetski and S. C. Kowalczykowsh,]BC 262,2093 (1987). 231. X. Yu and E. H. Egelman,JMB 225,193 (1992). 232. G. M. Weinstock, K. McEntee and I. R. Lehman, JBC 256,8829 (1981).
ReCA PROTEIN
219
W. A. Bedale and M. M. Cox,JBC 271,5725 (1996). S. W. Morrical,J. Lee and M. M. Cox, Bchem 25,1482 (1986). L. J. Roman and S. C. Kowalczykowski, Bchem 25,7375 (1986). B. F. Pugh and M. M. Cox, in “Protein Structure, Folding, and Design 2” (D. L. Oxender, ed.), p. 275. Alan R. Liss, New York, 1987. 237. B. F. Pugh and M. M. COKJBC263,76 (1988). 238. K. A. Kumar, S. Mahalakshmi and K. Muniyappa, JBC 268,26 162 (1993). 23th. P. Wittung, B. Nordkn and M. Takahashi, EJB 228,149 (1995). 239. B. C. Schutte and M. M. Cox, Bcheni 26,5616 (1987). 240. J. W. Lee and M. M. Cox, Bchem 29,7666 (1990). 241. J. W. Lee and M. M. Cox, Bchem 29,7677 (1990). 242. S. C. Kowalczykowski, Bchem 25,5872 (1986). 243. K. L. Menge and F. R. Bryant, Bchein 34 5151 (1992). 244. K. L. Menge and F. R. Bryant, Bchem 34 5158 (1992). 245. J. P. Menetski and S. C. Kowalczykowski,JMB ISl, 281 (1985). 246. J. P. Menetski, A. Varghese and S. C. Kowalczykowski, Bchem 27,1205 (1988). 247. E. Stole and F. R. Bryant,JBC 270,20322 (1995). 248. M. M. Cox, D. A. Soltis, I. R. Lehman, C. DeBrosse and S. J. Benkovic, JBC 258,2586 (1983). 249. G . M. Weinstock, K. McEntee and I. R. Lehman,JBC 256,8845 (1981). 250. K. L. Menge and F. R. Bryant, Bchem 27,2635 (1988). 251. J. I. Kim, M. M. Cox and R. B. Inman,JBC 267,16444 (1992). 252. M. E. Bianchi and C. E. Radding, CeZE 35,511 (1983). 253. Z. Livneh and I. R. Lehman, PNAS 79,3171 (1982). 254. W. RosseUi and A. Stasiak, EMBOJ. 10,4391 (1991). 255. J. I. Kim, M. M. Cox and R. B. Irunan,JBC 267,16438 (1992). 256. P Howard-Flanders,S. C. West and A. Stasiak, Nature (London) 309,215 (1984). 257. S. C. West, ARB 6l, 603 (1992). 258. S. McGavin,JMB 55,293 (1971). 259. J. H. Wilson, PNAS 76,3641 (1979). 260. R. A. Fishel and A. Rich, iiz “Mechanisms and Consequences of DNA Damage Processing” (E. C. Friedberg and P. C. Hanawalt, eds.), p. 23. Alan R. Liss, New York, 1988. 261. J. E. Lindsley and M. M. Cox,JBC 265, 10164 (1990). 262. B. Miiller, T. Koller and A. Stasiak,JMB 2 U , 9 7 (1990). 263. S. A. Chow, S. K. Chiu and B. C. Wong,JMB 223, 79 (1992). 264. E. C. Conley and S. C. West, JBC 265, 10156 (1990). 265. S. A. Chow and C. M. Radding, PNAS 82,5646 (1985). 266. D. K. Gonda and C. M. Radding,JBC 261,13087 (1986). 267. D. A. Julin, P. W. Riddles and I. R. Lehman,JBC 261, 1025 (1986). 268. M. Bianchi, C. Das Gupta and C. M. Radding, CeEZ 34,931 (1983). 269. P. W. Riddles and I. R. Lehman,JBC 260, 165 (198.5). 270. J. E. Yancey-Wrona and R. D. Camerini-Otero, C u r . BWZ. 5, 1149 (1995). 271. V. B. Zhurkin, C. Raghunathan, N. B. Ulyanov, 0. R. Camerini and R. L. Jernigan,]MB 239, 181 (1994). 272. S. Lacks, Genetics 53,207 (1966). 273. B. Burnett, B. J. Rao, B. Jwang, G . Reddy and C. M. Raddmg,JMB 238,540 (1994). 274. S. Umlauf, “Unusual DNA Structure in Site-Specific and Homologous Recombination.” Ph.D. Thesis, University of Wisconsin-Madison, 1990. 275. S. K. Jain, R. B. Inman and M. M. Cox,JBC 267,4215 (1992). 276. A. Stasiak, Mol. Microbiol. 6,3267 (1992).
233. 234. 235. 236.
220
ALBERT0 I. ROCA AND MICHAEL M. COX
277. 278. 279. 280. 281. 282. 283. 284. 285. 286. 287.
B. J. Rao, S. K. Chiu and C. M. Radding,JMB 229,328 (1993). S. K. Jain. M. M. Cox and R. B. Inman,JBC 270,4943 (1995). R. Baliga, J. W. Singleton and P. B. Dervan, PNAS 92,10393 (1995). M. A. Podyminogin,R. B. Meyer and H. B. Gamper, Bchem 34,13098 (1995). M. A. Podyminogen, R. B. Meyer and H. B. Gamper, B c h m 35,7267 (1996). S. W. Umlauf, M. M. Cox and R. B. Inman,]BC 265,16898 (1990). P. Hsieh, 0. C. Camerini and 0. R. Camerini, Genes Dev. 4,1951 (1990). B. J. Rao, M. Dutreix and C . M. Radding, PNAS 88,2984 (1991). S. K. Chiu, B. J. Rao, R. M. Story and C. M. Radding, Bchem 32,13146 (1993). H. E. Moser and P. B. Dervan, Science 238,645 (1987). R. D. Wells, D. A. Collier, J. C. Hanvey, M. Shimizu and F. Wohlrab, FASEBJ. 2, 2939
(1988). 288. A. K. Shchyolkina, E. N. Timofeev, 0. F. Borisova, I. A. Il’ichevcl, E. E. Minyat, E. B. Khomyakova and V. L. Florentiev, FEBS Lett. 339,113 (1994). 289. J. P. Menetski, D. G. Bear and S. C. Kowalczykowski, PNAS 87,21(1990). 290. W. Rosselli and A. Stasiak,JMB 216,335 (1990). 291. S. K. Jain, M. M. Cox and R. B. InmaqJBC 269,20653 (1994). 292. S. C. Kowalczykowski and R. A. Krupp, PNAS 92,3478 (1995). 293. J. L. Sikorav and G. M. Church,JMB 222,1085 (1991). 294. E. Kaslan and W. D. Heyer, JBC 269,14 103 (1994). 295. B. Alberts and R. Miake-Lye, Cell 68,415 (1992). 296. C. J. Ullsperger and M. M. Cox, Bchem 34,10859 (1995). 297. P. E. Lavery and S. C. Kowalczykowski,]BC 267,9315 (1992). 298. M. M. Cox, Trends Biochem. Sci. 19, 217 (1994). 299. B. Jwang and C. M. Radding, PNAS 89,7596 (1992). 300. P. Modrich, ARGen 25,229 (1991). 301. R. R. Meyer, J. Glassberg and A. Komberg, PNAS 76,1702 (1979). 302. R. R. Meyer and P. S. Laine, Microbial. Rev. 54,342 (1990). 303. S. C. Schmellik and E. S. Tessman,]. Bact. 172,4378 (1990). 304. A. Sancar, K. R. Williams, J. W. Chase and W. D. Rupp, PNAS 78,4274 (1981). 305. T. M. Lohman, W. Bujalowski and L. B. Overman, Trends Biochem. Sci. 13, 250 (1988). 306. T. M. Lohman and M. E. Ferrari, ARB 63,527 (1994). 307. J. D. Griffith, L. D. Hams and J. Register 111, CSHSQB 49,553 (1984). 308. W. Bujalowski and T. M. Lohman,JMB 195,897 (1987). 309. W. Bujalowski and T. M. Lohman, B c h m 25,7799 (1986). 310. K. Muniyappa, K. Williams, J. W. Chase and C. M. Radding, NARes 18,3967 (1990). 311. S. W. Momcal and M. M. Cox, Bchem 29,837 (1990). 312. C. Bortner and J. Griffith,JMB 215,623 (1990). 313. T. Griffin IV and R. D. Kolodner,]. Bact. 172,6291 (1990). 314. M. V. V. S. Madiraju and A. J. Clark, NARes 19, 6295 (1991). 315. M. V. Madiraju and A. J. Clark,]. Bact. 174,7705 (1992). 316. K. Umezu, N. W. Chi and R. D. Kolodner, PNAS 90,3875 (1993). 317. K. Umezu and R. D. Kolodner,JBC 269,30005 (1994). 318. B. L. Webb, M. M. Cox and R. B. Inman,JBC 270,31397 (1995). 319. M. V. Madiraju, A. Templin and A. J. Clark, €“AS 85,6592 (1988). 320. P. L. Moreau,]. Bcwt. 170,2493 (1988). 321. M. Sassanfar and J. Roberts, J. Bact. 173,5869 (1991). 322. P. E. Lavery and S. C. Kowalczykowski,]MB 203,861 (1988). 323. M. V. Madiraju, P. E. Lavery, S. C. Kowalczykowski and A. J. Clark, Bchem 3 1 10529 (1992).
ReCA PROTEIN
221
324. R. Kolodner, A. A. Fishel and M. Howard,]. Bact. 163,1060 (1985). 325. J. Ahnn, P. E. March, H. E. Takiff and M. Inouye, PNAS 83, 8849 (1986). 326. P. T. Momson, S. T. Lovett, L. E. Gilson and R. Kolodner, J. Bact. 171, 3641 (1989). 327. H. E. Takiff, S. M. Chen and D. L. Court,]. Back 1 7 1 2581 (1989). 328. C. Luisi-DeLuca and R. Kolodner,JMB 236,124 (1994). 329. C. Luisi-DeLuca,!. Bact. 177, 566 (1995). 330. A. A. Mahdi and R. G. Lloyd, NARes 17,6781 (1989). 331. A. A.Mahdi and R.G . Lloyd, MGG 216,503 (1989). 332. J. C. Alonso, A. C. Stiege, B. Dobrinski and R. Lurz, JBC 268,1424 (1993). 332a. J. M. Berg and Y. Shi, Science 271, 1081 (1996). 333. B. Van Houten and A. Snowden, BioEssays 15,51 (1993). 334. G. R. Smith, Genome 31,520 (1989). 335. T. C. Wang, H. Y. Chang and J. L. Hung, Mutat. Res. 294,157 (1993). 336. J. A. Sawitzke and F. W. Stahl, Genetics 130,7 (1992). 336a. S. P. Hegde, M. Rajagopalan and M. V. Madiraju, J. Bact. 178,184 (1996). 337. S. J. Sander, Mol. Microbial. 19,871 (1996). 338. T. Kogoma, G. W. Cadwell, K. C. Barnard and T. Asai, J. Bact. 178,1258 (1996). 339. N. J. Sargentini and K. C. Smith, Radiat. Res. 107,58 (1986). 340. R. G. Lloyd, M. C. Porton and C. Buckman, MGG 212,317 (1988). 341. Y. C. Tseng, J. L. Hung and T. C . Wang, Mutat. Res. 315, 1 (1994). 342. S . J. Sander and A. J. Clark,!. Bnct. 176,3661 (1994). 343. T. V. Wang and K. C. Smith,]. Bact. 158,727 (1984). 344. N. Otsuji, H. Ilyehara and Y. Hideshima,J. Bact. 117, 337 (1974). 345. F. E. Benson, G. T. Illing, G. J. Sharples and R. G. Lloyd, NARes 16,1541 (1988). 34.5~.J. B. Rafferty, S. E. Sedelnikova,D. Hargreaves, P. J. Artymiuk, P. J. Baker, G. J. Sharples, A. A. Mahd~,R. G. Lloyd and D. W. Rice, Science 274,415 (1996). 346. I. R. Tsaneva, B. Muller and S. C. West, Cell 69,1171 (1992). 347. T. Shiba, H. Iwasald, A. Nakata and H. Shinagawa, MGG 237,395 (1993). 348. S. C . West, Cell 76, 9 (1994). 349. A. Sancar and J. E. Hearst, Science 259,1415 (1993). 350. B. Miiller, I. R. Tsaneva and S. C. West,JBC 268,17179 (1993). 351. I. R. Tsaneva and S. C. West,JBC 269,26552 (1994). 352. A. H. Mitchell and S . C. West, JMB 243,208 (1994). 353. P. E. Marrione and M. M. Cox,Bcliem 34,9809 (1995). ,354. 1. R. Tsaneva, B. Muller and S. C. West, PNAS 90,1315 (1993). 35.5. A. Stasiak, I. R. Tsaneva,S . C. West, C. J. Benson, X. Yu andE. H. Egelman, PNAS 91,7618 (1994). 356. J. Kuriyan and M. ODonnell,JMB 234,915 (1993). 357. E B. Dean and J. Hurwitz,JBC 266,5062 (1991). 358. F. B. Dean, J. A. Borowiec, T. Eki and J. Hurwitz,JBC 267,14129 (1992). 359. Y. Wang and P. H. von Hippel, JBC 268,13947 (1993). 360. J. Geiselmann, T. D. Yager and P. H. von Hippel, Protein Sci. 1, 861 (1992). 361. J. Geiselmann, S. E. Seifried, T. D. Yager, C. Liang and P. H. von Hippel, Bcheni 31, 121 (1992). 362. J. Geiselmann, T. D. Yager, S. C. Gill, P. Calmettes and P. H. von Hippel, Bchem 3 1 121 (1992) 363. J: Geiselmann, Y. Wang, S. E. Seifried and P. H. von Hippel, PNAS 90, 7754 (1993). 364. S. E. Seifried,J. B. Easton, and P. H. von Hippel, PNAS 89, 10454 (1992). 365. W. Bujalowski, M. M. Klonowska, and M. J. Jezewska,JBC 269,31350 (1994). 366. K. Hiom and S. C. West, Cell 80, 787 (1995).
222
ALBERT0 I. ROCA AND MICHAEL M. COX
367. C. A. Parsons, A. Stasiak, R. J. Bennett and S. C . West, Nature (London) 374,375 (1995). 368. B. Miiller and S. C. West, Experierataa 50,216 (1994). 369. L. E. Iype, E. A. Wood, R. B. Inman and M. M. Cox,JBC 269,24967 (1994). 370. L. E. Iype, R. B. Inman and M. M. Cox,JBC 270,19473 (1995). 371. A. Segall,M. J. Mahan and J. R. Roth, Science 241, 1314 (1988). 372. D. E. Adams, I. R. Tsaneva and S. C. West, PNAS 91,9901 (1994). 373. R. G. Lloyd and G . J. Sharples,NARa 21, 1719 (1993). 374. M. C. Whitby, S. D. Vincent and R. G. Lloyd, EMBOJ. 13,5220 (1994). 375. L. J. Roman and S. C. Kowalczykowslo,JBC 264,18340 (1989). 376. L. J. Roman, D. A. Dixon and S. C. Kowalczykowski,PNAS 88,3367 (1991). 377. S. T. Lovett and R. D. Kolodner, PNAS 86,2627 (1989). 378. S. T. Lovett and R. D. Kolodner,J. Bad. 173,353 (1991). 379. S. T.Lovett and A. J. Clark, 3. Bmt. 157,190 (1984). 380. S. T. Lovett and A. J. Clark,J. Bat. 162,280 (1985). 381. S. T. Lovett, D. C. Luisi and R. D. Kolodner, Genetics l20,37 (1988). 382. S. E. Corrette-Bennett and S. T. Lovett,JBC270, 6881 (1995). 383. I. R. Lehman and A. L. Nussbaum,JBC 239,2628 (1964). 384. S. R. Kushner, H. Nagaishi, A. Templin and A. J. Clark, PNAS 68,824 (1971). 385. M. Defais, P. Fauquet, M. Radman and M. Errera, Wrology 43,495 (1971). 386. D. W. Mount, K. B. Low and S. J. Edmiston,J. Bad. 112,886 (1972). 387. M. Radman, in “Molecular and Environmental Aspects of Mutagenesis” (L. Frakash, F. Sherman,M. Miller, C. Lawrence and H. W. Tabor, eds.), p. 128. Charles C. Thomas F’ubl., Springfield, Jllinois, 1974. 388. J. W. Roberts, C. W. Roberts and N. L. Craig, PNAS 75,4714 (1978). 389. J. W. Little, S. H. Edmiston, L. Z. Pacelli and D. W. Mount, PNAS 77,3225 (1980). 390. G. C. Walker, ARB 54,425 (1985). 391. G. C. Walker, in ‘‘Escherichiucoli and Salmonella typhimurium:Cellular and Molecular Biology” (F. C. Neidhardt, J. L. Ingraham, K. B. Low, B. Magasanik, M. Schaechter and H. E. Umbarger,eds.), p. 1346.American Society for Microbiology,Washington, D.C., 1987. 392. R. Devoret, Ann. Z’Zmt.Pasteur Actuul. 1, 11 (1992). 393. E C. Friedberg, G. C. Walker and W. Siede, “DNA Repair and Mutagenesis.” ASM Press, Washington, D.C., 1995. 394. X. Yu and E. H. Egelman,JMB 231, 29 (1993). 395. C. Lu and H. Echols,JMB 196,497 (1987). 3952. W. M. Rehrauer, P. E. Lavery, E. L. Palmer, R. N. Singh and S. C. KowalczykowskiJBC 271,23865 (1996). 395b. F. G. Harmon, W. M. Rehrauer and S. C. Kowalczykowski,JBC 271,23874 (1996). 396. S. Murk and G . C. Walker, Cum @in. Genet. Deo. 3,719 (1993). 397. Z. Livneh, F. 0.Cohen,R. Skaliter andT. Elizur, Cht. Rev. Biochem. Mol. BWZ. 28,465 (1993). 398. G. C. Walker, Trends Biochem. Sci. 20,416 (1995). 399. M. Rajagopalan, C. Lu, R. Woodgate, M. ODonneU, M. E Goodman and H. Echols, PNAS 89,10777 (1992). 400. H. Shinagawa, H. Iwasaki, T. Kato and A. Nakata, PNAS 85,1806 (1988). 401. S. E. Burckhardt,R. Woodgate, R. H. Scheuermann and H. Echols, PNAS 85,1811 (1988). 402. R. Woodgate, M. Rajagopalan, C. Lu and H. Echols, PNAS 86,7301 (1989). 403. M. Dutreix, B. Burnett, A. Bailone, C. M. Radding and R. Devoret, MGG 232,489 (1992). 404. J. B. Sweasy, E. M. Witkin, N. Sinha and V. Roegner-Maniscalco,J.Bact. 172,3030 (1990). 405. E. G. Frank, J. Hauser, A. S. Levine and R. Woodgate, PNAS 90,8 169 (1993). 406. J. W. Zyskind, A, L. Svitil, W. B. Stine, M. C. Biery and D. W. Smith, Mol. Mimobiol. 6, 2525 (1992).
ReCA PROTEIN
407. 408. 409. 410. 411. 412. 413.
223
K. Skarstad and E. Boye,J. Bact. 175,5505 (1993). T. R. Magee, T. Asai, D. Malka and T. Kogoma, EMBOJ. 11,4219 (1992). T. Asai, S. Sommer, A. Bailone and T. Kogoma, EMBOJ. 12,3287 (1993). T. Asai and T. Kogoma, J. Bact. 176,7113 (1994). T. Asai and T. Kogoma, J. Bact. 176, 1807 (1994). S. C. West and B. Connolly, Mol. Microbiol. 6,2755 (1992). K. Ishimori, S. Sommer, A. Bailone, M. Takahashi, M. M. Cox and R. Devoret,JMB (1996).
Submitted. 414. S. Cheng, A. Sancar and J. E. Hearst, NARes 19,657 (1991). 415. S. Karlin and G. Ghandour, PNAS 82,8597 (1985). 416. S. Henikoff and J. G. Henikoff, PNAS 89, 10915 (1992). 417. J. Devereux, I? H. Haeberli and 0.S. Smithies, NARes 12,387 (1984). 418. D. F. Feng and A. F. Dolittle,J. Mol. Euol. 25,351 (1987). 419. T Horii, T. Ogawa and H. Ogawa, PNAS 7 7 , 3 13 (1980). 420. J. Felsenstein, Am. J. Human Genet. 25,471 (1973). 421. D. S. Goodsell and A. J. Olson,J. MoZ. Gruphics 10,235 (1992). 422. A. White, P. Handler and E. L. Smith, “Principles of Biochemistry,”3rd Ed., p. 124. McGraw-Hill,New York, 1964. 423. l? M. Richards, Annu. Reu. Biophys. Bioeng. 61, 151 (1977). 424. T. E. Femn, C. C. Huang, L. E. Jarvis and R. Langridge,J. Mol. Graphics 6,13 (1988). 425. A. Nicholls, K. A. Sharp and B. Honig, Proteins Struct. Funct. Genet. ll, 281 (1991). 426. M. Gribskov and D. Eisenberg, in “Techniques in Protein Chemistry” p.E. Hugh, ed.), p. 108. Academic Press, New York, 1989. 427. H. Fujisawa, T. Yonesaki and T. Minagawa, NARes 13, 7473 (1985). 428. P. J. Kraulis,]. Awl. Crystallog. 24, 946 (1991).
This Page Intentionally Left Blank
Molecular Biology of Axon-Glia Interactions in the Peripheral Nervous System VERDONTAYLOR AND UELI SUTER Instilute of Cell Biology Swiss Federal institute of Technology ETH-Hongerberg CH-8093 Zurich, Switzerland 1. Axon-Glial Interactions during Neural Crest Development . . . . . . . . . . 11. Regulation of Schwann Cell Proliferation and Differentiation by Growth Factors and Their Receptors .................................... 111. Role of the Extracellular Matrix in PNS Development . . . . . . . . . . . . . . . IV Myelination as a Speciality of Axon-Schwann Cell Interactions . . . . . . . V. Transcriptional Regulation of Axon-Schwann Cell Interactions . . . . . . . W. Degeneration and Regeneration in the Nervous System . . . . . . . . . . . . . VII. Axon-Schwann Cell Interactions as a Bilateral Communication . . . . . . VIII. Mechanisms of Membrane Sorting in Myelinating Schwann Cells . . . . . IX.Future Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
227 229 233 235 243 246 248 249 249 250
The nervous system is the most complex and sophisticated organ in the mammalian body. Composed of billions of cells arranged in a precise network, its formation during development requires a temporally and spatially controlled and accurate regulation of cell division, migration, and communication.
Abbreviations: SARIA, acetylcholine receptor inducing agent: BDNF, brain-derived neurotrophic factor; CNS, central nervous system; CMT, Charcot-Marie-Toothdisease; CNTF, ciliary neurotrophic factor; Cx32, connexin 32; ECM, extracellularmatrix; FGF, fibroblast growth factor; GFAP, glial fibrilic acid protein; GGFs, glial growth factors; GAS6, growth arrest sequence 6; HMSN, hereditary motor and sensory neuropathies; HNPP, human neuropathy with liability to pressure palsies; LAR, leukocyte common antigen-related phosphatase; MAG, myelin-associated glycoprotein; MAL, myelin and lymphocyte protein; MBP, myelin basic protein; NGFR, nerve growth factor receptor; NDF, neu differentiation factor; N-CAM, neural-cell adhesion molecule; PMP22, peripheral myelin protein 22; PNS, peripheral nervous system; PDGF, platelet-derived growth factor; POU, pit-act-unc domain; PO, protein zero; PLP, proteolipid protein; SCIP, suppressed CAMP-inducible POU-domain protein; TGF, transforming growth factor. Progress m Nucleic Acid Research and Molecular Biology,Vul. 56
225
Cop,Vnght D 1997 by Acddernic Press. All rights of reproduction in any Sorm reserved 0079-6603/97$25.00
226
VERDON TAYLOR AND UELI SUTER
During embryogenesis,the nervous system is the first organ to be generated. Soon after denotation of the body axis, the neural ectoderm begins its differentiation.These specialized ectodemal cells proliferate and differentiate into the plethora of different cell types of the central nervous system (CNS), which comprises the brain and spinal cord, and the Peripheral nervous system (PNS), which consists of gangha and peripheral nerves that lie outside of the brain and spinal cord. The motor neuron is an anomaly in the anatomical differentiation between CNS and PNS in that its cell body is positioned within the spinal cord but most of the cell, i.e., the axon, is located in the periphery. During development, motor neurons differentiate from a distinct region of the ventral neural tube, the floor plate, and extend processes into the surrounding tissue, contacting their target muscles. On leaving the spinal cord, motor axons are fasciculated and form the ventral root. Conversely, peripheral neurons, which form the sensory system and their associated glial cells, are the progeny of specialized and highly migratory pluripotent cells of the trunk neural ectoderm, termed the neural crest (1-5). The sensory neurons are the first neural crest-derived progeny migrating laterally into the dorsal roots and colonizing the ganglia (6, 7 ) .Subsequently, sensory neurons extend processes both into the tissue, where they will interact with sensory receptors, and into the CNS, where they synapse with commissar neurons of the spinal reflex circuit and the ascending afferents to the brain. The next neural crest descendants are the prospective peripheral &a1 cells (4,5, 7,8).These cells migrate laterally and ventrally into the dorsal and ventral roots and contact the axons of sensory and motor neurons (5, 6, 9). The prospective glial cells continue to proliferate as they migrate along the axon toward the target tissue (5).At a particular point in development, the cells cease proliferation and enter a differentiation pathway toward mature glial cells of either the myelin-forming or the nonmyelinating Schwann cell types, including the specialized neuromuscular junction endplate-capping Schwann cell (10).This distinction between myelinating and nonmyelinating Schwann cells appears to depend on the association with axons. Nonmyelinating Schwann cells can form myelin sheaths if transplanted into a nerve that normally contains a substantial number of myelinated axons (1.2). Schwann cells that are destined to deposit myelin sheaths come in close apposition to large-caliber PNS axons to form a one-to-one relationship. This differs from the CNS counterpart of Schwann cells, the oligodendrocytes, which extend several processes that myelinate dozens of axons. Nonmyelinating Schwann cells associate with bundles of small-caliber axons, surrounding and insulating them simultaneouslywithout depositing myelin. At the time of birth, rodent Schwann cells that are engaged in a one-to-one re-
AXON-GLIA INTERACTIONS IN THE PNS
227
lationship with axons undergo a developmental switch and begin to express myelin genes at high levels (3, 12). The following process of myelination peaks after 2 weeks but carries on for approximately an additional 5 weeks. Although the onset of CNS myelination varies between different brain region, it is generally slightly delayed compared to the PNS. The PNS is a very attractive place to study the molecular basis of cell-cell and cell-extracellular matrix interactions during development and in adulthood. It is morphologically well-defined and easily amenable to experimental examination. Furthermore, PNS differentiation can be mimicked (although only with in the limits of a model) by injuring PNS nerves through cut or crush lesions. As a result, the previously myelinating Schwann cells in the distal stump will start to proliferate again and myelin and axonal debris will be removed by phagocytic Schwann cells and invading macrophages (13).In sharp contrast to the CNS, the axons of the PNS will be able to regenerate functionally along the permissive substrate of the distal stump. In this review, we focus on some of the critical findings over the past years with respect to the fascinating interplay between peripheral neurons/axons and Schwann cells. We describe molecules that are clearly involved in these interactions, and we speculate about some of the potential molecules and mechanisms that may also play crucial roles in these processes.
1. Axon-Glial Interactions during Neural Crest Development The interplay between glial cells and neurons during PNS development is very complex and regulates cell migration, proliferation, differentiation,
and survival. It is becoming increasingly clear that Schwann cells depend on contact with axons for many of their migrational and developmental cues. After entering the rat PNS at embryonic day 9 to 10 (E9 to ElO), neural crest cells expressing high levels of the low-affinity nerve-growth-factor receptor ( ~ 7 5travel ~ along ~ ~ the ~ already ) fasciculated axons of the sensory and motor neurons (1-3, 7, 8). During this migrational stage, which extends up to E16, the prospective Schwann cells are in a continuous state of proliferation and depend on the axon for proliferation and survival signals (14, 15). The nature and location of these signals is not clear, although both diffusible and membrane-associated growth factors are likely to be involved (16, 17).Isolation of the migratory cells from the axon results in rapid cell death in uitro, which can be prevented by neuron-conmtioned medium or by members of the fibroblast growth factor (FGF) and insulin-like growth factor families (16). Due to the limited amounts of extracellular matrix (ECM) components
228
VERDON TAYLOR AND UELI SUTER
present in the early embryonic nerve, many interactions between axons, Schwann cells, and axon-Schwann cell units that are in close apposition are likely to involve membrane-associated molecules (18-20). Up to E16, both glial progenitors and neurons express specific cell-recognition proteins, including the members of the immunoglobulin superfamily L1 and the neural cell adhesion molecule (N-CAM),which can convey homophilic and heterophilic interactions. Interestingly, L1 and N-CAM expression on the surface of Schwann cell precursors is already polarized toward the axon bundles on which they are migrating (21, 22). No expression, however, is found on their outer surface, which is in contact with the mesenchyme. This finding reinforces the hypothesis that Schwann cells are highly specializedbipolar cells even at an early time point in their development, a concept that will be a recurrent theme throughout this review. Between El5 and E17, there is a striking conversion of the precursor cells into early Schwann cells. These cells begin to express the small calcium-binding protein SlOO and the cytoskeletal element, glial fibrilic acid protein (GFAP) ( I , 3,14,23,24).Furthermore, axonal signals are responsible for the appearance of the 0 4 antigen (25). Early Schwann cells remain mitotically active up to a second development switch into the myelinating phenotype at approximately El8 to E20. However, the expression of the classical myelin genes is delayed for approximately another 2 days until the time of birth. The mitotic signals for these cells are associated, at least in part, with the axonal membrane, because cultured neonatal rat Schwann cells respond by undergoing mitosis when placed in contact with neurites or when axolema (axonal membranes) are added to the culture medium (12, 26). This response is not mimicked by neuron-conditioned medium, emphasizing the need for direct contact with the axonal membrane. Around El8 to E20, the early Schwann cells invade the axon bundles along which they were migrating and begin to separate the axons from each other (27). After forming a one-to-one relationship with the axons that are destined to be myelinated, the Schwann cells wrap the axon one and one-half times with a single process. This induction of the myelinating phenotype results in the down-regulation of pi'sNGFR,GFAF', L1, and N-CAM expression as well as the up-regulation of galactocebroside and the A5E3 antigen (28). The differentiation states of Schwann cells and neurons are critical in the regulation of proliferation and differentiation. In the postnatal peripheral nerve, when the Schwann cells have formed myelin sheaths around the axon, the mitogenic axonal signal is completely suppressed. However, if the Schwann cell losses contact with the axon as a result of neuronal injury or degeneration, it can dedifferentiate into a proliferative state independent of axonal contact. This phenotypic dedifferentiation is also reflected at the mo-
AXON-GLL4 INTERACTIONS I N THE PNS
229
lecular level because the Schwann cells start to reexpress the premyelination marker proteins p75NGFR,GFAP, L1, and N-CAM (see 20 and references therein).
II. Regulation of Schwann Cell Proliferation and Differentiation by Growth Factors and Their Receptors The molecular nature of the signals that convey mitogenic signals to Schwann cells during development is slowly starting to emerge. CAMPwas the first cultured Schwann cell mitogen to be described. The addition of membrane-permeable derivatives of CAMP or forskolin, an activator of adenylate cyclase, to neonatal rat Schwann cells grown in serum-containing medium was found to result in proliferation (17,29).However, in the absence of growth factors, CAMPinduces a differentiation pathway resulting in the increase in myelin gene expression (30)(the interpretation of these results has recently been questioned; see 31). Nevertheless, these findings lead to the identification of a transcription factor called suppressed CAMP-inducible POU domain protein (SCIP) (32,33).The putative role of SCIP in the regulation of myelination is discussed in Section V,A and it remains to be resolved whether the mitogenic and potentiating effects of CAMPare controlled via SCIP. Several growth factors exhibit mitogenic properties on isolated neonatal Schwann cells in witro (reviewed in 34). Acidic and basic fibroblast growth factor induce proliferation in cultured rat Schwann cells, specifically in the presence of increased levels of CAMP. In addition, platelet-derived growth factor (PDGF) A and B, transforming growth factors (TGFs),and, in particular, the glial growth factors (GGFs)have been identified in witro as Schwann cell mitogens, either alone or by potentiating the actions of other growth factors and CAMP During prenatal development, the members of the FGF family are good candidates for the nondiffusible neuron-derived mitogenic and survival signals due to the association of the FGFs with heparan sulfate-like molecules in the ECM and on the surface of cells (16).In particular, FGFs act as survival factors for neural crest-derived cells that migrate within the peripheral nerve up to E l 5 (16).Thereafter, FGFs induce a proliferative response and suppress the expression of myelin genes in isolated Schwann cells in the presence of increased CAMPlevels. Unlike most other Schwann cell growth factors, TGF-p1 and TGF-P2 are mitogenic in their own right and are able to substitute for increased CAMP
230
VERDON TAYLOR AND UELI SUTER
in potentiating the proliferative effects of the other growth factors (35, 36). TGF-P, similar to FGFs, is also associated with the ECM and cell surface. Furthermore, TGF-P is highly expressed early in the developing nerve, where it might b e involved in guiding the differentiation of the neural crest-derived cells to a &al as opposed to a pigment cell fate (37,38).TGF-P also induces the proliferation of Schwann cells following axotomy in vitro and the same growth factors are up-regulated in proliferating Schwann cells of the distal stump following nerve transection (35).These findings have been interpreted to suggest that TGF-P may act in an autocrine loop to increase Schwann cell proliferation and to suppress progression to the myelinating phenotype without affecting the association with axons (39).However, TGF-P is unlikely to be involved in the proliferation of early Schwann cells and their precursors because, in coculture assays, it suppresses the axonal-induced proliferation response, which is thought to be responsible for mitogenesis at these developmental stages (36, 39). Although CAMPcan mimic many of the actions of axonal contact in cultured Schwann cells, there is increasing doubt as to whether elevated intracellular CAMPlevel is a major mediator of Schwann cell-axon interactions (31, 40). For example, CAMP might activate a pathway similar to or downstream of an unknown endogenous signal that may involve the transcription factor SCIP. The ability of TGF-p to substitute for CAMPin vitro suggests that there is either redundancy within the system or that multiple signals are able to synergize to achieve a similar response.
A. Neuregulins and Schwann Cell Development Glial growth factors were first identified as mitogenic substances in purified extract from bovine brain glial cells (41, 42). Three major active components have been purified, cloned, and found to be products derived by alternative splicingfrom a single gene (43-45). Subsequently, GGFs have been renamed neureguhs and shown to be identical to heregulins (46), acetylcholine receptor-inducing agent (ARIA)(47),and neu differentiation factor (NDF) (48,49).The alternative splicing of the neureguLn gene generates a wide range of products that encode proteins with varymg structures that are either membrane bound or soluble (50).These polypeptides are divided into the a-and P-type neuregdms, depending on the form of the receptor-binding epidermal growth factor @GF)-like repeat they exhibit. A high complexity in the cellular response to neuregulins is likely because some of these proteins may act as agonist or antagonist at the neureguh receptors (reviewed in 50). The receptors erbB2 (p185"""),erbB3, and erbB4 are receptors for neureguLns (49,51).These tyrosine b a s e receptors are members of the EGF
AXON-CLIA INTERACTIONS IN THE PNS
23 1
receptor-like family and erbB2 (if activated) is indeed able to cross-phosphorylate and activate the EGF receptor (52, 53).However, the neureguhn binding and activation of the different erbB receptors is complex. The erbB2 receptor displays a high affinity for neureguhs, but only in the presence of a partner, which is thought to be erbB3 (51).Conversely, erbB3 can induce a signal only in response to neuregulLns in conjunction with erbB2 (51).The erbB4 receptor is the only member of the erbB family that binds neureguhns and transduces a signal on its own, but erbB4 may also form heterodimers with erbB2 and erbB3 on ligand binding (51, 54, 55). I n wii3-0, neuregulins induce DNA synthesis in Schwann cells (56, 57), and mutations affecting erbB2 result in constitutively active receptors that are associated with Schwannomas in wivo (58). During development, neuregulins are expressed at high levels in the dorsal root ganglia and by motor neurons throughout embryogenesis (44, 59). To clarify the role of these growth factors in uivo, a genetically engineered neuregulin-deficient mouse mutant has been generated (60).This mutation was not compatible with normal development and led to premature death of the mice at E l 0 and E l l due to heart malformations. Severe deficiencies in the cranial ganglia were also observed, probably due to a loss of the neural crest-derivedcells. Furthermore, there was an ablation of neural crest cells migrating along the seemingly unaffected peripheral nerves in the trunk region. It has been demonstrated in in witro systems that neuregulLns can drive isolated neural crest cells from a neuronal to a Schwann cell phenotype and can prevent apoptosis of E l 4 neural crest cells (27, 61).These findings suggest that neuron-derived neuregulins may act not only as Schwann cell mitogens but also either to induce neural crest cells to adopt a Schwann cell fate and/or to promote their survival (27, 61). Individual gene-targeting experiments in each of two erbB receptors (erbB2 and erbB4) support the hypothesis proposed above, because similar cranial neural crest-derived abnormalities are present in these mutants (62, 63). Both, the erbB2- and erbB4-deficient animals died due to heart defects at E l 0 to E l l , but, interestingly, trunk neural crest cells did not seem to be affected. Thus, these cells may use the unaffected neureguh receptors in order to maintain their developmental course. It is anticipated that the generation of erbB3-deficient animals may resolve this apparent discrepancy. In summary, it seems likely that neuregulins, probably in collaboration with other growth factors (e.g.,TGF-P) are involved in decision-making steps during neural crest differentiation by providing some of the environmental cues that have been proposed to regulate neural crest cell fate both temporally and spatially (37,38,61).With the advent of modem tissue-specificand inducible-gene targeting methods (64-66), it will be an exiting task to exam-
232
VERDON TAYLOR AND UELI SUTER
ine the effects of neuregulins and their receptors later in Schwann cell development.
B. Neurotrophins and Schwann Cell Differentiation Early in the development of the Schwann cell-axon partnership, Schwann cell precursors express high levels of p7SNGFRand evidence has been presented that this receptor plays a role in the control of Schwann cell migration (67). Generally, however, the function of p75NGFR,which binds all known neurotrophins, is still unclear. It has been suggested that it may function in regulating apoptosis and retrograde axonal transport of neurotrophins, and/or as a permissive substrate for the growth of neurotrophinresponsive neurons (68-70). Recent data suggest that p7!jNGFRmay act as an accessory receptor, either assisting in the passage of ligand to the highaffinity neurotrophin receptors of the Trk family, or to bind ligand and create a neurotrophin-concentrated microenvironment (71). Conversely, it might prevent excess neurotrophin diffusing into the surrounding tissue, hence restricting the spatial effects of these growth factors. Furthermore, p75NGFRmay also participate directly in signaling via the Trk receptors (72). Finally, ~ 7 5 ~ ~ ~ ~ - d e fmice i c i edisplay nt a similar, albeit less severe, phenotype compared to TrkA-deficient mice (73), suggesting an important role for p7FiNGFRat least in the TrkA-signalingpathway (74). Schwann cell precursors express both TrkB and its main ligand brain-derived neurotrophic factor (BDNF) (75). Interestingly, glia-derived BDNF in the PNS appears not only to provide survival signals for neurons but also for Schwann cells in an autocrine fashion (BDNF) (76, 77). This may be of particular importance for Schwann cell survival after losing axonal contact as a result of injury (78, 79).
C. Role of the GAS6/Axl-Rse Family in the PNS Axl (Tyro 7, UFO, ark) and Rse (Tyro 3, Ksy, brt, tif) belong to the same family of receptor tyrosine kinases (80-83). Both Axl and Rse are widely expressed in the nervous system, and the growth-arrest sequence 6 (GASG)has been recently identified as a common ligand (82,83). Human Schwann cells show a pronounced growth response on stimulation with GASG, which is in line with the observation that Schwann cells express Axl and Rse receptors (84).Together with the findings that GAS6 is expressed by spinal motor neurons and in the dorsal root ganglia during development and adulthood, and that both Axl and Rse are up-regulated in Schwann cells after nerve injury, these results suggest that the GASG/Axl-Rse system may play a major role in the regulation of neuron-Schwann cell interactions (84).
AXON-GLIA INTERACTIONS IN THE PNS
233
D. Are Eph Tyrosine Kinase Receptors involved in PNS Development? Recently, Eph-related proteins have emerged to prominence as the largest subfamily of receptor tyrosine kinases and as having a role in the development of the nervous system. Eph receptors interact with membrane-associated ligands of the B6 1 family and provide crucial signals in cell-cell interactions (85-88). Within the CNS, Eph receptors and their glia-expressed ligands affect the migration and pathfinding of axonal growth cones by providing a set of mainly repulsive signals (89, 90). Due to the nature of the receptor-ligand interactions of the Eph family, the axon-Schwann cell interplay might be an ideal place for cell recognition and migration impulses, which these proteins provide in other systems. In support of this hypothesis, the Eph receptors Tyro 5,6, and 11 are expressed in Schwann cells of the developing sciatic nerve and in culture (91).Exposure of cultured Schwann cells to forskolin did not change the expression of Tyro 5 and 11 but resulted in a down-regulation of Tyro 6. Thus, Eph receptor tyrosine kinases might be involved in Schwann cell development by interacting with ligands on the axonal membrane. These interactions may be involved in differentiation decisions by Schwann cell progenitors and in their multiple maturation switches during development.
111. Role of the Extracellular Matrix in PNS Development As previously described, the migratory precursor cells of peripheral glia are tightly associated with the surface of fasciculated axons during development. The role of the relatively sparse ECM at this stage is not clear, but Schwann cells ensheath nerve fibers in the continuous presence of basal lamina (92,93).As Schwann cell development progresses, the ECM starts to play a pivotal role in establishing the myelinating phenotype (94,95).This can be effectively demonstrated in vitro where Schwann cells cannot deposit myelin in the absence of ECM in coculture with neurites (96).It is thought that a basal lamina is a prerequisite for assembly of the myelin sheath, a process that is likely to depend on the capabihty of the Schwann cell to become polarized (96). PNS ECM consists mainly of collagen, laminin, thrombospondin, fibronectin, tenascin, and proteoglycans (18).In particular, laminin is one of the main components of the Schwann cell basal lamina and might be involved in the stabilization of embryonic nerves. The critical role of the ECM
234
VERDON TAYLOR AND UELI SUTER
becomes evident after nerve injury, with a process of up-regulated Schwann cell expression of laminin and tenascin, which are important components in fostering axonal regrowth (97, 98). Furthermore, the laminin a2 chain is deficient in dy mice, in which partial PNS dysmyelination is a prominent feature (99-101). Compared to other cellular systems, the interesting hypothesis emerges that ECM receptors might play a role in signal transduction mechanisms in Schwann cells. The heterodimeric integrins are particularly attractive candidates for playing such a role because an integrin-dependent signaling pathway has been proposed that involves the activation of a kinase cascade, including focal adhesion kinase and src-like nonreceptor type tyrosine kinases (102, 103). Integrin activation results in spatially restricted phosphorylation of cytoskeletal elements, which leads to stabilization of the focal adhesion points. Subsequent activation of a receptor tyrosine phosphatase (leukocyte common antigen-related phosphatase; LAR) at the posterior extremity of the plaque may release the cytoskeleton and allow motile propagation of the cell along a substrate (104).Thus, one might speculate that the extensive and specific movements of the Schwann cell along axon bundles involve integrin-ECM interactions to propagate the motile cues. In vitro studies have shown that Schwann cells constitutively express the a6+l laminin receptor in the presence or absence of axons (105).In contrast, the a6@4-typeintegrin receptor is axonally regulated and, on myelination, increased in its expression at the abaxonal surface opposite the Schwann cell basal lamina, which contains the putative ligand(s) (105).Complementary in vivo studies in developing and regenerating peripheral nerves confirmed the tight regulation of the p4 subunit by axon-Schwann cell interactions and the polarized expression of the a6@4 receptors (106).The available data are consistent with the idea that the drastic phenotypic changes Schwann cells undergo from an ensheathing to a myelinating state, in particular with respect to the cytoskeleton (107),are associated with a switch in subunit composition of integrin receptors from p l to p4, probably affecting ligand specificity. The integrin-associated cell differentiation antigen CD9 is expressed by neurons and Schwann cells and is coordinately regulated with the myelin genes during sciatic nerve development (108, 109). Interestingly, various antibodies that are specific for this putative four-transmembrane protein promote adhesion, induce proliferation, and enhance migration of Schwann cells in in vitro paradigms (110, 111).Most interestingly, binding of antibodies to CD9 can induce tyrosine phosphorylation in Schwann cells, and CD9 can associate with a6@l integrin and L1 (112).Although these findings await confirmation by appropriate in vivo approaches, it is tantalizing to hypothesize that the observed tyrosine phosphorylation is the result of integrin acti-
AXON-GLIA INTERACTIONS I N THE PNS
235
vation, and that CD9 could be involved in regulating recognition processes between axons and Schwann cells via integrin receptors.
IV. Myelination as a Speciality of Axon-Schwann Cell Interactions Myelination is one of the most complex and stnlung phenomena in the nervous system and is strongly regulated by axon-Schwann cell interactions. Although our knowledge has improved considerably concerning the s h c ture and molecular composition of myelin, there still remain many basic questions. Some of these open issues concern the molecular determinants that act during initial myelination: What are the basic cues involved in the initiation of myelination? How is it determined that a Schwann cell will always myelinate only one axon, resulting in a 1:l relationship (in contrast to its cellular counterpart in the CNS, the oligodendrocyte, which myelinates several axons)? Which molecular mechanisms decide that only large-caliber axons will be myelinated? Why do small-caliber axons remain in bundles associated with nonmyelinating Schwann cells? Other unresolved questions concern later development and the maintenance of PNS nerves: How does the Schwann cell sense that the developing myelin sheath has reached proportionality to the axonal caliber and that no further membrane wrappings are necessary? Why is the myelin homeostasis disturbed in various late-onset peripheral neuropathies? The key to most (if not all) of these burning questions lies in the bidirectional dialogue between Schwann cells and axons. Whereas axonal components appear to initiate myelin formation and play a crucial role in the regulation of myelin gene expression (113, 114),myelinating Schwann cells are responsible for the correct dstribution of ion channels along the axon (115, 116) and determine the molecular composition of the axonal cytoskeleton (117, 118).Most of these mechanisms are likely to involve membrane-bound cell recognition molecules (119) whereas others might be mediated by diffusible factors (120). A first major step in the elucidation of the molecular basis of axonSchwann cell interactions was the biochemical dissection of myelin followed by the identification and cloning of myelin-associated proteins. To test the function of these molecules, in vitro models using dorsal root ganglia neurons in cocultures with Schwann cells, with or without allowing myelination, have been developed. Manipulation of these systems using retroviral expression vectors has contributed significantly to our present understanding of axon-Schwann cell interactions and paved the way for further investigations
236
VERDON TAYLOR AND UELI SUTER
(121).The arrival of advanced genetic techniques in vertebrates, in particular novel methods allowing the generation of transgenic rodents and directed gene disruptions in mice, has yielded considerable further progress. There is little doubt that additional refinements of these techniques, for example, tissue-specificand conditional (inducible) gene targeting, have the potential to revolutionize the field again. Another special attractiveness of the PNS system is derived from the observation that many of the molecular events that take place during PNS development are recapitulated during degeneration and regeneration of peripheral nerves after injury. If a peripheral nerve is crushed or cut, axons in the distal stump degenerate and the resulting debris, together with the degenerating myelin sheaths, is removed by phagocfic Schwann cells and macrophages in a process termed “Wallerian degeneration” (13).The debriscleared distal stump provides a permissive substrate for the regrowing axons; finally, recovery is observed, as manifested by the regeneration of a functionally intact nerve. PNS regeneration has been used to study axon-Schwann cell interactions in combination with transplantation technology in pioneering work by Aguayo and co-workers (124,125). In a number of experiments, hypomyelinating nerves from the spontaneous mouse mutant Trembler [which is associated with a point mutation causing a nonconservative amino acid substitution in the myelin protein PMP22 (122,123)] were grafted into normal mice s demonstrates that and vice versa (124,125). Recent continuation of t h ~work after regeneration of host axons through a nonmyelinating Trembler graft, axons show reduced calibers, slowed axonal transport, and altered neurofilament density and phosphorylation (126). These results suggest the activation of a kinaselphosphatase pathway in hypomyelinated nerves, the exact molecular mechanisms of which have not yet been elucidated. In the following sections, we first describe what is known about the molecular basis of axon-Schwann cell interactions in relation to the structure and function of myelin and its molecular components. This summary will provide the basis for a follow-up discussion about the molecular control mechanisms that regulate myelination and nerve regeneration.
A. Structure and Function of Myelin Myelin is an important vertebrate specialization that allows the rapid and effective saltatory propagation of action potentials along individual axons between the ion channel-containing nodes of Ranvier (127).Structurally, myelin is generated during ontogeny by membrane processes of Schwann cells (PNS) or oligodendrocytes (CNS),which spirally wrap around centrally placed axons. The resulting myelin sheaths are m d y compacted,i.e., the extracellular space between the turning loops is strongly reduced to form the intraperiod lines and
AXON-GLIA INTERACTIONS IN THE PNS
23 7
the cytoplasm is extruded to generate the major dense lines. Minor parts of myelin remain uncompacted, containing widely spaced membrane loops and extra cytoplasm.These structural domains include the paranodal loops and the Schmidt-Lanterman incisures as well as the cytoplasmic collar.
B. Biochemical Composition of Myelin The main components of myelin are lipids (approximately70% of the dry weight), whereas proteins represent only a minor, but functionally crucial, fraction (approximately3O0o).The lipid content of myelin is unusually high compared to most other biological membranes, and it is thought that this particular biochemical composition may reflect the specialized function of myelin as an electrical insulator. Although myelin lipids have been studied intensively, no myelin-specific lipids have been found so far, but myelin is enriched in cholesterol and cerebrosides. There is good evidence for the importance of axon-Schwann cell interactions in the regulation of the rapid but transient myelin protein gene expression during development and nerve regeneration. Thus, one might anticipate that lipid biosynthesis in Schwann cells underlies similar control mechanisms. Although this area has not been thoroughly investigated, it has been demonstrated that 3-hydroxy-3-methylglutql coenzyme A reductase (HMG-CoA reductase; EC 1.1.1.34),a key regulatory enzyme in the cholesterol metabolism, is rapidly down-regulated following nerve injury in the PNS in a time course that parallels the rapid down-regulation of the major PNS myelin proteins (128).In nonglial cells,, HMG-CoA reductase is regulated by modifymg its transcription rate through specific activators via sterol response elements in its promoter, altering the stability of its mRNA, or changing the rate of degradation of the enzyme itself (129).It will be an interesting task to determine if any of these mechanisms are also operative in myelinating Schwann cells and possibly regulated by axon-glia interactions. Studies of mouse myelination mutants and various human diseases affecting myelination revealed that the correct development and maintenance of PNS nerves are critically dependent on the correct function and expression of most myelin proteins (130, 131). Interestingly, the major myelin proteins of the CNS and PNS differ. Some proteins are restricted either to the CNS [e.g., proteolipid protein (PLP)] or the PNS [e.g.,protein zero (PO) and peripheral myelin protein 22 (PMP22)],whereas other proteins are found in both tissues [e.g., myelin basic protein (MBP) and myelin-associated glycoprotein (MAG)]. 1. MYELIN PROTEIN ZERO(PO)
PO is exclusively expressed by Schwann cells, and it is thus confined to PNS myelin, where it accounts for almost 50% of the total protein (132).
238
VERDON TAYLOR AND UELI SUTER
Structurally, PO is a transmembrane glycoprotein of approximately 30 kDa that carries a single immunoglobulin (1g)-like domain (133).Consequently, PO can be viewed as one of the most primitive members of the immunoglobulin domain-containing superfamily of cell recognition proteins. During nerve development, a high level of PO expression is first observed on Schwann cells at initial stages of myelination in the Schwann cell loops and at the Schwann cell-axon contact phase (for review see 134).As myelination proceeds, PO is completely down-regulated in noncompacted myelin but very strongly expressed in compact myelin (135).After nerve injury, PO disappears from previously myelinating Schwann cells and is reexpressed during myelination of regenerating axons in a pattern that is consistent with a putative regulation by axon-Schwann cell interactions (29). A series of in vitro and in vivo approaches have yielded consistent and complementary results concerning the function of PO (136-138).In compact myelin, the mainly homophilic interactions between the Ig-like domains of PO proteins are involved in the maintenance of the intraperiod line (138),although a significant contribution of heterophilic PO interactions with carbohydrate ligands may also be involved (139).Furthermore, the highly basic, positively charged cytoplasmic domain of PO appears to be required for the integrity of the major dense line by interaction with negatively charged membrane phospholipids (132,140).Besides these two well-established functions of PO in myelin compaction, PO might also play a role in the promotion of myelin spiral formation, because myelination is significantly retarded in genetically engineered PO-deficient (PO knockout) mice (138). Further studies in the PO knockout mice have revealed that PO is not only involved in PNS development, but also in the maintenance of myelin and axon integrity (141).Myelin degeneration is a common theme in young PO knockout mice and similar features have been observed in older mice that have only one functional copy of the PO gene (heterozygous PO knockout mice). Most interestingly, the genetic alteration of Schwann cells also affects axons by causing sigdicant morphological changes. Beside significantly reduced calibers, a considerable number of axons that are associated with an abnormal myelin sheath appear to degenerate with increasing age (138). Thus, normal PO expression is involved in determining directly or indirectly the phenotype of the axonal parber, suggesting an important role for this protein in axon-Schwann cell interactions. 2. MYELINBASICPROTEINS MBPs are small basic, positively charged cytoplasmic proteins that vary in molecular mass between 12 and 22 kDa, a heterogeneity that is generated via differential splicing from a single gene and by posttranslational modifications (142).Evidence from the naturally occurring mouse mutant shiv-
AXON-GLIA INTERACTIONS I N THE PNS
239
erer (partial deletion in the MBP gene) suggests that MBPs are involved in maintaining CNS myelin stability and thickness (143). In the PNS, shiverer animals are only slightly affected, mainly by a significant increase in the number of Schmidt-Lanterman incisures (144). As an explanation for the apparent differences of the MBP deletion in the CNS versus PNS, it was suggested that the positively charged intracellular domain of PO might compensate for the functional loss of MBP in the PNS. To test this hypothesis, mouse mutants lacking both PO and MBP were generated (140).As anticipated, analysis of the double mutants showed that PNS myelin was now devoid of major dense lines and Schwann cell processes forming spirals around their axons were uncompacted and contained cytoplasm. Thus, PO and MBP play interchangeable roles during the formation of the major dense line in PNS myelin. The regulation of MBP expression during PNS nerve development and regeneration is closely correlated with myelination and the expression of the PO gene. Thus, the two genes that encode the largest fractions of PNS myelin protein might be coordinatelyregulated by axon-Schwann cell interactions (145).
3. PERIPHERAL MYELINPROTEIN22 PMP22 is a small hydrophobic 22-kDa PNS myelin protein (146-148). Although minor PMP22 expression has been found in nonmyelinating Schwann cells, it is predominantly localized in compact myelin, where it accounts for 2-5% of total protein. In the sciatic nerve, PMP22 expression parallels the expression of PO and MBP during development and regeneration (147, 149). Transgenic experiments in rodents revealed that PMP22 is required for the correct development of peripheral nerves, the maintenance of axons, and the determination of myelin thickness and stability (150).Similar to the findings in mice lacking PO, PMP22-deficient mice are also retarded in myelination, suggesting a role of PMP22 in early steps of myelination (150). Based on these observations, it might be speculated that both PMP22 and PO are components of a potential “premyelination” complex that must be formed to allow efficient initiation of myelination. Similar situations have been proposed in other systems, including the formation of connexons and glycoprotein “rafts,”which are composed of molecules targeted for cellular compartmentalization by sorting in the epithelial trans-Golgi network (151-154). It is feasible that the myelin components, both protein and lipid, may need to be structurally organized in a precise stochiometric ratio within the endoplasmic reticulum or Golgi apparatus of the cell before being transported to the myelin lamellae. The lack of one of these myelin proteins or a stochiometric imbalance could result in delayed myelination or in the formation of unstable myelin structures. Heterozygous PMP22 knockout mice that have retained only one functional PMP22 gene
240
VERDON TAYLOR AND UELI SUTER
copy display haploinsufficiency inasmuch as they develop sausagelike focal hypermyelinations (so-calledtomacula) in peripheral nerves (150). The expression of PMP22 is regulated by a dual promoter system (155). Although one of these promoters is active exclusively in the myelinating Schwann cell, a second more ubiquitous and weaker promoter is responsible for the expression in other tissues and also in the non- and premyelinating Schwann cell (155,156). The PNS, unlike other tissues, appears to be exquisitely sensitive to PMP22 gene dosage. The relatively mild 50% overexpression of PMP22 in the human peripheral nerve is responsible for a common peripheral neuropathy [Charcot-Marie-Tooth disease] (CMT) (reviewed in 157).The mode of action of this overexpression is unclear. However, overexpression of PMP22 in transgenic mice and rats leads to hypomyelination,with the severity correlating with the number of transgenic copies in different animal lines (158,159). In the presence of high PMP22 gene dosage, myelination is completely inhibited and the transgenic Schwann cells show impaired differentiation characterized by expression of premyelinationmarkers such as N-CAM, L1, and p75NGFR(158).It remains to be seen whether this effect reflects a general cellular disturbance, such as defective protein transport culminating in cell death, as has been observed in a comparable situation in CNS myelination diseases caused by mutations affecting the myelin proteolipid protein (PLP) (160, 161). Interestingly, the overexpression of PMP22, a presumed growth arrest gene (gas3) (162, 163), results in hyperproliferation of the Schwann cells in transgenic animals. Whether the reported prolongation of the G, phase in PMP22-overexpressing Schwann cells in vitro or the induction of an apoptotic phenotype in NIH 3T3 cells as a result of overexpression is relevant in this context remains to be determined (164,165). However, although the exact function of PMP22 is not known, these seemingly contradictory results suggest that PMP22 may be involved in the regulation of general Schwann cell physiology, possibly by forming an adhesive channel or pore.
4. MYELIN-ASSOCIATED GLYCOPROTEIN Although myelin-associatedprotein (MAG) is only a minor constituent of PNS myelin, its structure, expression, and localization make this protein a prime candidate for mediating axon-Schwann cell interactions during the initiation of myelin formation (20,166). MAG is a heavily glycosylated recognition protein of 100 kDa and belongs with its five Ig-like domains to the immunoglobulin superfamily (121). MAG is first expressed in vitro after the Schwann cell has established the 1:l relationship with the axon (114),or in vivo when the Schwann cell membrane has made one and one-half turns around the axon (22). In later developmental stages, MAG is not present in
AXON-GLIA INTERACTIONS IN THE PNS
241
compact myelin but remains expressed periaxonally (periaxonal collar) and in noncompacted domains of myelin (22). Several lines of evidence suggest that MAG is directly involved in signal transduction events, in particular in association with the tyrosine kinase fyn (167, 168).Nevertheless, the exact extracellular ligand(s) of MAG are not known, although experimental results suggest that MAG belongs to a family of proteins that recognize sialylated glycans (169).Furthermore, gangliosides have been suggested to be involved in MAG-mediated cell-cell interactions (170). In vitro experiments using cocultures of dorsal root ganglia and Schwann cells constitutivelyoverexpressing MAG showed accelerated segregation and ensheathment of larger caliber axons (121).In contrast, the sorting of larger caliber axons and their ensheathment appeared inhibited if Schwann cells canying a MAG-specific antisense mRNA were used in the same experiments (171).These data strongly suggested a crucial role of MAG at initial stages of myelination and in axon-Schwann cell interactions. However, peripheral nerves of' MAG-deficient mice formed morphologically and biochemically normal myelin, and no delay in myelination or abnormal sorting of large-caliber axons was observed (172,173).An exception was the absence of cytoplasmic collars from some Schwann ceblarge-caliber axon units in the ventral roots (172).However, these abnormalities were not detected in sciatic and femoral nerves (173, 174). Interestingly, the scene changes dramatically when aged animals (older than 8 months) are examined. Particularly in the quadriceps nerve, degenerating myelin, onion bulb formations (supernumerary Schwann cells and Schwann cell processes that are sensitive indicators of continuous processes of demyelination and remyelination), as well as axonal degeneration, often associated with myelin tomacula, become prominent morphological features (174).Based on these findings, MAG appears to be essential for the maintenance of but not the formation of myelin. In this respect, it is noteworthy that peripheral nerves of MAG-deficient mice show a distinct up-regulation of the neural cell adhesion molecule N-CAM at locations where MAG is normally expressed in wildtype mice (173).These data suggest a potential compensatory mechanism mediated by N-CAM during myelin formation and in the maintenance of uncompacted myelin domains. 5. OTHERMYELIN-ASSOCIATED PROTEINS IN THE PNS
Additional proteins found in PNS myelin include the cytosolic lipid-binding P2 protein, whose expression is restricted to myelinating cells (175).Furthermore, two related proteolipid proteins, plasmolipin which might form cation-specific channels (176),and myelin and lymphocyte protein (MAL;
242
VERDON TAYLOR AND UELI SUTER
MVP17) (177,178),have been described as myelin components. E-Cadherin and connexin32 (Cx32)are localized in the uncompacted portions of myelin, the Schmidt-Lantermann incisures and paranodal loops, and may be involved in maintaining a Schwann cell cytoplasmic channel network (179, 180).Fin d y , periaxin is exclusively expressed by myelinating Schwann cells and its expression is under stringent axonal control. Although the function of periaxin is not yet known, it is likely to participate in early events of myelin formation as well as in the stabilization of the mature myelin sheath (179,181). 6. NEURAL ADHESION MOLECULE AND L1
N-CAM and L1 are two additional members of the immunoglobulin superfamily that are expressed in neural cells in patterns that suggest potential roles in axon-Schwann cell interactions. As described earlier, both proteins are found during the development of peripheral nerves. However, they can also be found on fasciculated axons and Schwann cells at later stages, suggesting that they may have functional roles in myelination (21, 22,182). Although nonmyelinating Schwann cells retain expression of N-CAM and L1 as development progresses, myelinating Schwann cells down-regulate both molecules. However, after nerve injury, N-CAM and L l are up-regulated in the proliferating Schwann cells of the distal stump and on the regenerating axon (2l, 182). Thus, N-CAM and L1 expression in myelinating Schwann cells seems to be negatively regulated by axonal contact. Interestingly, NCAM knockout mice display no morphological phenotype in the PNS, indicating that PNS myelination does not depend on N-CAM expression (183).
C. Myelin-associated Genes and Hereditary Peripheral Neuropathies Mutations affecting the genes encoding PO, PMP22, and Cx32 are associated with the dominantly inherited Charcot-Marie-Tooth disease (CMTl) or hereditary motor and sensory neuropathies (HMSNs)in humans (134,184, 185).Interestingly, the phenotypes associated with ddferent mutations are surprisingly similar, although there is some variation depending on the molecular nature of individual mutations. CMTl shows onset of symptoms in the first to second decade of life. Behaviorally, progressive muscle weakness is the hallmark of the disease, but sensory problems are usually mild. Furthermore, foot deformities (including pes cavus) and difficulties with fine muscle movements of the hands are frequent. Clinically, CMTl patients are characterized by decreased nerve conduction velocities due to demyelination of PNS nerves. In severely affected patients, congenital hypomyelination is often observed. The most common
AXON-GLIA INTERACTIONS IN THE PNS
243
form of CMTl (CMTlA) is the result of a 1.5-megabase duplication of chromosome 17~11.2,which includes the PMP22 gene, whereas the reciprocal 1.5-megabase deletion is associated with the mild peripheral neuropathy, hereditary neuropathy with liability to pressure palsies (HNPP). Thus, already minimal abnormalities in PMP22 expression can cause diseases of the PNS, a notion that can be extended to similar fmdings with PLP, the potential counterpart of PMP22 in the CNS. Because PMP22 expression is regulated by axon-glia interactions, it is reasonable to suggest that this regulation must be extraordinarily well controlled.
V. Transcriptional Regulation of Axon-Schwann
Cell Interactions As described in the precedmg paragraphs, many proteins show a pronounced regulation by axon-Schwann cell interactions. It is reasonable to assume that one of these regulatory mechanism is at the transcriptional level. In support of this hypothesis, the transcription factor &ox24 (Zif268, Egr-1, NGFlA, tis8) has been shown to be required for p75NGFRexpression (186). The most striking example, however, is suggested by the well-controlled,coordinate expression of the myelin genes as exemplified by the extremely high, transient demand for newly synthesized myelin proteins during development and nerve regeneration. Many efforts have been made over the last years to identify potential master regulators of myelin gene expression in analogy to the heh-loop-helix transcription factor MyoD in muscle development (187, 188). However, this search has proved to be extremely difficult, although some candidates have been identified and are discussed in the following sections.
A. Suppressed CAMP-inducible POU Protein CAMPhas been studied extensively as a possible general signaling molecule involved in axon-Schwann cell interactions based on the finding in vitro that myelin genes are up-regulated by forskolin, an activator of adenylate cyclase (32).Although the physiological relevance of these experiments has been questioned, SCIP was initially found in cultured Schwann cells due to its up-regulation by CAMP(32,189,190).Because the pit-oct-unc (POU) domain transcription factors are often associated with the regulation of cell type-specific events, and SCIP expression was shown to be tightly regulated by axon-glia interactions (33),SCIP appeared to be attractive candidate for the regulation of myelin gene expression. In support of this hypothesis and in agreement with the regulation of SCIP expression in PNS nerves, SCIP
244
VERDON TAYLOR AND UELI SUTER
was shown to act as a repressor of the PO gene promoter in cotransfection assays (191). Recently, controversial results have been obtained from the analysis of two sets of transgenic mice in which the expression of SCIP has been manipulated. In the first experiment, a truncated dominant-negative variant of SCIP was expressed specificallyin myelinating Schwann cells under the transcriptional control of the PO promoter (192).Besides modest hypermyelination, these transgenic mice showed striking premature myelination, suggesting that SCIP functions as a regulator of Schwann cell differentiation and, possibly, as a repressor of myelin gene regulation. In contrast, mice completely lacking SCIP expression display a quite different phenotype in peripheral nerves, which is characterized by severe congenital hypomyelination (193).Thus, SCIP is likely to perform multiple functions in the development of peripheral nerves that extend beyond myelin protein gene regulation.
B.
Role of the Zinc-finger Family Protein Krox2O in Schwann Cell Biology
Transgenic mice carrying a null mutation in the Krox20 gene show severe abnormalities in the segmentation of the hindbrain, leading to increased postnatal lethality (194).The few animals that survive into the second postnatal week start to tremble, a phenotype indicative of PNS abnormalities. On closer examination, a severe defect in Schwann cell development can be observed, resulting in hyperproliferation and a presumed differentiation arrest in a premyelination state just prior to myelination (195).These findings suggest that &ox20 regulates the expression of genes involved in the final progression of Schwann cells to myelin formation.
C. Function of Pax3 in Control of the Schwann Cell Lineage Pax3 is a member of the paired domain transcription factors, which have recently received considerable attention due to their ability to function as master switches in cell fate decision. In particular, the Drosophila Pax6 homolog (eyeless)has been shown to be able to determine the phenotype of any cell by inducing it, and the surrounding cells, to differentiate into cellular components of the eye (196). Pax3 may play a similar role in controlling the Schwann cell lineage. Examination of the expression of the Pax3 gene shows a biphasic expression pattern in Schwann cells (197).The first peak at El4 coincides with the beginning of proliferative early Schwann cell development, and a second wave of expression is observed at the onset of myelination from birth to postnatal day 5. Two naturally occurring Pax3 mutants, splotch and splotch delayed, are af-
245
AXON-GLIA INTERACTIONS IN THE PNS
fected by mutations in the DNA-binding paired domain and both mutants are homozygous lethal at E l 3 to E14. Peripheral nerves of the splotch mutant reveal a complete lack of S100-positive cells in the PNS, although the neurons and the migrating neural crest cells appear normal (198).In contrast, a few S100-positive cells can be found in peripheral nerves of the splotch delayed mutant. Thus, one might speculate that during embryogenesis, Pax3 is likely to be involved in regulation of Schwann cell differentiation by inducing the switch from the Schwann cell precursor to the proliferating early Schwann cell phenotype around E l 4 (Fig. 1).Similarly, Pax3 appears to be involved in regulating the postnatal switch from a nonmyelinating to a myelinating phenotype.
TGFRs
Sensory Neuron Enteric Neuron Neuregulins Neurotrophins
Pax-3
Earlv Schwann Cell
Regeneration
Injury
Myelinating Schwann Cell
Pax-3 Krox24 Krox20
-
I Non-Myelinating Schwann Cell
Myelin Gene
FIG.1. Schematic representation and molecular regulation of the generation of the Schwann cell lineage.
246
VERDON TAYLOR AND UELI S U E R
However, it is unclear whether Pax3 is regulating the proliferation state of Schwann cells, or whether Schwann cell proliferation results in Pax3 downregulation. The use of tissue-specific inducible knockout mutants may be able to address the question by modulating the expression of Pax3 during the time of Schwann cell proliferation and developmental switching. As previously described, neuregulins induce Schwann cell development from neural crest cells and repress neuronal differentiation. In addtion, TGF-p can repress crest formation of pigment cells (37,38).Although the downstream signaling cascades from neureguhn and TGF receptors are not fully understood, it is tantalizing to suggest that both may be linked to the regulation of Pax3 expression. Based on the master switch theory described for Pax6, it would be interesting to express Pax3 ectopically in early peripheral sensory neuron precursor cells to assess whether this regulatory protein might be able to bypass the signals that normally regulate ordered phenotypic neural crest differentiation, resulting in the ectopic generation of Schwann cells from neuronal progenitors.
D. Role of c-Jun in Axon-Schwann Cell Interactions The transcription factor c-Jun, which belongs to the bZIP family of transcription factors (199),is exclusively expressed by nonmyelinating Schwann cells in normal nerve and previously myelinating Schwann cells after injury (200).During PNS development and regeneration, the expression pattern of c-Jun is consistent with its regulation by axon-Schwann cell interactions. Thus, c-Jun is likely to be involved in determining the differentiated state of nonmyelinating and axon-deprived Schwann cells, although the target genes for this transcription factor in such cells remain to be determined.
VI. Degeneration and Regeneration in the Nervous System It has been suggested that the drastic changes taking place in Schwann cells following the loss of axonal contact show similarities to those in the developing nerve. Thus, the sciatic nerve crush-and-cut models have frequently been used to study not only the regeneration process, but also the factors regulating PNS development. If Schwann cells loose axonal contact, they enter a phase of dedifferentiation and rapid proliferation (201).The mitogenic signals that control this proliferation event are likely to be similar to those involved in the nonaxon contact-mediated proliferation of Schwann cell precursors. In this respect, the degeneration and regeneration processes in the PNS provide a unique opportunity to study regulatory changes that take
AXON-GLIA INTERACTIONS IN THE PNS
247
place during development, without encountering problems of limited tissue availability and potential in witro cell culture artifacts.
A. Regulation of Axonal Regeneration Is the Key to Repair It has been suggested that the different protein composition of PNS versus CNS might be of functional significancewith respect to the regeneration ability of PNS, which is in sharp contrast to the CNS. Indeed, potent neurite growth-inhibiting factors, NI-35 and NI-250, are expressed by oligodendrocytes and influence the plastic potential and regenerative capacity of the CNS (202). However, a neutralizing antibody (IN-1)directed against the inhibitory components appears only partially effective in improving axonal regeneration after lesioning of CNS tracts in wiwo (203).This suggests that additional inhibitory factors might also contribute to the inability of the CNS to regenerate. MAG has been proposed as an additional inhibitory myelin protein (204, 205), although the available evidence is controversial (206). Because MAG is also expressed in the regeneration-competent PNS, albeit at a tenth of the level in the CNS, it will be interesting to determine whether this quantitative difference of MAG expression is crucial for the ability of the PNS to regenerate. Alternatively, regeneration-promoting molecules may ovemde the mhibitory activity of MAG in the PNS, or the removal of the MAG-containing myelin debris during Wallerian degeneration may clear the path for regenerating axons by providing a favorable substrate for regeneration. The ECM is known to be an important regulator of Schwann cell development, particularly in the myelinating state, and it has been shown that tenascin-C is a marker for the regeneration process in the peripheral nerve (207). After nerve transection, Schwann cells produce tenascin-C, which is deposited in the ECM, where it acts as a permissive substrate for axonal regeneration (98,207). Although some components of PNS myelin are known inhibitory substrates for axonal regeneration, neurites of transected neurons grow into the region after the myelin debris has been cleared by phagocytosis, and laminin has been identified as one of the permissive factors in sciatic nerve preparations that could override the myelin-associatedinhibitory effects (97, 208).
B. Neurotrophic Factors Are Released during Wa IIer ian Degeneration Following injury to the nervous system, the prime objective is to prevent neurons from dying and to facilitate regeneration. Ciliary neurotrophic factor (CNTF) is renowned for its ability to rescue neurons deprived of troph-
248
VERDON TAYLOR AND UELI S U E R
ic factors. Schwann cells are the major source of intracellularly stored CNTF
(reviewed in 209). Because CNTF lacks a classical signal sequence for secretion, it is conceivable that CNTF is produced by Schwann cells and stored in the cytoplasm, only to be released in case of cell damage, to support the surrounding neurons. After a crush or cut injury to the nerve, Schwann cells loose contact with axons and CNTF production is down-regulated (ZIO),but reinervation, Schwann cells are induced to reexpress CNTF, This process is reproducible in vitro when isolated neonatal CNTF-negative rat Schwann cells are placed in coculture with dorsal root ganglion neurons. In this system, up-regulation of CNTF expression is independent of myelination (209). Interleukin-6 (IL-6) shares gp130 as a receptor-signaling component with CNTI?. Interestingly, Schwann cells increase their expression of IL-6 in response to a number of pathological situations, including inflammation, infection, and nerve injury (211).Whether IL-6, besides its function as a neurotrophic factor, may also support Schwann cell survival in an autocrine fashion remains to be determined. In addition, IL-6 may have indirect effects in the interplay of various cytokines, including TNF-a (212),IL-1 (213,214), and IL-10 (215),which have been found in injured PNS nerves.
VII. Axon-Schwann Cell Interactions as a Bilateral Communication The interactions between axons and Schwann cells is not unidirectional. The axons also respond to the Schwann cells as exemplified by the redistribution of NA+ channels on axons that are ensheathed by Schwann cells (115, 116).Nonmyelinated axons show a ubiquitous distribution of these channels with no focal concentration to specific regions. In the myelinated axon, however, these channels are concentrated at the nodes of Ranvier between two consecutive myelin sheaths, in agreement with the concept of saltatory conduction (115).It could be imagined that these specialized regions regulate the positioning of the ensheathing glial cells, but regeneration studies demonstrated that the converse is true in that Schwann cells regulate the position and also the channel composition of the nodes of Ranvier. When Schwann cells remyelinate axons during regeneration, the glial processes extending along the axon are preceded by clusters of Na+ channels (116).These clusters eventually aggregate between two adjacent Schwann cells and form new nodes, which are positionally distinct from the nodes of Ranvier of the original uninjured nerve (216).Thus, Schwann cells can regulate the distribution of proteins on the axonal membrane.
AXON-GLIA INTERACTIONS IN THE PNS
249
VIII. Mechanisms of Membrane Sorting in Myelinating Schwann Cells Schwann cells require a powerful sorting and transport system during myelin formation because large amounts of membrane components have to be synthesized and transported within a short time. Interestingly, the growthassociated protein-43 (GAP-43) has been proposed to be involved in the membrane fusion process as part of exocytosis and transfer of membrane proteins to the cell surface (217).However, GAP-43, although present in nonmyelinating Schwann cells (218),is not highly expressed during the myelination period, which makes it an unlikely candidate for a similar role in Schwann cells (219, 220). Nevertheless, such transport-facilitating proteins are also expected to be present in actively myelinating Schwann cells, where they may be involved in the regulation of vesicle formation and in determining the rate of protein transport to myelin membranes. To expand further on this highly speculative hypothesis, some components that may normally reside in intracellular membranous compartments, such as the endoplasmic reticulum or the Golgi apparatus, may be transported to the myelin sheath because of the extraordinary high demand of membranes during myelination. An indication for such a potential mechanism is provided by the finding that some myelin glycoproteins (such as PO) are found in myelin in both terminally and immaturely glycosylated states (221).The presence of high mannose-type sugars on surface-expressed molecules may be interpreted that either the cellular glycosylation or the transport machinery has been overrun and proteins are being expressed on the surface prematurely (221).
IX. Future Perspectives Although our understanding of the molecular basis of interactions in the PNS has advanced considerably and a number of the communicating molecules in this complex scenario have been identified, we are still a long way from understanding this elaborate system of cell-cell and cell-matrix interactions. The nervous system is without doubt a unique place to study interactions between different cell types within specified developmental windows and to elucidate how these multilateral interplays are guiding the precise array of phenotypic changes observed. Much of our future work will be directed toward a detailed understanding of these processes, in particular, by further cataloguing the participants and examining their functions within the system. As the next step, the mo-
250
VERDON TAYLOR AND UELI SUTER
lecular biology of spatial and temporal resolution will be of fundamental importance to the understandmg of such dynamic arrays as the nervous system. To achieve this goal, new advanced molecular and cell biology techniques will be necessary, for example, specific tracers that will allow us to follow molecules precisely in living tissue by microscopic techniques, particularly to support our attempts to dissect the specific signaling pathways involved. Last but not least, it is hoped that a thorough understanding of the special dynamics of the nervous system at the molecular level will ultimately lead to the development of effective treatment strategies for nerve regeneration after injury and for the devastating degenerative diseases of the nervous system.
REFERENCES D. J. Anderson, Neuron 3, 1 (1989). N. M. Le Douarin and C. Ziller, Curr. @in. Cell Biol. 5, 1036 (1993). D. J. Anderson, Cum @in. Neurobiol. 3, 8 (1993). D. L. Stemple and D. J. Anderson, Cell 7 4 973 (1992). K. R. Jessen, A. Brennan, L. Morgan, R. Mirsky, A. Kent, Y. Hashimoto and J. Gavrilovic, Neuron 12,509 (1994). 6. M. Bronner-Fraser, Cum. @in. Genet. Dm. 3,641 (1993). 7. E. Frank and J. R. Sanes, Development 111, 895 (1991). 8. N. Le Douarin, C. Dulac, E. Dupin and P. Cameron-Curry, Glia 4,175 (1991). 9. M. Bronner-Fraser, BioEssays 15,221 (1993). 10. J. T. Trachtenberg and W. J. Thompson, Nature (London)379,174 (1996). 11. A. J. Aguayo, L. Chavron and G. M. Bray, J. Neurocytol. 5,565 (1976). 12. K. R. Jessen, R. &sky and L. Morgan,]. Neurosci. 7,3362 (1987). 13. A. Waller, Philos. Trans. R. Soc. Lond. (Biol).140,423 (1850). 14. K. R. Jessen and R. Mirsky, Cum. @in. Neurobiol. 2,575 (1992). 15. K. R. Jessen and R. Mirsky, Glia 4, 185 (1991). 16. J. Gavrilovic, A. Brennan, R. Mirsky and K. R. Jessen, Eur. J. Neurosci. 7, 77 (1995). 17. H. J. Stewart, P. A. Eccleston, K. R. Jessen and R. Mirsky,]. Neurosci. Res. 30,346 (1991). 18. J. R. Sanes, Annu. Rev. Neurobwl. 12,491 (1989). 19. M. Schachner and R. Martini, Trends Neurosci. 18,183 (1995). 20. R. Martini,]. Neurocytol. 23, 1 (1994). 21. R. Martini and M. Schachner,]CB 106,1735 (1988). 22. R. Martini and M. Schachner,JCB 103,2439 (1986). 23. D. J. Anderson, FASEBJ. 8, 707 (1994). 24. L. Lo and D. J. Anderson, Neuron 15,527 (1995). 25. R. Mirsky, C. Dubois, L. Morgan and K. R. Jessen, Dmelopmnt 109,105 (1990). 26. G. Sobue, B. Kreider, A. Asbuly and D. Pleasure, Brain Res. 280,263 (1983). 27. Z. Dong, A. Brennan, N. Liu, Y.Yarden, G. Leflcowitz, R. Mirsky and K. R. Jessen, Neuron 15,585 (1995). 28. K. R. Jessen, L. Morgan, H. J. Stewart and R. Mirsky, Development 109,91 (1990). 29. G. Lemke and M. Chao, Development 102,499 (1988). 30. L. Morgan, K. R. Jessen and R. Mirsky,JCB 112,457 (1991). 1. 2. 3. 4. 5.
AXON-GLIA INTERACTIONS I N THE PNS
251
31. L. Cheng and A. W. Mudge, Neuron 16,309 (1996). 32. E. S. Monuki, G. Weinmaster, R. Kuhn and G. Lemke, Neuron 3,783 (1989). 33. S. S. Scherer, D. Y. Wang, R. Kuhn, G. Lemke, L. Wrabetz and J. Kamholz,J. Neurosci. 14, 1930 (1994). 34. M. L. Reynolds and C. J. Woolf, Cuw. @in. Neurobiol. 3,683 (1993). 35. S. Einheber, M. J. Hannocks, C. N. Metz, D. B. Rifkin and J. L. Salzer,JCB 129,443 (1995). 36. V. Guenard, T. Rosenbaum, L. A. Gwynn, T. Doetschman, N. Ratner and P. M. Wood, Glia 13,309 (1995). 37. K. M. Stocker, L. Sherman, S. Rees and G. Ciment, Develupment I l l , 635 (1991). 38. S. L. Rogers, P. J. Gegick, S. M. Alexander and P. G. McGuire, Deu. Biol. 154 192 (1992). 39. V. Guenard, L. A. Gwynn and P. M. Wood, /. Neurosci. 15,419 (1995). 40. L. Cheng, M. Khan and A. W. Mudge,JCB 129,789 (1995). 41. M. C. Raff, E. Abney, J. P. Brockes and A. Hornby-Smith, Cell 15,813 (1978). 42. J. P. Brockes, K. J. Fryxell and G. E. Lemke, J. Exp. Biol. 95,215 (1981). 43. A. D. Goodearl, J. B. Davis, K. Mistry, L. Minghetti, M. Otsu, M. D. Waterfield and P. Stroobant,JBC 268,18095 (1993). 44. M. A. Marchionni,A. D. J. Goodearl, M. S. Chen, 0. Bermingharr-McDonogh, C. Kirk, M. Hendricks, F. Danehy, D. Misumi, J. Sudhalter and K. Kobayashi, Nature (London) 362, 312 (1993). 45. M. A. Marchionni, Nature (London)378,334 (1995). 46. W. E. Holmes, M. X. Sliwkowski, R. W. Akita, W. J. Henzel, J. Lee, J. W. Park, D. Yansura, N. Abadi, H. Raab, G. D. Lewis, H. M. Shepard, W. J. Kuang, W. I. Wood, D. V. Goeddel and R. L. Vandlen, Science 256,1205 (1992). 47. D. L. Falls, K. M. Rosen, G. Corfas, W. S. Lane and G. D. Fischbach, Cell 72,801 (1993). 48. D. Wen, S. V., Suggs, D. Karunagaran, N. Liu, R. L. Cupples, Y. Luo, A. M. Jmssen, N. BenBaruch, D. B. Trollinger, V. L. Jacobsen, S.-Y. Meng, H. S. Lu, S. Hu, D. Chang, W. Yang, D. Yanigahara, R. A. Koski and Y. Yarden, MCBiol 14, 1909 (1994). 49. D. Wen, E. Peles, R. Cupples, S. V. Suggs, S. S. Bacus, Y. Luo, G. Trail, S. Hu, S. M. Silbiger, R. B. Levy, Y. Luo and Y. Yarden, Cell 69, 559 (1992). 50. E. Peles and Y. Yarden, BioEssays 15, 815 (1993). 51. K. L. I. Carraway and S. J. Burden, Cum. @in. Neurobiol. 5,606 (1995). 52. T. Wada, X. Qian and M. I. Greene, Cell 6l, 1339 (1990). 53. R. Goldman, R. B. Levy, E. Peles and Y. Yarden, Bchem 29,11024 (1990). 54. G. D. Plowman, J. M. Green, J. M. Culouscou, G. W. Carlton, V. M. Rothwell and S. Buckley, Nature (London)366,473 (1993). 55. K. L. R. Carraway and L. C. Cantley, Cell 7 8 , s (1994). 56. A. D. Levi, R. P. Bunge, J. A. Lofgren, L. Meima, F. Hefti, K. Nikolics and M. X. Sliwkowski, J. Neurosci. 15,1329 (1995). 57. B. T. Zhang, N. Hlkawa, H. Hone and T. Takenaka,J. Neurosci. Res. 41,648 (1995). 58. A. Nikitin, L. A. Ballering, J. Lyons and M. F. Rajewsky, PNAS 88,9939 (1991). 59. A. Orr-Urtreger, L. Trakhtenbrot, R. Ben-Levy, D. Wen, G. Rechavi, P. Lonai and Y. Yarden, PNAS 90,1867 (1993). 60. D. Meyer and C. Birchmeier, Nature 378,386 (1995). 61. N. M. Shah, M. A. Marchionni, I. Isaacs, P. Stroobant and D. J. Anderson, Cell 77,349 (1994). 62. M. Gassmann, F. Casagranda, D. Orioli, 1% Simon, C. Lai, R. Klein and G. Lemke, Nature (London)378,390 (1995). 63. K. F. Lee, H. Simon, H. Chen, B. Bates, M. C. Hung and C. Hauser, Nature 378, 394 (1995). 64. H. Gu, J. D. Marth,P. C. Orban, H. Mossmann and K. Rajewsky, Science 265,103 (1994).
252 65. 66. 67. 68. 69.
VERDON TAYL.OR AND UELI SUTER
P. Soriano, Annu. Reu. Neurosci. 18, 1 (1995). R. Kuhn, F. Schwenk, M. Aguet and K. Rajewsky, Science 269,1427 (1995). E. S. Anton, G. Weskamp, L. F. Reichardt and W. D. Matthew, PNAS 91,2795 (1994). G. L. Barrett and P. F. Bartlett, PNAS 9 4 6501 (1994). S. Rabizadeh, J. Oh, L. T. Zhong, J. Yang, C. M. Bitler, L. L. Butcher and D. E. Bredesen, Science 2 6 4 345 (1993). 70. R. Curtis, K. M. Adryan, J. L. Stark, J. S. Park, D. L. Compton, G . Weskamp, L. J. Huher, M. V. Chao, R. Jaenisch, K. F. Lee, R. M. Lindsay and P. S. Distefano, Neuron 14, 1201 (1995). 71. M. Bothwell, Annu. Rev. Neurosci. 18,223 (1995). 72. P. Kahle, P. A. Barker, E. M. Shooter and C. Hertel, J. Neumsci. Res. 38,599 (1994). 73. R. J. Smeyne, R. Klein, A. Schnapp, L. K. Long, S. Bryant, A. Lewin, S. A. Lira and M. Barhacid, Nature (London)368,246 (1994). 74. K. F. Lee, E. Li, L. J. Huber, S. C. Landis, A. H. Sharpe, M. V. Chao and R. Jaenisch, Cell 69,737 (1992). 75. L. C. Schecterson and M. Bothwell, Neuron 9,449 (1992). 76. R. Klein, R. J. Smeyne, W. Wurst, L. K. Long, B. A. Auerhach, A. L. Joyner and M. Barbacid, Cell 75,113 (1993). 77. M. E. Sebert and E. M. Shooter,J. Neurosci. Res. 36,357 (1993). 78. R. Heumann, S. Korsching, C. Bandtlow and H. Thoenen,JCB 104,1623 (1987). 79. R. Heumann, D. Lindholm, C. Bandtlow, M. Meyer, M. J. Radeke, T. P. Misko, E. Shooter and H. Thoenen, PNAS 84,8735 (1987). 80. B. C. Varnum, C. Young, G. Elliott, A. Garcia, T. D. Bartley, Y. W. Fridell, R. W. Hunt, G . Trail, C. Clogston, R. J. Toso, D. Yanaghara, L. Bennet, M. Sylber, L. A. Merewether, A. Tseng, E. Escohar, E. T. Liu and H. K. Yamane, Nature (London) 373,623 (1995). 81. K. Ohashi, K. Nagata, J. Toshima, T. Nakano, H. Arita, H. Tsuda, K. Suzuki and K. Mizuno, JBC 270,22681 (1995). 82. T.N. Stitt, G. COM, M. Gore, C. Lai, J. Bruno, C. Radziejewski, K. Mattsson, J. Fisher, D. R. Gies, P. F.Jones, l? Masiakowski, T. E. Ryan, N. J. Tobkes, D. H. Chen, P. S. Distefano, G. L. Long, C. Basilico, M. P. Goldfbrb, G . Lemke, D. J. Glass and G. D. Yancopoulos, Cell 80,661 (1995). 83. P. J. Godowski, M. R. Mark, J. Chen, M. D. Sadick, H. Raab and R. G. Hammonds, Cell 82, 355 (1995). 84. R. H. Li, J. Chen, G. Hammonds, H. Phillips, M. Armanini, P. Wood, R. Bunge, P. L. Godowski, M. X. Sliwkowski and J. P. Mather,J. Neurosci. 15,2012 (1996). 85. T. D. Bartley, R. W. Hunt, A. A. Welcher, W. J. Boyle, V. P. Parker, R. A. Lindherg, H. S. Lu, A. M. Colombero, R. L. Elliott, B. A. Guthrie, P. L. Holst, J. D. Skrine, A. J. Toso, M. Zang, E. Femandez, G . Trail, B. Vamum, Y. Yarden, T. Hunter, G . M. Fox Nature (London) 368, 558 (1994). 86. M. P. Beckmann, D. P. Cerretti, P. Baum, T. Vanden Bos, L. James, T. Farrah, C. Kozlosky, T. Hollingsworth, H. Shilling, E. Maraskovaky, F. A. Fletcher, V. Lhotak, T. Pawson and S. D. Lymann, EMBO]. 13,3757 (1994). 87. S. Davis, N. W. Gale, P. C. Aldrich, M. Maisonpierre, V. Lhotak, T. Pawson, G . D. Goldfarb and G . D. Yancopoulos, Science 266,5 186 (1994). 88. J. W. Winslow, P. Moran, J. Valverde, A. Shih, J. Q. Yuan, S. C. Wong, S. P. Tsai, A. Goddard, W. J. Henzel, F. He&, K. D. Beck and I. W. Caras, Neuron 14,973 (1995). 89. U. Drescher, C. Kremoser, C. Handwerker,J. Loschinger,M. Noda and F. Bonhoeffer, Cell 82,359 (1995). 90. H. J. Cheng, M. Nakamoto, A. D. Bergemann and J. G . Flanagan, eel2 82,371 (1995). 91. C. Lai and G . Lemke, Neuron 6,691 (1991).
AXON-GLIA INTERACTIONS IN THE PNS
253
92. R . P. Bunge, M. B. Bunge and M. Cochran, Neurology 28,59 (1978). 93. R. P. Bunge and M. B. Bunge,JCB 78,943 (1978). 94. C. F. Eldridge, M. B. Bunge, R. P. Bunge and P. M. Wood,JCB 105,1023 (1987). 95. C. F. Eldridge, M. B. Bunge and R. P. Bunge,J. Neurusci. 9,625 (1989). 96. R . P. Bunge, M. B. Bunge and C . F. Eldridge, Annu. Rm. Neurusci. 9,305 (1986). 97. E. S. Anton, A. W. Sandrock, Jr. and W. D. Matthew, Dev. Biol. 164,133 (1994). 98. M. Fruttiger, M. Schachner and R. Martini,]. Neurocytol. 24, 1 (1995). 99. H. Xu, X . R. Wu, U. M. Wewer and E. Engvd, Nuture Genet. 6,297 (1994). 100. H. J. Weinberg, P. S. Spencer and C. S. Raine, Bruin Res. 88, 532 (1975). 101. H. Yamada, A. Chiba, T. Endo, A. Kobata, L. V. B. Anderson, H. Hori, H. Fukuta-Ohi, I. Kanazawa, K. P. Campbell, T. Shimizu and K. Matsumura,]. Neurochem. 66,1518 (1996). 102. M. D. Schaller, J. D. Hildebrand, J. D. Shannon, J. W. Fox, R. R. Vines and J. T. Parsons, MCBiuZ 14, 1680 (1994). 103. S. J. Shattil, B. Haimovich, M. Cunningham, L. Lipfert, J. T. Parsons, M. H. Ginsberg and J. S. Brugge,JBC 269,14738 (1994). 104. C. Serra-Pages,N. L. Kedersha, L. Fazikas, Q. Medley, A. Debant and M. Streuli, EMBOJ. 14, 2827 (1995). 105. S. Einheber, T. A. Milner, F. Giancotti and J. L. Salzer,JCB 123, 1223 (1993). 106. M. L. Feltri, S. S. Scherer, R. Nemni, J. Kamholz, H. Vogelbacker, M. 0.Scott, N. Canal, V. Quaranta and L. Wrabetz, Develgment 120,1287 (1994). 107. G. Kidd, S. B. Andrews and B. D. Trapp,]. Neurusci. 16,946 (1996). 108. S. Higashiyama,R. Iwamoto, K. Goishi, G. Raab, N. Taniguchi, M. Klagsbmn and E. Mekada, ]CB 128,929 (1995). 109. K. Nakamura, R. Iwamoto and E. Medada,JCB 129,1691 (1995). 110. E. S. Anton, M. Hadjiargyrou, P. H. Patterson and W. D. Matthew,]. Neurusci. 15, 584 (1995). 111. M. Hadjiargyrou and P. H. Patterson,]. Neurosci. 15, 574 (1995). 112. C. Schmidt, V. Kiinemund, E. S. Wintergerst, B. Schmitz and M. Schachner,1.Neurosci. Res. 43, 12 (1996). 113. P. M . Wood, M. Schachner and R. P. Bunge,]. Neurosci. 10,3635 (1990). 114. G. C. Owens and R. P. Bunge, Gliu 2,119 (1989). 115. E. H. Joe and K. J. Angelides,]. Neurosci. 13,2993 (1993). 116. E . H . Joe and K. Angelides, Nature (London) 356,333 (1992). 117. S. T. Hsieh, G. J. Kidd, T. 0. Crawford, Z. Xu, W. M. Lin, B. D. Trapp, D. W. Cleveland and J. W. Griffin, ]. Neurosci. 14,6392 (1994). 118. L. L. Kirkpatrick and S. T. Brady,]. Neurusci. 14,7440 (1994). 119. R. Martini, M. Schachner and T. M. Brushart,]. Neurosci. 14,7180 (1994). 120. L. M. Bolin and E. M. Shooter,]CB 123,237 (1993). 121. G. C. Owens, C. J. Boyd, R. P. Bunge and J. L. Salzer,JCB 111,1171 (1990). 122. U . Suter, A. A. Welcher, T. Ozcelik, G. J. Snipes, B. Kosaras, U. Francke, G. S. Billings, R. L. Sidman and E. M. Shooter, Nature (London)356,241 (1992). 123. U . Suter, J. J. Moskow, A. A. Welcer, G. J. Snipes, B. Kosaras, R. L. Sidman, A. M. Buchberg and E. M. Shooter, PNAS 69,4382 (1992). 124. A. J. Aguayo, M. Attiwell, J. Trecarten, S. Perkins and G. M. Bray, Nature (London)265,73 (1977). 125. A. 1. Aguayo, G . M. Bray and S. C . Perkins, Ann. N.Y. Acad. Sci. 317, 512 (1979). 126. S. M . de Waegh, V. M. Lee and S. T. Brady, CeZE 68,451 (1992). 127. A. Peters, S . L. Palay and D. F. H. Webster, “The Fine Structure of the Neurvous System: The Neurons and Supporting Cells.” W. B. Saunders, Co., Philadelphia, 1976. 128. J. E Goodrum,]. Neurochem. 54,1709 (1990).
254
VERDON TAYLOR AND UELI SUTER
129. J. L. Goldstein and M. S. Brown, Nature (London)343,425 (1990). 130. G. J. Snipes and U. Suter, Brain Pathol. 5,233 (1995). 131. U. Suter and G. J. Snipes, Annu. Rev. Neurosci. 18,45 (1995). 132. G. Lemke, Glia 7,263 (1993). 133. G. Lemke and R. Axel, Cell 40, 501 (1985). 134. G. J. Snipes and U. Suter,]. Anat. 186,483 (1995). 135. R. Martini, E. Bollensen and M. Schachner, Dev. Biol. 129,330 (1988). 136. D. D’Urso, P. J. Brophy, S. M. Staugaitis, C. S. Gillespie, A. B. Frey, J. G. Stempak and D. R. Colman, Neuron 4,449 (1990). 137. M. T. Fdbin, F. S. Walsh, B. D. Trapp, J. A. Pizzey and G. I. Tennekoon, Nature (London) 344, 871 (1990). 138. K. P. Giese, R. Martini, G. Lemke, P. Soriano and M. Schachner, Cell 71,565 (1992). 139. J. Schneider-Schaulies,A. von Brunn and M. Schachner,J. Neurosci. Res. 27,286 (1990). 140. R. Martini, M. H. Mohajeri, S. Kasper, K. P. Giese and M. Schachner,]. Neurosci. 15,4488 (1995). 141. R. Martini, J. Zielasek, K. V. Toyka, K. P. Giese and M. Schachner, Nature Genet. 11,281 (1995). 142. J. Kamholz and L. Wrabetz, in “Myelin:Biology and Chemistry” (R. E. Martenson, ed.), p. 367. CRC Press, Boca Raton, 1992. 143. Y. Inoue, R. Nakamura, K. Mikoshiba and Y. Tsukada, Bruin Res. 219,85 (1981). 144. R. M. Could, A. L. Byrd and E. Barbarese,]. Neurocytol. 24,85 (1995). 145. S. K. Gupta, J. F. Poduslo and C . Mezei, Brain Res. 464, 133 (1988). 146. U. Suter and G. J. Snipes,]. Neurosci. Res. 40, 145 (1995). 147. G. J. Snipes, U. Suter, A. A. Welcher and E. M. Shooter,]CB 117,225 (1992). 148. A. A. Welcher, U. Suter, M. De Leon, G .J. Snipes and E. M. Shooter, PNAS 88,7195 (199 1). 149. G . Kuhn, A. Lie, S. Wilms and H. W. Muller, Glia 8,256 (1993). 150. K. Adlkofer, R. Martini, A. Aguzzi, J. Zielasek, K. V. Toyka and U. Suter, Nature Genet. 11, 274 (1995). 151. L. S. Musil land D. A. Goodenough, Cell 74,1065 (1993). 152. K. Fiedler, R. G. Parton, R. Kelher, T. Etzold and K. Simons, EMBOJ.13,1729 (1994). 153. L. A. Hannan, M. P. Lisanti, E. Rodriguez-Boulan and M. Edidin,JCB 120,353 (1993). 154. D. A. Brown and J. K. Rose, Cell 68,533 (1992). 155. U. Suter, G.J. Snipes, R. Schoener-Scott, A. A. Welcher, S. Pareek, J. R. Lupski, R. A. Murphy, E. M. Shooter and P. I. Pate1,JBC 269,25795 (1994). 156. D. Baechner, T. Liehr, H. Hameister, H. Altenberger, H. Grehl, U. Suter and B. Rautenstrauss,]. Neurosci. Res. 42, 733 (1995). 157. U. Suter and P. I. Patel, Human Mutat. 3 , 9 5 (1994). 158. J. P. Magyar, R. Martini, T. Ruelicke, A. Aguzzi, K. Adlkofer, Z. Dembic, J. Zielasek, K. V. Toyka and U. Suter, J. Neurosci. 16,5351 (1996). 159. M. Sereda, I. CrifEths, A. Puhlhofer, H. Stewart, M. J. Rossner, F. Zimmermann, J. P. Magyar, A. Schneider, E. Hund, H.-M. Meinck, U. Suter and K. A. Nave, Neuron 16,1049 (1996). 160. I. R. Griffiths, A. Schneider, J. Anderson and K. A. Nave, Brain Pathol. 5,275 (1995). 161. P. E. Knapp, R. P. Skoff and D. W. Redstone, J. Neurosci. 6,2813 (1986). 1fi2. C. Schneider, R. M. King and L. Philipson, Cell 54,787 (1988). 163. G. Manfioletti, M. E. Ruaro, G . Del Sal, L. Philipson and C. Schneider, MCBioZ 10,2924 (1990). 164. E. Fabbretti, P. Edomi, C. Brancolini and C. Schneider, Genes Dev. 9,1846 (1995). 165. G. Zoidl, S. Blass-Kampmann, D. D’Urso, C. Schrnalenbach and H. W. Muller, EMBOJ. 14, 1122 (1995). 166. N. H. Stemberger, R. H. Quarks, Y. ltoyama and H. D. Webster, PNAS 76, 1510 (1979).
AXON-GLIA INTERACTIONS IN THE PNS
255
167. H . Umemori, S . Sato, T. Yagi, S. Aizawa and T. Yamamoto, Nature (London) 367, 572 (1994). 168. M. L. Jaramillo, D. E. Afar, G. Almazan and J. C. Bell, JBC 269,27240 (1994). 169. S. Kelm, A. Pelz, R. Schauer, M. T. Filbin, S. Tang, M. E. de Bellard, R. L. Schnaar, J. A. Mahoney, A. Hartnell, P. Bradfield and P. R. Crocker, Cuff.Bid.4 , 9 6 5 (1994). 1 70. L.J. Yang, C. B. Zeller, N. L. Shaper,M. Kiso, A. Hasegawa, R. E. Shapiro and R. L. Schnaar, PNAS 93,814 (1996). 171. G. C. Owens and R. P. Bunge, Neuron 7,565 (1991). 172. C. Li, M . B. Tropak, R. Gerlai, S. Clapoff; W. Abramow-Newerly, B. Trapp, A. Peterson and J. Roder, Nature (London) 369, 747 (1994). 173. D. Montag, K. P. Giese, U. Bartsch, R. Martini, Y. Lang, H. Bluthmann, J. Karthigasan, D. A. Kirschner, E. S. Wintergerst, K. A. Nave, J. Zeilasek, K. V. Toyka, H.-P. Lipp and M. Schachner, Neuron 13,229 (1994). 174. M. Fruttiger, D. Montag, M. Schachner and R. Martini, Eur J. Neut-osci.7,511 (1995). 175. R . E. Martenson and K. Uyemura, in “Myelin: Biology and Chemistry” (Ti. E. Martenson, ed.), p. 509. CRC Press, Boca Raton, 1992. 176. I. Fischer and V. S. Sapirstein,JBC 269,24912 (1994). 177. N . Schaeren-Wiemers,D. M. Valenzuela, M. Frank and M. E. Schwab,]. Neurosci. 15,5753 (1995). 178. T. Kim, K. Fiedler, D. L. Madison, W. H. Krueger and S. E. Pfeiffer,J. Neurosci. Res. 42, 413 (1995). 179. S . S . Scherer, S. M. Deschenes, Y.-t. Xu, J. B. Grinspan, K. H. Fischbeck and D. L. Paul,]. Neurosci. Res. 15, 8281 (1995). 180. A. M. Fannon, D. L. Sherman, G. Ilyina-Gragerova,l? J. Brophy, V. L. Friedrich, Jr. and D. R. Colman,]CB 129,189 (1995). 181. C. S. Gillespie, D. L. Sherman, G. E. Blair and P. J. Brophy, Neuron 12,497 (1994). 182. R. Tacke and R. Martini, Neurosci. Lett. 120,227 (1990). 183. H . Cremer, R. Lange, A. Christoph, M. Plomann, G. Vopper, J. Roes, R. Brown, S. Baldwin, P. Kraemer, s. Scheft D. Barthels, K. Rajewsky and W. Wille, Nature (London)367, 455 (1994). 184. U . Suter and P. I. Patel, Human Mutat. 3, 95 (1994). 185. P. I . Patel and J. R. Lupski, Trends Biochem. Sci. 10, 128 (1994). 186. S. S . Nikam, G. I. Tennekoon, B. A. Christy, J. E. Yoshino and J. L. Rutkowski, Mol. Cell. Neurosci. 6,337 (1995). 187. J. B. Miller, E. A. Everitt, T. H. Smith, N. E. Block and J. A. Dominov, BioEssays 15, 191 (1993). 188. E. N. Olson, T. J. Brennan, T. Chakraborty, T. C. Cheng, P. Cserjesi, D. Edmondson, G. James and L. Li, MCBchem 104,7 (1991). 189. U. Suter and G. J. Snipes, Trends, Neurosci. 17,399 (1994). 190. E. S . Monuki, R. Kuhn, G. Weinmaster, B. D. Trapp and G. Lemke, Science 249, 1300 (1990). 191. E. S. Monuki, R. Kuhn and G. Lemke, Mech. Deu. 42,15 (1993). 192. D. E. Weinstein, P. G. Burrola and G. Lemke, MoZ. Cell. Nmrosci. 6,212 (1995). 193. J. R. J. Bermingham, S. O’Connell, E. Arroyo, F. Powell, K. Kalla, R. McEvilly, S. Scherer and M. G. Rosenfeld, Soc. Neurosci. Abstr 21, 5 (1995). 194. S . Schneider-Maunoury,P. Topilko, T. Seitandou,G. Levi, M. Cohen-Tannoudji,S. Pournin, C. Babinet and P. Charnay, Cell 75, 1199 (1993). 195. P. Topilko, S . Schneider-Maunoury,G. Levi, A. Baron-Van Evercooren, A. 3.Chennoufi, T Seitanidou, C. Babinet and P. Chamay, Nature (London)3 7 1 796 (1994). 196. G. Halder, P. Callaerts and W. J. Gehring, Science 267,1788 (1995).
256
VERDON TAYLOR AND UELI SUTER
197. C. Kioussi, M. K. Gross and P. Gruss, Neuron 15,553 (1995). 198. D. J. Epstein, M. Verkermans and P. Gros, Cell 67,767 (1991). 199. W. W. Lamph, P. Wamsley, P. Sassone-Corsi and I. M. Verma, Nature (London) 334,629 (1988). 200. M. E. Shy, Y. Shi, L. Wrabetz, J. Kamholz and S. S. Scherer,]. Neurosci. Res. 43 (1996). 201. J. W. Fawcett and R. J. Keynes, Annu. Rev. Neurosci. 13,43 (1990). 202. M. E. Schwab, J. P. Kampfhammer and C. E. Bandtlow, Annu. Reu. Neurosci. 16, 565 (1993). 203. B. S . Bregman, E. Kunkel-Bagden, L. Schnell, H. N. Dai, D. Gao and M. E . Schwab, Nature (London)378,498 (1995). 204. L. McKerracher, S . David, D. L. Jackson, V. Kottis, R. J. Dunn and P. E. Braun, Neuron 13, 805 (1994). 205. G. Mukhopadhyay,P. Doherty, F. S. Walsh, P. R. Crocker and M. T. FiIbin, Neuron 13,757 (1994). 206. U . Bartsch, C. E. Bandtlow, L. Schnell, S. Bartsch, A. A. Spillmann, B. P. Rubin, R. Hillenbrand, D. Montag, M. E. Schwab and M. Schachner, Neuron 15,1375 (1995). 207. R. Martini, M. Schachner and A. Faissner,J. Neurocytol. 19,601 (1990). 208. S. David, P. E. Braun, D. L. Jackson, V. Kottis and L. McKerracher, J. Neurosci. Res. 42, 594 (1995). 209. N . Y. Ip and G . D. Yancopoulos, Annu. Reu. Neurosci. 19,491 (1996). 210. E. D. Rabinovsky, G . M. Smith, D. P. Browder, H. D. Shine and J. L. McManaman,J. Neurosci. Res. 31,188 (1992). 211. L. M. Bolin, A. N. Verity, J. E. Silver, E. M. Shooter and J. S. Abrams, J. Neurochem. 64, 850 (1995). 212. G. Stoll, S. Jung, S. Jander, P. H. Van der Meide and H.-P.Hartmung,J. Neuroirnmunol. 45, 175 (1993). 213. D. Lindholm, R. Heumann, M. Meyer and H. Thoenen, Nature (London)330,658 (1987). 214. M. C. Brown, V. H. Perry, E. R. LUM, S. Gordon and R. Heumann, Neuron 6,359 (1991). 215. S. Jander, J. Pohl, C. Gillen and G. S t o t J. Neurosci. Res. 43,254 (1996). 216. S. Dugandzija-Novakovic,A. G . Koszowski, S. R. Levinson and P. Shrager,J. Neurosci. 15, 492 (1995). 217. H. J. Stewart, R. Curtis, K. R. Jessen and R. Mirsky, Eur. 1.Neurosci. 7,1761 (1995). 218. R. Curtis, €1.J. Stewart, S. M. Hall, G. P. WilkiT1, R. Mirsky and K. R. Jessen,JCB 116,1455 (1992). 219. L. C. Plantinga, L. H. Schrama, B. J. Eggen, W. H. Gispen, J. Verhaagen and G . Lemke, Neuroreport 5,2465 (1994). 220. C. J. Woolf, M. L. Reynolds, M. S. Chong, P. Emson, N. Irwin and L. I. Benowitz,]. Neurosci. 12,3999 (1992). 221. K. R. Brunden,J. Neurochm. 58,1659 (1992).
Regulation of Eukaryotic Messenger RNA Turnover’ LAKSHMANE. RAJAGOPALANAND JAMES S. MALTER
Department of Pathology and Labwatwy Medicine University of Wisconsin-Madison Hospitals and Clinics Madison, Wisconsin 53792 I. Measurement of mRNA Decay Rates . . . . . . .. . . . . A. Pulse-Chase . . . . . . . . . . . B. Transcriptional Blockade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Serum-responsivefos Promoter D. In VitroSystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. mRNA Transfections . . . 11. Cis Elements . . . . . . . . . . . . . . . .
....................
A. B. C. D. E.
Adenosine-Uridine-rich Elements . . . Approaches to Identdymg New Cis Elements . . . . . . . . . . . . . . . . . . . The 29-Base Element of Amyloid Precursor Protein mRNA The Iron Response Element . . . . . . . . . . . . . . . . . . . . . . . . . . . Stability Elements of Globin mFiNAs . . . . . . . . . . . . . . . . . . .
A. B. C. D.
Approaches to Identifylng Trans Factors . . . . . . . . . . . . . . . . . . . . . . . The Adenosine-Uridine Binding Factor . . Function and Regulation of Adenosine-Uridine Binding Factor . . . . hnRNP C and Nucleolin . .............................. and Intact Animals: Application to
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
258 258 259 259 260 26 1 261 267 267 269 271 2 72 2 73 2 73 2 74 2 76 277 280 281 282 283
Gene expression can be modulated at multiple control points. Until recently, attention has been focused largely on understanding transcriptional regulation. This is both natural and appropriate, because without transcription there can be no gene expression. However, for a growing list of genes Abbreviations: GAPDH, glyceraldehyde phosphate dehydrogenase; MLA, gibbon leukemia cell line; GM-CSF, granulocyte-macrophage colony-stimulating factor; ARE, adenosineuridine-rich element; DRB, 5,6-dichloro-l-~-~ribofuranosylbenzimidazole; AUBF, adenosineuridine binding factor; PMGT, particle-mediated gene transfer; APP, amyloid protein precursor; PBMC, peripheral blood mononuclear cells. Progress in Nucleic Acid Research and MolecularBiology, Vol. 56
257
Copwght 0 1997 by Academic Press. All rights of reproduction in any form reserved 0079.660397 $25.00
258
LAKSHMAN E. RAJAGOPALAN AND JAMES S . MALTER
and gene families, transcriptional activation or inhibition fails to account for the observed changes in gene expression, especially as environmental or intracellular conditions fluctuate. Under such conditions, posttranscriptional events such as mRNA turnover and translation or posttranslational protein processing must be involved. In mammalian cells, mRNAs display substantial dfferences in their susceptibilityto degradation by ribonucleases. Their half-life values (G ) can vary from about 15 min for the very labile protooncogene c-fos mRNA (1-4)to 17 hr for P-globin mRNA (1).Also, these rates can be modulated by extracellular stimuli or inbacellular conditions. For example, in resting cells, mRNAs encoding granulocyte-macrophage colony-stimulatingfactor (GM-CSF),interleukin 2 (IL-2),tumor necrosis factor-a (TNF-a), interferon+ (IFN-P), cmyc, and crfos are all very unstable, with half-lives of 15 to 40 min (4-9). Such instability effectively limits the synthesis of intracellular protein by preventing the accumulation of significant levels of cytoplasmic mRNA. However, in activated cells, these same mRNAs are relatively stable, decaying in some cases with half-lives of greater than 4 hr (5, 8, 9). In mitogen-stimulated T lymphocytes or fibroblasts, GM-CSF mRNAs can accumulate several hundredfold, accounting for the characteristic burst of cytokine secretion following activation. Similarly, changes in mRNA decay rates have been observed during development, senescence, and differentiation (10-12). These data demonstrate that mRNA decay is an actively regulated process, extensively utilized by mammals throughout their life cycle, and one that critically determines the expression of individual genes or gene families. There are critical issues that remain largely unknown: (1)how specific mRNAs are recognized and targeted for either rapid decay or stabilization; (2) what trans-acting protein factors are responsible for rapid decay, or stability under different physiologic conditions; (3)where these processes occur; and (4)what signal transduction pathways connect mRNA decay to cell surface events. A focus of our laboratory has been to iden* and evaluate the functions of specific mRNA regons (cis sequences) that regulate mRNA turnover as well as to characterize stabilizing or destabilizing proteins (trans factors) that specifically recognize and bind these sequences. We have also devoted considerable time developing experimental systems, both in vitro and in vivo, that allow us to analyze mRNA decay pathways and their regulation.
1. Measurement of mRNA Decay Rates
A. Pulse-Chase For many years, it has been appreciated that mRNAs coding for distinct genes can have widely disparate decay rates. As discussed briefly above,
REGULATION OF EUKARYOTIC
mRNA TURNOVER
259
mammalian mRNAs display half-lives From 15 min (c-fos) (1-4) to >17 hr (globin) (1).The initial evaluation of decay rates was performed for globin mRNAs in reticulocytesby the classical pulse-chase methodology (13,14).Radiolabeled nucleotides were added to the culture and allowed to be incorporated into elongating transcripts. After a brief incubation, unlabeled nucleotides were added in huge excess to “chase” the radiolabeled nucleotide. This effectively produced a pulse of radiolabeled globin message whose decay could be followed thereafter by autoradiography.Although this methodology is effective for measuring globin mRNA decay rates, its widespread application is limited. First, it was assumed, without direct measurement, that the intracellular pools of radiolabeled nucleotide could be instantaneously diluted and hence “chased”by unlabeled nucleotides. This is unlikely and could account for continued (albeit at lower rates) production of radiolabeled mRNAs. Second, due to the low specific activity of labeling, mRNAs less abundant than gIobin were difficult or impossible to detect. Thus, pulse-chase methodology is inadequate to evaluate moderately (or more) abundant mRNA.
B. Transcriptional Blockade Alternative methods have been developed to measure the decay rates of less abundant mRNAs in a variety of cell types. The most commonly employed current approach is to block transcription and quantitate residual amounts of the mRNA of interest over time, by Northern or slot-blottingtechniques. Such an analysis permits calculation of the mRNA half-life. The decay of mRNAs transcribed from endogenous (cellular) or exogenous (transfected) genes can be evaluated without difficulty and often simultaneously. Transcriptional blockade can be accomplished with RNA polymerase I1 inhibitors such as actinomycin D (Act D) (15),a-amanitin (16),or the adenosine analog, 5,6-dichloro-l-~-~-ribofuranosylbenzimidazo1e (DRB) (17).However, recent data have demonstrated that actinomycin D and possibly DRB can have direct effects on mRNA stability. c-fos (4, 18),erythropoietin (19),transfemn receptor (20), and GM-CSF (18) mRNAs have all been reported to be stabilized in cells treated with actinomycin D. These data demonstrate that the very drugs used for measurement of mRNA decay may alter that process. At this time, the mechanisms underlying these effects are largely unknown, but suggest that mRNA decay measurements made in the presence of such drugs may not be valid.
C. Serum-responsive fos Promoter Perhaps the most popular approach to avoid transcriptional poisons has been the use ofthe serum-responsivefos promotor (1,4).A cDNA of interest is cloned downstream of the c-fos promoter and transiently or permanently transfected into NIH 3T3 cells. After approximately 24 hr of serum starva-
260
LAKSHMAN E. RAJACOPALAN AND JAMES S . MALTER
tion, the cells are challenged with serum, which induces a brief pulse (approximately 15-30 min) of transcription from the cfos promoter (2422).The promoter rapidly down-regulates, effectively producing a pulse of mRNA whose decay can be monitored by Northern blotting or RNase protection. Transcription during serum starvation is minimal, although some variability in the duration for promoter shutdown after serum treatment is commonly observed. Other potential problems include the requirement for serum starvation and serum replacement, which in itself could alter mRNA decay rates. For the study of cfos mRNA, whereby serum treatment normally induces transcriptional up-regulation, such issues are largely immaterial. Following the same thesis, many additional regulatable promoters have been examined for their utility in measuring mRNA decay rates. These include metal responsive promoters (metallothianein)(23)as well as those activated by antibiotics (tetracycline)(24,25).Neither of these has gained widespread acceptance, although tetracycline-responsive promoters can be activated by nontoxic levels of antibiotic and may see broader future use.
D. In Vitro Systems Rather than struggle with intact cells, a number of in vitro systems have been developed (26,27).Postmitochondrial supernatants (28),whole or partial cell extracts (29-32), and reticulocyte lysates (33-35) have been employed to examine the decay of endogenous or exogenous mRNAs. However, the fraction of an mRNA that is polysome associated in such systems is unknown. This places a limitation on their reliability, because there is growing evidence that, at least for some classes of mRNAs, decay is coupled to ongoing translation (7, 36-42). This provided a rationale for the development of polysome-based in vitro decay systems (43-45), which u primi, might be a better approximation of intracellular events. The decay properties of polysome-associated mRNAs as well as the function of putative stabilizing or destabilizingprotein factors could be examined. Using a polysome-basedsystem, we were able to show that more than 90% of the total cellular activity of the adenosine-uridine binding factor (AUBF; see Section III,C) was polysome associated (46).AUBF is a cytosolic protein that is activated posttranslationally to bind specifically to the AUUUA repeats found in the 3' untranslated regions (UTRs) of several cytokine, growth factor, and protooncogene mRNAs (47, 48).When we physically removed all polysomeassociated AUBF activitywith biotinylated competitor RNA coupled to streptavidin-linked magnetic beads, we were able to accelerate the decay of polysome-associated GM-CSF mRNA fivefold (46).Thus the polysome system was a very useful tool to examine mRNA decay. However, it is a static system with no capacity for reiterative translation that could be missing essential truns factors dissociated during isolation.
REGULATION OF EUKARYOTIC
mRNA TURNOVER
261
E. mRNA Transfections The direct introduction into cells of a bolus of exogenous mRNAs would permit subsequent measurement of their decay in vivo, without the need for transcriptional blockers. This has been practical for many decades by microinjection (49).Practical considerations, including the number of cells that can be microinjected in a time period short enough for measurement of their decay has prevented application of this method to mammalian cells. mRNA transfections by electroporation (50) or lipofection (51)have been reported. Neither technique is particularly efficient, requires extensive cell recuperation, and delivers mRNAs into unknown or potentidy nonphysiologic intracelldar compartments.
F. Particlemediated mRNA Transfections We have recently evaluated particle-mediated gene transfer (PMGT) as an alternative methodology to introduce mRNAs into normal resting cells as
well as transformed cell lines. PMGT employs microcarrier gold beads coated with the mRNA of choice; these beads are accelerated at high speed into mammalian cells (52-54).On contact with the aqueous cytoplasm, the mRNAs are immediately released from the gold beads and become translationally active as well as susceptible to cytoplasmic degradation. Approximately S to 10% of the cells is productively transfected with approximately 20% cell death (53).Full-length cDNAs were subcloned into a transcription vector downstream from a T7 RNA polymerase start site and upstream from a 90-base poly(dT) tract. Plasmids linearized at the terminus of the poly(dT) tract were transcribed in vitro in the presence of a cap analog [m7G(S’)ppp(5’)G],and polyadenylylated mRNAs were selected using oligo(dT)-cellulose, prior to precipitation onto gold beads for transfection. Wild-type and mutant GM-CSF mRNAs were produced and delivered into normal, resting lymphocytes or tumor cell lines. If transfected mRNA accurately mimicked the behavior of endogenous mRNA, we expected rapid mobilization onto polyribosomes and the synthesis of immunologically detectable GM-CSF protein, coupled with the rapid decay of the mRNA. The transfected mRNA should also be responsive to known modulators of mRNA decay pathways, including phorbol ester (5,55-57), cycloheximide (36-42), and possibly actinomycin D (4, 7, 18-20).
1. TURNOVER AND TRANSLATION OF GM-CSF mRNAs IN NORMAL CELLS Three versions of human GM-CSF mRNA were compared. The wild-type mRNA (hGM-AUUUA)contained five tandem AUUUA repeats in its 3‘ untranslated region (58-59). The presence of these repeats in the 3’ UTR of cy-
262
LAKSHMAN E. RAJAGOPALAN AND JAMES S . MALTER
A
I
1001
50
.-
hGM-AUUUA
+ hGM-AUGUA --t hGM-d3'-UTR
+ BGlobin 10'
0
15
I
I
L
30
45
60
Time (minutes)
t
-0-
hGM-AUUUA(-cap)
* hGM-AUGUA(-=p)
1
0
15
30
45
60
75
I 90
Time (minutes) FIG.1. (A) Decay kinetics of transfected GM-CSF and p-globin mRNAs in normal PBMCs. Resting PBMCs were transfected with capped and polyadenylylated (Ago) hGM-AUUUA, hGMAUGUA, hCM-d3'UTR, or p-globin mRNAs via particle-mediated gene transfer. Immediately after transfection, cells were washed twice with culture media to remove any mRNA on the cell surface. Transfectedcells were then placed in culture; at the indicated time points, equal numbers
REGULATION OF EUKARYOTIC mRNA TURNOVER
263
tokine and protooncogene mRNAs targets them for rapid decay in resting cells (60-62). A mutated version with four tandem AUGUA repeats (hGMAUGUA) and a truncated version with the entire 3' UTR deleted (hGMd3'UTR) were also constructed. Following transfection of normal Ficoll-Hypaque purified T lymphocytes, cells were immediately washed in culture media to remove any mRNA adhering to the cell surface, and at various times thereafter the amount of intracellular, transfected mRNA was analyzed by Northern blotting. hGM-AUUUA mRNA decayed extremely rapidly with a half-life of about 9 min (Fig. 1A). This rate was approximately four- to fivefold faster than that observed in actinomycin-D- or DRB-treated fibroblasts or lymphocytes (5, 52). hGM-AUGUA mRNA was significantly more stable (g = 30 min; Fig. lA), with transfected cells secreting 20-fold more transgenic protein into the culture medium than cells receiving wild-type GMCSF mRNAs (Fig. 2A). Transgenic protein was detectable within 15 min of transfection, consistent with a rapid mobilization of transfected mRNAs onto polysomes. GM-CSF mRNA without a 3' UTR (hGM-d3'UTR) was even more stable (t$ = 80 min; Fig. lA), but was less efficiently translated (Fig. 2A), consistent with a role for the 3' UTR in translational control (63).Although these mRNAs decayed more rapidly than in the presence of transcriptional blockers, the relative rates of decay were preserved. Transfected globin mRNA was quite stable, decaying with a calculated half-life of about 6 hr (Fig. 1A).The enhanced stability of hGM-d3'UTR compared with hGMAUGUA implies the existence of a second destabilizingelement distinct from the AUUUA motifs. The location of this element remains unknown, but is under active investigation. OF PHORBOL ESTERON mRNA DECAY 2. INFLUENCE
The decay of transgenic mRNAs was also examined in the presence of drugs such as phorbol ester (TPA), cycloheximide and actinomycin D, which influence mRNA decay. Phorbol ester has been shown to stabilize a variety of cytokine and other mRNAs, presumably through activation of protein kinase C signal transduction cascades (64-66). Treatment of transfected cells with phorbol ester (20 nglml) for 1hr resulted in a significant stabilization of
of cells were harvested and total RNA was quantitativelyisolated and Northern blotted. Northern blots were sequentially probed with ,'"P-labeled cDNA probes for either GM-CSF or P-globin cDNA and 18s ribosomal RNA. Radioactive signals were quantified using a phosphorimager, and GM-CSF or p-globin signals were nomiahzed to 18s ribosomal RNA signals and plotted versus time. (B) Uncapped, transfected mRNAs decay with altered kinetics. Polyadenylylated RNAs were produced in viti-o, without 5' caps, selected on oligo(dT)-cellulose columns, loaded onto gold beads, and delivered into resting PBMCs via particle-mediated gene transfer. Decay rates of transfected RNAs were analyzed as indicated in A.
264
LAKSHMAN E. RAJAGOPALAN AND JAMES S . W I E R
A 240 C hGM-AUUUA
200 m
+ hGM-AUGUA
- * hGMd3'UTR
0
30
60
90
120
150
180
Time (minutes)
B 800
0
GM-AUGUA (Cell)
Ei GM-dB'UTR (Cell) GM-AUGUA (Med)
0.5
1
2
3
4
Time (hours)
FIG.2. Transgenic protein production in transfected PBMCs and K562 cell Lines. (A) Culture media of transfected PBMCs and (B) culture media and cell pellets from transfected K562 erythroleukemia cell lines were analyzed for transgenic GM-CSF protein at the indicated time points after transfection of cells with either hGM-AUUUA, hGM-AUGUA, or hGM-d3'UTR mRNAs. Measurements were made using a human GM-CSF-specificenzyme-linked immunosorbent assay.
REGULATION OF EUKARYOTLC
mRNA TURNOVER
265
hGM-AUUUA and hGM-AUGUA mRNAs to t$ > 120 min. The equivalent stabilization of mutant and wild-type GM-CSF mRNAs suggests that the phorbol ester response element (56, 67-69), which is yet to be characterized, resides outside the AUUUA motifs.
3. INFLUENCE OF CYCLOHEXIMIDE ON mRNA DECAY The protein synthesis inhibitor cycloheximide has repeatedly been shown to stabilize a variety of mRNAs, including those with AUUUA motifs (60). The mechanism for this effect remains elusive and has been ascribed to the inhibition of synthesis of a labile protein necessary for rapid mRNA decay (39) or the need for continuous ribosome movement (38, 41). We used this well-known effect of cycloheximide to assess the validity of our mRNA transfection system. Transfected lymphocyteswere treated with cycloheximide (15 pg/ml) prior to measuring decay rates. Based on Northern blot analysis, both wild-type hGM-AUUUA and mutant hGM-AUGUA mRNAs were stabilized to ti > 90 min. Under these conditions, no detectable intracellular or extracellular GM-CSF protein was detectable over a 6-hr period. These data further demonstrate that protein synthesis must be functional for labile mRNAs, such as GM-CSF, to decay.
4. INFLUENCE OF ACTINOMYCIND ON mRNA DECAY The effects of actinomycin D on the mRNA decay machinery are only now becoming appreciated. mRNAs encoding c-Fos (4, 18) erythropoietin (19),and transferrin receptor (20) appear to be directly stabilized by actinomycin D. As such, the true half-lives of these messages must be measured without transcriptional blockade. As most mRNA decay rates have been determined in the presence of actinomycin D, the mere suggestion that this compound may directly impact this catabolic pathway is of great concern. Using particle-mediated gene transfer (F'MGT), we introduced GM-CSF mRNAs (hGM-AUUUAand hGM-AUGUA)into resting lymphocytesprior to their treatment with concentrations of actinomycin D typically used to block transcription (5 p,g/ml). Decay rates were assessed 15 min after the addition of actinomycin D, allowing estimation of the kinetics of any detectable effects. Like cycloheximide and phorbol ester, actinomycin D had a profound and nearly instantaneous stabilizing effect on both GM-CSF mRNAs. Decay half-lives for both were approximately 90 min, or 3- to 10-fold greater than that in the absence of the drug. Interestingly, the synthesis and secretion of GM-CSFprotein into the extracellular medium was delayed by about 2 hours in actinomycin-D-treated cells compared to untreated cells, although the total amounts of secreted cytokine closely approximated the control condition after 10 h. These data demonstrate that actinomycin D has direct effects on
266
LAKSHMAN E. RAJAGOPALAN AND JAMES S. MALTER
the mRNA catabolic pathways and appears to influence protein synthesis as well (70).As polyribosomes appear to be the site of labile mRNA decay, these effects may be interrelated.
5. DECAYOF UNCAPPED mRNAs In corollary experiments,we introduced uncapped GM-CSF mRNAs into resting lymphocytes. These RNAs would not be expected to be mobilized onto polyribosomes nor translated. Northern analysis showed a biphasic pattern of decay. Over the initial 30 min, no visible decay was observed for either wild-type or mutant GM-CSF message (Fig. 1B).Over the next 30 min, however, rapid decay occurred with an almost complete loss of message. Although these experiments are not as elegant as those in which the start codon has been mutated (42),they demonstrate that polysome mobilization must occur for normal decay. In this particular case, clearance of cap-negative mRNAs appears to occur outside polysomes, because these rnRNAs failed to produce detectable protein. 6. TURNOVER AND TRANSLATION OF GM-CSF mRNAs IN TRANSFORMED CELLLINES
In addition to normal, resting lymphocytes, particle-mediated gene transfer can productively transfect tumor cell lines as well as intact tissues (52-54). hGM-AUUUA, hGM-AUGUA, and hGM-d3'UTR mRNAs all decayed with half-lives in excess of 90 min after transfection into cycling K562 erythroleukemia cell lines. In the few available studies (71, 72), freshly expanded human tumors have often shown dramatic stabilization of many cytokine mRNAs. These data suggest that mRNA stabilization is a critical derangement of tumors permitting excessive secretion of biologically potent cytokines. The mechanism for this effect is unclear, but may reflect abundant, constitutive activity of the AUUUA-mRNA-stabilizing,adenosine-uridine binding factor (AUBF) (see Section 111,A). As we saw in resting lymphocytes, hGM-d3'UTR mRNA is less efficiently translated in K562 cells, despite its comparable stability with hGM-AUUUA and hGM-AUGUA mRNAs (Fig. 2B). Our data would therefore again imply either the existence of a cisacting translational element in the 3' UTR of GM-CSF mRNA or that the length of the 3' UTR in itself influences translation. The latter has recently received experimental support (63). In summary, the direct transfection of mRNA avoids metabolic derangement by nonspecific drugs and permits realtime assessment of mRNA decay and translation rates. mRNA decay is occurring in the physiologic context of protein synthesis such that both processes can be assessed simultaneously. The effect of exogenous agents can be measured and mechanistic hypotheses developed to examine where these
REGULATION OF EUKARYOTIC
mRNA TURNOVER
267
drugs exert their effects. Through such a process, signal transduction cascades that influence mRNA decay can eventually be dissected.
II. Cis Elements As approximately 20,000-25,000 mRNAs exist in any gwen cell at any time, mechanisms must exist by which the cellular machinery can discriminate one mRNA from another. Selective subcellular localization,rapid decay, mobilization onto polyribosomes for protein synthesis, assembly into cytoplasmic ribonucleoprotein particles, or other specific events must be driven by information encoded by the mRNA. Such information could be packaged as a primary sequence, a secondaiy or higher order structure, or a combination of both. The identification of such elements can be facilitated by aligning primary mRNA sequences from divergent species. In the case of cytokine mRNAs, the codmg regions typically show substantial homology on the order of 60-800/0 (at the nucleotide level). However, the 3’ untranslated regions can show up to 90% homology. The extreme conservation of such sequences that lie outside of the coding region must signify the maintenance of an important functional capability. If so, loss of these domains through mutation or experimental design would be expected to change the metabolism of these mRNAs. Such changes could be altered stability, localization (73-75), or translatability (63, 76, 77).This is not to imply that all mRNA cis determinants reside in untranslated domains. Recent data with c-fos (I, 4, 78-80) and c-myc (81, 82) mRNAs demonstrate the existence of destabilizing coding region elements that appear to function independently of the 3’ UTR AUUUA repeats. Thus, these particular mRNAs, presumably due to the sigmficance of their protein products, have higher orders of regulatory complexity, perhaps to ensure normal control despite a single genetic mutation.
A. Adenosi ne-U ridine-rich Elements The identification of elements by homology search still requires experimental demonstration of their functionality. In 1986, Shaw and Kamen (60) as well as Caput et al. (61)identified a common, conserved nucleotide sequence consisting of repeated AUUUA pentamers in the 3’ untranslated regons of mRNAs encoding inflammatory mediators. They demonstrated that the AUUUA motifs present in GM-CSF and TNF-a mRNAs targeted them for rapid decay. When fused to globin mRNA, the AU-rich domain greatly accelerated the decay of this previously stable transcript (60)(ti >17 hr to 45 min). Rapid decay required ongoing protein synthesis because cycloheximide blocked degradation of mRNAs with AUUUA repeats (60).
268
LAKSHMAN E. RAJAGOPALAN AND JAMES S. MALTER
These results demonstrated that fragments of the 3’ untranslated region of cytokine mRNAs that contained multiple reiterations of the AUUUA motifs are mRNA destabilizers, and also established a potential link between mRNA turnover and translation. Further sequence analysis showed that these domains are present in a large number of cytokine, protooncogene, and growth factor mRNAs (61).They were found exclusively in the 3’ UTR, but without an obvious spatial relationship to either the stop codon or the polyadenylylation signals (61). Occasional single reiterations of the AUUUA motif were identified in stable mRNAs, such as globin. This suggested that the AUUUA sequence might not be the true destabilizing entity, but rather a higher order or larger structure that included it. Recent work (83, 84) has conclusively shown that the true destabilizing motif is the nonamer UUAUUUA(U/A)(U/A),which in cis can destabilize a chimeric message. Interestingly, additional reiterations of this sequence are more destabilizing, suggesting a dosage effect. Reiterations of AUUUA (AUUUAAUUUA) are not destabilizing (84), suggesting an inhibitory effect of multiple purines within the context of the U-rich domain. The higher order structure assumed by AU-rich elements (ARES)remains unknown. Computer-assisted folding has failed to demonstrate a stable structure (such as a stem-loop) and ribonuclease mapping of such domains is yet to be reported. However, there is no a p r i w i requirement for the ARE to assume a stem-loop or other structural configuration. Additional work has furthered our understanding of how the ARE functions. The ARE of GM-CSF fused to the p-globin coding region was nonfunctional when (1)initiation of protein synthesis was inhibited by mutating the start codon, (2)all the AUUUA motifs in the ARE were disrupted by G or C substitutions, and (3) ribosomes were allowed to transit across the ARE (42).These stable, chimeric mRNAs remained associated with cytoplasmic ribonucleoproteins rather than with polyribosomes. mRNAs with an intact ARE and functional start codon were associated with a large (>20S) translation-dependent destabilizing complex. Others (38) have similarly shown that ARE-containing mRNAs are unstable only if translated. Inhibition of ribosome translocation as a result of a stable stem-loop structure in the 5’ UTR of a chimeric mRNA with a 3’ GM-CSF ARE resulted in the loss of destabilizing function of the ARE (38). Surprisingly, it also showed that a stable stem-loop anywhere in the 3’ UTR upstream of the ARE also prevented rapid decay (41). These data suggest that ARE-mediated decay might involve the movement of ribosome-associated,translation-dependent decay factors into the 3’ UTR. However, using p-globin-c-fos ARE constructs, Chen et al. (18)showed rapid chimeric mRNA decay in the absence of cotranslation. Finally, when an iron response element (IRE) was used to regulate translation of a chimeric
REGULATION OF EUKARYOTIC
mRNA TURNOVER
269
mRNA containing the cfos ARE, the mRNA decayed rapidly in the absence of translation (39). Whether these discrepancies reflect true differences between the AREs of cfos and GM-CSF or system-dependent artifacts remains unresolved. Although the use of such chimeric mRNAs can provide useful information regarding stability elements, the remaining 5’ UTR, 3’ UTR, and coding region sequences may also be involved. For example, long-range interactions between the 3’ UTR and the coding region or even the 5’ UTR are possible and should not be ignored. The mRNA stability determinant of insulin-like growth factor I1 (85, 86) is created by long-range interactions, where two elements separated by almost 2 kb form a stable stem-loop structure. Similarly, chimeric ARE-containing mRNAs usually cannot be stabilized by phorbol ester (57).These data reinforce the likely interaction between AREs and other regions of the mRNA. As a result of these concerns, we have examined the function of the AREs in the context of the full-length, wild-type mRNA. Thus, we constructed in vivo expression vectors coding for either fd-length, wild-type GM-CSF mRNA or a mutant version containing AUGUA repeats in place of the AUUUA repeats. These mRNAs differ by approximately 50 nucleotides, solely at the AREs. Internal substitutions within the UUAUUUAUU motif block function. The introduction of purines, especially guanosine, into the core AUUUA region prevents destabilization (48).The expression vectors were transfected into normal resting peripheral blood mononuclear cells (PBMCs)via particle-mediated gene transfer and decay rates were assessed after actinomycinD treatment (52). The wild-type mRNA decayed with a half-life of approximately20 min whereas the mutant was sign&cantly more stable and showed a half-life of approximately 90 min. Despite the observed stabilization of mutant GM-CSF mRNA, it remained relatively unstable compared to a host of more stable lymphocyte mRNAs, such as those coding for the amyloid protein precursor (APP) (87) or GAPDH (52). This suggested that GM-CSF may contain additional, heretofore uncharacterized, destabilizing domains. Homology search of the 3’ untranslated regon shows considerable conservation in areas outside of the AUUUA motifs. It is thus tempting to speculate that GM-CSF, much like cfos and c-myc mRNAs, may contain ancillary or distinct elements that can lead to accelerated decay in the absence of functional AREs. The identity of this ancillary domain is yet to be critically established.
B. Approaches to Identifying N e w Cis Elements The ARE remains the most intensively studied of all mRNA instability determinants, but it is by no means the only one. Literally dozens and perhaps hundreds of mRNAs appear to be regulated at posttranscriptional levels. In
270
LAKSHMAN E. RAJAGOPALAN AND JAMES S. MALTER
general, this conclusion has been reached by investigators after nuclear runoff experiments failed to account for changes in mRNA levels of the gene of interest. This suggests that there may be dozens of unique cis elements that direct the decay of these mRNAs or families of mRNAs. Just as common promoter elements can coregulate the transcription of gene families, shared mRNA instability or stability determinants would unify cytoplasmic regulation. Furthermore, the inclusion of distinct cis elements would permit cellular metabolism to control individual mRNA levels and hence translatabdity. The growing body of evidence suggests that mRNA levels are as closely regulated posttranscriptionally in the cytoplasm of cells as they are in the nucleus. The identification of new cis elements can be performed in a variety of ways. As alluded to previously, homology searches between divergent species sharing the same gene can reveal unexpected conservation. Such regions should not be expected to be mRNA-stabilizingor -destabilizing motifs, but certainly have the potential to be so. There are two complementary approaches to establish the functional role of a homologous region. First, directed mutation of the domain followed by transfection and half-life analysis can establish if the region is a regulatory one. In complementary studies, radiolabeled RNA containing a putative regulatory sequence is produced in vitro and used for RNA mobility shift assays. As discussed later in this essay, mRNA regions capable of binding cytoplasmic proteins are often regulatory in nature. Therefore, one can be directed to a potential cis element by mapping the binding sites. There are a number of caveats, however, that must be observed in these types of analysis. First, the type of cell used for the transfection or isolation of cytoplasmic protein must be appropriate for the system under study. For example, a variety of tumor cell lines show dysregulated cytokine mRNAs decay (88). In these lines, GM-CSF as well as IL-2 and a host of other mRNAs are constitutively stable. The removal of a destabihzing motif that is already nonfunctional in thls context may show little or no effect on the decay rate of the mRNA. The obvious cell for functional assessment of the mutant mRNA would be a normal, untransformed cell. In the case of GM-CSF this could be a fibroblast, lymphocyte, or endothelial cell, all of which can produce GM-CSF message and protein after appropriate stimulation. Even when a normal cell is used, one must pay attention to its activation state to ensure appropriate context. For example, activation of a lymphocyte with phorbol ester and phytohaemagglutinin (PHA)will dramatically stabilize GM-CSF mRNA (5,56).If a mutant mRNA lacking the AUUUA motifs is inserted into such a context, it too will be stable. Therefore, the investigator may miss the effect of the mutation under study. Similar issues impact on the selection of cytoplasmiclysates for trans-factor analysis. Presumably,pro-
REGULATION OF EUKARYOTIC mKNA TURNOVER
271
teins that stabilize cytokine mRNAs are present in activated normal cells or tumor cell lines, but not in resting, nondividing normal cells. Thus, during the purification of such cells, care must be taken not to induce any activation. Methodologies as benign as leukopheresis, which is often used to collect large numbers of peripheral blood mononuclear cells for analysis, may indeed induce partial cell activation.Finally, the particular cell used for study must contain the appropriate machinery to recognize and degrade the message under study. The use of irrelevant cell types may therefore be as deleterious as the use of transformed cell lines.
C. The 29-Base Element of Amyloid Precursor Protein mRNA We employed the above guidelines to evaluate the possibility that amyloid precursor protein (APP) mRNAs are regulated at the level of mRNA stability. This message codes for APP, from which P-amyloid is proteolytically released (89). In normal cells, P-amyloid processing liberates a nonamyloidogenic fragment that fails to accumulate intracellularly or extracellularly (90).In Alzheimer's disease, however, the proteolpc pathway is altered such that the released P-amyloid has a longer half-life and accumulates in the extracellular matrix. Although it remains controversial,the levels of APP mRNA are often elevated in the brains of Alzheimer's patients (91),again suggesting a transcriptional or posttranscriptional defect in APP gene expression, coexisting with abnormal P-amyloid posttranslational processing. We screened cytosolic lysates, prepared from either normal, resting cells or activated cells and transformed cell lines with radiolabeled 3' UTR of APP mRNA in RNA mobility-shift experiments. UV cross-linking of the radiolabeled APP RNA to the cytosolic protein extracts, followed by sodium dodecyl sulfate/polyacrylamide gel electrophoresis, identified six distinct RNA-protein complexes of molecular sizes 42,47,65, 73,84, and 104 kDa (92). These complexes were not found in normal resting peripheral blood mononuclear cells but were readily induced after the ceIIs were stimulated for 3 hr with TPA and PHA (87).The complexes were also detected in all tumor cell lines that were tested (92). By using unlabeled competitor RNAs from overlapping regions of the 3' UTR, we were able to map the binding site to a 29-base element that was highly conserved in human and mouse samples (26129 bases) and located at nearly the same distance (about 200 nucleotides) from the stop codon in both species. We were also able to show that in TPA- and PHA-stimulated PBMCs, concurrent with the appearance of RNA binding activity, the half-life of endogenous APP mRNA increased from 4 to >10 hr (87).We therefore proposed that the binding of these protein factors to the 29-base element blocked normal APP mRNA decay.
272
LAKSHMAN E. RAJAGOPALAN AND JAMES S . MALTER
In transformed cell lines in which binding activity was constitutively present, APP mRNA decayed with a half-life of >12 hr. We also transcribed, in vitro, wild-type and mutant APP mRNAs with a 90-base poly(A)tail.The mutant differed from the wild-type mRNA only in the 29-base element. When transfected into resting lymphocytes, via particle-mediated gene transfer, wild-type APP mRNA decayed with a half-life of about 1.5hr. Loss of the 29base element in the mutant led to a more stable mRNA (t$ = 4 hr), thereby implicating this element as a destabilizer. In aggregate, these data strongly suggest that the 3' untranslated region, 29-base domain is integral to the regulated stability of APP mRNA. Th'is re@on is relatively AU rich, but lacks tandem AUUUA boxes. Computer-assisted folding suggests a stem-loop structure, although confurnation by ribonuclease mapping has not yet been performed. Homology searches of the data base have not revealed any other mRNAs containing this domain. We are currently in the process of evaluating these two mRNAs in vitro using a rabbit reticulocyte lysate translation system. Preliminary work indicates that the lysate differentiatesbetween these two mRNAs and that the wild-type decays approximately twice as fast as the mutant. We are also engaged in altering the relationship of the element to the stop codon, both by reducing the intervening sequence length as well as by mutating the stop codon to allow 3' untranslated region read through. We have not yet mutated the start codon to establish the intracellular localization of APP mRNA decay. Presumably, this will occur on a polyribosome during cotranslation. However, in preliminary work, cycloheximide, had no effect. Therefore, APP mRNA decay may occur at a nonpolysomal location.
D. The Iron Response Element As demonstrated for APP mRNA, it is likely that a large number of mRNAs are regulated at the posttranscriptional level through variable mRNA stability, As this field develops, many new cis elements are being characterized and defined. Perhaps best known is the iron response element (IRE),which has been identified in the 5' untranslated region of fenitin mRNA as well as in multiple reiterations in the 3' untranslated region of transferrin receptor mRNA (93, 94). This is an approximately 30-base primary sequence that is highly conserved from oocytes to mammals. Based on computer-assisted folding, the element appears as a stable stem-loop (93,94). The components of the stem are somewhat variable, but retain functionality so long as a conserved cytosine residue forms a bulge on the 5' end of the stem at a position five nucleotides from the loop. The loop is also highly conserved and is composed of six nucleotides, with a C residue at the 5' end. When this element is present in the 5' untranslated region of ferritin, it controls the translation of this mRNA. Binding of the iron regulatory protein (IRP) to the IRE re-
REGULATION OF EUKARYOTIC
InRNA TURNOVER
2 73
presses translation whereas release of this factor stimulates translation 40-100 fold. The position of this element with respect to the 5' cap is critical to its function (95, 96). As it is moved toward the start codon, it becomes progressively less active. This suggests that the element, bound to its protein effector, can inhibit the assembly of ribosomal components but not block the movement of a preformed ribosomal subunit. When this element is present in the 3' untranslated region of transfemn receptor mRNA, it regdates the stability of the mRNA (39,97,98). The mechanisms that underlie transfenin receptor mRNA degradation are not known. A host of other cis elements have recently been identified, including those regulating the stability of c-jun (99), ribonucleotide reductase (67, loo),histones (101, 102),tubulin (103,104),and insulin-like growth factor I1 mRNAs (85, 86), to name a few. In general, these elements are found in the 3' untranslated region, although, as mentioned above, cfos and c-myc mRNAs also contain redundant coding region determinants. The mechanisms of action of these diverse elements remain to be cl+rified, but it is likely that the 3' untranslated region determinants will be distinct from those present in the coding region.
E. Stability Elements of Globin mRNAs So far, we have commented on the function of destabilizing motifs that lead to a more rapid decay of cognate mRNAs, generally in resting cells. There also exist classes of mRNAs with tremendous stability, such as globin or actin. The phenotype of these mRNAs could be explained by the absence of any destabilizing motifs or the presence of stability elements that prevent those mRNAs from being decayed. For globin, the restricted production by erythrocytes and erythroid precursor cells suggests a cell-speclfic stabilization mechanism. Recent work (105,106)has demonstrated a 3' UTR stability determinant in a-globin mRNA. The element is apparently functional only in the context of erythroid precursor cells (107). In patients carrying the athalassemia mutation, ribosomal readthrough into the 3' UTR of a-globin mRNA destabilizes the message. The instability determinant, which functions independently of translation, has been mapped to three cfldine-rich regions in the 3' UTR (108).Recently described mutations in the 5' UTR of the p-globin gene may be associated with the stability of the mRNA (109).
111. Tiuns Factors As mentioned above, a typical mammalian cell may contain upward of 20,000 distinct mRNAs at any given time. Because these mRNAs are subject to divergent degradation rates that, in many cases, can be varied under dif-
274
LAKSHMAN E. RAJAGOPALAN AND JAMES S. MALTER
ferent cellular conditions, a substantial and intricate regulatory machinery must exist. Such an apparatus must be able to distinguish different mRNAs and be modulated by changes in extracellular or intracellular conditions. The most logical effectors to perform these complex functions are cytoplasmic mRNA binding proteins. Such factors would likely exist distinct from the known ribonucleoproteins engaged in translation, such as elongation and initiation factors. They may, however, be associated with polyribosomes and could conceivably comprise some of the large number of uncharacterized proteins that make up that supramolecular entity. Based on the large and growing number of mRNAs subject to posttranscriptional gene regulation, the number of proteins engaged in this process must indeed be large. In the interest of brevity, the remainder of this section focuses on the identification and characterization of proteins that interact with either the AUUUA motif or the 29-base element of AF'P mRNA.
A, Approaches to Identifying Trans Factors After the pioneering work (60, 62) in 1986 that demonstrated the regulated stability of GM-CSF and TNF-amRNAs, many investigators began to search for mRNA binding proteins that might mediate either rapid decay in resting cells or stabilization after cell activation. A number of approaches were used, including filter hybridization involving the immobilization of cytoplasmic proteins onto solid supports and their incubation with radiolabeled ARE-containingmRNAs. This Northwestern blot approach assumes that the immobilized proteins w i l l be able to interact with their RNA ligands, that background binding to the filter or to nonspecific proteins will be minimal, and that competition assays with unlabeled, ARE-containing mRNAs (but not irrelevant mRNAs) can displace bound radiolabeled ligand. Whether appropriately folded, filter-bound protein could be generated after transfer of cytosolic proteins from denaturing SDS-PAGE was unknown. In an effort to circumvent these difficulties, we and others incubated cytoplasmic protein extracts with radiolabeled AU-containing RNA ligands pnor to an electrophoretic mobility-shift assay. This technique has been successfully utilized by investigators studying the interactions of transcription factors with their DNA target sequences (110).It is both highly sensitive and specific because native protein conformation can be maintained and nonspecific binding minimized by the inclusion of irrelevant competitors such as tRNA, poly(I).poly(C),or heparin. However, for this method to work, nonspecific protein-RNA complexes must be eliminated and specific complexes must be stable through gel electrophoresis under nondenaturing (native) conditions. Despite the presence of nonspecific competitors, we have found it necessary to add ribonuclease, usually TI or A, to the reaction mix to eliminate completely nonspecific interactions between charged protein and RNA
REGULATION O F EUKARYOTIC
mRNA TURNOVER
275
as well as to degrade uncomplexed, full-length radiolabeled mRNA ligand. After ribonuclease treatment, samples can be directly analyzed by electrophoresis on native gels under low ionic strength conditions (0.25-0.5x TBE) or subjected to W cross-linking and analyzed by SDS-PAGE. Each method has advantages and disadvantages. The native gel preserves approximately 50% of the initial complex through electrophoresis, permitting a far greater sensitivity for the detection of novel RNA-binding activities. However, no molecular size determination can be made and multiple proteins with distinct molecular sizes may migrate at nearly identical positions, obscuring the complexity of interactions with the target mRNAs. However, SDS-PAGE after UV cross-linking can provide information regarding the molecular masses of the RNA-protein complexes. Depending on the ribonucleases used to cleave unprotected portions of the target mRNA, the mass of the complex may be partially or predominantly contributed to by the protein. However, W cross-linking to proteins is very inefficient, with 2 4 % of all complexes being productively cross-linked.As such, the sensitivityof SDS-PAGE analysis is far less than that of native gel analysis. In addition, any protein-protein interactions that occur will not be identified under denaturing conditions. It is probably worthwhile to consider the two techniques as being complementary and to use them in combination. Final caveats to consider in order to identi& unique mRNA-protein interactions are the choice of mRNA ligand and which radolabeled nucleotides to incorporate into the ligand. The longer the mRNA target, the lower the relative specific activity and molar ratio of the unknown element. Thus, for initial screens, we use the minimal sequence necessary that includes the full-length element. This typically involves utdizing mRNAs in the range of 50-200 nucleotides in length. Once a putative element has been identified using a partial sequence, the entire mRNA can be radiolabeled in witro and employed for mobility-shift assays or used as an unlabeled competitor. The choice of nucleotides for labeling is equally important, but is rarely considered. For AU-rich elements, radiolabeled UTP is obviously a good choice. However, when confronted with identifylng a unique element, it is often best to begin the survey with labeled mRNA ligands [32P]GTP,[3”P]UTP, and [32P]CTP.In some cases a combination of all three nucleotides can be beneficial, although such intensively labeled probes tend to be autolyzed rapidly. The purpose of employing different radiolabeled nucleotides is to maximize the specific activity of labeling within the putative element. If RNA band-shift assays are performed in the absence of a ribonuclease treatment, such considerations become less important. For our systems, however, where nonspecific interactions are eliminated with ribonuclease, this is critical. Final concerns regard buffers and other solution-phase binding conditions. We have generally employed low ionic strength conditions at physio-
276
LAKSHMAN E. RAJAGOPALAN AND JAMES S. MALTER
logic pH. Such conditions tend to be permissive for the binding of most mRNA to proteins that we have examined to date. We routinely include approximately 500- to 1000-fold molar excess of tRNA to serve as a nonspecific competitor. Once a complex is identified, it can be optimized by altering ionic and redox conditions, incubation temperatures and times, as well as the ratio of protein to RNA. Using these general rules, we have identified mRNA binding proteins that interact with the AUUUA motifs of cytokine and protooncogene mRNAs (47, 48), a unique 3’ UTR determinant found in erythropoietin mRNA (113),and the unique element found in the amyloid protein precursor mRNA (92).
6. The Adenosine-Uridine Binding Factor The adenosine-uridine binding factor was identified in crude cytosolic extracts from Jurkat cells using mobility-shift assays with an 80-base radiolabeled oligoribonucleotideprobe containing four consecutive reiterations of the AUUUA motif. This probe clearly resembles the AU-rich region of GMCSF mRNA. After a brief incubation, followed by ribonuclease T1 or ribonuclease A digestion, the reactions were electrophoresed on native gels and showed the presence of a dominant RNA-protein complex that migrated distinctly from free RNA. The complex could be destroyed by treatment with heat or proteinase K digestion, demonstrating it indeed contained protein (47). The complex was specific as assessed by competition with unlabeled AUUUA-containing RNAs that displaced the labeled probe at molar ratios between 25:l and 50:l (47). At similar concentrations, non-AUUUAcontaining RNAs had no effect on the complex. When subjected to W cross-linkingand analyzedby SDS-PAGE, a dominant RNA-protein complex of 42 kDa was observed (47). There was competition for this complex by unlabeled AUUUA-containingRNA, but not by irrelevant control RNA. The complex could be identified as long as the radiolabeled probe maintained a minimum length of approximately 30-35 nucleotides and carried at least three consecutive reiterations of the AUUUA motif(47,48).RNAs containing single reiterations of the motif or those shorter than 30 bases failed to interact with the protein. These data suggested that secondary or higher order structures were involved in presenting the primary sequence to the protein and that full-length mRNAs such as globin, which contain a single AUUUA pentamer, would not be ligands for AUBF. When multiple AUUUA motifs were forced into a double-stranded conformation, RNA-protein complexes continued to be observed, suggesting that AUBF was able to melt potentially interfering secondary structures. Single mutations within the AUUUA boxes had powerfd inhibitory effects on AUBF binding (48). When the middle uracil was converted to a guanosine in each of the AUUUA motifs, forming four consecutive AUGUA motifs, pro-
REGULATION OF EUKARYOTIC mRNA TURNOVER
277
tein binding was completely abolished. A similar, although slightly less dramatic, effect was observed when cytosine residues were inserted in the middle of the AU boxes. These data suggested that the recognition by AUBF of the AUUUA motifs is strictly defined and raises the possibility that the mutations within the 3’ UTR of cytokine or protooncogene mRNAs might have dramatic and deleterious effects on the regulation of these mRNAs. If the ribonuclease machinery that normally recognizes and rapidly degrades AUcontaining mRNAs showed similar specificity, such mRNAs would escape ribonuclease surveillance and be long-lived in the cytoplasm of cells. Indeed, several transformed cell lines have viral insertions within the AU-rich 3‘ UTR of IL-2 (MLA-144)(114)or IL-3 (71, 72); these insertions disrupt the AU boxes and lead to the production of extremely stable cytokine mRNAs. Truncations of c-fos mRNA with the elimination of the AU boxes have also been identified. Under such conditions, cfos is far more stable than normal (115).
C. Function and Regulation of Adenosine-Uridine Binding Factor Studies on the function and regulation of AUBF were performed with normal, resting T lymphocytes,rather than transformed cell lines. Tumor cell lines expressed constitutive AUBF activity. We suspected, and others have confirmed, that mRNA decay pathways are suppressed or inhibited in tumor lines (88)as well as explants of fresh human tumors (116-118). Normal resting tissues, including liver or lymphocytes, fail to demonstrate active AUBF. Therefore, we explored if mitogens, such as phorbol ester, phytohemagglutinin, cytokines such as TNF-a, or cyclic AMP analogs might affect AUBF activity. These mitogens, which partially or fully drive lymphocytes into the cell cycle, stabilize subclasses of mRNAs, includmg those coding for cytokines. Phorbol ester, in particular, has profound effects on the stability of GMCSF (5, 56, 57),IL-2 (119),and interferon-y mRNAs (120).The kinetics of these effects are yet to be characterized fully, but based on our examination of the decay of transgenic mRNAs as discussed above, the effects likely occur within 1hr or less. HAUBF was involved in the stabilization of GM-CSF mRNA, we would expect phorbol ester treatment to up-regulate its activity. Under such conditions, active protein would interact with the AUUUA motifs and block decay. Thus, resting lymphocytes were treated with mitogenic doses (20 ng/ml) of TF’A and lysates were examined for AUBF activity through RNA mobility-shift assay (65). Resting lymphocytes lack detectable activity, but treatment for as little as 15-30 min induces rapid up-regulation that is detectable for as long as 10-12 hr (65). Separate experiments with the calcium ionophore A23 187 demonstrated nearly identical results (65).Interestingly, treatment with both ago-
278
LAKSHMAN E. RAJAGOPALAN AND JAMES S. MALTER
nists failed to reveal additive or synergistic effects, suggesting that either pathway maximally activates AUBF. Because both ionophore and phorbol ester stabilize GM-CSF mRNA, it is possible that the up-regulation of AUBF activity is associated with cytokine mRNA stabilization. The simplest possible model is that AUBF occupies the AUUUA motifs and blocks ribonuclease recognition and/or cleavage. These results also demonstrate that calcium and protein kinase C-dependent pathways, which can activate transcription, coregulate mRNA stability. This is a logical outcome, because transcriptional up-regulation and mRNA stabilization are both required for the elaboration of cytokines by activated cells. The involvement of protein kinases was further verified by separate experiments evaluating the effects of the phosphatase inhibitor okadaic acid (121).Ifphosphorylation is an obligatory step to the activation of AUBF, phosphatase inhibition should alter the balance between phosphorylation and dephosphorylation, ultimately activating AUBF. Despite the toxicity of okadaic acid, NIH 3T3 cells treated for 8 to 12 hr with this agent showed increases in AUBF activity and stabilization of glucose transporter (GLUT-1) mRNA. GLUT-1contains multiple AUUUA motifs in the context of a poly(U)-richregion, which we had previously demonstrated to be an AUBF ligand (47).The increased stability of GLUT-1with increased AUBF activity suggests a causeand-effect relationship between the two. We next assessed if AUBF is the target of posttranslational modification by protein kinase C or some downstream kinases, or whether active AUBF results form new gene transcription and translation (65). Resting lymphocytes were therefore activated with phorbol ester in the presence of actinomycin D and/or cycloheximide. These agents would be expected to block gene transcription and protein synthesis, respectively. If AUBF activity could be up-regulated under such conditions, it would argue for posttranslational modification of preexisting protein with the conversion of an inactive to an active form. Such a situation is analogous to the conversion of transcriptional regulators from an inactive protein to an active transcription factor (122). Band shift assays revealed that AUBF could indeed be up-regulated in the presence of actinomycin D and cycloheximide (65),demonstrating that preexisting protein must be modified through protein kinase C-dependent pathways. The most obvious explanation was the direct phosphorylation of AUBF by a kinase. To assess this possibility, cytoplasmicextracts from tumor cell lines or TPA-treated lymphocytes were incubated with phosphatases (65).Under such conditions, AUBF could be inactivated, demonstrating it is hkely a phosphoprotein. hnRNP C protein, as well as a host of other nuclearbased activities, appear to go through similar phosphorylation-dephosphorylation cycles that regulate their ability to bind nucleic acid ligands.
REGULATION OF EUKARYOTIC
mRNA TURNOVER
279
Whether AUBF is a direct substrate of protein kinase C remains to be established. In vitro manipulation of AUBF activity has revealed additional levels of regulatory oversight. Several RNA binding proteins, for example, the iron response element binding protein (123,124),R17 coat protein (123,and many DNA binding proteins (126),including c-Fos, are susceptible to oxidation-reduction. Despite the maintenance of an overall reducing environment in the cytoplasm of cells, individual enzymes or proteins can exist in microenvironments that are dominantly oxidizing. We therefore tested whether AUBF was sensitive to changes in redox by incubating cytoplasmic lysates with the reversible oxidant diamide or the irreversible oxidant n-ethyl maleimide (NEM) (65). In the presence of such reagents, AUBF activity was completely abolished. 2-Mercaptoethanol (2-ME)treatment of diamide-treated lysates fully reversed the effect of oxidation, suggesting redox regulation may also participate in the regulation of AUBF activity (65). Oxidizing agents are thought to modify free sulfhydryl groups, which may function to either coordinate metals or interact directly with RNA ligands. In order to discriminate between these two possibilities, AUBF was incubated with GM-CSF mRNA to form a stable complex prior to treatment with NEM. If the target sulfhydryl groups directly interacted with the RNA ligand, NEM should have no effect on preformed complexes. Conversely, if sulfhydryls remained available and were involved in other functions such as the chelation of metals, then NEM should inhibit preformed mRNA/AUBF complexes. Indeed, NEM had inhibitory effects on AUBF activity, whether RNA-protein complexes had formed or not. These data suggested that sulfhydryl groups likely participate in metal ion chelation rather than directly with the RNA ligand. This was directly assessed by exhaustive dialysis of AUBF against EDTA- and EGTA-containing solutions. After dialysis, protein activity was eliminated, but could be completely restored by reconstitution with calcium or magnesium (127). A variety of other divalent and trivalent metals were assayed, but none restored activity (127). These data suggest that AUBF, on reduction, can bind calcium and/or magnesium, leading to RNA binding activity. The levels of calcium necessary to activate AUBF were found to be in the low micromolar range, which is close to the physiologic intracellular level after the engagement of cell surface receptors or treatment with calcium ionophore. Therefore, AUBF appears to exist as an inactive precursor within the cytoplasm of cells. On activation and attendant up-regulation of PKC with calcium flux, AUBF becomes reduced, capable of binding calcium or magnesium, and phospho-
280
LAKSHMAN E. RAJAGOPALAN AND JAMES S. MALTER
rylated. Once these posttranslational modifications have occurred, AUBF can engage and bind to AUUUA-containing RNA ligands. We have since broadened these observations and demonstrated that non-PKC-mediatedsignal transduction pathways can also activate AUBF. These include those mediated by cyclic AMP, TNF-a and phytohemagglutinin. Although all of these reagents have not yet been demonstrated to stabilize AUUUA-containing mRNAs, we predict that they would do so. Despite these important results, they do not directly demonstrate that AUBF stabilizes GM-CSF mRNAs. In order to establish this hypothesis directly, we employed a polysome-based in vi&o mRNA decay system derived from TPA/PHA-activated peripheral blood mononuclear cells (46). Polysomes loaded with the mRNA of interest can accurately and specificallymimic in vivo mRNA decay (43,44).Traditionally,the polysomes are derived from tumor cell lines, including K562, but we were concerned that these transformed cells might lack appropriate regulatory pathways. We therefore isolated polysomes from TPAPHA-activated lymphocytes. RNA mobility shift of subcellular fractions showed that >90% of the total cellular activity of AUBF was polysome associated, with -6% of the activity in the S130 supernatant, supporting its postulated role as a stabilizer of polysome-associated AUUUA-containing mRNAs. Northern blotting of RNA isolated from the polysomes revealed ample GM-CSF mRNA, which would serve as our target AUUUA-containingmRNA. When polysomes were incubated under protein synthesis conditions, GM-CSF mRNA decayed with an apparent halflife of 90 min as assessed by Northern blotting. Removal of the polysomebound AUBF activity with biotinylated AUUUA-containing RNA linked to streptavidin magnetic beads accelerated the decay of polysome-associated GM-CSF mRNA by fivefold to a half-life of 17 min. These data directly demonstrated that AUBF stabilizes GM-CSF mRNAs on polyribosomes.
D. hnRNP C and Nucleolin As discussed above, the AUUUA motifs are found in a variety of posttranscriptionally regulated genes. On sequence analysis, the 3’ untranslated region of the amyloid protein precursor mRNA contained four reiterations of the AUUUA motif (92).These motifs were separated by approximately 50 nucleotides, with each embedded in a relatively GC-rich region. At the time of the initiation of these studies, it was unknown if single AUUUA motifs in such a context would be destabilizing. However, we began to search for RNA binding proteins that might interact with some component of the APP 3’ UTR. Using progressively shorter APP mRNAs, we identified a 29-base region approximately 200 bases from the stop codon that, when incubated with lysates from normal, activated cells or tumor cell lines, produced six RNA-protein complexes (92). Fine mapping demonstrated that this region
REGULATION OF EUKARYOTIC mRNA TURNOVER
281
contained no AUUUA motifs, but was rather AC rich. Using classical purification techniques, and assaying for APP RNA binding activity, we ultimately purified two proteins, hnRNP C and nucleolin, from cytoplasmic lysates of tumor cell lines. We demonstrated by mobility-shift assay, Northwestern blotting, and Western blotting that these proteins were the authentic RNA binding proteins that interacted with the APP 3’ UTR (128). Further, we demonstrated that approximately 30% of hnRNP C and nucleolin were cytosolic (128). These data are striking for a number of reasons. First, both nucleolin and hnRNP C were previously considered as nuclear activities. Their identification in the cytoplasm and particularly on polyribosomes would suggest they may have additional cytoplasmic functions. One can envision modifications including phosphorylation, methylation, and/or reduction that might mediate such functional diversity. Second, these results suggest specific roles for nucleolin and hnRNP C protein in APP mRNA decay. We are currently assessing their function directly by using in vitro decay systems wherein these proteins can be added or removed and subsequent effects on decay of APP mRNA can be assessed. These experiments will establish whether these proteins are merely RNP passengers or whether they mediate specific functions.
IV. Overproduction of Cytokines in Cells and Intact Animals: Application to Gene Therapy As mentioned earlier, mutations within the AUUUA motifs of short-lived mRNAs result in a significant stabilization of these mRNAs. Clearly, these mutant mRNAs must be as efficiently translated as their wild-type counterparts in order to have functional significance. To assess this question directly, we transfected resting peripheral blood mononuclear cells (approximately 70% T lymphocytes)with cytomegalovirus (CMV)promoter-driven in vivo expression vectors coding for either wild-type human GM-CSF mRNA (hGM-AUUUA)or a mutant version with four tandem AUGUA repeats in the 3’ UTR (hGM-AUGUA) (52).The protein synthetic capabllity of these mRNAs was assessed by enzyme-linked immunosorbent assays (ELISAs)performed 24 hr after transfection, on conditioned culture medium and cell lysates from identical numbers of cells. Under the conditions employed, PBMCs transfected with vector control failed to produce any detectable GM-CSF protein. hGM-AUUUAtransfectants secreted about 25 pg of GM-CSF protein/ml/106 cells, with no detectable protein in the cell pellet. hGM-AUGUA transfectants secreted about 550 pg of GM-CSF protein/mb106 cells, with an additional 200 pg of protein/106 cells in the cell pellet. Thus, a 4.5-fold increase in half-
282
LAKSHMAN E. RAJAGOPALAN AND JAMES S. MALTER
life from 20 min for GM-AUUUA mRNA to 90 min for GM-AUGUA mRNA resulted in an increase in protein production of 20- to 30-fold. This dramatic increase was also observed after the transfection of intact animals (52). Particle-mediated gene transfer has been employed to introduce a variety of expression vectors into intact animal skin, as well as into internal organs. Depending on the velocity and size of DNA-coated gold partcles, DNA can be delivered 20-40 cell layers deep into tissues, including skin, muscle, brain, liver, or mucosa (52,53).We analyzed the ability of hGMAUUUA and hGM-AUGUA expression vectors to produce immunologically detectable GM-CSF after a single transfection of rat epidermis. At 24 hr after transfection, serum and skin (transfection site) samples were collected and analyzed for GM-CSFprotein. About 100 pg of GM-CSF per milliliter of tissue homogenate was detected at the site ofwild-type GM-CSFtransfection,which was approximately 1/1OOthof that measured at the site of hGM-AUGUA cDNA introduction. The serum of animals transfected with wild-type GM-CSF had undetectable levels of protein, whereas those receiving mutant constructs had approximately 650 pg of proteiniml of serum. These data demonstrate the power of subtle alterations in the stability of GM-CSF mRNA to up-regulate protein production in cells or intact animals. We are currently using such technology to optimize the expression of cytokines for cancer gene therapy in humans with melanoma and breast cancer. The transgenic proteins produced were fully biologically active as assessed by a massive inflammation by neutrophils and macrophages at the transfection site of mutant constructs (52).
V. Summary We have demonstrated the existence of multiple mRNA binding proteins that interact specifically with defined regions in posttranscriptionally regulated mRNAs. These domains appear to be destabilizers whose function can be attenuated by the interaction with the specific binding proteins. Thus, the ability to alter mRNA decay rates on demand, given different environmental or intracellular conditions, appears to be mediated by controlling the localization, activity, and overall function of the cognate binding protein. Based on our limited experience, we predict that most, if not all, of similarly regulated mRNAs will ultimately be found to interact with regulatory mRNA binding proteins. Under conditions whereby the mRNA binding proteins are constitutively active (e.g., tumor cell lines), abnoimal mRNA decay will result, with accumulation and overtranslation. Such appears to be the case for cytokines and possibly amyloid protein precursor mRNAs in cancer and Alzheimer's disease, respectively.Conversely, mutagenesis of these critical 3 '
REGULATION OF EUKARYOTIC
mRNA TURNOVER
283
untranslated region elements will likely have comparable deleterious effects on the regulation of gene expression. To the extent that such derangements exist in human dsease, attention to understanding the mechanistic detail at this level may provide insights into the development of appropriate therapeutics or treatment strategies.
ACKNOWLEDGMENTS Part of this work was supported by National Institutes of Health Grant AG 10675. We thank members of the laboratory for their support and valuable input. We also thank the Department of Cancer Gene Therapy, Agracetus Inc., Middleton, Wisconsin, for providing the Accell particle-mediated gene transfer inshvment and protocols.
REFERENCES 1 . K. S. Kabnick and D. E. Housman, MCBiol8,3244 (1988). 2. H. J. Rahmsdorf, A. Schonthal, P. Angel, M. Liftin, U. Ruther and P. Herrlich, NARes 15, 1643 (1987). 3. I. M. Verma and P. Sassone-Corsi, Cell 51, 513 (1987). 4. A. B. Shyu, M. E. Greenberg and J. G. Belasco, Genes Deu. 3 , 6 0 (1989). 5. M. Bickel, R . B. Cohen and D. H. Plumik, J. Immunol. 145,840 (1990). 6. G. Brewer, MCBiol 11 2460 (1991). 7. R. Wisdom and W. Lee, Genes Deu. 5,232 (1991). 8. A. Wodnar-Filipowiczand C. Moroni, PNAS 87,777 (1990). 9. T. Lindsten, C. H. June, J. A. Ledbetter, G. Stella and C. B. Thompson, Science 244,339 (19 8 9). 10. V. Volloch, B. Schweitzer and S. Kits, E x p . Cell Res. 173,38 (1987). 11. S. G. Swartwout and A. J. Kinniburgh,MCBiol9, 288 (1989). 12. H. M. Jack and M. Wabl, EMBO]. 7,1041 (1988). 13. R . B. Alterman, S. Ganguly, D. H. Schulze, W. F. Marzluff, C. L. Schildkraut and A. I. Skoultchi,MCBiol4, 123 (1984). 14. R. Levis and S. Penman, Cell ll, 105 (1977). 15. R. H. Singer and S. Penman, Nature (London)240, 100 (1972). 16. Y. Chen, J. Weeks, M. A. Mortin and A. L. Greenleaf, MCBiol 13,4214 (1993). 17. P. B. Sehgal, L. Tamm and J. Vilcek, Science 190,282 (1975). 18. C. A. Chen, N. Xu and A. Shyu, MCBiol 15,5777 (1995). 19. M. A. Goldberg, C. C. Gaut and H. F. Bunn, Blood 77,271 (1991). 20. C. Seiser, M. Posch, N. Thompson and L. C. Kuhn,]BC 270,29400 (1995). 21. M. E. Greenberg and E. B. Ziff, Nature (London)311,433 (1984). 22. W. Kruijer, J. A. Cooper, T.Hunter and I. M. Verma, Nature (London)312, 711 (1984). 23. R. A. Hurta, A. H. Greenberg and J. A. Wright,]. Cell. Physiool. 156,272 (1993). 24. E. R. Eldredge, P. J. Chiao and K. P. Lu, Methods Enzy~nol.254,481 (1995). 25. M. Gossen and H. Bujard, PNAS 89,5547 (1992). 26. J. Ross, in “Control of mRNA Stability”(J. Belasco and G. Braweman, eds.), p. 417. Academic Press, San Diego, 1993.
284
LAKSHMAN E. RAJAGOPALAN AND JAMES S . MALTER
27. J. Ross, in “RNA Processing-A Practical Approach (B. D. Hames and S. J. Higgins, eds.), Vol. I1 p. 107. IRL Press, Oxford, 1994. 28. M. S. Altus and Y. Nagamine,]BC266,21190 (1991). 29. R. Pei and K. Calame, MCBiol 8,2860 (1988). 30. R. Bandyopadhyay, M. Coutts, A. Krowczynska and G. Brawerman, MCBioZ 10, 2060 (1990). 31. C. R. Krikorian and G. S. Read, J. Virol.65,112 (1990). 32. E C. Nielsen and J. Christiansen,JBC 267,19404 (1992). 33. M. Gorospe and C. Baghoni, JBC 269,11845 (1994). 34. J. E. Hepler, J. J. Van Wyk and P. K. Lund, Endocrinology 127, 1550 (1990). 35. D. H. Wreschner and G. Rechavi, EJB 172,333 (1988). 36. M. J. Ernest, Bchem 2 4 6761 (1982). 37. E. Stimac, V. E. Groppi and P. Coffino, BBRC 19,917 (1983). 38. T. Aharon and R. J. Schneider, MCBioE 13,197 1 (1993). 39. D. M. Koeller, J. A. Horowitz,J. L. Casey, R. D. Klausner and J. B. Harford, PNAS 88,7778 (1991). 40. R. A. Graves, N. B. Pandey, N. Chodchoy and W. F. M d u f f , Cell 48,615 (1987). 41. A. M. Curatola, M. S. Nadal and R. J. Schneider, MCBioE 15,6331 (1995). 42. S. Savant-Bhonsaleand D. W. Cleveland, Genes Deo. 6,1927 (1992). 43. J. Ross and G. Kobs,JMB 188,579 (1986). 44. I. Sunitha and L. I. Slobin, BBRC 144,560 (1987). 45. G. Brewer and J. Ross, MCBiol8,1697 (1988). 46. L. E. Rajagopalan and J. S. Malter,JBC269,23882 (1994). 47. J. S. Matter, Science 246,664 (1989). 48. P. Gillis and J. S. Malter,JBC 266, 3172 (1991). 49. R. Koren, Y. Burstein and H. Soreq, PNAS 80,7205 (1983). 50. H. J. Song, D. R. Gallie and R. F. Duncan, EJB 232,778 (1995). 51. R. W. Malone, P. L. Felgner and I. M. Verma, PNAS 86,6077 (1989). 52. L. E. Rajagopalan, J. K. Burkholder, J. Turner, J. Culp, N.-S. Yang and J. S. Matter, Blood 86,2551 (1995). 53. J. K. Burkholder, J. Decker and N.3. Yang, J. Immunol. Methods 165,149 (1993). 54. P. Qiu, P. Ziegelhoffer, I. Sun and N.-S. Yang, Gene Therapy 15,45 (1996). 55. T. J. Ernest, A. R. Ritchie, G. D. Demetri and J. D. Griffin,JBC 264,5700 (1989). 56. Y. Iwai, M. Bickel, 0. H. Pluznik and R. B. Cohen,JBC 266,17959 (1991). 57. M. Akashi, G. Shaw, M. Hahiya, E. Elstner, G. Suzuki and P. Koeffler, Blood 83, 3182 (1994). 58. G. G. Wong, J. S. Witek, P. A. Temple, K. M. Wilkens, A. C. Leary, D. P. Luxenberg, S. S. Jones, E. L. Brown, R. M. Kay, E. C. Orr, C. Shoemaker, D. W. Golde, R. J. Kaufman, R. M. Hewick, E. A. Wang and S. C. Clark, Science 228,8 10 (1985). 59. N. M. Gough, J. Gough, D. Metcalf, A. Kelso, D. Grail, N. A. Nicola, A. W. Burgess and A. R. Dunn, Nature (London)309, 763 (1984). 60. G. Shaw and R. Kamen, Cell 46,659 (1986). 61. D. Caput, B. Beutler, K. Hartog, R. Thayer, S. Brown-Shimer and A. Cerami, PNAS 83, 1670 (1986). 62. M. Akashi, M. Hachiya, H. P. Koeffler and G. Suzuki, BBRC 189,986 (1992). 63. R. L. Tanguay and D. R. Gallie, MCBiol16,146 (1996). 64. M. Gorospe, M. S. Kumar and C. Baglioni, JBC 268,6214 (1993). 6.5. J. S . Malter and Y. Hong,JBC 266,3167 (1991). 66. F. Y.Chen, F. M. Amara and J. A. Wright, BJ 302,125 (1994). 67. F. M. Amam, F. Y. Chen and J. A. Wright,JBC 269,6709 (1994).
REGULATION OF EUKARYOTIC m R N A TURNOVER
285
68. F. Y. Chen, F. M. Amara and J. A. Wright, EMBOJ. 12,3977 (1993). 69. Y. Iwai, K. Akahane, D. H. Pluznik and R. B. Cohen,]. Immunol. 150,4386 (1993). 70. S. W. Peltz, G. Brewer, P. Bernstein, P. A. Hart and J. Ross, Crit. Rev. Eukaryvtic Gene Expression l, 99 (1991). 71. P. A. Algate and J. A. McCubrey, Oncogene 8, 1221 (1993). 72. H. H. Hirsch, A. P. K. Nair and C. Moroni,J. Eap Med. 178,403 (1993). 73. E. H. Kislauskis, X. Zhu and R. H. Singer,]. Cell. Biol. 127,441 (1994). 74. D. Ferrandon, L. Elphick, C. Nusslein-Volhard and D. St. Johnson, Cell 79, 1221 (1994). 75. J. L. Smith, J. E. Wilson and P. M. MacDonald, Cell 70,849 (1992). 76. V. Kruys, 0.Marinx, G. Shaw, J. Deschamps and G. Huez, Science 245,852 (1989). 77. G. Grafi, I. Sela and G. Galili, MCBiol 13,3487 (1993). 78. S. C. Schiavi, C. L. Wellington, A. B. Shyu, C. Y. A. Chen, M. E. Greenberg and J. G. BeIasco,JBC 269,3441 (1994). 79. A. B. Shyu, J. G. Belasco and M. E. Greenberg, Genes Deu. 5,22 1 (1991). 80. C. L. Wellington, M. E. Greenberg and J. G. Belasco, MCBiol 13,5034 (1993). 81. P. L. Bernstein, D. J. Herrick, R. D. Prokipcak and J. Ross, Genes Dev. 6, 642 (1992). 82. R. D. Prokipcak, D. J. Herrick and J. Ross,JBC 269,9261 (1994). 83. A. M. Zubiaga, J. G. Belasco and M. E. Greenberg, MCBiol 15, 2219 (1995). 84. C. A. Lagnado, C. Y. Brown and G. J. Goodall, MCBioZ 14,7984 (1994). 85. D. Meinsma, W. Scheper, P. E. Holthuizen, J. L. Van den Brande and J. D. Sussenbach, NARes 20,5003 (1992). 86. W. Scheper, D. Meinsma, P. E. Holthuizen and J. D. Sussenbach, MCBiol 15, 235 (1995). 87. S. H. E. Zaidi and J. S. Malter,JBC 269,24007 (1994). 88. H. J. Ross, N. Sato, Y. Ueyaa and H. P. Koeffler, Blood 77,1787 (1991). 89. S. Ishiura,J. Neurochem. 56,363 (1991). 90. D. J. Sekoe, Sci. Am. 11, 68 (1991). 91. S. A. Johnson, T. McNeill, B. Cordell and C. E. Finch, Science 248,854 (1990). 92. S. H. E. Zaidi, R. Denman and J. S. Malter,JBC 269,24000 (1994). 93. J. B. Harford, T.A. Rouault and R. D. Klausner, in “Iron Metabolism in Health and Disease” (J. H. Brock, J. W. Halliday, M. J. Pippard and L. W. Powell, eds.), p. 123. W. B. Saunders Co., Philadelphia, 1994. 94. R. D. Klausner, T. A. Rouault and J. B. Harford, Cell 72,19 (1993). 95. Z. Kikinia, R. S. Eisenstein, A. J. Bettany and H. N. Munro, NARes 23, 4190 (1995). 96. B. Goossen and M. W. Hentze, MCBiol 12, 1959 (1992). 97. J. L. Caseyk, D. M. Koeller, V. C . Ramin, R. D. Klausner and J. B. Hartford, E M B O J . 8, 3693 (1989). 98. E. W. Mullner and L. C. Kuhn, Cell 53,815 (1988). 99. S. S. Peng, C. A. Chen and A,-B. Shyu, MCBiol 16,1490 (1996). 100. F. Y. Chen, E M. Amaxa and J. A. Wright, NARes 22,4796 (1994). 101. W. F. Marzluff and N. B. Pandey, Trends Biochem. Sci. 13,49 (1988). 102. D. Schumperli, Trends Genet. 4, 187 (1988). 103. J. S. Pachter, T. J. Yen and D. W. Cleveland, Cell 51,283 (1987). 104. T. J. Yen, D. A. Gay, J. S. Pachter and D. W. Cleveland, MCBiol8, 1224 (1988). 105. X. Wang, M. Kiledjian, I. M. Weiss and S. A. Liebhaber, MCBiol 15, 1769 (1995). 106. I. M. Weiss and S. A. Liebhaber, MCBiol 14,8123 (1994). 107. R. N. Bastos and H. Aviv,JBC 110,205 (1977). 108. I. M. Weiss and S. A. Liebhaber, MCBiol 15,2457 (1995). 109. P. J. Ho, J. Rochette, C. A. Fisher, B. Wonke, M. K. Jarvis, A. Yardumian and S. L. Thein, Blood 87,1170 (1996). 110. D. Lane, P. Prentki and M. Chandler, Microbid. Reu. 56,509 (1992).
286
LAKSHMAN E. RAJAGOPALAN AND JAMES S. MALTER
113. 1. J . Rondon, L. A.MacMillan, B. S. Beckman, M. A. Goldberg, T. Schneider, H. F. BUM and J. S. Malter,JBC 266, 16594 (1991). 114. S. J. Chen, N . J. Holbrook, K. I? Mitchell, C. A. Vallone, J. S. Greengard, G. R. Crabtreeand Y. Lin, PNAS 82,7284 (1985). 115. F. Meijlink,T. Curran, A. D. Miller and I. M. Verma, PNAS 82,4987 (1985). 116. S. Bauer, M. Piechaczyk,A. Nepveu,K. M m , R. Nordan, M. Potter andF. Mushinski,Oncogene 4,615 (1989). 117. D. Eick, M. F'iechaczyk, B. Heinglein, J. M. Blanchard, B. Traub, E. Koffer, S. Wiest, G. Lenoir and G. W. Bomkamm, EMBOJ. 4,3717 (1985). 118. M. Piechaczyk, J. Q. Yang,J. M. Blanchard, P. Jeanteur and K. Marcu,CeZZ 42,589 (1985). 119. 0. Bill, C. G. Garlisi, D. S. Grove, G. E. Holt and A. M. Mastro, Cytokine 6,102 (1994). 120. P. Kaldy and A. M. Schrnitt-Verhulst,Eur. J. Imrnunol. 25,889 (1995). 121. J. Stephens, B. Z. Carter, P. Pekala and J. S. Malter,JBC 267,8336 (1992). 122. B. J. Druker, M. Neumann, K. Okuda, B. R. Franza and J. D. Grif&,JBC 269 (1994). 123. A Constable, S . Quick, N. K. Gray and M. W. Hentze, PNAS 89,4554 (1992). 124. M. W. Hentze and P.Argos, NARes 19,1739 (1991). 125. P. J. Romaniuk and 0. C. Uhlenbeck, Bchm 24,4239 (1985). 126. S. Xanthoudakis and T. Curran, Methods Enzymol. 234,163 (1994). 127. J. S. Malter, W. A. McCrory, M. Wilson and P. Gillis, Enzyme 44,203 (1990). 128. S. H. E. ZaidiandJ. S. Malter,]BC270,17292 (1995).
New and Atypical Families of Type I Interferons in Mammals: Comparative Functions, Structures, and Evolutionary Relationships’ R. MICHAELROBERTS? LIMINLIUAND ANDREIALEXENKO Departments of Veterinary Pathobiology and Animal Sciences University of Missouri Columbia Missouri 65211
...................
I. Interferon-w
11. Interferon-T . ................... 111. Comparison of Structures of IFN-w and IFN-Twith Other Type I Inter-
291 295
ferons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N.Evolution of 1FNW and IFNT
304
MI. Is There a Human IFN-T? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII. Concluding Remarks References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 19
320
The concept of viral interference grew primarily out of experiments performed on chick allantoic membranes 40 or more years ago, when it was realized that tissue exposed to an inactivated influenza virus could resist a challenge from virulent live virus ( I , 2). Initially it was thought that the protective factor comprised only a single antiviral substance, for which the term “inter-
’
Abbreviations: b P - 1 , bovine trophoblast protein-1 (bovine IFN-T);CL, corpus luteum (or corpora lutea); GAF, gamma (interferon)activating factor; GAS, gamma (interferon)activating sequence; GM-CSF, granulocyte-macrophagecolony stimulating factor; IFN, interferon; IFNAR, interferon a/P receptor; lFNA, gene for IFN-a; ZFNB, gene for IFN-P; IFNT, gene for IFN-T;ZFNW, gene for Im-w; IGF, insulin-like growth factor; ISGFS, interferon-stimulated gene factor-3; ISRE, interferon-stimiulated response element; Jak, Janus kinase; MDBK, Madin-Darby bovine kidney (cells);o n - 1 , ovme trophoblast protein-1 (ovine IFN-7); PGFza, prostaglandin F2-; STAT, signal transducers and activators of transcription; tyk, tyrosine kinase. To whom correspondence should be addressed. Progress in Nucleic Acid Research and MolecularBiology, Val. 56
Copyright 0 1997 by Academic Press.
287
AU rights of repruductlon in m y form reserved 0079.6603197 $25.00
288
R. MICHAEL ROBERTS ET AL.
feron” was coined. It was not until the first interferon (IFN) cDNA and IFN genes were cloned in the early 1980s that the full complexity of the IFN system began to be appreciated, although attempts at purification and serological studies had earlier hinted that more than one active factor was present in the preparations from virally challenged cells (3). There are two distinct groups of IFN, type I and type I1 (Fig. 1)(4,5).The latter, better known as IFN-y, and once referred to as “immune interferon,” seems to be confined to mammals (Table I). In whatever species it has been studied, IFN-r has been encoded by a single gene containing three introns (5).IFN--y is a homodimeric molecule and bears little or no resemblance to huIFNARl
huIFNAR2a hulFNAKZb
FIG.1. The type I interferon signal transduction pathway. The figure summarizes what is known about type I IFN. The IFNARl receptor, originally cloned by Uzi: et aE. (24, has an exh-acellular structure consisting of four immunoglobulin-likedomains. The intracellular region associates with the tyke kinase and undergoes phosphorylation on IFN binding (P). The IFNAR2c receptor (the so-called long form) (23) can bind type I IFN directly. High-affinity binding requires both subunits. IFNAR2c associates with Jakl kinase. The STAT factors associate with the receptors by their SH2 domains and become phosphorylated on tyrosine residues. Once activated in this manner, they can associate with p48 to form the transcription factor ISGFS, which binds to the IFN-stimulated response element (ISRE) on type-I-responsive genes. STAT1 can also homodimerize. As such, it corresponds to GAF, the transcription factor activated by IFNy. The IFNARB receptor exists in two additional forms (a and b) as the result of alternative transcript splicing. The functions of these additional binding proteins are unclear. IF”AFt2a is soluble and was originally purified from urine (see 22).IFNAR2b is membrane spanning and was originallyidentified as the type I IFN receptor by Novick et al. (22),but has a different sequence, compared to IFNAR~c,in its cytoplasmic region. The representation does not preclude additional signal pathways, e.g., involving other STAT,nor does it rule out the presence of additional receptor subunits, possibly with subtype specificity.
289
NOVEL TYPE I INTERFERONS
TABLE I A SUMMARY COMPARISON OF TYPEI AND TYPEI1 INTERFERONS Type I1 IFN
Type I IFN
Feature Distribution
Mammals, birds, fish, and possibly amphibians and reptiles
Mammals only
Genes
(a) Multiple (b)lntronless
(a) Single (c)Three introns
Protein
Monomer
Dimer
Receptor
Two known subunits
Two known subunits
Signal induction Cell type for expression
See Fig. 1 Many cell types
Jak STAT pathway
Stability
Stable at low pH
Unstable at low pH
T cells and a limited range of other cells, including pig trophoblasts
"See Section VI,B.
the single-chain type I IFN in primary structure. It also binds to its own spe-
cific receptor, and its actions are potentiated through a hstinct signal transduction pathway within its target cell (6, 7). Note from Fig. 1,however, that the signal transducer and activator of transcription (STAT1) factor provides a common element to both the type I and type I1 signaling pathways. As a phosphorylated homodimer [the gamma interferon activation factor (GAF)](7),it can bind to the gamma activating sequence (GAS) element on IFN-y-responsive genes, whereas as a component of interferon-stimulated gene factor 3 (ISGF3) [the heterotrimeric complex that binds the interferon-stimulated response element (ISRE)],it is involved in the transactivation of genes that are transcriptionally regulated by type I IFN. It is possibly for this reason that the biologcal activities of type I and type I1 IFN overlap. Genes that contain both GAS and ISRE elements are probably responsive to both types of IFN. The type I IFNs are a &verse group of molecules (Table 11).By about 1980, two major subtypes (IFN-a and IFN-P) had been recognized. In humans and mice, there is only a single gene for IFN-P (ZFNB),although cattle cany at least five (8,9).In contrast, there are multiple genes for IFN-a (IFNA) in all mammalian species so far examined (lo),including humans (11,12) and cattle (13,14).As discussed later, humans cany 13 ZFNA genes that are transcribed, and several additional pseudogenes (12). The multiplicity of type I IFN has raised to important questions regarding IFN function (15).First, do individual IFNs have special biological properties that equip them particularly well for certain roles? Second, are individual IFN genes induced differentially, so that a cell can provide an IFN
290
R. MICHAEL ROBERTS I X AL.
TABLE I1 TYPEI INTERFERONSOF CATTLE Subtype
>15
-5
-4a
-4-5
virus
virus
virus
UnknOWn
Cell of origin
Leukocytes and others
Fibroblasts and others
Leukocytes and others
Trophectoderm
Genes
Intronless
Intronless
Intronless
Intronless
166
166
172h
172
Number of genes Inducer
Length of polypeptide sequence Percent sequence identity of IFN-a
100
-30
-75
-50
Antiviral activity
Yes
Yes
Yes
Yes
Antiproliferativeactivity
Yes
Yes
3
Yes
<'Threeto four genes a r detected ~ with a highly specific 3' cDNA probe, but at least 15 are detectable with a less specific,full-length probe (see Fig. 2) Additional genes related to IF"* probably exist. bHuman IFN-w is either 172 or 174 amino acids long.
response to a wide range of different pathogens and other external cues? If the answers to either of these questions are yes, there has likely been positive Darwinian selection operating on the IFNA genes, i.e., they have not assumed their individual sequence identities simply by random genetic drift. What also seems probable is that the IFNA genes have duplicated independently in different orders of placental mammals (10, 16). Therefore, if selection has driven ZFNA duplication and diversity, different mammals may not utilize their IFN-a for identical purposes. Despite the large numbers of type I IFNs and claims that some of them might have unique properties, only a single kind of type I receptor has so far been recognized (Fig. 1).Within a species, all known type I IFNs appear capable of binding to this receptor and competing with other type I IFNs for occupancy (17-20). The human receptor consists of at least two distinct subunits, IFNARl (21) and IFNAR2 (22,23). The former is sometimes referred to as the transducing subunit because it has little ability to bind IFN on its own. The second subunit, which occurs in several forms as the result of alternative splicing, is considered to be the primary binding subunit. Both may be required to provide high-affinity binding (29, and each forms at least one noncovalent association with a tyrosine kinase (Fig. 1). The two chains be-
291
NOVEL TYPE I INTERFERONS
come phosphorylated at specific tyrosines in their cytoplasmic regons following ligand binding to the extracellular domain of the receptor (25,26) and are then able to recruit STATl and STAT2 as substrates for the tyrosine kinases. Once STATl is phosphorylated, it can dimerize, creating GAF (6, 7). Alternatively, STATl heterodimerizes with STAT2 and then associates with the DNA-binding protein p48 to form ISGF3, which migrates into the nucleus and activates type-I-responsive genes (Fig. 1).The receptor complex thus functions as an adapter molecule, linking tyrosine kinases to potential transcription factors. Although type I IFNs are undoubtedly a primary line of defense against viruses, they are also pleiotropic cytokines capable of inducing a remarkably varied range of changes on their targets (4, 19). They form part of the complex regulatory network that controls the immune system and can also influence the growth, behavior, and metabolic activities of many kinds of nonimmune cells. To regard the IFN system as merely a response to infection would be a serious underestimation of much broader role in cellular homeostasis and development. Over the last decade it has become evident that the type I IFNs are an even more extensive family than was anticipated from the initial cloning studies. Screening of human and bovine cDNA and genomic libraries under nonstringent conditions revealed an entirely new subtype, serologically and structurally distinct from the IFN-a and -p. Although originally named IFNaII (II),the term IFN-w, coined by Hauptman and Swetly (27),is now recommended. IFN-w is the first of the atypical IFNs reviewed in this essay. In 1987, yet a different subtype was described. In this case, the IFN was the major secretory product of the trophoblast of preimplantation ovine embryos and functioned as a hormone of pregnancy. It has become known as IFN-T. Since that time other type I IFNs have been described, including ones from nonmammalian species. The purpose of this essay is to describe what is known about IFN-w and IFN-Tand related mammalian type I IFNs, and to discuss, as far as it is possible, their function. Wherever it is appropriate, we have made comparisons with the better known IFN-a and -p (Table 11).Finally, we speculate on the evolutionary origin and relationships among the various subtypes so far discovered.
I. Interferon-w
A. Discovery IFN-w was first described by Capon et al. (11)and Hauptman and Swetly (27) in 1985. The former screened human and bovine genomic libraries
292
R. MICHAEL ROBERTS ET AL.
under low-stringency conditions with huIFN-ol probes. Weakly hybridizing clones were isolated that represented novel type-I IFNs distinct from IFN-ol and -P. They were named IFN-a,,. Hauptman and Swetly used a similar strategy to screen a human lymphoma cDNA library, but named the novel IFN they discovered IFN-w. The latter designation is now accepted (28).The most unusual feature that sets IFN-w apart from the related IFN-ol is that it possesses an extension of six amino acids at the carboxyl terminus. A similar hexapeptide "tail" characterizes IFN-T (discussed in Section 11). Although IFN-w is believed to be widespread (29)(Fig. 2), it has not been extensively studied. (For a full discussion of IFN-o origin and species distribution, see Section IV.) HuIFN-w is readily induced by viruses in a wide range of cells, including peripheral blood leukocytes and lymphoma ceU lines (11, 27,30,31) and placental trophoblast cells (32).In the latter it is expressed simultaneously with IFN-p, but in leukocytes, where it constitutes about 15% of the antiviral activity induced by Sendai virus, it is partnered by a mixture of IFN-a.
B. Structure Recombinant huIFN-w has been produced in yeast (30),in insect cells by employing Baculovirms (33),in Escherichia coli (34,and in Chinese hamster ovary cells (34).A monoclonal antibody (31)raised against the bacterial form was used (34)to punfy IFN-w from the total mixture of IFN released by virally induced human leukocytes.Leukocyte IFN-w is a mixture of two formsa predwted 172-amino acid polypeptide with a cysteine at position 1, and a longer 174-aminoacid form with a 2-amino acid extension at the N terminus that has been proposed to arise by aberrant cleavage by signal peptidase (35, 36).The two are othenvise identical and are probably transcribed from a single functional gene. HuIFN-wl possesses a single N-linked complex biantennary carbohydrate chain on Asn-78 (Asn-80 in the long form) and has a molecular weight calculated from SDS-PAGE of 24,500 (35, 36). Little information is available about the stability of huIFN-w. Unlike IFN-a, it has been suggested that huIFN-w is denatured at low pH (37),but this observation has not been reconciled with several purification procedures in which an acid treatment step is incorporated to precipitate contaminating proteins (27, 32, 35).
C . Genes The IFN-w genes (ZFNW) are believed to have diverged from the ZFNA 116 to 132 million years ago (10, U ) ,i.e., well before the establishment of eutherian mammals (see Section IV).Curiously, they have been reported to be absent from the dog DNA (38),and they have not so far been detected in rodents. However, they are found in diverse mammalian groups, including cat-
NOVEL TYFE I INTERFERONS
293
FIG.2. Genomic Southern blot analysis of gene distribution for IFNW genes (encoding IFN-w)and IFNT genes (encoding IFN-T)in a variety of mammalian species (zoo blots). DNA from each species (5-8 pg except from musk ox, where only 3 pg was utilized) was digested to completionwith restriction enzyme EcoRI, electrophoresed,and transferred to nylon membrane for hybridization. (A) Blot hybridized with a full-length oT€-1 cDNA probe, expected to recognize both IFNTand IFNW genes but not IFNA and ZFNB genes in a variety of species. (B) Same blot after it was stripped and hybridized with a full-length equine IFNW probe to see ifthe same pattern of genes was identified with this nonruminant IFN-w probe. (C) Duplicate blot hybridized with a specific IFNTprobe derived from the 3’ untrdnslated region of the oTp-1cDNA. This probe has been shown to recognize only IFNT genes in sheep and cattle, and was utilized to determine how widely distributed similar genes were in other mammalian species. Molecular sizes are indicated in kilobase pairs.
tle, goats and sheep (Ruminantia) (11,27,29,39-41), pigs (Suina) (42),horses (Perissodactyla) (43),rabbits (44),and humans (11, 12, 27, 34) (Fig. 2, A and B). In most cases, there appear to be multiple lFNW or ZFNW-related genes.
2 94
R . MICHAEL ROBERTS ET AL.
There is only a single functional gene for IFN-w in human DNA, although there are six pseudogenes (12).The single gene product is therefore IFN-wl. All the genes are located on the short arm of chromosome 9 in band 9p22-pl3 in association with the ZFNA (see Section V). As with all other type I genes, I F N W lacks introns and is believed to be about 2 kbp long. The I F W 1 transcript size is 1.2 kb ( f l ,27). The ZFNW from humans (11,27,32)and cattle (45)is virally inducible, but the viral response elements in the promoters have not been well-studied or well-defined. The genes are expressed simultaneously with certain IFN-a in peripheral leukocytes from human blood after the cells are exposed to virus (11, 31, 35, 36). Overexpression of interferon transcription factor 1 (IRF-1) will induce IFN-w as well as IFN-a and IFN-P from the transfected human genes in Cos-1 cells (46).These observations suggest that there may be features common to the promoter elements of all these type I IFNs.
D. Receptor HuIFN-w1 competes for binding to the common type-I receptor complex shared by IFN-a and -@ (20, 47). Antibodies against the ligand-binding subunit of this receptor block the antiviral activity of IFN-wl (25, 48). As with huIFN-a and IFN-p, huIFN-w binding to human cells induces tyrosine phosphorylation of the receptor subunits (25), the associated tyrosine kinases tyk2 and Jak-1,and the STAT components of the transcription factor ISGFS (26).It is presumably through this signaling pathway that IFN-w exerts its antiviral activity and an ability to up-regulate interferon-responsive genes (49, 50).As with the IFN-T, discussed in the next section, it remains to be seen whether IFN-w can trigger signaling pathways distinct from those utilized by other type I IFNs and unrelated to their antiviral activities. To date, there are no suggestions that IFN-o possesses properties that set it apart from IFN-a, but it would not be surprising to find that it does.
E. Function What then is the function of IFN-o? It appears to be no less potent than the commoner IFN-a in antiviral and antiproliferative activities (34,51)and is coinduced with IFN-a in leukocytes by virus. No unusual biological activity has yet been ascribed to it. As with IFN-a and IFN-@,it may prove to have value in controlling viral hepatitis and chronic papillomavirusinfections (52), in limiting progression of certain kinds of tumor, or in alleviating autoimmune conditions (53).The type I IFNs are now widely used in treatment of many diseases of humans and animals. Conceivably, IFN-w may find a special niche among these therapies.
295
NOVEL TYPE I INTERFERONS
II. Interferon-.r
A. Identification of the Antiluteolytic Factors in Cattle and Sheep as a Type I IFN IF“-? was discovered as a result of efforts to identify a factor released from ovine and bovine embryos prior to embryonic attachment to the uterine wall; the factor of interest was responsible for “rescuing” the corpus luteum during early pregnancy (54, 58).A failure to prevent the normal cyclic regression of this ovarian structure results in a decrease in serum progesterone concentration and a subsequent inability of the uterine endometrium, a progesterone-responsive tissue, to support the continued growth and development of the fetus and its membranes. The mechanism that preserves corpus luteum function in these ruminant species is totally unlike that found in the human and higher primates, wherein the hormone, chorionic gonadotropin, is released by the invading trophoblast tissue, enters the maternal bloodstream, and acts directly via receptors on the luteal cells to promote continued progesterone synthesis (59).In cattle and sheep, and probably in all related ruminant ungulates, including goats, deer, antelopes, and giraffes (29), the antiluteolyhc factor produced by the embryo acts locally on the uterus rather than directly on the corpus luteum and, by mechanisms still not understood, prevents the pulsatile release of the luteolyhc hormone prostaglandin F,a, whose action on the corpus luteum normally causes luteal cell death and leads to the initiation of a new ovarian cycle (58, 60). The pregnancy factor responsible for preventing luteolysis was identified by culturing preimplantation sheep embryos flushed from the uterine lumen of pregnant ewes in medium supplemented with radioactive amino acids (55).Two-dimensional electrophoresis identified several isoforms of a protein of M , approximately 20,000 that was produced transiently in the period immediately preceding firm attachment of the trophoblast (preplacenta) to the uterine wall. Small amounts of this protein, known originally as ovine trophoblast protein-1 ( o n - 1 ) (61), or trophoblastin (54), were purified and shown to be capable of extending estrous cycle length when introduced into the uterine lumen of nonpregnant ewes. Parallel experiments in cattle revealed a protein (bT€-1)immunologically related to oT€-1, but slightly larger (57).The difference in size of the two proteins is now known to be due to the presence of a single asparagine-linked carbohydrate chain, present on bV-1 but missing on oTP-1 (62-64).
B. Structure Molecular cloning of oTP-1 and bTP-1 cDNA showed that both were represented by multiple mRNA copies of length approximately 1 kb (63-67).
296
R . MICHAEL ROBERTS ET AL.
The open reading frames encoded polypeptides 195 amino acids long, which included a 23-amino acid signal peptide. Surprisingly,both trophoblast proteins showed a clear structural resemblance to known type I IFN. For example, there was an approximately 50% degree of amino acid sequence identity to IFN-u and -30% to IFN-p. However, the greatest similarity was to the IFN-w (-75%), and, like the latter, the trophoblast IFN had an extension of six amino acids at the carboxyl end relative to the IFN-u and -6.The four cysteines involved in intrachain disulfide bridges in IFN-u (1 --+ 99; 29 + 139) were conserved, and hydrophobicity/hydrophilicityplots for IFN-a, -w, oTP1 and bTP-1 were barely distinguishable (68). That oT€-1and bTP-1 were indeed IFNs was confirmed by showing that they possessed just about every activity expected of this class of protein (see 58). They have, for example, potent antiviral activity and antiproliferative properties (58, 69);they can activate natural killer cells (70)and can up-regulate a variety of IFN-responsive genes (71, 72).As discussed in Section II,C, IFN-Twill compete with IFN-u for binding to a common type I receptor on uterine endometrium and other tissues, and can activate both STAT-l-containing factors ISGF-3 and GAF (D.Leaman, A. Alexenko, K. Cox and R. M. Roberts, unpublished results). Initially, the trophoblast IFNs were considered to be variant forms of IFN-o (63-66), but they were sufficiently dissimilar in structure and immunogenicity to be given a separate subtype designation (28).Their trophoblast-specific expression, lack of viral inducibility, and the unique promoter regions in their genes (see Section II,D) reinforced the view that they were indeed a distinct subtype of type I IFN.
C. Binding of IFNT to the Type I Receptor Among the first indications that the antiluteolybc product produced by the sheep embryo is an interferon was its ability to compete with huIFN-a2 for binding to an apparently common receptor (73).This competition between subtypes has since been confirmed for both endometrial tissues and cultured Madin-Darby bovine kidney (MDBK) cells (58, 74-77) (Fig. 3). Both boIFN-a1 and boIFN-.r had similar affinities for the receptors (-3.5 X 10- A4) on bovine endometrium, and the binding data were consistent with the presence of only a single receptor class (75).Affinity cross-linkinganalysis also revealed a major receptor-ligand complex, with a M , around 130,000 (Fig. 3). Treatment of the cross-linked polypeptide-IFN with N-glycosidase decreased its apparent M,. to -75,000. Thus, the receptor polypeptide had a mass of -55 kDa. Most likely this polypeptide is the “long form” of the ligand-binding subunit (IFNAR2)of the type I receptor known to be capable of binding IFN-a, -p, and -w (23)(Fig. 1). The binding and cross-linkingdata obtained with boIFN-.r on bovine en-
NOVEL TYPE I INTERFERONS
297
FIG.3. Electrophoretic analysis of cross-linked [1251]boIFN-~land [12JI]boIFN-cy1complexes with polypeptides in bovine endometrial membranes and MDBK cells. (A) 20 ng of iodinated IFN was bound (18 hr, 4°C) and cross-linked to either bovine e n d o m e ~ amembranes l or to MDBK cells in the presence of either 0 or 400 ng of the alternative IFN. After immunoprecipitation, the IFN-bound complexes were analyzed by eleckophoresis in 7.5% polyacryamide gels. (B) 20 ng of [1251]boIFN-~1 was bound (18 hr, 4°C) and cross-linkedto IFN receptors on MDBK cells (2 X lo7 cells per reaction) in the presence of either 0 (lane 1)or 500 ng (lane 4)of unlabeled boIFN-71. The complex was analyzed in either the presence (lane 1)of the absence (lane 2) of 6-mercaptoethanol (BME).The complex was also digested with (N-glycosidase F (lane 3). Arrows indicate positions of main radioactive bands. The IFN-Tused, recombinant (r) boIFN-rY2, is genetically engineered IFN-Twith additional tyrosine (Y) residues near the carboxyl terminus, allowing it to be readily iodinated (75).
dometrium contrasts with observations made on the interaction of a mixture of naturally occurring OVIFN-T (74) and recombinant boIFN-.r (77) with ovine e n d o m e t d membranes. With such combinations of ligand and receptor, at least one additional cross-linked band of -95,000 is obseived. The identity of this complex is unclear because it seems to be too small to represent the other subunit (IFNAFU) of the type I receptor (21, 78);possibly it is an accessoryprotein associated with the receptor complex. Why cross-linking to ovine receptor reveals this second band whereas bovine cells and tissues do not is puzzling, but the difference probably reflects the relative distribution of reactive amino groups on both the ligand and the polypeptides with which the ligand associ-
298
R . MICHAEL ROBERTS ET AL.
FIG.4. Expression of IFN-TmRNA (a and b) and actin mRNA (c and d) in day 12 ovine conceptuses during the initial period of elongation. Sections were prepared from a single day 12 conceptus (86).In situ hybridization was performed with a 35S-labeled probe specific for the 3’ nntranslated region of an IFN-TDNA or for y-actin. (a and c) Sections are stained with toluidine blue and are viewed (b and d) by dark-ground illumination. The open arrow shows the embryonic disk; the closed arrow shows the trophectoderm. The solid arrowhead indicates extraembryonic endoderm that detached from the trophectoderm during tissue processing; bar = 100 pm. Note the low signal for IFN-TmRNA hybridization over the embryonic disk and endoderm, but high silver grain density over trophectoderm (b).By contrast, aciin mRNA is present in all cell types (d).This figure was prepared by Charlotte Farin.
ates. Similar complex cross-linking patterns have been observed with certain human IFN-a subtypes in their binding to particular cells (79, 80).
D . Trophoblast-specif ic Expression Perhaps the most unusual feature of IFN-Tis its massive and apparently constitutive expression in the outer epithelial layer (trophectoderm) of the de-
NOVEL TYPE I INTERFERONS
299
veloping placenta during the days precedmg attachment of the embryo to the uterine wall (81, 82) (Fig. 4). The trophectoderm forms as the cavitating blastocyst develops from a ball of cells constituting the morula at approximately day 7 of pregnancy in cattle and sheep. As the blastocele cavity expands, two cell types become evident, a cluster of what appear to be undifferentiated cells constituting the inner cell mass, whch ultimately gives rise to the embryo proper, and the trophectoderm, a polarized epithelium responsible for pumping fluid into the blastocoelic cavity. A layer of extraembryonic endoderm also quickly grows out from the inner cell mass and attaches to the inner surface of the trophectoderm. About this time (day 8), the blastocyst, which is only about 150 pm in diameter, hatches from the acellular sheet (zona pellucida) that encloses it, but rather than attaching to the uterine wall, as does the human or mouse blastocyst, it continues to be free-floating and to expand until it reaches a diameter of a millimeter or more (82) (Fig. 4).
300
R. MICHAEL ROBERTS ET AL.
This sphere of cells then grows and elongates. Within 3 to 4 days, it can reach 10-15 cm in length and occupy most of the uterine lumen. However, it remains only loosely associated with the uterine wall and, with care, can be flushed out in intact form (55).By day 17 in sheep and by days 19-20 in cattle, definitive attachment is evident and invasive binucleate cells present in the trophectoderm begin to invade the uterine epithelium (83). IFN-Tsecretion first occurs as the blastocyst starts to expand and then to hatch (84, 85). It is also about this stage that IFN-T mRNA can be detected by reverse transcription polymerase chain reaction (PCR) (85).Expression per cell is low in these early blastocysts but increases markedly just prior to when the conceptus begins its elongation (58, 81, 86) (Fig. 4). This increase in expression could be prompted by factors released by the maternal endometrium (84).For example, the cytokines GM-CSF (87) and IL-3 (88)and the insulin-like growth factor 1 (IGF-1)(89) have been reported to increase IFN-Tproduction by cultured ovine conceptuses. IFN-Ttranscripts are confined to the mononucleate cells of the trophectoderm, and expression quickly falls as attachment begins (90). At its zenith, at about day 15, production of IFN-Tfrom a single ovine conceptus can produce well over 200 pg in a 24-hr period of in vi&o culture (91). There seems to be no comparable system whereby type I interferons are produced in such quantity. Amounts must far exceed those that are required to saturate type I IFN receptors present in the tissue abutting the uterine lumen. Possibly a low-affinity receptor is required for antiluteolyhc function. Alternatively, the action of IFN-Tmay not be strictly local, although there is no convincing evidence that it enters the peripheral blood circulation of pregnant ewes in appreciable quantity. Perhaps the best explanation is that the embryo must declare its presence when it is quite small, possibly when it is only beginning to elongate and occupying only a fraction of the full uterine lumen. In order to increase its sphere of influence to the rest of the uterine endometrium at a time when the corpus luteurn is wavering on the verge of regression in anticipation of prostaglandin F2a(F'GF,,) release from the uterus, it is crucial that the embryo produces the antiluteolytic factor in large amounts. The apparently excessive production a few days later (see 58)may be merely an outcome of this early commitment to signal vigorously to the mother. This pattern of expression contrasts sharply with that associated with expression of IFN-a, -p, and o,whose genes are normally quiescent and activated only in response to viral infection. The induction of IFN-a and -p expression by virus requires only about 120 bases beyond the site of transcriptional initiation and depends on many minienhancer sites that provide a flexible and graded response to virus and various modulating stimuli (92-99). A variety of transcription factors bind to this region and have been implicated in the IFN response. By contrast, the genes for IFN-T have pro-
301
NOVEL TYPE I INTERFERONS
- 400
-
-380
-
-360
-340
bTP-1 oTP-p7
TGAGTGACTCTGCATTCCTATGTGTAAGATAAGGAGGGAAAAATGCAGTTAAGAATCAATGGAAAATTATATTCC
bTP-1 OTP-p7
TGACATAAGATAAACAAAAGGAATGTTTATATATATTATACCTA TAATAACTATGTACACATCTA . . .T ..................... ..................................................
bTP-1 oTP - p 7
TAAG
...G ................................................. -320
- 220
- 260
~
200
CTTACATAACT TCAGCCTT$~-$I$-ATA ......... C ..... G . . . . . C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T . . . . . . . .
-180
bTP-1 oTP-p7
- 280
-300
- 240
G..........G....
-140
-160
CAA
-120
ACCCA$~~FEK+AAAATTAAATTTCTACTGTAAAAATTAAGP$-+C . T . . . . . . . . . . . . . . . . . . .G . . . . C . . . . . . . A . . . . . . . . . . . . . . . . . . . . . . G . . . . . . . . T . . . . . . . . I
-
-100
bTP-1 OTP-p7
bTP- 1 oTP-P~
-
-80
-
-
- 60
- 40
AACAGAAAATATCTAACTGAAAACACAAACAGGAAGTGAGAGAGAAATTTTCGGATAATGAGTACCGTCTTCCC .GT .......G . . . . A . .................... G .................................
m - 20
TATTTAAAAGCCTTGCTTAGAACGATCATC
......................
C...G..
FIG.5. Nucleotide sequences of a bovine and an ovine IFN-T gene 5' of the start site for fmnscription. Differences between the two genes are few and are indicated in the appropriate position on the ovine gene sequence. Several base sequence motifs are illustrated. G A A A " motifs (where N is any nucleotide) are underlined by thick, solid bars; decamers similar to those that bind the Ant gene product are in open boxes, and octamers resembling those that bind the Oct-3 gene product are in bold letters. Sequences similar to those that bind IRF-1 and IRF-2 are highlighted by a narrow bar above the line. The TATA box is underlined. Numbers indicate the number of bases from the transcription start site. Note that bTP and oTP are terms previously used to refer to bovine and ovine IFN-T, respectively.
moter regions that are highly conserved up to approximately 400 bp beyond the transcription start site (29, 200) (Fig. 5). Although there is limited similarity to other type I genes within the proximal regions of these promoters, most of the general features that apply to the transcriptional regulation of other type I IFNs, including viral induction, seem not to pertain to IFN-T.Indeed, it seems that it is the promoters of lFNT that, more than any other feature, set these genes apart from all other type I IFNs.
E. Transcriptional Regulators Study of the transcriptional regulation of IFN-Thas been considerably hampered because there are currently no well-defined trophoblast cell lines
302
R. MICHAEL ROBERTS E T AL.
available from cattle or sheep to study IFN-Texpression in vitro. Fortunately, human choriocarcinoma cells, such as JAr, are competent to support constitutive IFN-Tpromoter activity from transfected reporter genes and presumably contain a complement of transcription factors compatible with IFN-7 expression (101, 102), despite the apparent absence of ZFNT genes in human DNA (see Section VII) and the fact that these cells, when cultured normally and in the absence of virus, do not display expression of any other type I IFN. Transient transfection experiments with JAr cells suggest that two distinct promoter regions are required for full constitutive expression (101, 102). One, proximal to position - 126, appears to be necessary for basal ex. pression, whereas a more distal region (-280 to 400) seems to be an enhancer. Electrophoretic mobility-shift assays employing nuclear extracts from ovine embryos prepared during the period of maximal IFN-T expression are consistent with the above transfection assays. For example, a proximal region (-69 to 91) (Fig. 5) was defined that formed high-mobility cornplexes with nuclear proteins from day 15 embryos. These complexes could not be disassociatedby competition with a 100-fold molar excess of DNA derived from the same region of an ZFNW gene promoter. In addition, a single complex, specific for the time in which IFN-T expression was maximal, was formed in association with the -322 to -358 (distal) region of the promoter. The association of transcription factors with these regions is presently being examined by using the yeast single-hybrid system. Between the - 120 proximal region and the distal enhancer region there is a domain of -200 nucleotides that can b e deleted without influencing expression (102).This region also fads to form complexes with nuclear proteins in electrophoretic mobility-shift assays, yet is well-conserved across species and contains several decamers identical or closely related to the sequence ATITAATTGA (Fig. 5). The latter corresponds to the recognition sequence most favored by the horneodomain regions of products of the Antennupedia (an@)locus of Drosophilu (see 100). A second homeodomain protein engruited (en)binds the same 10-base sequence, although its recognition sites seem generally to be found in the opposite orientation relative to the direction of transcription-that is, TCAATIAAAT. The presence of these binding sites has suggested that homeodomain factors might be involved in regulation of possibly silencing IFN-Texpression during early development.
F. Production of IFN-7 and Its Value in Medicine and Agriculture IFN-T(known originally as ovine trophoblast protein-1, or oTP-1)was first purified as a mixture of isoforms from the medium after culturing ovine conceptuses in vitro (55).Purification involved DEAE ion-exchange and gel-filtration chromatography.A protein, now known to be identical to IFN-WOTP-
NOVEL TYPE I INTERFERONS
303
I, and called trophoblastin, had previously been identified in extracts of sheep embryos (54). Although IFN-Tis the principal secretory product of the preimplantation embryo, purification from this source yields only small amounts of product, depends on access to pregnant ewes, and requires surgical intervention. As a consequence, various procedures for producing recombinant product have been pursued. IFN-T has, for example, been synthesized in amounts sufficient for large-scale experimental animal testing in E. coli (103-104) and yeast (105, 106).The products are active in all in vitro tests for type I IFN and have the ability to extend estrous cycle length in both sheep and cattle (1015-107). Recombinant IFNs may have value as fertility agents for livestock (58,82, 108, 109). A significant proportion of pregnancies are believed to be lost in mammals because the embryo fails to signal its presence significantly robustly at the time that the corpus luteum (CL) is poised on the verge of regression. Administering IFN-7 to pregnant ewes or cows by intramuscular injection may serve to rescue embryos that are lagging in development and are otherwise destined to be lost. Experiments utilizing boIFN-a supplied by CIBA-GEIGYillustrated the potential of this procedure for improving pregnancy success in sheep (108, 109),but not in cattle (110),wherein the injections caused pronounced hyperthermia and flulike symptoms (110-112). One other drawback of the procedure is that it might override a maternal mechanism for selecting “good” embryos. However, experiments so far have not indicated any increase in frequency of abnormal lambs born to ewes treated with IF”? during the penod of maternal recognition of pregnancy (108, 109). . The IFN-Tmay have other properties that make it attractive for pharmaceutical purposes. It has been reported to be much less cytotoxic than IFNa and hence likely to exert fewer side effects in treatment of autoimmune and inflammatory diseases and even cancer (53,113-115). On the other hand, it seems improbable that ovine or bovine IFNs will be used for human therapy because they are hkely to evoke an immune response. Therefore, the reported existence of a human IFN-T(116)has raised expectations that this IFN might be a valuable therapeutic agent (53,115).However, as discussed later, the existence of such an IFN-7 still remains in doubt.
G. Do the IFN-T Possess Unique Function? The question has been posed as to whether IFN-T possesses special biological features that enable it to act as a pregnancy hormone or whether it is effective as a result of being produced in the right place, at the right time, and in quantities sufficient to fulfill its function. Evidence is accumulating that IFN-Tis more efficient at extendmg estrous cycle length than is IFN-a (see 58), and may be able to induce expression of one or more unique en-
304
R. MICHAEL ROBERTS ET AL.
dometrial proteins that are not up-regulated by IFN-a (117, 118).There are precedents for believing that different type I IFNs might differ in their relative activities. For example, huIFN-a8, unlike other huIF'N-a tested, has no ability to activate natural killer (NK) cells (19,119).Similarly,antiviral and antiproliferative properties of various IFN-a subtypes are not necessarily wellcorrelated (120).Human cells lacking the tyk2 component of the signal transduction pathway cannot undergo an antiviral response to IFN-a, but can still do so when treated with IFN-P (121).Also, huIFN-a and huIFN-P cause different phosphorylation patterns in the interferon receptor and its associated kinases (122),despite competing with each other for receptor binding. These differences are often quite subtle but have been difficult to reconcile with the presence of a common type I receptor. One explanation is that several signal transduction pathways, which are differentially activated by binding of different ligands, emanate from the receptor. Another is that there are subtype accessory polypeptides associated with the receptor complex that trigger different phosphorylation or second-messenger cascades within the target cell.
111. Comparison of Structures of IFN-w and IFN-7 with Other T-pe I Interferons
A. Primary and Secondary Structures Only in cattle have genes for all four type I subtypes (a,P, o and T) been cloned and sequenced, thereby permitting full pairwise comparisons to be made (9, 11, 13, 14)(Table 111).There is a report of an ZFNT in human DNA TABLE 111 PAIRWISE COMPARISONS OF AMINO ACIDSEQUENCES AMONG TYPEI INTERFERONFROM CATTLE N 1
N
IFN subtype
2
3
GenBank accession no.
4
5
6
7
28.3 33.2 53.2 50.0 -
29.8 34.8 50.0 46.3 73.9
-
-
27.7 34.2 49.5 45.7 72.8 97.4 -
Similarity (Oh)
~~~
1 2 3 4 5
IFN-p
IFN-p
6
IFN-a IFN-(Y IFN-w IFN-T
7
1m-T
M15477 M15478 MllOOl M29314 M11002 M60913 M31557
84.4 -
28.8 31.5 -
-
29.8 31.5 92.6
-
-
NOVEL TYPE I INTERFERONS
305
(116), but the nucleotide sequence of the cDNA more resembles that of O V F W (94% identity) than of OVIFN-T(86% identity). A comparison of the bovine sequences indicates that the boIFN-.r exhibits about 50,30, and 75% amino acid sequence identity to the bovine IFN-a, -p, and w identity, respectively (Table 111).These values are consistent with the view that the ZFNT diverted from the Z F W relatively recently (see Section IV). Compared with IFN-a, both IFN-w and IFN-T have six amino acid extensions at their carboxyl termini. It is unclear how this tail originated, but it may have occurred by a frameshift or a mutation in the stop codon. Comparison of the six extra codons on IFN-Tand IFN-w with the proximal end of the 3' UTR of IFN-a is not particularly revealing in this regard, most probably because of the length of time that has passed since the genes diverged. It is unclear whether the extra length of IFN-w and IFN-T has any functional signhcance. Figure 6 provides an alignment of selected IFN-a, -w, and -7. Each structure is based on five major regons of a-helix (helices A, B, C, D, and E). The refinement of the original mouse IFN-p structure (123)shows that it contains an additional short helix (CD) (124).Because this structure appears to be conserved in IFN-T(1251, it is likely to be ubiquitous. IFN-T and IFN-w generally possess the conserved cysteines characteristic of IFN-a at positions, 1, 29, 99, and 139, although there are several exceptions. In human IFN-wl, which possesses an additional two residues at its amino terminus (35, 36), these cysteines would be Cys3, 31, 101, and 141. Similarly,porcine IFN-o has a deletion of five amino acids between residues 113 and 117, so that the conserved cysteines are 1,29,99, and 134. Such minor changes in the primary sequence, including the two above, probably do not interfere with the formation of the two disulfide bonds analogous to 1-99, and 29-139 generally considered to be typical of all IFN-a (226,127) and IFN-w (34).The C y ~ ~ ~ - Cdisulfide y s ' ~ ~is essential for biological activity in huIFN-a (128-230),whereas the 1-99 bond seems less important (131). Mouse IFN-P lacks both disulfides, but an engineered cysteine, equivalent to Cys29-Cys139in huIFN-a, increases its antiviral activity 10-fold (132). All IFN-T and most IFN-w have an additional Cys at 86 in the center of helix C, and both bovine (bo)and giraffe (gi) IFN-Thave a Cys at 64 in helix B as well (Fig. 6). Residues at 64 and 86, which are present on the antiparallel helices B and C, respectively, may be sufficiently close to each other to permit formation of an additional disulfide bond (see Section IILB). In the case of gdFN-7, which lacks an equivalent of Cys-99, this bond could provide the second stabilizing disulfide, substituting for 1-99. Several other IFN-T also possess a sixth cysteine, but not at position 64 (Fig. 6), and these residues are unlikely to be positioned appropriately to interact with Cys-86. Dog IFN-a has a Cys at 69, which has been suggested to form a third dsul-
hum-w
I
HelixA
Helix C
Helix D
C
FIG.6. An alignment of selected type I IFNs (ci,w, and T) showing the regions of (Y helix (for IFN-T)and the relative positions of cysteine (C) and proline (P) residues. Capitalized letters indicate f d conservation; lower case, italicized letters indicate residues that are not fully conserved. Numbers indicate amino acid residue. Abbreviations: hu, human; bo, bovine; ca, caprine (goat);eq, equine; gi. giraffe; PO, porcine; ra, rabbit.
NOVEL TYPE I INTERFERONS
307
fide bond ( C y ~ ~ ~ - C(38), y s but ~ ~ its ) existence has not been confirmed experimentally (38). In addition to comparing the distribution of cysteines among IFNs, Fig. 6 also provides the locations of prolines. As expected, prolines are generally found in nonhelical regions or at the end of helices. Pro-55, found in helix B of some IFN-a and in some OVIFN-T, is an exception. The distribution of proline residues confirms that IFN-w are intermediate in structure between IFNa and IFN-T.Two prolines (at positions 26 and 39) are well conserved across all type I IFNs, whereas the one at position 4 (or 5 ) is generally common to IFN-a and IFN-o, and the one at 116 to IFN-w and IFN-T. The simultaneous presence of prolines at positions 4 and 9 in rabbit IFNw and in some ovine IFN-w strongly suggests that helix A must be shorter than the usual 4-20 residues in these molecules. Similarly, the absence of Pro4 in all IFN-7 probably ensures that the conformation of the IFN-Tamino terminus is different than that of either IFN-o or IFN-a. Although most huIFN-a do not carry an Asn-X-ThrISer consensus glycosylation site, many other type I IFNs, including most IFN-w and many IFNT , do have such a sequence, centered around Asn-78 (or its equivalent). In cases where it has been studied, cg., huIFN-wl (33, 34) and boIFN-.r (62), this asparagine is glycosylated. Despite the lack of N-glycosylation sites on many huIFN-a, there are repoi-ts that these molecules are glycoproteins (4, 133, 134), presumably at serine or threonine residues. It is unclear whether similar modifications occur on IFN-Tand IFN-w. However, the significance of such carbohydrate groups is unclear. Sugar chains may be important for stability, solubility, or in controlling circulating half-life. In general, however, bacterially produced and natural forms of these IFNs have comparable antiviral activities, although they are hkely to differ in antigenicity.
B, Th ree-dirnensionaI Structures Although crystals of IFN-a and IFN-P were reported well over a decade ago (135,136),they have not been of sufficient quality for X-ray crystal structure analysis. However, recombinant murine IFN-P did yield crystals that allowed its structure to be solved (123, 124).That success may stem from the lack of disulfide bonds in muIFN-P, thus obviating disulfide interchange reactions and the possibility of oligomer formation. The structural information on muIFN-P has allowed homology models for huIFN-a (137-139) and later for IFN-T(125,140)to be constructed. The basic feature, five a-helices and the Iong loop between helices A and B, is remarkably preserved in all three IFNs and is illustrated for muIFN-P and ov IFN-Tin Fig. 7. The short CD helix, recently demonstrated in the refined muIFN-P structure (123),also appears to be preserved, although it is somewhat shorter in OVIFN-T than in muIFN-P. BoIFN-T differs from muIFN-P in
308
R . MICHAEL ROBERTS ET AL.
FIG.7. Stereo overlay of the crystallographically determined structure of murine IF"-p (thin line) and the modeled structure of bovine IFN-T(thick line). Helix identifications A-E, as well as the N and C termini, are marked. The largest deviation in overlap occurs in loop AB (see text). The C-terminal "tail" part of IFNT could not be uniquely modeled. Cys residues that form disu6de bridges (Cysl-Cysg", C y ~ ~ W y sare l ~marked ~ ) by solid circles. From Senda et al. (125) with permission.
three additional respects. The first is the carboxyl tail,which extends nine residues beyond the terminus of muIFN-@and which therefore cannot be modeled. Its conformation relative to the rest of the structure is unknown, but it would have a length of -30A if fully extended (e.g., Fig. 7),but could possibly fold back over the body of the molecule rather than project downward as shown. A second difference is that IFN-T,as well as IFN-a and IFNa,has a three-amino acid insertion in loop AI3 and a likely disulfide bridge between Cys-29 in that loop and Cys-139 at the beginning of helix E. These features would almost certainly provide conformational differences between them and muIFN-@in a region of loop AB (Leu-22 to Arg-33) thought likely to interact with the receptor (see Section 111,C). A final difference is that all IFN-Tand most IFN-w possess a Gly at 126 in place of the normally conserved Arg at that position (125).In muIFN-@and huIFN-a, this Arg forms a hydrogen bond network with several residues in the distal end of the AB loop; this network probably stabilizes this rather unstructured region in its association with helix D. The Gly replacement at 126 would impair such in-
NOVEL TYPE I INTERFERONS
309
teractions and may provide more conformational flexibility to the loop of IFN-T. As mentioned earlier, boIFN-T and ~IIFN-TIhave cysteines at positions 64 and 86. Figure 8 illustrates that the side chains of these cysteines are positioned relatively close to each other and could be in an appropriate conformation during folding to form an additional disulfide bond.
C. Receptor-bindingSites A number of different approaches have been used to define the functionally important regions of type I IFN, including site-directed mutagenesis, limited proteolysis, construction of hybrid IFN molecules from the same or different species, competition with synthetic peptides, and antibodies directed toward particular surface epitopes (reviewed in 141, 142). Recent interpretations of such data applied to the structural models discussed above (Fig. 8) have defined certain “hot areas” on type I IFNs that would appear to be the most important regions for receptor binding, and include the central part of loop AB,helix D, and loop DE (138,141).In addition, it has been suggested that helix A and helix C also make primary contact with one or another of the two known receptor subunits (143).There may be other functionally important regions in addition. Site-directed mutagenesis of Lys-160 on IFN-T,for example, reduces antiviral activity on bovine cells by about 90% without greatly altering receptor-binding affinity (144).The amino terminus of IFN-Thas also been suggested to contribute to its biological activity (145, 146).One possibility is that the sequences on the distal end of helix E and the adjacent amino terminus (see Fig. 8) form a contact with a receptor subunit responsible for signal transduction, e.g., IFNARl or some other accessory polypeptide. As emphasized earlier, any unique biological activities that might distinguish IFN-T from IFN-u could possibly require an additional dedicated subunit.
IV. Evolution of lfNW and IFNT A. Coding Region Phylogenetic analyses the primary structures and gene sequences of type
I IFN show three main clusters in mammals: ZFNB, I F ” , and IFNW/T, wellseparated from the type I IFNs of birds and fish (10).ZFNB and ZFNA are thought to have arisen by a duplication event occurring at least 250 million years ago (MYA) (4,147).Calculated values have varied, most likely because assumed mutational rates dlffer. The divergence occurred after the separation of the mammalian and avian linkages. What remains controversial is whether the different ZFNA genes duplicated independently from their prog-
99 C
FIG.8. The three-dimensional structures of bovine IFN-Trepresented in ribbon format. The diagram is based on a model of ovine
IFN-S4 (BrooWlaven data base; lovI), which was calculated from the atomic coordinates of murine IF"-p (124). The structures were prepared by using SYBYL 6.2 software (Tripos Inc., St. Louis, MO). Three views of the molecule are shown: from beneath (left), from above (right),and a side view (middle).The carboxyl terminus (C)and the amino terminus (N) are shown. Because the most distal nine residues (relative to murine IFN-p) cannot be modeled, the carboxyl terminus here is -163 and is at the end of helix E. The IFN-Tis characterized by five major helices (A-E) and a smaller helix (CD),which is separated from helix C only by Pro-102. The positions of the two disulfides (1-99; 29-139) are shown. A possible third disulfide between Cys-86 and Cys-64 is illustrated and would connect helices B and C. As computed in the murine IFN-P, the distance between the cysteine side chains may be too great to permit such a bond.
NOVEL TYPE I INTERFERONS
311
enitor gene after the major eutherian orders diverged, or whether there were already multiple ZFNA genes in the earliest eutherian mammals (10, 148).A recent opinion (lo),based on comparison of all available human and rodent sequences available in 1995, favors the former hypothesis, i.e., duplication occumng late rather than early. The same controversy exists for IFNW. Multiple ZFNW genes have been identified in humans (12),cattle (11, 39), sheep (29, 40, 149),pig (42),horse (43),rabbit (44)and many other mammalian species (29)(see Fig. 2). If rates of mutational changes provide a clock, it has been calculated that ZFNW diverged from ZFNA prior to the divergence of placental mammals, between 116 and 132 MYA ( I f ) . A more recent calculation has given a value of 129 MYA (10).Curiously, ZFNW genes are absent from the dog (38)and have not been detected in rodents. Either these genes have been lost or ZFNW originated more recently than mutational rates predict. A more detailed analysis of the distribution and sequences of ZFNW genes among modem mammals must be undertaken before this question can be properly addressed. Hughes, in an analysis of type I genes, concluded that the ZFNW family failed to show species-specific clusters (10).He has suggested that duplication of ZFNW genes occurred relatively early and probably well before primates and Artiodactyla &verged. However, he failed to take into consideration that ZFNW and the recently evolved ZFNT genes are distinct subtypes. Moreover, his analyses did not include the extensive group of rabbit ZFNW genes (94). A follow-up determination by us shows that the sequence similarities of ZFNW genes from the same species are relatively high, and that there are, in fact, considerable differences between species. For example, there is over 96% nucleotide sequence conservation of the coding region among the cloned ZFNW genes in rabbits (44),over 92% in pigs (42),and about 94% in sheep (29, 40, 146). By contrast, pig and cattle IFNW genes show only about 83%identity. Again, there are two explanations. Recent duplications could have provided multiple genes of considerable similarity. Alternatively, frequent recombination events between homologous, but relatively ancient, genes may have continued to blend differences and prevent divergence within species. Examination of ZFNT genes also reveals species-specificclusters of genes (Fig. 9), although in this case there is also considerable conservation across species and clear differences between them and the IFNW genes of their own species (29, 58).These data strongly suggest that the IFNT genes arose relatively recently and that the duplication events occurred since the divergence from the ZFNW genes. Divergence in sequence provides a measure of evolutionary distance. When two genes diverged relatively recently from a common ancestor, the observed “distance” between two sequences is a reasonably accurate repre-
IFNT
FNW
HuIFNA
IFNA 7
I
L lvhunl
~
FIG.9. A phylogenetic tree based on amino acid sequence identities for the type I IFNs from several different species. Protein sequences were obtained from the Swiss Protein, GenBank, and PIR data bases. The tree was established by doing a pairwise alignment that scores the similarity between every possible pair of sequences (UPGMA dendrogram; see 164).The sequences chosen for the analysis from a particular species were selected only if they differed by
NOVEL TYPE I INTERFERONS
313
sentation of the evolutionary dvergence between them. With time, more than one substitution event can occur at a single site and the observed distance has to be corrected appropriately (150). As a general rule, nucleotide substitutions at synonymous sites, ie., ones that do not alter the amino acid sequence, will be better tolerated than ones that alter primary polypeptide sequence and provide a better clock than ones occurring at nonsynonymous sites when short evolutionary time periods are involved. Table IV is a matrix of corrected &stances w i h the codmg repons of several representative ZFNW and ZFNT genes within the suborder Ruminantia. The calculations of average distances are based on the distances between d reported full-length sequences of ZFNT and ZFNW genes in the species listed. There is presently little evidence to support the presence of lFNT in the two other suborders of Artiodactyla (Suiformes and Tylopoda) (29).The distances between the ZFNTgenes of cattle (subfamilyBovinae) and those of sheep, goat, and musk ox (subfamilyCaprinae) average 10.9 per 100 bases calculated by the Kimura two-parameter method. It is generally accepted that the ancestors of the Bovinae and Caprinae diverged about 20 MYA (148).If it is assumed that the ZFNT genes evolved at the same rate in both lineages, the base substitution rate has been 0.271 2 0.004 per 100 bases per million years (MY).A quite similar substitution rate (0.275 2 0.008) can be calculated for the single ZFNT of the giraffe (familyGiraffidae),whose ancestors diverged from Bovinae approximately 24 MYA. If similar calculations are performed for the ZFW genes, the results are quite different. The substitution rate within Bovinae is 0.186 (* 0.015) over 20 MY and is 0.179 (+ 0.003) if the bovine and ovine ZFNW genes are compared with those of pigs (suborder Suiformes), whose ancestors diverged from the precursors of modem-day ruminants approximately 55 MYA. Clearly, the ZFNT genes are evolving almost 50% faster than the ZFNW genes, a feature that may be indicative of adaptive diversification within the IFN-T subtype. The results in Table lV confirm the view that IFN-T and IFN-o are distinct groupings despite their similarities in primary sequence. If the same base substitution rates noted in Table IV are assumed to have been maintained in the two sets of genes after they diverged from the common IFNTIIFNW ancestor, the branch point for the IFNT and IFNW genes occurred 36.5 ( 2 0.24) MYA. It predicts that the lFNT gene evolved from the l F W gene after the appearance of the suborder Ruminantia (45-48 MYA) more than 1%from each other. Each IFN is listed by its databank code identification and by its common name (bo, bovine; ca, goat; do, dog; fe, cat; pi, giraffe; mo, mouse; ov, ovine; ov mo, musk ox; PO, pig; ra, rat; rb, rabbit). Distances along the horizontal axis are proportional to the differences between sequences.
TABLE Jv DISTANCES BETWEEN NUCLEIC ACID SEQUENCES I N CODING REGIONS OF IFNT AND IFNW
IN
RUMINANTIA~
Coding regionb
1
2
3
4
5
6
7
8
9
10
11
12
1 OVIFN-wl 2 OvIFN-w2 3 OVIFN-U~ 4 BOIFN-wl 5 OvIFN-rl 6 GOIFN-71 7 OvIFN-72 8 MuIFN-T 9 BoIFN-71 10 BoIFN-72 11 BoIFN-73 12 CiIFN-7
0.00
5.87 0.00
6.06 6.61 0.00
7.37 6.43 8.51 0.00
17.14 16.92 17.33 17.35 0.00
18.23 18.00 17.97 18.22 1.91 0.00
16.68 16.91 16.43 16.47 3.68 3.50 0.00
17.34 17.34 17.06 17.80 6.62 6.05 4.59 0.00
16.56 15.92 16.96 16.57 11.68 12.28 10.88 12.11 0.00
15.71 15.49 16.11 16.14 10.69 11.27 9.89 11.11 1.03 0.00
16.96 16.32 17.14 17.43 11.87 12.46 11.06 12.29 1.91 1.55 0.00
16.33 18.35 18.26 18.36 14.18 15.23 13.34 15.30 11.61 10.82 11.99 0.00
=Distancesare number of substitutions per 100 bases calculated by the Kimura two-parametermethod (150)by using the Genetic Computer Group (Universityof Wisconsin) sequence analysis software package (Version 7.1).The access numbers for IFN used in this comparison are X59067 (OvIF’h-w2),M73245 (OvIFN-w3),M11002
(BoIFN-wl), X56345 (OVIFN-~l), M73243 (GoIFN-TI),X56346 (OvIFN-TZ),M73244 (MdFN-T),M31557 (RoIFN-T~), M31558 (BoIFN-~l), and M60913 (BoIFN-TZ). T from this laboratory. OvIFN-ol is from Charlier et al. (40) and G ~ I F N - is ’Ov, Ovine; Bo, bovine; Go, goat, Mu, musk ox; Gi, giraffe.
315
NOVEL TYPE I INTERFERONS
and before the radiation of the “true” ruminants (Pecora) (24 MYA) (148).The predction is fully in agreement with the experimental data that there are ZFNT in Bovidae and Giraffidae, but not in either pigs (suborder Suiformes) or llamas (family Camelidae, suborder Tylopoda) (29).A tree indicating evolutional relationships among representative ZFNA, l F W , and ZFNT genes is shown in Fig. 9.
B.
Promoter Region
The expression of the ZFNT gene is limited to the trophoblast of Ruminantia, and the genes are not inducible by virus. By contrast, the ZFNW genes are responsive to virus and are more generally expressed. Presumably these differences can be accounted for in the promoter regions of the respective genes. The first 130 bases of the promoter region of lFNT are highly conserved among sheep, goats, musk ox, cattle and giraffes (Fig. 10) (29; L. Liu, D. W.
-133 PoIFNW5 BoIRYWl OVIFNW3
w 1 m 1 OVIFNWZ IFNW Consensus
-65
...%..G...
G . . . . . . . .C.U . . . . . . . . . . . . . . . C.... ..... C . .G......T.G.... . . . . .U . . . . . . . . . . .CG ............................... C . . . A . . C . . . . . . . . . . . . . . . .T.. . . . . .T. ................................ .ATT.G.... . . . . . TO.C . . ......................................... T.... T . . . . . . . . .....................................
............. ’..*.. .. .***
IFNT c ~ n s e n s u s oVIFNT4 OVIFNT5
OvIFNT6 Calm? hrm0IFN-I
WOIFNTC GilFNT
.
~ U C U T m C T I T G A C C U T A ~ X G ~ T U C U T ~ ~ ~ ~ ~ ~ ~ G T ~ G ~
.* *.
*
tf. (I..
.. f
* .* .) I
f......
f
oViFWi oVIFN#2 IFKd Consensus
-1
. . .T . . . . . . . . . . . . . . . . . . . . . . . . . . . G.................... G..U. . . . . . . . . . . .c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .c . . . ......................................... ......................................... c ................... . . . . c ............ C . . .................. U . . . . C ........... C .......
U
.............. ............. T
G
-
~
I
\
~
A
O
A
T
A
A
I
V
I
f
IEET ~onsensus
~
oVIFNT4
G
oVIFNTS OVIFNT6 CaIFNT OvmalFNT BoIFNT4 GiIFm
f
. . . . . .G . . . . . . . . . . . . . . . . . . G... . . . . . . . . . . . . . . . . G....-T . . . . . . . . . . . . . . . . . . . . . . .T . . . . . . . . .0.. . . . . . . . . . . . . . . . . . . .G ..................... . . . . .T.. . . . . . . . . .0 GT ..................... G. .. .............. G.G ...................... R ......................... . . . . . . . . . . . . . . . . .AC ............. c ....... u................ . . . . . . . . A , . . . . . . . . .T . . . . . . . . . . . A . . . . . . . . T.
-66 POIm-65 601FWii C.Jiir,i3
f
A C T A C U T T T C C T A C G T C R a x T U G ~ T A ~ T ~ ~ ~ C ~ C ~ ~ A ~ G ~ - - - - - ~ - U ~ ~ G
O
G
M
r
r
J
....
T
C
L
D
C
A
T
G......
..........................................
~
..... ..... ~
~
~
C
~
C
C
~
R
~
C
~
A
~
~
~
~
~
C...G..... U...T. . . .
G
. . . .R.............R.T.................... .............................................. G
0.
G.T . . .
FIG.10. Alignment of the putative promoter sequence for I F W and ZFNT genes. Caps (indicated by a dash) have been introduced into the sequences to provide optimal alignments.Sequences identical to those in the respective consensus sequences of ZFNW and ZFNT genes are indicated by dots. Nucleotides that differ between the consensusesofthe IFNW and IFNTgenes are marked by an asterisk. Po, pig; Bo, bovine; Ov, ovine; Ca, goat; Ov mo, musk ox;Gi, giraffe. The access numbers for IFNs used here are X57196 (PiIFN-w5),M11002 (BoIFN-wl),X59067 (OVIFN-we),M73245 (OVIFN-wS),M88771 (OVIFN-T~), M73241 (OVIFN-~5), M73242 (OVIFN-T~), M73243 (GoIFN-Tl),M73244 (MuIFN-T),and X65539 (BoIFN-74).OvIFN-wl is from Charlier d al. (40) and GiIFN-7 is from this laboratory (unpublished).
~
~
~
A
U
~
~
C
R
~
316
R. MICHAEL ROBERTS ET AL.
TABLE V DISTANCES IN NUCLEOTIDE SEQUENCES Average distance (SEM)"
Promoter Coding region 3' UTR
50.78 2 0.78 16.69 It 0.11 43.97 t 0.75
9.02 ? 0.85 10.85 t 0.16 10.05 t 0.20
11.56 t 0.91 7.44 -+ 0.60 8.5aband 31.0
aDistances are number of substitutions per 100 bases calculated by the Kimura two-parameter method (150).Cattle sequences are compared to sheep, musk ox, and goat sequences to obtain the distances of IFN-T(IFN-TIIFN-T) and IFN-w (IFN-w/IFN-w)in Bovinae and Caprinae. Both intra- and interspecies comparisons are included in calculating the average distance between IFNT and I F N W within the family Bovidae. The promoter region used in the comparison is about 130 bp upstream of the transcription start site, and the 3' UTR is about 300 bp downsbeam of the stop codon. "The distance between the bovine IFNW and ovine l F N W 2 (8.58)was very different from that (31.0)between bovine I F N W and ovine I F N W I . The access numbers for IFN used in this comparison are M11002, M31557, M31558, M60908, M60913, M73241, M73242, M73243, M73244, M73245, M88771, M88772, X56342, X56345, X56346, X59067, and X65539. The sequences for IFNT and IFNW from Charlier et al. (40) are also included in the comparison; they are not in GenBank.
Leaman and R. M. Roberts, unpublished results). In cases where more extended regions of the promoters have been sequenced, conservation is evident up to about 400 bases beyond the transcription start site (see Fig. 5) and is then lost. The evolutionary distances between I F N T genes of Bovidae and Caprinae within the promoter region are quite close to those within the coding region (Table V). Thus, the regions are equally conserved and have evolved at similar rates. The I F N W promoters are also relatively well conserved, not only within these pecoran species, but between them and pigs (suborder Suiformes). Not unexpectedly, the rate of substitution has been somewhat higher in the promoter and 3' UTR than in the coding region, however (Table V). The promoter sequences of the I F N T and I F N W genes bear only a limited resemblance to each other (Fig. 10) and the calculated substitution rate (0.71 per 100 bases per MY) is improbably high. The actual substitution rate within the I F N T promoter over the 24 MY since the beginning of the pecoran radiation is quite modest (-0.23 per 100 bases per MY) and consistent with that within the coding region. It could not account for the dfferences between the I F N T genes and the I F N W genes. Presumably a more abrupt series of genetic changes, e.g., deletions and recombination events, in thepromoter region accompanied the first emergence of the IFNT gene as a distinct subtype. This acquisition of a unique promoter may have been the triggering
NOVEL TYPE I INTERFERONS
317
event that provided placental expression and negated viral inducibility of these genes. Surprisingly,the entire 3' UTR (-300 bp) of the IFNTgenes is conserved as the 5' UTR and the open reading frame (ORF) (Tablev). The average evolutionary distance (number of substitutions per 100 bases) between the 3' UTR of the IFNT gene of the Bovidae and Caprinae, for example, is only 10.05 -t 0.20. The 3' UTRs of the l F N T and I F N W genes share about 80% (range 74-83Oo) sequence identity over the first -120 bases beyond the stop codon, but then diverge markedly (<65%) further downstream. It is for this reason that probes prepared from the 3' UTR can be used for Southern genomic blotting to distinguish IFNT from ZFNW genes, whereas probes representing the ORF cannot (29) (Fig. 2). These observations strongly suggest that the duplication event that initially separated the I F N W and IFNT genes was initiated by acquisition, not only of a specialized promoter, but also of a unique region in the 3' end of the genes. It seems possible that the conserved 3' UTR of the IFNT gene plans some role in the control of IFN-T expression.
V. Chromosomal Location and Linkage of IFNW and IFNT Human I F N W genes are closely linked to the I F N A genes w i h n 400 kb of chromosome 9, but a single ZFNB gene is placed at the distal end of the cluster relative to the centromere (Fig. 11).There are seven ZFNW genes, but only one of them, the most dstal, is functional. Diaz et al. (12)have discussed in detail the likely way in which this locus evolved and have drawn attention to the placement of many of the ZFNW genes after pairs of I F N A genes. The most likely explanation is that the I F N W genes duplicated in tandem with one or possibly a pair of I F N A genes. Clearly there has been no positive selection to maintain functionality in the majority ofthe I F N W genes. By contrast, there are 13 apparently functional I F N A genes, one closely related pseudogene, and three additional type I pseudogenes with only limited resemblance to I F N A genes. It is unfortunate that the type I IFN locus of cattle has not yet been physically mapped to determine whether it is organized similarly to that of the human. Such an analysis would probably indicate whether the duplication events that provided the multiple I F N A and I F N W genes occurred after the diversification of the major mammalian orders. Certainly, the apparent absence of I F N W in the dog and in rodents might be more easily accounted for if there were only a single IFNA-IFNW-ZFNB gene cluster at the time the carnivores and rodent lineages separated from other eutherian mammals. The I F N W and I F N T genes of cattle have been localized to bovine chro-
318
R. MICHAEL ROBERTS ET AL. 50 kb
FIG. 11. Map ofthe IFN gene cluster showing the locations of the different IFN genes and pseudogenes as vertical bars. The arrowheads show the directions of transcription inferred for each gene. The gene for I F N W I , the only IFNW that is transcribed, is distally placed relative to the centromere (cen) and located between the genes for IFNa21 (ZFNASI) and IFN-p (IFNBI), which is closest to the telomere (tel).Pseudogenes are designated with a P, e.g., IFNWPIS and ZFNAP22. The IFN-related pseudogenes are designatedby open bars. From Diaz et al. (IZ),with permission
mosome 8, band 15, by fluorescent in situ hybridization with a full-length boIFN-.r cDNA probe (152).They are closely linked to the ZFNA gene. The genes have also been localized to the homologous chromosome banding region in river buffalo (3q15) (153),goat (8q15)(44, and sheep ( 2 ~ 1 5(41). ) Physical linkage of the type I IFN gene families in the bovine genome has been addressed in only an exploratory manner by hybridizing subtype-specific probes to large, endonuclease-restricted DNA fragments that had been separated by pulse-field gel-electrophoresis (154).A tentative order of the genes IFNkZFNW-IFNT-IFNB was inferred. If correct, it seems likely that the ZFNT genes form their own minicluster, separated from the ZFNAIFNW cluster.
VI. Other Atypical T p I Interferons A. The FNWVariant (IFNW,,,) of Sheep A type I IFN gene with such limited similarity to ovine ZFNW that it appeared to constitute a separate subtype has recently been described (155). However, another analysis (156) shows that the open reading frame of this gene and the first 423 nucleotides upstream of the start codon are almost identical to those of a gene for rabbit IFN-o (44).There seems little chance that the gene was discovered as the result of rabbit DNA contamination of the ovine genomic library from which it was cloned (156). Moreover, the ovine gene had a unique 3' UTR. Because the I F W n a Tgene had an intact ORF, it was possible to express
NOVEL TYPE I INTERFERONS
319
it in E. coli. Curiously, this IFN had very low antiviral activity on bovine and goat cells but relatively normal activity on sheep and rabbit cells. One possibility is that the lFNW,,, gene was acquired by interspecies transfer from rabbits, possibly by a mechanism involving a virus. It will be important to determine whether this gene is closely linked to the true ovine I F N W genes.
B. The Short Type I Interferon (spl) Expressed by Pig Trophoblast In the period immediately preceding attachment of the pig trophoblast to the uterine wall, the conceptus begins to produce a mixture of type I and type I1 (IFN-7) interferons (157-160). There is no evidence that these IFNs play an analogous role to the IFN-T of cattle, i.e., preventing regression of the corpus luteum and ensuring continued progesterone production. Their function in early pregnancy remains unknown, although it seems likely that they influence the local immune system of the mother. conceivably they play some role in modulating immune and inflammatory responses at the uterine-placental interface. Low-stringency screening of a day 14-15 pig conceptus cDNA library with a porcine IFN-w probe allowed the type I IFN to be identified (161).This IFN is highly unusual. It is 149 amino acids long and is only distantly related to other type I IFNs (159) (Fig. 9). Nevertheless, it binds to the type I receptor (162) and has antiviral activity in the range expected for a type I porcine IFN. This activity is not neutralized by any of the common antisera available against other type I IFNs (161).The gene for this short porcine IFN lacks obvious viral response elements, and, like the ZFNT gene, appears not to be virally inducible. It seems reasonable to conclude that the short porcine IFN represents yet another type I subtype. It is unclear how widely it is represented in other mammals, although it is seemingly of relatively ancient origin (Fig. 9) and might be expected to be ubiquitous.
VII. Is There a Human IFN-T? The cloning of a cDNA from a human term placental cDNA library that encoded an IFN with close similarity to ovine and bovine IFN-Thas been reported (116). In situ hybridization indicates that the gene is expressed primarily in cytotrophoblast. This discovery raises some interesting questions regardmg the origin of the IFNT gene, because it implies that these genes arose much earlier in mammalian evolution than had been inferred from phylogenetic trees based on gene and protein sequences (58,161) (Fig. 9). As discussed in Section II,F, this report has raised considerable interest because of the prospects for using IFN-Tin treatment of human disease, includmg mul-
320
R. MICHAEL ROBERTS ET AL.
tiple sclerosis (115)and AIDS (103).Despite the initial cloning (116)and the unpublished observation that there may be as many as seven human IFNT genes (cited in 115),there are reasons to suspect that such genes may not exist. Careful analysis of the inferred amino acid sequence (116) shows that it most resembles that of bovine or ovine ZFNW (89-87Oo) rather than bovine or ovine IFNT (72-75Oo). Moreover, no such gene has been identified in the 400-kb region encompassing the type I IFN locus on human chromosome 8 (12).Attempts in this laboratory to isolate the gene either by screening a human genomic library with IFNT probes or by PCR amplification from human DNA with specific oligonucleotide primers based on the published huIFN-T sequence (116) have given only genes corresponding to a human I F N W p.Ezashi, J. Bixby and R. M. Roberts, unpublished results). Finally, the nucleotide sequence conservation between the putative human gene and that of present day ZFNW and IFNT genes of Ruminantia appears much higher than could be expected for species that diverged >65 M A .
VIII. Concluding Remarks The information reviewed here shows that the type I IFNs are considerably more diverse than was originally suspected at the time the first IFN-or and IFN-P were being identified and their genes and cDNA were cloned. Almost certainly, the headcount is not yet complete, and other IFN or IFN-like molecules will emerge. A second conclusion that can be drawn from the studies on IFN-T,in particular, is that some type I IFNs are produced without the stimulus of disease and can direct normal developmental processes. These IFNs might well be curiosities,restricted as they are to establishment of pregnancy in a rather narrow group of mammals, but they may reflect a mainstream function for IFN as developmental regulators. ACKNOWLEDGMENE We thank Jim Bixby for assisting with the sequence alignments, Charlotte Farin for supplying Fig. 4, Toshihiko Ezashi for unpublished work on the human “tau”genes,and Gail Foristal for assembling the manuscript.The work was supportedby NIH Grant HD21896.
REFERENCES 1 . W. Henle,]. Zmmunol. 64,203 (1950). 2. A. Isaacs and J. Lindenman, ROC. R. SOC.Lond. (Biol.) 147,258 (1957).
NOVEL TYPE I INTERFERONS
321
3. W. E. Stewart 11, “The Interferon System.” Springer-Verlag, Berlin and New York, 1979. 4. E. DeMaeyer and J. DeMaeyer-Guignard,“Interferons and Other Regulatory Cytokines.” Wiley, New York, 1988. 5. M. A. Farrar and R. D. Schreiber, Annu. Rev. lmmunol. 11,571 (1993). 6, J. E. Darnell, J. M. Kerr and G. R. Stark, Science 264, 1415 (1994). 7. K. S h u i C. M. Horvath, L. H. T. Huang, S. A. Qureshi, D. Cowburn and J. E. Darnell, Cell 76,821 (1994). 8. V. Wilson, A. J. Jeffreys, P. A. B e e , P. G. Boseley, P. M. Slocombe, A. Easton and D. C. Burke,JMB 166,457 (1983). 9. D. W. Leung, D. J. Capon and D. V. Goeddel, Bio/Tahnology, May, 458 (1984). 10. A. L. Hughes,J. Mol. Evol. 41,539 (1995). 11. D. J. Capon, H. M. Shepard and D. V. Goeddel, MCBioZ 5,768 (1985). 12. M. 0. Diaz, H. M. Pomykala, S. K. Bohlander, E. Maltepe, K. Malik, B. Brownstein and 0. I. Olopade, Genomics 22,540 (1994). 13. B. Velan, S . Koben, H. Grosfield, M. Lieter and A. Shafferman,JBC 260,5498 (1985). 14. P. D. Chaplin, G. Entrican, K. I. Gelder and Fi. A. Collins, J. Interferon Cyt. Res. 16, 25 (1996). 15. N. B. Finter,J. Inte?feron Res. (specialissue), 185 (1991). 16. T. Miyata and H. Hayashida, Nature (London) 295,165 (1982). 17. A. A. Branca and C. Baglioni, Nuture (London)294, 768 (1981). 18. J. A. Langer and S. Pestka, bnniunol. Today 9,393 (1988). 19. S. Pestka, J. A. Langer, K. C. Zoon and C. E. Samuel ARB 56,727 (1987). 20. I. Flores, T. M. Mariano and S. Pestka,JBC 266,19875 (1991). 21. G. UzC, G. Lutfalla and J. Gresser, Cell 60,225 (1990). 22. D. Novick, B. Cohen and M. Rubinstein, Cell 77,391 (1994). 23. P. Domanski, M. Witte, M. Kellum, M. Rubinstein, R. Hackett, P. Pitha and 0. R. Colamonici,JBC 270,21616 (1995). 24. D. Russell-Harde,H. Pu, M. Betts, R. N. Harkms, H. D. Perez andE. Croze,JBC270,26033 (1995). 25. L. C. Platanias, S. Uddin and 0. R. Colomonici,JBC 269, 17761 (1994). 26. S. Uddin, A. Chamdin and L. C. Platanias,JBC 270,24627 (1995). 27. R. Hauptmann and P. Swetly, NARes 13,4739 (1985). 28. G. Allen, M. 0. Diaz, N. B. Finter, G. R. Adolf, J. Doly, E. Lundgren, S. Pestka, R. M. Roberts, D. Testa and J. Wietzerbin,J. lnterjimon Res. 14,223 (1994). 29. D. W. Leaman and R. M. Roberts,J. interferon Res. l2,1(1992). 30. G. R. Adolf,J. Gen. Virol.68, 1669 (1987). 31. G. R. Adolf, Virology 175,410 (1990). 32. G. Aboagye-Mathiesen,F. D. Toth, C. Juhl, N. Norskov-Lauritsen,P. M. Petersen,V. Zachar and P. Ebbesen, J. Gen. Virol. 72, 1871 (199 1). 33. T. Voss, E. Ergalen, H. Aborn, V. Kubelka K. Sugiyama, I. Maurer-Fogy and J. Glossl, EJB 217,913 (1993). 34. G. Fi. Adolf, B. Fruhbeis, R. Hauptmann, I. Kalsner, I. Maurer-Fogy, E. Ostermann, E. Patzelf, R. Schwendenwein, W. Sommergruber and A. Zophel, BBA 1089,167 (1991). 35. G. R. Adolf, I. Maurer-Fogy,I. Kalsner and K. Cantell,JBC 265,9290 (1990). 36. H. Shirono, K. Kono, J. Koga, S. Hayashi, A. Matsuo and H. Hiratani, BBRC 168,16 (1990). 37. P. Kontsek, L. Borecky and M. Novak, Virology 18l, 416 (1991). 38. A. Himmler, R. Hauptmann, G. R. Adolf and P. Swetly,J. lnterferan Res. 7,173 (1987). 39. T. R. Hansen, D. W. Leaman, J. C. Cross, N. Mathialagan, J. A. Bixby and R. M. Roberts, JBC 266,3060 (1991).
322
R. MICHAEL ROBERTS ET AL.
40. M. Charlier, D. Hue, M. Boisnard, J. Martal and P. Gaye, Mol. Cell. Endoninol. 76, 161 (1991). 41. L. Iannuzzi, G. P. DiMeo, D. S. Gallagher, A. M. Ryan, L, Ferrara and J. E. Womack, J. Heredity 84,301 (1993). 42. D. Mege, F. Lefevre and C. LaBonnardGre,J. Znt&eron Res. 1 1 341 (1991). 43. A. Himmler, R. Hauptmann, G. R. Adolfand P. Swetly, DNA 5,345 (1986). 44. M. Charlier, R. L‘Haridon, M. Boisnard, J. Martal and P. Gaye, J. Interferon Res. 13,313 (1993). 45. J. C. Cross and R. M. Roberts, PNAS 88,3817 (1991). 46. T.Fujita, Y.Kimura, M. Miyamoto, E. C. Barsoumian and T. Taniguchi, Nature (London) 337,270 (1989). 47. P. Benoit, D. Maguire, I. Plavec, H. Kocher, M. Tovey and F. Meyer, J. Immunol. 150,707 (1993). 48. P. Eid and M. G. Tovey, 1.Znt&mon Cytokine Res. 15,205 (1995). 49. S. Y. Hwang, P. J. Hertzog, K. A. Holland, S. H. Sumarsono, M. J. Tymms, J. A. Hamilton, G. Whitty, I. Bertoncello and I. Kola, PNAS 92, 11284 (1995). 50. M. A. Meraz, J. M. White, K. C. F. Sheehan, E. A. Bach, S. J. Rodig, A. S. Dighe, D. H. KapIan, J. K. Riley, A. C. Greedund, D. Campbell, K. Carver-Moore,B. N. DuBois, R. Clark, M. Aguet and R. D. Schreiber, Cell 84,43 1 (1996). 51. M. Kubes, N. Fuchsberger and P. Kontshek,J. Interferon Res. 14,57 (1994). 52. N. B. Finter, Biotherupy 7,151 (1994). 53. H. M. Johnson, F. W. Bazer, B. E. Szente and M. E. Jarpe, Sci. Am. 270,68 (1994). 54. J. Martal, M. C. Lacroix, C. Loudes, M. Saunier and S. Wintenberger-Torres,J.Reprod. Fertil. 56,63 (1979). 55. J. D. Godkin, F. W. Bazer, J. Moffatt, F. Sessions and R. M. Roberts, J. Reprod. Fertil. 65, 141 (1982). 56. F. F. Bartol, R. M. Roberts, F. W. Bazer, G. S. Lewis, J. D. Godkin and W. W. Thatcher, B i d . Reprod. 32,681 (1985). 57. S. D. Helmer, P. J. Hansen, R. V. Anthony, W. W. Thatcher, F. W. Bazer and R. M. Roberts, J. Reprod. Fertil. 79, 83 (1987). 58. R. M. Roberts, J. C. Cross and D. W. Leaman, Endocrine R m . 13,432 (1992). 59. J. P. Hearn, G. E. Webley and A. A. Gidley-BairdJ. Reprod. Fertil. 92,497 (1991). 60. W. W. Thatcher, M. D. Meyer and G. Danet-Desnoyers,J. Reprod. Fertil. Suppl. 49, 15 (1995). 61. J. D. Godkii, F, W. Bazer, W. W. Thacher and R. M. Roberts,J. Reprod. Fertil. 7 I, 57 (1984). 62. R. V. Anthony, S. D. Helmer, S. E Sharif, R. M. Roberts, P. J. Hansen, W. W. Thatcher and F. W. Bazer, Endocrinology 123,1274 (1988). 63. K. Imakawa, R. V. Anthony, M. Kazemi, K. R. Marotti, H. G. Polites and R. M. Roberts, Nutwe (London)330,337 (1987). 64. K. Imakawa, T. R. Hansen, P.-V. Malathy, R. V. Anthony, H. G. Polites, K. R. Marotii and R. M. Roberts, Mol. Endocrinol. 3,127 (1989). 65. M. Charlier, D. Hue, J. Martaland P. Gaye, Gene 77,341 (1989). 66. H. J. Stewart, S. H. E. McCann, A. J. Northrop, G. E. Lamming and A. P. F. Flint, J. Mol. Endocrinol. 2,65 (1989). 67. S. W. Klemann, K. Imakawa and R. M. Roberts, NARes 18,6724 (1990). 68. R. M. Roberts, C. E. Farin and J. C. Cross, in “Oxford Reviews of Reproductive Biology” (S. R. Milligan ed.), Vol. 12, p. 147. Oxford University Press, London, 1977. 69. A. Assal-Meliani, G. Charpigny, P. Reinaud, J. Martal and G. Chaoat, J. Reprod. Zmmunol. 25,149 (1993). 70. W. Tuo, T. I. Ott and F. W. Bazer, Am]. Reprod. Zmmunnol. 29,26 (1993).
NOVEL TYPE I INTERFERONS
323
71. H. J. Stewart, F. M. Guesdon, J. H. Payne, B. Charleston, J. L. Vallet and A. P. Flint, J. Reprod. Fertil. Suppl. 45,59 (1992). 72. B. Charleston and H. J. Stewart, Gene 137, 327 (1993). 73. H. J. Stewart, S. H. E. McCann, P. J. Barker, K. E. Lee, G. E. Lamming and A. P. F. Flint,J. Endorrinol. 115, R13 (1987). 74. T. R. Hansen, M. Kazemi, D. H. Keisler, P. V. Malathy, K. Imakawa and R. M. Roberts,J. Inte7feron Res. 9,215 (1989). 75. J. Li and R. M. Roberts,JBC 269,13544 (1994). 76. P. S. Subramaniam,S. A. Khan, C. H. Pontzer and H. M. Johnson, PNAS 92,12270 (1995). 77. J. Li, Ph.D. Thesis, University of Missour-Columbia (1994). 78. E. Mouchel-Vielh, G. Lutfalla, K. E. Mogensen and G . UzB, FEBS Lett. 313,255 (1992). 79. 0.R. Colamonici, L. M. Pfeffer, F. D’Alessandro, L. C. Platanias, S. A. Gregory,A. Rosolen, R. Nordan, R. A. Cruciani and M. 0. Diaz,J. Immunol. 148,2126 (1992). 80. L. C. Platanias, L. M. Pfeffer, R. Cruciani and 0. R. Colamonici,J. Immunol. 150, 3382 (1993). 81. C. E. Farin, K. Imakawa and R. M. Roberts, Mol. Endocrinol. 3,1099 (1989). 82. R. M. Roberts, BioEssays 13,121 (1991). 83. F. B. F. Wooding, Placenta 4,527 (1983). 84. J. J. Hernandez-Ledezma, N. Mathialagan, C. Villanueva, J. D. Sikes and R. M. Roberts, Mol. Reprod. Dev. 36, 1 (1993). 85. J. J. Hernandez-Ledezma,J. D. Sikes, C. N. Murphy, A. J. Watson, G. A. Schultz and R. M. Roberts, Biol. Reprod. 47,374 (1992). 86. C. E. Farin, K. Imakawa, T.R. Hansen, J. J. McDonneU, C. N. Murphy, P. W. Farin and R. M. Roberts, Biol. Reprod. 43,210 (1990). 87. K. Imakawa, S. D. Helmer, K. P. Nephew, C. S. R. Meka and R. K. Christenson, Endocrinology 132,1869 (1993). 88. K. Imakawa, K. Tamura, W. J. McGuire, S. Khan, L. A. Harbison,J. P. Stanga, S. D. Helmer and R. K. Christenson, Endocrine 3,511- (1995). 89. Y. KO, C. Y. Lee, T. L. Ott, M. A. Davis, R. C. M. Simmen, E W. Bazer and F. A. Simmen, Biol. Reprod. 45, 135 (1991). 90. M. Guillomot, C. Michel, P. Gaye, N. Charlier, J. Trojan and J. Martal, Bid. Cell 68, 205 (1990). 91. C. J. Ashworth and F. W. Bazer, B i d . Reprod. 40,425 (1989). 92. H. R a g and C. Weissman, Nuture 303,439 (1983). 93. N. J. MacDonald, D. Kuhl, D. Maguire, D. Naf, P. Gallant, A. Goswamy, H. Hug, H. Bueler, M. Chaturvedi, J. de la Fuente, H. Ruffner, F. Meyer and C. Weissman, Cell 60, 767 (1990). 94. W. Du, D. Thanos and T. Maniatis, Cell 74, 887 (1993). 95. M. Miyamoto, T. Fujita, Y. Kimura, M. Maruyama, H. Harada, Y. Sudo, T. Miyata and T. Taniguchi, Cell 54,903 (1988). 96. H. Harada, T. Fujita, M. Miyamoto, Y. Kimura, M. Maruyama, A. Furia, T. Miyata and T. Taniguchi, Cell 58,729 (1989). 97. M. J. Lenardo, C.-M. Fan, T. Maniatis and D. Baltimore, Cell 57,287 (1989). 98. N. B. Raj, W. C. Au and P. M. Pitha,JBC 266,11360 (1991). 99. D. Thanos and T. Maniatis, Cell 83, 1091 (1995). 100. R. M. Roberts, D. W. Leaman, J. J. Hernandez-Ledezma and N. C. Cosby, in “Trophoblast Cells: Pathways for Matemd-Embryonic Communication,” (M. J. Soares, S. Handwerger and F. Tdamantes, eds.), Chap. 14, p. 206. Springer-Verlag, Berlin and New York, 1993. 101. J. C. Cross and R. M. Roberts, PNAS 88,3817 (1991). 102. D. W. Leaman, J. C. Cross and R. M. Roberts, Mol. Endocrinol. 8,456 (1994).
3 24
R. MICHAEL ROBERTS ET AL.
103. S. W. Klemann, J. Li, K. lmakawa, J. C. Cross, H. Francis and R. M. Roberts, Mol. Endocrinnol. 4, 1506 (1990). 104. J. Li, A. P. Alexenko and R. M. Roberts, Protein Expression Pu& 6,401 (1995). 105. J. Martal, E. Degryse, G. Charpigny, N. Assal, P. Reinaud, M. Charlier, P. Gaye and J. P. Lecocq, J, Endocrinol. 127, R5 (1990). 106. T. L. Ott, G . Vanheeke, H. M. Johnson and F. W. Bazer, J . Interferon Res. ll, 357 (1991). 107. K. D. Niswender, J. Li, K. R. Loos, M. R. Powell, R. M. Roberts, D. H. Keisler and M. F. Smith, Biol. Reprod. Suppl. 50,89 (1994)(abstract). 108. K. P. Nephew, K. E. McLure, M. L. Day, S. Xie, R. M. Roberts and W. F. Pope,]. Anim. Sci. 68,2766 (1990). 109. T. K. Schalue-Francis,P. W. Farin, J. C. Cross, D. Keisler and R. M. Roberts,]. Rqrrod. Fertil. 91,347 (1991). 110. C. M. Barros, J. G. Betts, W. W. Thatcher and P. J. Hansen,]. Endorrinol. 133,175 (1992). 111. G. R. Newton, S. Martinod, P. J. Hansen, W. W. Thatcher, B. Siegenthaler, C. Gerber and M.-J. Voir01,J. Dairy Sci. 73,3439 (1990). 112. M. D. Meyer, P. J. Hansen, W. W. Thatcher, M. Drost and R. M. Roberts,J. Dairy Sci. 78, 1470 (1995). 113. C. H. Pontzer, F. W. Bazer and H. M. Johnson, Cancer Res. 51,5304 (1991). 114. F. W. Bazer and H. M. Johnson, Am. ]. Reprod. Immunol. 26,19 (1991). 115. J. M. Soos, P. S. Subramaniam, A. C. Hobeika, J. Schiffenbauer and H. M. Johnson, J. Immuno.?.155,2747 (1995). 116. A. E. Whaley, C. S. R. Meka, L. A. Harbison,J. S. Hunt and K. Imakawa,JBC 269,10864 (1994). 117. B. R. Rueda, K. A. Naivar, E. M. George, K. J. Austin, H. Francis and T. R. Hansen,]. Interferon Res. 13,303 (1993). 118. K. A. Naivar, S. K. Ward, K. J. Austin, D. W. Moore and T. R. Hansen, Biol. Reprod. 52, 848 (1995). 119. J. R. Ortaldo, R. B. Herberman, C. Harvey, P. Osheroff, Y. C. E. Pan, B. Kelder and S. Pestka, PNAS Sl, 4926 (1984). 120. R. Hu, Y. Gan, J. Liu, D. Miller and K. C. Zoon,JBC 268,12591 (1993). 121. L. Velazquez, M. Follous, G. R. Stark and S. Pellegrini, Cell 70,313 (1992). 122. C. Abramovich, L. M. Shulman, E. Ratovitski, S . Harroch, M. Tovey, P. Eid and M. Revel, EMBOJ. 13,5871 (1994). 123. T. Senda, T. Shimazu, S. Matsuda, G . Kawano, H. Shimizu, K. T. Nakamura and Y. Mitsui, E M B O ] . ll, 3193 (1992). 124. T. Senda, S.-I. Saitoh and Y. Mitsui,JMB 253,187 (1995). 125. T. Senda, S . 4 . Saitoh,Y. Mitsui,J. Li and R. M. Roberts,]. Interferon Cytokine Res. 15,1053 (1995). 126. R. Wetzel, Nature (London)289,606 (1981). 127. N. B. Lydon, C. Fawe, S. Bove, 0. Neyret, S. Benureau, A. M. Levine, G. F. Seelig, T. L. Nagabhushan and P. P. Trotta, Bchem 4,4131 (1985). 128. H. Morehead, P. D. Johnston and R. Wetzel, Bchem. 23,2500 (1984). 129. G. J. Waine, M. J. Tymms, E. R. Brandt, B. F.Cheetham and A. W. Linnane, J. Intaferon Res. E , 4 3 (1992). 130. M. W. Beilhartz, 1. T. Nisbet, M. J. Tymms, P. J. Hertzog and A. W. Linanne,]. Interferon Res. 6,677 (1986). 131. I. T. Nisbet, M. W. Beilharz, P. J. Hertzog, M. J. Tymms and A. W. Linnane, Biochem. Int. 11,301 (1985). 132. C. Day, B. Schwartz, B.-L. Li and S. Pestki+J. Intmjieron Iies. 12,139 (1992).
NOVEL TYPE I INTERFERONS
325
133. K. C. Zoon, D. Miller, J. Bekisz, D. zur Nidden, J. C. Enterline, N. Y. Nguyen and R. Hu JBC 267,15210 (1992). 134. G. R. Adolf, I. Kalsner, H. Ahorn, I. Maurer-Fogy and K. Cantell, BJ 276, 511 (1991). 135. D. L. Miller, H. Kung and S. Pestka, Science 215,689 (1982). 136. S. Matsuda, G. Kawano, S. Itoh, Y. Mitsui and Y. Iitaka,JBC 26%16207 (1986). 137. N. J. Murgolo, W. T. Windsor, A. Hruza, P. Reichert, A. Tsarbopoulos, S. Baldwin, E. Huang, B. Pram&, S. Ealick and P. P. Trotta, Proteins, Struct. Funct. Genet. 17, 62 (1993). 138. A. P. Kom, D. R. Rose and E. N. Fish,J. Interferon Res. 1 4 , l (1994). 139. M. H. Seto, R. N. Harkins, M. Adler, M. Whitlow, W. B. Church and E. Croze, Protein Sci. 4, 655 (1995). 140. M. A. Jarpe, H. M. Johnson, F. W. Bazer, T. L. Ott, E. Curto, N. R. Knshna and C. H. Pontzer, Protein Eng. 7, 863 (1994). 141. Y. Mitsui, T. Senda, T. Shimazu, S. Matsuda and J. Utsumi, Pharmacol. Therap. 58, 93 (1993). 142. P. Kontsek, Acta Virol. 38,345 (1994). 143. G. UzB, G. Lutfda and K. E. Mogensen,!. Znterferon Cyt. Res. 15,3 (1995). 144. J. Li and R. M. Roberts, JBC 269,24826 (1994). 145. D. H. Pontzer, T. L. Ott, E W. Bazer and H. M. Johnson, PNAS 87,5945 (1990). 146. C. H. Pontzer, T. L. Ott, F. W. Bazer and H. M. Johnson, J. Znte7feron Res. 14,133 (1994). 147. D. Gillespie, E. Pequignot and W. E. Carter, in “Handbook of Experimental Pharmacology” (P. E. Came and W. A. Carter, eds.), Vol. 71, p. 45. Springer-Verlag,New York, 1984. 148. D. Gillespie and W. Carter,!. Znterferon Res. 3,83 (1983). 149. K. P. Nephew, A. E. Whaley, R. K. Christenson and K. Imakawa, Biol. Reprod. 48, 768 (1993). 150. M. Kimura,J. Mol. E d . 16, 111 (1980). 151. M. M. Miyamoto, F. Kraus, P. J. Laipis, S. M. Tanhauser and S. D. Webb, in “Mammal Phylogeny” (F. S. Szalay, M. J. Novacek and M. C. McKenna, eds.), p. 268. Springer-Verlag, Berlin and New York, 1993. 152. A. M. Ryan, D. S. Gallagher and J. E. Womak, Cytogenet. Cell Genet. 6 3 , 6 (1993). 153. L. Iannuzzi, D. S. Gallagher, A. M. Ryan, 6 .P. DiMeo and J. E. Womack, Cytogenet. Cell Genet. 62,224 (1993). 154. A. M. Ryan and J. E. Womack, Aninul Genet. 2 4 , 9 (1993). 155. A. E. Whaley, R. S. Carroll and K. Imakawa, Gene 106,281 (1991). 156. L. Liu, D. W. Leaman, J. A. Bixby and R. M. Roberts, BBA 1294,55 (1996). 157. J. C. Cross and R. M. Roberts, Biol. Reprod. 40, 1109 (1989). 158. M. A. Mirando, J. P. Hamey, S. Beers, C. H. Pontzer, B. A. Torres, H. M. Johnson and F, W. Bazer,!. Reprod. Fertil. 88, 197 (1990). 159. F. Lefkvre, F. Martinat-Botte, M. Guillomot, K. Zouari, B. Charley and C. LaBonnardiBre, EurJ.hnmunol. 20,2485 (1990). 160. C. LaBonnardiBre, F. Martinat-Botte, M. Terqui, F. Lefkvre, K. Zowari, J. Martal and E W. Bazer,!. Reprod. Fertil. 91,469 (1991). 161. F. Lefkvre andV. Boulay,JBC268,19760 (1993). 162. P. D. Niu, F. LefBvre and C. LaBonnardikre, J. Interferon Cyt. Res. 15,769 (1995). 163. N. DereuddreBosquet, P. Clayette, M. Martin, A. Mabondzo, P. Fretier, G. Gras, J. Martal and D. Dormont, J. Ac9. Zmmune Defic. Syndromes Human Retrouirol. 11,241 (1996). 164. P. H. A. Sneath and R. R. Sokal, in “Numerical Taxonomy,” p. 230. W. H. Freeman, San Francisco, 1973.
This Page Intentionally Left Blank
General Transcription Factors for RNA Polymerase II’ RONALDC. CONAWAY AND JOAN WELIKY CONAWAY
Program in Moleailar and Cell Biology O k t a h m Medical Research Foundation Oklahoma City, Oklahoma 73104 I. TFIID and Formation of the First Stable Intermediate in Assembly of the Preinitiation Complex ...................................... 11. TFIIB and Selective Binding of RNA Polymerase I1 to the TFIID-Core Promoter Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. TFIIF and Assembly ofthe Active Preinitiation Complex . . . . . n! Roles of TFIIE and TFIIH in Formation and Activation of the sembled Preinitiation Complex .......................... V. Overview of RNA Polymerase I1 General Elongation Factors . . . . . . . . . VI. SII and Nascent Transcript Cleavage ............. VII. The Elongation Activity of TFIIF ................................ VIII. The Elongin (SIII) Complex and von Hippel-Lindau Disease . . . . . . . . X. ELL and Acute Myeloid Leukemia .............................. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ........
328 330 330 332 335 336 33 7 338 340 341
Messenger RNA synthesis is a major site for the regulation of gene expression. Eukaryotic mRNA synthesis is an elaborate biochemical process catalyzed by multisubunit RNA polymerase I1 and controlled by the concerted action of a diverse collection of transcription factors that fall into at least three functional classes: (1)DNA binding transuctiwators,which activate expression of specific genes or gene families by increasing the rate of initiation ( I ) or, as shown recently, the efficiency of elongation (2-4) by RNA polymerase 11; (2)coactiwators, such as the SRB-containingMediator complex (5, 6), Creb binding protein (CBP) (7-9), and PC4 (10, I]), which are required for transcriptional activation and appear to function by promoting essential communication between DNA binding transactivators and the RNA polymerase I1 initiation complex; and (3)the general transcriptionfactors, which are characterized by their ability to function in intimate association with RNA polymerase I1 as components of the preinitiation, initiation, and elongation complexes, by their apparent roles in transcription of most, if not all, eukaryotic protein-coding genes, and by their strilung structural and functional conservation in eukaryotes from yeast to man (12-14). Abbreviations: RAP, RNA Polymerase associated Protein; TFG, h-anscriptionfactor g. Progress in Nucleic Acid Research and Molecular Biology, Vol. 56
327
Cupfight 0 1997 by Academic Press. reproduction i n any fonn reserved 0079-6603/97 $25.00
All rights
328
RONALD C. CONAWAY AND JOAN WELIKY CONAWAY
The past decade was a golden age for biochemical studies of the general transcription factors. During this time, many general transcription factors were identifed and purified to homogeneity, and working models for their roles in transcription were established. Among the most striking features of the general transcription factors are their sheer number and rich functional diversity. To date, more than 10 general transcription factors have been purified to homogeneity and classified according to the transcriptional stage they regulate. Biochemically defined general transcription factors include the general initiation factors TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH, which function in the preinitiation and initiation stages of transcription to expedite selective binding of RNA polymerase I1 to promoters and to promote synthesis of the first few phosphodiester bonds of nascent transcripts (12), and the general elongation factors SII, P-TEFb, TFIIF, elongin (SIII), and ELL, which function by a variety of mechanisms to promote efficient elongation of transcripts by RNA polymerase I1 (13-15 ). Here we review the general transcription factors with particular emphasis on their roles in the preinitiation, initiation, and elongation stages of transcription by RNA polymerase 11.
1. TFllD and Formation of the First Stable Intermediate in Assembly of the Preinitiation Complex Transcription initiation by RNA polymerase I1 is preceded by assembly of polymerase and the general initiation factors into an active preinitiation complex at the core region of a class I1promoter. Core promoters transcribed by RNA polymerase I1 are structurally diverse (16).Whereas a small subset of core promoters contains both a consensus TATA box (consensus sequence TATAAAA) located -30 base pairs upstream of the transcriptional start site and a strong initiator (Inr) element [consensus sequence (Y),-,CANT(Y),-,] surrounding the transcriptional start site, a large number of core promoters lack a consensus TATA box, a strong Inr element, or both. In some TATA-less promoters, the TATA box is replaced by an (A + T)-rich sequence that differs from the consensus TATA box, but is nevertheless a functional promoter element. In other TATA-less promoters, a functional promoter element at -30 appears to be lacking entirely. RNA polymerase I1 preinitiation complexes assemble at many TATA and TATA-less core promoters by a common pathway (17)(Fig. 1). Assembly of the preinitiation complex is nucleated by sequence-specific binding of TFIID to the core promoter to form the nucleoprotein recognition site for RNA polymerase I1 and the other general initiation factors on the DNA. RNA polymerase 11, assisted by TFIIB and TFIIF, can then bind selectively to the
329
RNA POLYMERASE I1 TRANSCRIPTION FACTORS
-
TATA+ Inr TATA+ In* (AdML)
TATA- In+
p
4v
UNSTABLE COMPLEX
rpoL +r-
STABLE COMMITTED COMPLEX
' I 1 'IB
llB
COMMIITED COMPLEX
+Lr
[IF
COMPLETE COMPLEX
STABLE COMMITTED COMPLEX IIE, IIH COMPLETE COMPLEX
FIG.1. Similar pathways for assembly of the RNA polymerase I1 preinitiation complex at many TATA and TATA-lesspromoters. TATA, Consensus TATA element; Inr, strong initiator element; A W L , adenovirus 2 major late promoter; POL 11, RNA polymerase 11; IIB, TFIIB; IIF, TFIIF; IIE, TFIIE; IIH, TFIIH.
TFIID-promoter complex. Assembly of the complete preinitiation complex is accomplished by binding of TFIIE and TFIIH. TFIID is the only general initiation factor that possesses measurable sequence-specific DNA binding activity and is therefore believed to be largely responsible for sequence-specific recognition of the core promoter by the preinitiation complex. TFIID is a multisubunit complex composed of the TATA box binding protein (TBP) and as many as 10 additional polypeptides, TBP-associated factors (TAFs) (18-22). At least two TFIID subunits possess sequence-specific DNA binding activities. TBP, which binds to a variety of consensus and nonconsensus TATA boxes (23,24),and TAF150, which binds the initiator (Inr) element (25-27). The combined DNA binding activities of these two TFIID subunits may account for the sequence-specific DNA binding properties of TFIID. TFIID is believed to duect assembly of the RNA polymerase I1 preinitiation complex at all class I1 core promoters, even though it binds most avidly to promoters that have consensus TATA boxes and strong Inr elements, such as the AdML promoter. At the AdML promoter, TFIID alone is sufficient for formation of the first stable intermediate on the pathway to assembly of the preinitiation complex. At this promoter, sequence-specific interactions between the TATA box and TBP and between the Inr and TAF150 or
330
RONALD C. CONAWAY AND JOAN WLIKY CONAWAY
other subunits of TFIID apparently provide sufficient binding energy for formation of a stable, committed intermediate. At many other promoters, which lack consensus TATA boxes or strong Inr elements, TFIID binds sequence specifically but weakly to form unstable preinitiation intermediates. At these promoters, formation of the first stable preinitiation intermediate depends strongly on TFIID and both TFIIB and RNA polymerase 11;these intermediates are further stabilized by binding of TFIIF (17).
II. TFllB and Selective Binding of RNA Polymerase II to the T F I I M o r e Promoter Complex Selective binding of RNA polymerase I1 to the TFIID-core promoter complex requires TFIIB (28).TFIIB from higher eukaryotes is a protein of -35 kDa; it was initially purified to homogeneity from rat liver (29) and subsequently from human cells (30) and Drosophila melanogaster (31).TFIIB from yeast is a protein of -41 kDa (32,33). TFTIB functions as a ‘loridging factor” that promotes binding of RNA polymerase I1 to core promoters by direct and stable interactions with polymerase and TFIID or its TBP subunit (28).Distinct regions of TFIIB appear to mediate its interactions with RNA polymerase I1 and TBP. A proteaseresistant C-terminal core, which contains two 84-amino acid direct repeats, is sufficient for interaction with the TBP-promoter complex. A proteasesusceptibleN-terminal region, which contains a putative zinc-finger structure (CysX,CysX,,CysX,Cys), is required for selective binding of polymerase to the TBP-promoter complex (34-40). Substantial evidence indicates that TFIIB plays an important role in establishing the proper spacing between the TATA box and transcriptional start site. Some mutations in Saccharomyces cermisiae TFIIB lead to dramatic alterations in the position of the transcriptional start site (33).In addition, in an S . cereuisiae in vitro transcription system, replacement of TFIIB and RNA polymerase I1 from S. cerevisiae with their Schizosaccharomyces pombe counterparts is sufficient to change the transcriptional start site to the position characteristic of S . pombe (41).
111. TFllF and Assembly of the Active Preinitiation Complex Although TFIIF is not essential for entry of RNA polymerase I1 into the preinitiation complex (17, 42, 43), it strongly stabilizes the binding of polymerase and TFIIB to the TFIID-core promoter complex (17, 44-46). The
RNA POLYMERASE I1 TRANSCRIPTION FACTORS
331
role of TFIIF in transcription initiation is most likely not limited to stabilizing binding of RNA polymerase I1 to the core promoter, however, because initiation depends strongly on TFIIF even under conditions where the factor makes only a relatively small contribution to the strength of polymerase binding (17).In addtion, although transcription initiation from nearly all promoters tested shows a very strong dependence on TFIIF, RNA polymerase I1 is capable of initiating transcription in vitro from a human IgH promoter in the absence of TFIIF,in a reaction requiring a negatively supercoiled DNA template and depending only on TFIID and TFIIB (42). In higher eukaryotes,TFIIF is a heterodimer composed of -30- (RAP30) and -70-kDa (RAP74) subunits, which were first identified among a small group of proteins capable of binding to immobilized RNA polymerase I1 ( 4 7 ) . TFIIF was initially identified as one of the general initiation factors (29, 48) and purified to homogeneity from rat liver (49),Drosophih melanogaster Kc cells (50),and subsequently human cells (51, 52). In S. cerevisiae, TFIIF is a heterotrimer composed of -1O5- F G l ) , -54- (TFG2),and -30-kDa (TGF3) subunits (53, 54). TFIIF shares structural and functional properties with bacterial u factors (12, 55).First, like bacterial u factors, TFIIF binds stably to its cognate RNA polymerase. As mentioned above, human RAP30 and RAP74 were initially identified among a small group of proteins that bind to immobilized RNA polymerase I1 (47, 56, 57); in addition, human (52, 58, 59), D. melunogaster (50), rat (60),and S. cerevisiae (53)TFIIF have each been shown to associate stably with RNA polymerase I1 in solution. On isolating a cDNA encoding human RAP30, Greenblatt and co-workers identified a RAP30 region sharing sequence similarity with the RNA polymerase bindmg domains of Escherichia coli u70 and Bacillirs subtilis u43 (61). Ln subsequent experiments, it was observed that a serine residue located at the C terminus of this RAP30 region is protected from phosphorylation by CAMP-dependent protein kinase when TFIIF and RNA polymerase I1 are mixed, suggesting that TFIIF binds to RNA polymerase TI at least in part through this region (62). In the same study, it was also observed that TFIIF is capable of binding to E . coli RNA polymerase and can be dsplaced by u70. Taken together, these results suggest that RAP30 and u70 have related RNA polymerase-binding domains. Second, a variety of evidence suggests that, llke bacterial u factors, TFIIF promotes stable binding of its cognate polymerase to promoters through a direct interaction with DNA. Results from resbiction-site protection (44)and phenanthrolene-copper-footprinting (45) experiments indicate that TFIIF promotes formation of stable protein-DNA contacts between the TATA box and transcriptional start site, either by promoting binding of RNA polymerase 11, TFIIB, TFIID, or some combination of these proteins to this
332
RONALD C. CONAWAY AND JOAN WELIKY CONAWAY
core promoter region or by binding directly to DNA. More recent results suggest that the RAP30 and RAP74 subunits of TFIIF contain C-terminal regions capable of interacting directly with DNA (63, 64). The RAP30 C-terminus exhibits statistically sigdicant similarity to the C terminus of B. sub& ~ ~ ( 6 This 5 ) . region of uKincludes the highly conserved u homology region 4. Within this region is a helix-turn-helix DNA binding motif that is a cryptic DNA binding domain that interacts with the -35 element of bacterial promoters (66, 67). Using the approach of Gross and co-workers (66, 67), we obtained evidence that the RAP30 C terminus is also a cryptic DNA binding domain (63).Consistent with the hypothesis that the DNA binding activity of RAP30 is relevant to its function in transcription initiation, (1) RAP30 can be cross-linked in the preinitiation complex to core promoter sequences between the TATA box and transcriptional start site (68), (2)the RAP30 C terminus, including the cryptic DNA binding domain, is critical for TFIIF activity in transcription initiation (63, 65, 69, 70), and (3)there is a strong correlation between the effects of RAP30 C-terminal deletion mutations on RAP30 DNA binding and transcription activities (63).Despite the ability of both of its subunits to interact with DNA, TFIIF most likely does not make a major contribution to sequence-specific recognition of promoter sequences by RNA polymerase 11, because substantial evidence argues that TFIID is the primary determinant of sequence specificity in assembly of the preinitiation complex. Third, TFIIF has been shown to prevent nonselective binding of RNA polymerase I1 to nonpromoter sites on DNA, much as E. coli u70 prevents nonselective binding of E. coli RNA polymerase to nonpromoter sites on DNA (71, 72). Pwified RNA polymerase I1 binds DNA nonselectively but quite stably to form a binary complex that dissociateswith a half-life of more than 1 hr. Both formation and stability of these binary complexes are substantially reduced in the presence of TFIIF. Thus, another contribution of TFIIF to selective binding of RNA polymerase I1 to core promoters may be to suppress formation of nonproductive binary complexes of polymerase and DNA.
IV. Roles of TFllE and TFllH in Formation and Activation of the Fully Assembled Preinitiation Complex Entry of the final two general initiation factors, TFIIE and TFIIH, into the preinitiation complex results in formation of stable protein-DNA contacts near the transcriptional start site (44, 73). Whether TFIIE and TFIIH bind DNA directly or whether they induce RNA polymerase I1 or the other
RNA POLYMERASE I1 TRANSCRIPTION FACTORS
333
general initiation factors to bind h s region of the core promoter is unclear. Substantial evidence argues, however, that the major contribution of TFIIE and TFIIH to formation of the productive initiation complex is to promote AT€-dependent formation of an open promoter complex by unwindmg a short stretch of promoter DNA near the transcriptional start site prior to initiation (74, 75). TFIIE from higher eukaryotes is a heterodimer composed of -34- and -58-kDa polypeptides and has been purified to homogeneity independently from human cells (76) and rat liver (77). TFIIE from s. cerevisiae is a heterodimer of -43- and -66-kDa subunits (78, 79). TFIIH from higher eukaryotes and S. cerevisiae is a nine-subunit protein that has now been implicated in cellular processes as diverse as transcriptional regulation, DNA repair, and cell cycle control. TFIIH was initially identified and purified from rat liver (73,80) and subsequently from S. cerevisiue (8482) and human cells (83, 84). TFIIH is the only general transcription factor known to possess associated enzymatic activities. Purified rat TFIIH possesses a weak DNA-dependent ATPase activity specific for adenine nucleoside triphosphates (80).Although this ATPase was stimulated by a variety of single- and double-stranded DNAs, it was most strongly stimulated by DNA fragments containing the TATA regions of the Adh4L and mouse interleukin-3 core promoters, which are both strong promoters in uitro. Saccharomyces cermisiae (85) and human TFIIH (86,87) were subsequently shown to possess closely associated DNA-dependent ATPase activity. Human (88),rat (89),and S. cerevisiae (90) TFIIH also possess An-dependent DNA helicase activities. The largest subunit of human TFIIH is a DNA helicase encoded by the product of the nucleotide excision repair (NER) gene XPB/ERCC3 (88), which is mutated in individuals suffering from the human genetic disorders xeroderma pigmentosum (group B) and Cockayne’s syndrome. In other studies, TFIIH was shown to function &rectly in NER in cells (91, 92) and to be composed of additional subunits encoded by known NER genes, including TFBl (93-95), Ssll (95, 96), and XPD/ERCC2 (yeast B a d ) (90, 97, 98), which encodes a second TFIIH-associated DNA helicase. Substantial evidence suggests that the TFIIH DNA helicase mediates AT€-dependent formation of an open promoter complex prior to synthesis of the first phosphodiester bond of nascent transcripts. It is well established that, under most conditions, promoter-specific transcription requires a hydrolyzable ATP cofactor. It was frst observed that the nonhydrolyzable ATP analog, AMP-PNP, does not replace ATP in synthesis of accurately initiated transcripts from the AdML promoter, even though AMP-PNP is a substrate for elongation by RNA polymerase I1 (99). Subsequently it was demonstrat-
334
RONALD C. CONAWAY AND JOAN WELIKY CONAWAY
ed that ATP is required at some stage during synthesis of the first eight phosphodiester bonds of transcripts initiated at the AdML promoter (100).Furthermore, ATP is required for synthesis of dinucleotide-primed, abortive trinucleotide transcripts initiated at the AdML promoter, providing evidence that ATP is essential for synthesis of the first phosphodiester bond of nascent transcripts (101).Using a partially fractionated rat liver transcription system, we identified ATPyS as a potent inhibitor of the ATP-requiring step in transcription initiation and used this inhibitor to demonstrate that ATP activates the preinitiation complex in a reversible step preceding synthesis of the first phosphodiester bond of nascent transcripts (102).The requirement for ATP in activation of the preinitiation complex, although somewhat controversial (103),has recently been confirmed in studies carried out with crude extracts (104)and fully reconstituted, purified transcription systems (105)These studies have also provided strong evidence that TFIIH is inhspensible for ATPdependent transcription initiation by RNA polymerase 11. Evidence consistent with the idea that ATP-dependent activation of the preinitiation complex involves formation of an open promoter complex by the TFIIH DNA helicase has come from several laboratories. First, using KMnO, as a probe for DNA melting, Gralla and co-workers (74)obtained evidence that ATP hydrolysis provides energy to drive unwinding of a short stretch of DNA surrounding the transcriptional start site prior to transcription initiation in crude HeLa cell extracts (74).Like ATP-dependent activation of the preinitiation complex, formation of the open promoter complex requires ATP or dATP (100,102);neither AMP-PNP, CTP, UTP, nor GTP support DNA melting. Second, Timmers and co-workers confirmed and extended these findmgs by demonstrating that TFIIH is essential for ATPdependent open complex formation in a fully reconstituted, purified transcription system (75).Finally, additional evidence that the TFIIH DNA helicase plays a role in Am-dependent activation of the preinitiation complex has come from biochemical studies indicating that an ATP cofactor is not required for transcription initiation under a limited set of conditions where TFIIH is dispensible for initiation. These conditions include initiation by RNA polymerase I1 from promoters on negatively supercoiled DNA templates (42, 43, 106) or “preopened promoters containing a short stretch of mismatched base pairs surrounding the transcriptional start site (105, 107, 108). In addition to DNA-dependent ATPase and DNA helicase activities, S. cereuisiae TFIIH (85)and, later, rat (109)and human (110)TFIIH were found to possess a protein kinase activity capable of phosphorylating the heptapeptide repeats in the C-terminal domain (CTD) of the largest subunit of RNA polymerase 11. The mammalian TFIIH kinase is composed of cdk 7 (M015) (111-113), cyclin H (111-113), and MAT1 (84).The S. cerevisiae
RNA POLYMERASE I1 TRANSCRlPTION FACTORS
335
TFIIH kinase is a three-subunit complex composed of the cdk-like kinase KIN28, the cyclin-like protein CCLI, and TFB3, a homolog of MAT1 (82, 114, 115; R. Kornberg, personal communication). Based on findings indicating (1) that the CTDs of RNA polymerase I1 molecules actively engaged in transcription are highly phosphorylated (116-118) and (2) that polymerases containing hypophosphorylated CTDs preferentially enter the preinitiation complex (119-121),where they are subsequently phosphorylated (119),it was proposed that CTD phosphorylation could play a role in transcription initiation or in the transition of polymerase from initiation to elongation. Several lines of evidence argue, however, that CTD phosphorylation is not an essential step in promoter-specific transcription by RNA polymerase 11. First, the TFIIH kinase utilizes both ATP and GTP as phosphate donors, whereas transcription initiation by RNA polymerase I1 exhibits a strict requirement for adenine nucleoside triphosphates (85, 89, 109).Second, the isoquinoline sulfonamide derivatives H-7 and H-8, which potently inhibit the TFIIH kinase, have no effect on promoter-specific transcription by RNA polymerase I1 in vitro in highly purified basal transcription systems (122, 123).Finally, TFIIH containing a mutant cdk7 subunit that lacks CTD kmase activity is fully functional in both basal and activated transcription in vitro (124).
V. Overview of RNA Polymerase II General Elongation Factors A requirement for a class of elongation factors that promotes eukaryotic messenger RNA synthesis was predicted by early biochemical studies of the catalybc properties of RNA polymerase 11. These studies revealed that purified RNA polymerase I1 lacks the capacity to catalyze RNA chain elongation in vitro at rates sufficient to account for the observed rates of messenger RNA synthesis in vivo; whereas eukaryotic messenger RNA synthesis is estimated to proceed in vivo at rates of 1200-2000 nucleotides per minute (125-127), RNA polymerase I1 synthesizes RNA in vitro at 100-300 nucleotides per minute under optimal conditions (128).Furthermore, RNA chain elongation by purified RNA polymerase I1 is an inherently discontinuous process interrupted by frequent pausing and, in some cases, premature arrest at a variety of sequences found within eukaryotic protein-coding genes (13, 129). Consequently, elongation factors that increase the overall rate of messenger RNA synthesis by suppressing transient pausing or preventing premature arrest by transcribing RNA polymerase I1 might be expected to play important roles in eukaryotic gene expression by expediting passage of polymerase through the long stretches of chromosomal DNA comprising most eukaryoticproteincoding genes. Indeed, eukaryotes have evolved such a family of elongation
336
RONALD C . CONAWAY AND JOAN WELIKY CONAWAY
factors. Biochemically defined members of this family include P-TEF'b, SII, TFIIF, elongm (SIII), and ELL. P-TEF'b functions in partially fractionated transcription systems to convert early, promoter-specific, termination-prone transcription complexes into productive elongation complexes (15,130).SII, TFIIF, elongin (SIII),and ELL all regulate the activity of the purified RNA polymerase I1 ternary elongation complex. These four factors are the subject of the following sections.
VI. SII and Nascent Transcript Cleavage SII is an -38-kDa protein originally identified and purified to homogeneity from Ehrlich ascites tumor cells (131).SII does not appear to increase the overall catalytic rate of nucleotide addition to growing RNA chains by RNA polymerase 11. Instead, SII expedites passage of polymerase through transcriptional impediments, including various nucleoprotein complexes as well as a collection of related DNA sequences that act as intrinsic arrest sites (132).On encountering transcriptional impediments, a fraction of transcribing polymerase arrests, but can be reactivated by SII. SII-sensitive intrinsic arrest sites are found in many genes, including the human histone H3.3, adenovirus 2 major late, and adenosine deaminase genes (14, 129). Typical intrinsic arrest sites include two or more closely spaced stretches of T residues in the nontemplate strand. Why particular DNA sequences function as arrest sites is not clear. However, the DNA at some arrest sites can adopt a bent conformation, and it has been proposed that these DNA bends are directly responsible for inducing arrest (133). Although it is not known whether SII-sensitive arrest sites play a role in regulating the expression of particular eukaryotic genes, SII-sensitive arrest sites are distributed throughout many eukaryotic protein-coding genes. Yeast cells lacking the SII gene (PPR2) are sensitive to the uracil analog, 6-azauracil, which lowers intracellular UTP and GTP pools and thus might be expected to affect transcription elongation. Notably, the phenotypic effects of some yeast RNA polymerase I1 large subunit mutations that render messenger RNA synthesis sensitive to 6-azauracil are overcome by increasing the dosage of the wild-type PPR2 gene (134). Efforts to understand how SII promotes passage of RNA polymerase I1 through intrinsic arrest sites led to the discovery that SII-dependent readthrough is accompanied by reiterative endonucleolytic cleavage and reextension of nascent transcripts in the ternary elongation complex (135-137 ). Many observations argue that SII-activatedtranscript cleavage is an essential step in SII-dependent transcription. The SII-dependent readthrough of all transcriptional impediments examined is accompanied by nascent transcript
RNA POLYMERASE I1 TRANSCRIPTION FACTORS
33 7
cleavage (138-141). Also, the appearance of SII-dependent cleavage products precedes the appearance of readthrough products (138, 139, 144, and all SII-dependent transcription through intrinsic arrest sites in the human histone H3.3 gene or past DNA-bound Lac repressor is preceded by cleavage (138, 141).Finally, SII deletion or point mutants that fail to activate nascent transcript cleavage also fail to promote readthrough (138, 142-144). It is noteworthy that, although this evidence argues that cleavage is necessary for readthrough, it may not be sufficient in light of the recent identification of an SII mutant that promotes transcript cleavage but not readthrough (143,144 ). Attempts to iden* the catalyhc site responsible for SII-activated transcript cleavage show that RNA polymerase I1 participates directly in the cleavage reaction. The SII-activated transcript cleavage requires a physical interaction between polymerase and the cleaved transcript and is inhibited by low concentrations of the drug, a-amanitin, which inhibits elongation by RNA polymerase I1 (135-137). Whereas purified SII has no detectable transcript cleavage activity in the absence of polymerase, highly purified RNA polymerase I1 elongation complexes exhibit low but detectable levels of transcript cleavage activity (135-137). Finally, evidence that the RNA polymerase I1 catalyhc site is responsible for transcript cleavage has come from the observation that SII-activated transcript cleavage and pyrophosphorolysis result in cleavage of the same internal phosphodiester bonds of nascent transcripts (145).
VII. The Elongation Activity of TFllF TFIIF is unique among RNA polymerase I1 general transcription factors by virtue of its abihty to function in both the initiation and the elongation stages of transcription. Greenleaf and co-workers first demonstrated that TFIIF is capable of stimulating elongation by RNA polymerase (50).Their work and subsequent studies from several laboratories demonstrated that TFIIF stimulates elongation by a mechanism that involves suppression of the frequency or duration of transient pausing by RNA polymerase I1 at many sites on DNA templates (58, 128, 146-149). Unlike SII, TFIIF does not promote nascent transcript cleavage by the RNA polymerase I1 elongation complex, and it is not capable of releasing polymerase from arrest at intrinsic arrest sites. Nevertheless, interaction of TFIIF with transcribing RNA polymerase I1 decreases the likelihood that polymerase will suffer arrest at these sites (150 and references therein). Thus, TFIIF and SII appear to regulate the activity of transcribing RNA polymerase I1 in different yet complementary ways: whereas TFIIF suppresses transient pausing by polymerase and protects the elongation complex from becoming arrested, SII reactivates the elongation complex once it has arrested.
338
RONALD C. CONAWAY AND JOAN UTELIKY CONAWAY
Although the mechanism by which TFIIF stimulates elongation by RNA polymerase I1 has not been established unequivocally, evidence suggests that TFIIF elongation activity is executed through a direct but transient interaction with transcribing polymerase. TFIIF is capable of stimulating the elongation rate of purified RNA polymerase I1 on oligo(dC)-tailedDNA templates in the absence of other transcription factors (50).Although TFIIF does not remain stably bound to transcribing polymerase during elongation in vitro (50),TFIIF is capable of binding stably to RNA polymerase 11 in solution (47, 50, 53, 58, 60, 151). Phosphorylation of RAP74 both stabilizes binding of TFIIF to RNA polymerase I1 and increases TFIIF elongation activity (152). Analysis of a series of mutant TFIIFs containing RAP30 deletion mutations revealed that TFIIF elongation activity is strongly dependent on the RAP30 region proposed (62) to bind RNA polymerase 11, but not on RAP30 C-termind sequences required for initiation (69).
VIII. The Elongin (SIII)Complex and von Hippel-Lindau Disease Like TFIIF, elongin (SIII)increases the overall rate of elongation by RNA polymerase I1 by decreasing the frequency or duration of transient pausing by polymerase at many sites on DNA templates (148).Elongin (SIII) was initially identified and purified to homogeneity from rat liver nuclei (153)as a heterotrimeric complex of A, B, and C subunits with apparent molecular masses of -110, -18, and -15 kDa. Subsequent biochemical studies have shown (1) that elongin A is the transcriptionally active subunit of elongin (SIII) (154) and stimulates elongation by a novel, inducible transcriptional activation domain that exhibits an overall architecture similar to the ligandinducible activation domains of members of the nuclear receptor superfamily (T. Aso, J. W. Conaway and R.C. Conaway, unpublished results) and (2) that elongin B and C are positive regulatory subunits (155, 156). Elongin B and C regulate elongm A by different mechanisms (154) (Fig. 2). Elongin C functions as a direct activator of elongin A, because it is capable of interacting directly with elongin A in the absence of elongin B to form an AC complex with increased specific activity relative to that of elongin A. In contrast, elongin B, a member of the ubiquitin homology (UbH) gene family, neither activates elongin A nor is capable of interacting with elongin A in the absence of elongin C; instead, elongin B binds directly to elongin C and promotes assembly and stability of the elongin (SIII) complex. The elongin BC complex has recently been shown to be a potential target for regulation by the product of the von Hippel-Lindau (VHL)tumor suppressor gene (157, 158). The VHL gene is mutated in families with VHL
.J
low specific activity
FAST
r n R N A .
-
high specific activity (unstable)
rnRNAJ
FAsEl?
high specific activity (stable) FIG.2. Assembly and activities of elongin (SIII) and elongin subassemblies. Pol 11, RNA polymerase 11; A, elongin A; B, elongin B; C, elongin C; mRNA, messenger RNA.
340
RONALD C. CONAWAY AND JOAN WELIKY CONAWAY
disease, a rare genetic disorder (incidence -1 in 36,000) that predisposes affected individuals to a variety of cancers, including retinal hemangiomas, central nervous system hemangioblastomas, multiple endocrine neoplasias, and clear-cell renal carcinoma (159-161). Of more general clinical importance, somatic mutations of the VHL gene are found in most sporadic clearcell renal carcinomas (161-164). The VHL protein binds tightly and specifically to the elongin BC complex both in vitro and in cells (157,158).A subset of naturally occurring VHL mutants from VHL tumors and clear-cell renal carcinomas exhibits substantially reduced binding to the elongin BC complex, arguing that the VHL-elongin BC interaction is likely to be important for the tumor suppressor activity of the VHL protein. Binding of the VHL protein and elongin A to the elongin BC complex in vitro is mutually exclusive, and binding of the VHL protein to the elongin BC complex inhibits its ability to activate elongin A transcriptional activity (157).Taken together, these results suggest that the normal tumor suppressor function of the VHL protein could involve regulation of elongin (SIII) transcriptional activity.
IX. ELL and Acute Myeloid Leukemia We isolated a novel -80-kDa elongation factor from rat liver nuclei and found that the purified protein exhibits sigdicant sequence similarity to the product of the human ELL gene (165).Subsequent experiments led to the discovery that the human ELL protein is also an RNA polymerase I1 elongation factor (165).Like TFIIF and elongin (SIII), ELL stimulates the rate of elongation by RNA polymerase I1 by suppressing transient pausing by polymerase at many sites on DNA templates (165). The human ELL (eleven-nineteen lysine-rich leukemia) gene on chromosome 19~13.1was initially isolated as a gene that undergoes frequent translocations with the MLL (mixed-lineageleukemia) gene on chromosome l l q 2 3 in acute myeloidleukemia (166,167).ELL encodes a 621-amino acid protein that is highly conserved and ubiquitously expressed in higher eukaryotes, but that contains no obvious structural motifs characteristic of transcription factors (166,167).The N-terminalhalfof the 3968-amino acid MLL ) methyltransgene product contains (A-7")-hook DNA binding ( 1 6 7 ~and ferase-like domains, whereas the C-terminal half of the MLL-encoded protein contains several regions that resemble the Drosophilu trithorux gene product, including multiple contiguous zinc-finger motifs and a highly conserved 215-amino acid region at the C terminus of the protein (168, 169). Like its potential Drosophita counterpart, the MLL gene product regulates expression of homeotic genes (170).
341
RNA POLYMERASE I1 TRANSCRIPTION FACTORS
ELL
-
(19~13.1)
A-T -. . hooks
-
TRITHORAX like
MT
I )
Zn-finaers
MLL
MLL
FIG.3. The chimeric MLL-ELL gene in acute myeloid leukemia. MT, Methylbansferaselike domain; Zn, zinc.
Chromosomal translocations involving MLL and ELL result in formation of a chimeric gene encoding all but the first 45 amino acids of ELL fused to the N-terminal-1400 amino acids of the MLL protein, including its (A-T)hook and methyltransferase domains, but lacking the C-terminal trithoraxlike regions (Fig. 3). Whether leukemogenesis results from expression of the MLL-ELL chimera or from loss of one allele of ELL, MLL, or both remains unclear. Nevertheless, the identification of two RNA polymerase I1 elongation factors, elongin (SIII) and ELL, which are implicated in oncogenesis, supports the idea that there may be a close connection between the regulation of transcription elongation and cell growth.
A c KNOWLED GM E NTS Work in our laboratory is supported by National Institutes of Health Grant GM41628 and by funds provided to the Oklahoma Medical Research Foundation by the H. A. and Mary K. Chapman Charitable Trust.
REFERENCES 1 . R. Tjian and T. Maniatis, Cell 77, 5, (1994). 2. D. L. Bentley, Cum. @in. Genet. Deu. 5, 210 (1995). 3. K. Yankulov, J. Blau, T. Purton, S. Roberts and D. L. Bentley, Cell 77, 749 (1994). 4. A. Krumm, L. B. Hickey and M. Groudine, Genes Den 9,559 (1995). 5. Y. J. Kim, S. Bjorklund, Y. Li, M. H. Sayre and R. D. Komberg, Cell 77,599 (1994). 6. A. J. Koleske and R. A. Young, Nature (London)368,466 (1994).
342
RONALD C. CONAWAY AND JOAN WELIKY CONAWAY
7. J. C. Chrivia, R. P. S. Kwok, N. Lamb, M. Hagiwara, M. R. Montminy and R. H. Goodman, Nature (London)365,855 (1993). 8. R. P. S. Kwok, J. R. Lundblad, J. C. Chrivia, J. P. Richards, H. P. Bachinger, R. G . Brennan, S. G . E. Roberts, M. R. Green and R. H. Goodman, Nature (London)370,223, (1994). 9. J. Arias, A. S. Alberts, P. Brindle, F. X. Claret, T. Srneal,M. Karin, J. Feramisco andM. Montminy, Nature (London)370,226 (1994). 10. M. Kretzschmar, K. Kaiser, F. Lottspeich and M. Meisterernst, Cell 78,525 (1994). 11. H. Ge and R. G. Roeder, Cell 78,513 (1994). 12. R. C. Conaway and J. W. Conaway, ARB 62,161 (1993). 13. T. Aso, J. W. Conaway and R. C. Conaway, FASEB]. 9,1419 (1995). 14. C. M. Kane, in “Transcription: Mechanisms and Regulation” (R. C. Conaway and J. W. Conaway, eds.),p. 279. Raven Press, New York, 1994. 15. N. l? Marshall and D. H. l’rice,JBC 270,12335 (1995). 16. S. T. Smale, in “Transcription: Mechanisms and Regulation” (R. C. Conaway and J. W. Conaway, eds.), p. 63. Raven Press, New York, 1994. 17. T. Aso, J. W. Conaway and R. C. Conaway,JBC 269,26575 (1994). 18. Q. Zhou, P. M. Lieberman, T. G . Boyer and A. J. Berk, Genes Deo. 6,1964 (1992). 19. Q.Zhou, T. G. Boyer and A. J. Berk, Genes Deu. 7,180 (1993). 20. B. F. Pugh and R. Tjian, JBC 267,679 (1992). 21. E. Martinez, C. M. Chiang, H. Ge and R. G. Roeder, EMBO]. 13,3115 (1994). 22. C. Brou, S. Chaudhary, I. Davidson, Y.Lutz, J. Wu, J. M. Egly, L. Tora and P. Chambon, EMBOJ. 12,489 (1993). 23. S. Hahn, S. Buratowski,P. A. Sharp and L. Guarente, PNAS 86,5718 (1989). 24. S. R. Wiley, R. J. Kraus and J. E. Mertz, PNAS 89,5814 (1992). 25. C. P. Vemjzer, K. Yokomori, J. L. Chen and R. Tjian, Science 264,933 (1994). 26. M. A. Sypes and D. S. Gilmour, NARes 22,807 (1994). 27. J. Kaufmann, C. P. Verrijzer, J. Shao and S. T. Smale, Genes Deu. 10,873 (1996). 28. S. Buratowski, S. Hahn, L. Guarente and P. A. Sharp, Cell 56,549 (1989). 29. J. W. Conaway, M. W. Bond and R. C. Conaway,JBC 262,8293 (1987). 30. I. Ha, W. S. Lane and D. Reinberg, Nature (London)352,689 (1991). 31. S. L. Wampler and J. T. Kadonaga, Genes Dev. 6,1542 (1992). 32. H. Tschochner, M. H. Sayre, P. M. Flanagan, W. J. Feaver and R. D. Komberg, PNAS 89, 11292 (1992). 33. I. Pinto, D. E. Ware and M. Hampsey, Cell 68,977 (1992). 34. A. Barberis, C. W. Muller, S. C. Harrison and M. Ptashne, PNAS 90,5628 (1993). 35. S. Buratowski and H. Zhou, PNAS 90,5633 (1993). 36. I. Ha, S. Roberts, E. Maldonado, X. Sun, L. U. Kim, M. Green and D. Reinberg, Genes Deu. 7,1021 (1993). 37. S. Malik, D. K. Lee and R. G. Roeder, MCBiol13,6253 (1993). 38. K. Hisatake, R. G . Roeder and M. Horikoshi, Nature (London)363,744 (1993). 39. S. Yarnashita, K. Hisatake, T. Kokubo, K. Doi, R. G. Roeder, M. Horikoshi and Y. Nakatani, Science 2 6 l 4 6 3 (1993). 40. D. B. Nikolov, H. Chen, E. D. Halay, A. A. Usheva, K. Hisatake, D. K. Lee, R. G. Roeder and S. K. Burley, Nature (London)377, 119 (1995). 41. Y. Li, P. M. Flanagan, H. Tschochner and R. D. Komberg, Science 263,805 (1994). 42. J. D. Parvin andP. A. Sharp, Cell 73,533 (1993). 43. J. D. Parvin, B. M. Shykind,R. E. Meyers, J. Kim and P. A. Sharp,JBC 269,18414 (1994). 44. R. C. Conaway, K. P. Garrett, J. P. Hanley and J. W. Conaway, PNAS 88,6205 (1991). 45. S. Buratowski, M. Sopta, J. Greenblatt and P. A. Sharp, PNAS 88,7509 (1991).
RNA POLYMERASE I1 TRANSCRIPTION FACTORS
343
46. 0.Flores, H. Lu, M. Killeen, J. Greenblatt, Z. F. Burton and D. Reinberg, PNAS 88,9999 (1991). 47. M. Sopta, R. W. Carthew and J. Greenblatt,JBC 260,10353 (1985). 48. D. H. Price, A. E. Sluder and A. L. Greenleaf,JBC 262,3244 (1987). 49. J. W. Conaway and R. C. Conaway,JBC 264,2357 (1989). 50. D. H. Price, A. E. Sluder and A. L. Greenled, MCBiol9,1465 (1989). 51. 0.Flores, I. Ha and D. Reinberg,JBC 265, 5629 (1990). 52. S. Kitajima,Y. Tanaka, T. Kawaguchi,T. Nagaoka, S. M. Weissman and Y. Yasukochi, NARes 18,4843 (1990). 53. N. L. Henry, M. H. Sayre and R. D. Kornherg,JBC 267,23388 (1992). 54. N. L. Henry, A. M. Campbell, W. J. Feaver, D. Poon, P. A. Weil and R. D. Komberg, Genes Dew. 8,2868 (1994). 55. J. Greenblatt, Trends Biochem. Sci. 16,408 (1991). 56. Z. F. Burton, L. G. Ortolan and J. Greenblatt, EMBOJ. 5,2923 (1986). 57. Z. F. Burton, M. W e e n , M. Sopta, L. G. Ortolan and J. Greenblatt, MCBwl 8,1602 (1988). 58. 0. Flores, E. Maldonado and D. Reinberg,JBC 264,8913 (1989). 59. 0.Flores, E. Maldonado, Z. Burton,J. Greenblatt and D. Reinberg,JBC263,10812 (1988). 60. H. Serizawa,J. W. Conaway and R. C. Conaway, in “Transcription:Mechanisms and Regulation” (R. C. Conaway and J. W. Conaway, eds.), p. 27. Raven Press, New York, 1994. 61. M. Sopta, Z. F. Burton and J. Greenblatt, Nature (London)341,410 (1989). 62. S. McCracken and J. Greenblatt, Science 253,900 (1991). 63. S. Tan, K. P. Garrett, R. C. Conaway and J. W. Conaway, PNAS 91,9808 (1994). 64. B. Q. Wang and Z. F. Burton,JBC 270,27035 (1995). 65. K. P. Garrett, H. Serizawa J.P. Hanley,J. N. Bradsher,A.Tsuboi,N. Arai, T. Yokota, K.Arai, R. C. Conaway and J. W. Conaway,JBC 267,23942 (1992). 66. A. J. Dombroski, W. A. Walter, M. T. Record, D. A. SiegeIe and C. A. Gross, Cell 70,501 (1992). 67. A. J. Dombroski, W. A. Walter and C. A. Gross, Genes Dm. 7,2446 (1993). 68. B. Coulombe, J. Li and J. Greenblatt,JBC 269,19962 (1994). 69. S. Tan, R. C. Conaway and J. W. Conaway, PNAS 92,6042 (1995). 70. D. J. Frank, C. M. Tyree, C. P. George and J. T. Kadonaga,JBC 270,6292 (1995). 71. J. W. Conaway and R. C. Conaway, Science 248,1550 (1990). 72. M. T. Killeen and J. F. Greenblatt, MCBioZ l2,30 (1992). 73. J. W. Conaway, J. N. Bradsher and R. C. Conaway,JBC 267,10142 (1992). 74. W. Wang, M. Carey and J. D. Gralla, Science 255,450 (1992). 75. F. C. P. Holstege, P. C. van der Vliet and H. Th. M. Timmers, EMBOJ. 15, 1666 (1996). 76. Y. Ohkuma, H. Sumimoto, M. Horikoshi and R. G. Roeder, PNAS 87,9163 (1990). 77. J. W. Conaway, J. P. Hanley, K. P. Garrett and R. C. Conaway,JBC 266,7804 (1991). 78. M. H. Sayre, J. Tschochner and R. D. Komberg,JBC 267,23383 (1992). 79. W. J. Feaver, N. L. Henry, D. A. Bushnell, M. H. Sayre,J. H. Brickner, 0. Gileadi and R. D. Kornberg, JBC 269,27549 (1994). 80. R. C. Conaway and J. W. Conaway, PNAS 86,7356 (1989). 81. W. J. Feaver, 0. Gileadi and R. D. Komberg,JBC 266, 19000 (1991). 82. W. J. Feaver, J. Q. Svejstrup, N. L. Henry and R. D. Komberg. Cell 79, 1103 (1994). 83. M. Gerard, L. Fischer, V. Moncollin, J. M. Chipoulet, P. Chambon and J. M. Egly,JBC 266, 20940 (1991). 84. J. P. Adamczewski, M. Rossignol, J. P. Tassan, E. A. Nigg, V. Moncollin and J. M. Egly, EMBOJ. 15, 1877 (1996). 85. W. J. Feaver, 0.Gileadi, Y. Li and R. D. Komberg, CeZZ 67,1223 (1991).
344
RONALD C. CONAWAY AND JOAN WELIKY CONAWAY
86. R. Roy, L. Schaeffer, S. Humbert, W. Vermeulen, G. Weeda and J. M. Egly,JBG 269,9826 (1994). 87. Y. Ohkuma and R. G. Roeder, Nature (London)368,160 (1994). 88. L. Schaeffer, R. Roy, S. Humbert, V. MoncolLin, W. Vermeulen, J. H. J. Hoeijmakers, P. Chambon and J. M. Egly, Science 260,58 (1993). 89. H. Serizawa, R. C. Conaway and J. W. Conaway,JBC 268,17300 (1993). 90. W. J. Feaver, J. Q. Svejstrup, L. Bardwell, A. J. Bardwell, S. Buratowski, K. D. Gulyas, T F. Donahue, E. C. Friedberg and R. D. Kornberg, Cell 75,1379 (1993). 91. Z. Wang, J. Q. Svejstrup,W. J. Feaver, X. Wu, R. D. Kornberg and E. C. Friedberg, Nature (London)368, 74 (1994). 92. A. J. van Vuuren, W. Vermeulen, L. Ma, G. Weeda, E. Appeldorn, N. G .J. Jaspers, A. J. van der Eb, D. Bootsma, J. H. J. Hoeijmakers, S. Humbert, L. Schaeffer and J. M. Egly, EMBO J. 13, 1645 (1994). 93. 0 . Gileadi, W. J. Feaver and R. D. Kornberg, Science 257,1389 (1992). 94. L. Fischer, M. Gerard, C. Chdut, Y. Lutz, S. Humbert, M. Kanno, P. Chambon and J. M. Egly, Science 257,1392 (1992). 95. Z. Wang, S. Buratowski,J. Q. Svejstrup, W. J. Feaver, X. Wu, R. D. Kornberg, T. F. Donahue and E. C. Friedberg, MCBiol 15,2288 (1995). 96. S. Humbert, H. van Vuuren, Y. Lutz, J. H. Hoeijmakers,J. M. Egly and V. Moncollin,EMBO J. 13,2393 (1994). 97. L. Schaeffer,V. Moncollin, R. Roy, A. Staub, M. Mezzina, A. Sarasin,G. Weeda, J. H. Hoeijmakers and J. M. Egly, EMBOJ. 13,2388 (1994). 98. R. Drapkin, J. T. Reardon, A. Ansari, J. C. Huang, L. Zawel, K. Ahn, A. Sancar and D. Reinberg, Nature (London)368, 769 (1994). 99. D. Bunick, R. Zandomeni, S. Ackerman and R. Weinmann, Cell 29,877 (1982). 100. M. Sawadogo and R. G. Roeder,JBC 259,5321 (1984). 101. D. S. Luse and G. A. Jacob,]BC 262,14990 (1987). 102. R. C. Conaway and J. W. Conaway,JBC 263,2962 (1988). 103. J. A. Goodrich and R. Tjian, Cell 77, 145 (1994). 104. Y. Jiang, M. Yan and J. D. Gralla,JBC 270,27332 (1995). 105. A. Dvir, K. P. Garrett, C. Chalut, J. M. Egly, J. W. Conaway and R. C. Conaway,JBC271, 7245 (1996). 106. H. Th. M. Timmers,EMBOJ. 13,391 (1994). 107. D. Tantin and M. Carey,JBC 269,17397 (1994). 108. F. Holstege, D. Tantin, M. Carey, P. C. van der Wet and H. Th. M. Timmers, EMBOJ. 14, 810 (1995). 109. H. Serizawa,R. C. Conaway and J. W. Conaway, PNAS 89,7476 (1992). 110. H. Lu, L. Zawel, L. Fischer, J. M. Egly and D. Reinberg, Nature (London) 358, 641 (1992). 111. R. Roy, J. P. Adamczewski, T. Seroz, W. Vermeulen,J. P. Tassan, L. Schaeffer, E. A. Nigg, J. H. J. Hoeijmakers and J. M. Egly, Cell 79, 1093 (1994). 112. H. Serizawa, T. P. Makela,J. W. Conaway, R. C. Conaway, R. A. Weinberg and R. A. Young Nature (London)374,280 (1995). 113. R. Shiekhattar, F. Mermelstein, R. P. Fisher, R. Drapkin, B. Dynlacht, H. C. Wessling, D. 0. Morgan and D. Reinberg, Nature (London)374,283 (1995). 114. J. Q. Svejstrup, W. J. Feaver, J. W. LaPointe and R. D. Kornberg,JBC 269,28044 (1994). 115. J. Q. Svejstrup, Z. Wang, W. J. Feaver, X. Wu, D. A. Bushnell, T. F. Donahue, E. C. Friedberg and R. D. Komberg, Cell 80,2 1 (1995). 116. J. M. Payne, P. J. Layboum and M. E. Dahmus,JBC 264,19621 (1989). 117. D. L. Cadena and M. E. Dahmus,JBC 262,12468 (1987).
RNA POLYMERASE I1 TRANSCRIPTION FACTORS
345
B. Bartholomew, M. E. Dahmus, and C. F. Meares,JBC 2 6 1 14226 (1986). P. J. Layboum and M. E. Dahmus, JBC 265,13165 (1990). J. D. Chesnut, J. H. Stephens and M. E. Dahmus,JBC 267,10500 (1992). H. Lu, 0.Flores, R. Weinmann and D. Reinberg, PNAS 88, 10004 (1991). H. Serizawa, J. W. Conaway and R. C. Conaway, Nature 363,371 (1993). Y. Li and R. D. Kornberg, PNAS 91,2362 (1994). T. P. Makela J. D. Parvin, J. Kim, L. J. Huber, P. A. Sharp and R. A. Weinberg, PNAS 92, 5174 (1995). 125. C. N. Tennyson, H. J. Klamut and R. G. Worton, Nature Genet. 9,184 (1995). 126. D. S. Ucker and K. R. Yamamoto,JBC 259,7416 (1984). 127. C. S. Thummel, K. C. Burtis and D. S. Hogness, Cell 6 4 101 (1990). 128. M. G. Izban and D. S. Luse,]BC 267,13647 (1992). 129. T. K. Kerppola and C. M. Kane, FASEB J. 5,2833 (1991). 130. N. F. Marshall and D. H. Price, MCBiol 12,2078 (1992). 131. K. Sekimizu, N. Kobayashi, D. Mizuno and S. Natori, Bchem 15,5064 (1976). 132. D. Reines, in “Transcription: Mechanisms and Regulation” (R. C. Conaway and J. W. Conaway, eds.), p. 263. Raven Press, New York, 1994. 133. T.K. Kerppola and C. M. Kane, Bchem 29,269 (1990). 134. J. Archambault, F. Lacroute, A. Ruet and J. D. Friesen, MCBiol12,4142 (1992). 135. D. Reines,JBC 267,3795 (1992). 136. M. G. Izban and D. S. Luse, Genes Dev. 6,1342 (1992). 137. D. G. Wang and D. K. Hawley, PNAS 90,843 (1993). 138. D. Reines, P. Ghanouni, Q. Li and J. Mote,JBC 267, 15516 (1992). 139. M. G. Izban and D. S. Luse,JBC 268,12874 (1993). 140. J. Mote, P. Ghanouni and D. Reines,JMB 236,725 (1994). 141. D. Reines and J. Mote, PNAS 90, 1917 (1993). 142. C. J. Jeon, H. S. Yoon and K. Agarwal, PNAS 91, 9106 (1994). 143. G. Cipres-Palacin and C. M. Kane, PNAS 91,8087 (1994). 144. G. Cipres-Palacin and C. M. Kane, Bchem 34,15375 (1995). 145. M. D. Rudd, M. G. Izban and D. S. Luses, PNAS 91,8057 (1994). 146. S. Tan, T. Aso, R. C. Conaway and J. W. Conaway,JBC 269,25684 (1994). 147. D. D. Kephart, B. Q. Wang, Z. F. Burton and D. H. Price,JBC 269,13536 (1994). 148. J. N. Bradsher, S. Ti&, H.-JMcLaury,J. W. Conaway and R. C. Conaway,JBC 268,25594 (199 3). 149. E. Bengal, 0.Flores, A. Krauskopf, D. Reinberg and Y. Aloni, MCBiol11, 1195 (1991). 150. W. Gu and D. Reines,JBC 270,11238 (1995). 151. Y. Kobayashi, S. Kitajima and Y. Yasukochi, NARes 20,1994 (1992). 152. S. Kitajima, T. Chibazakura, M. Yonaha and Y. Yasukochi,JBC 269,29970 (1994). 153. J. N. Bardsher, K. W. Jackson, R . C. Conaway and J. W. Conaway,JBC 268,25587 (1993). 154. T. Aso, W. S. Lane, J. W. Conaway and R. C. Conaway, Science 269, 1439 (1995). 155. K. P. Garrett, S. Tan,J. N. Bradsher, W. S. Lane, J. W. Conaway and R. C. Conaway, PNAS 91,5237 (1994). 156. K. P. Garret, T. Aso, J. N. Bradsher, S. I. Foundling, W. S. Lane, R. C. Conaway and J. W. Conaway, PNAS 92,7172 (1995). 157. D. R. Duan, A. Pause, W. H. Burgess, T. Aso, D. Y. T. Chen, K. P. Garrett, R. C. Conaway, J. W. Conaway, W. M. Linehan and R. D. Klausner, Science 269,1402 (1995). 158. A. Kibel, 0.Iliopoulos, J. A. DeCaprio and W. G. Kaelin, Science 269,1444 (1995). 159. F. Latif, K. Tory, J. Gnarra, M. Yao, F. M. Duh, M. L. Orcutt, T. Stackhouse, I. Kuzmin, W. Modi, L. Geil, L. Schmidt, F. Zhou, H. Li, M. H. Wei, F. Chen, G. Glenn, P. Choyke, M. M. Walther, Y. Weng, D. R. Duan, M. Dean, K. Glavac, F. M. Richards, P. A. Crossey,
118. 119. 120. 121. 122. 123. 124.
346
RONALD C. CONAWAY AND JOAN WELIKY CONAWAY
M. A. Ferguson-Smith, D. Le Paslier, I. Chumakov, D. Cohen, A. C. Chinault, E. R. Maher, w. M. Linehan, B. Zbar and M. I. Lerman, Science 260,1317 (1993). 160. F. Chen, T. Kishida, M. Yao, T. Hustad, D. Galvac, M. Dean, J. R. Gnarra, M. L. Orcutt, F. M. Duh, G. Glenn, J. Green, Y. E. Hsia, J. Lamiell, H. Li, M. H. Wei, L. Schmidt, K. Tory, I. Kuzmin, T. Stackhouse, F. Latif, W. M. Linehan, M. Lerman and B. Zbar, Human Mutat. 5, 66 (1995). 161. J. M. Whaley, J. Naglich, L. Gelbert, Y. E. Hsia, J. M. Lamiell, J. S. Green, D. Collins, H. P. H. Neumann, J. Laidlaw, F. P. Li, A. J. P. Klein-Szanto, B. R. Seizinger and N. Hey, Am.]. Human Genet. 55,1092 (1994). 162. J. R. Gnarra, K. Tory, Y. Weng, L. Schmidt, M. H. Wei, H. Li, F. Latif, S. Liu, F. Chen, F. M. Duh, I. Lubensky, D. R. Durn, C . Florence, R. Pozzatti, M. M. Walther, N. H. Bander, H. B. Grossman, H. Brauch, S. Pomer, J. D. Brooks, W. B. Isaacs, M. I. Lerman, B. Zbar and W. M. Linehan, Nature G a t e . 7,85 (1994). 163. T. Shuin, K. Kondo, S. Torigoc, T. Kishida, Y. Kubota, M. Hosaka, Y. Nagashima, H. Kitamura, F. Latif, B. Zbar, M. I. Lerman and M. Yao, Cancer Res. 54,2852 (1994). 164. K. Foster, A. Prowse, A. van den Berg, S. Fleming, M. M. F. Hulsbeek, P. A. Crossey, F. M. Richards, P. Cairns, N. A. Affara, M. A. Ferguson-Smith,C. H. C. M. Buys and E. R. Maher, Humn Mol. Genet. 3,2169 (1994). 165. A. Shilatifard,W. S. Lane, K. W. Jackson, R. C. Conaway and J. W. Conaway, Science 271, 1873 (1996). 166. M. J. Thirman, D. A. Levitan, H. Kobayashi, M. C. Simon and J. D. Rowley, PNAS 91,12110 (1994). 167. K. Mitani,Y. Kanda, S. Ogawa, T.Tanaka,J. Inazawa, Y. Yazaki and H. Hirai, Blood 85,2017 (1995). 167a. R. Reeves and M. S. Nissen,JBC 265,8573 (1990). 168. D. C. Tkachuk, S. Kohler and M. L. C l e q , Cell 71,691 (1992). 169. Y. Gu, T. Nakamura, H. Alder, R. Prasad, 0. Canaani, G. Cimino, C. M. Croce and E. Canaani, Celt 71,701 (1992). 170. B. D. Yu, J. L. Hess, S. E. Homing, G. A. J. Brown and S. J. Korsmeyer, Nature (London) 378,505 (1995).
Biochemistry and Molecular Genetics of Cobalamin Biosynthesis MICHELLER. RONDON, R. TRZEBLATOWSIU AND JORGE C. ESCALANTE-SEMERENA~
JODI
Department of Bacteriology University of Wisconsin-Madison Madison, Wisconsin 53706.1567 I. Nomenclature of Comnoids ......................... 11. Diversity of Coninoids . . . . 111. Cobalamin-producing Organisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Cobdamin-dependentReactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Biochemistry of Cobalamin Synthesis ............................ A. Biosynthesis of Uroporphyrinogen 111 ......................... B. Biosynthesis of Cobalamin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Biosynthesis of the Lower Ligand ............................. D. Utilization of Exogenous Corrinoids VI. Molecular Genetics of Cobalamin Synthesis ....................... A. Genetics of Uroporphyrinogen 111 Synthesis B. Genetics of Cobalamin Synthesis ............................. VII. Regulation of Cobalamin Synthesis VIII. Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References .....................
349 350 352 352 354 354 356 367 368 369 369 3 70 376 378 380
Cobalamin (Cbl) is one of the largest nonpolymeric molecules with biological activity. It belongs to an important group of compounds, called cyclic tetrapymoles, that includes hemes, chlorophylls, siroheme, and coenzyme F,,,. These compounds are involved in many important biological processes, including photosynthesis (chlorophylls),respiration (hemes), and methanogenesis (coenzyme F4,0 and Cbl). Cbl differs from these other molecules in a number of ways, including having one less bridge carbon (be-
’
Abbreviations: Cbl, Cobdamin; Cba, cobamide; Me,Bza, 5,6-dimethylbenzimidazole; a d o , 5’-deoxyadenosyl;uro’gen 111, uroporphyrinogen 111;AmLevA, 8-aminolewlinicacid; Gla, glutamate semialdehyde; PBG, porphobilinogen;pre’U, preuroporphyrinogen; SUMT, uro’gen I11 methyltransferase; Chi, cobinamide; cob, cobalamin biosynthetic genes; cbi, cobinamide biosynthetic genes; pdu, 1,2-propanediolutilization genes; AdoMet, S-adenosylmethionine;Nir, ribosyl-nico-tinamide(nicotinamide ribonucleoside); ORF, open reading kame; CRP, CAMPreceptor protein; l? denitrifians, Pseudomonas denitrificans. To whom correspondence should be addressed. Email:
[email protected]. Copyright 0 1997 by Academic Press.
347
Au lighhts of reproductionin any form reserved 0079-6603/97 $25.00
348
MICHELLE R. RONDON ET AL.
tween the A and D rings),addtion of many C-methyl groups to the ring structure, amidation of six of the carboxyl side chains, containing cobalt as the metal atom, and, except for coenzyme F,,,, having a more reduced ring structure. Cbl is also unique among tetrapyrroles in having upper and lower ligands attached to the central metal atom. The structure of adenosylcobalamin (Ado-Cbl) is shown in Fig. 1.Cbl is an essential nutrient for humans, although it is synthesized exclusively by prokaryotes.
S'deoxyadenosine
r L
Cobyric acid Cobinarnide
Aminopropanol
Nucleotide Loop
H k f
0
IY
HOH2C Hb 0
FIG.1. Structure of adenosylcobalamin. The names of portions of the molecule discussed in the text are noted in the figure. The corrin ring and the lower ligand are numbered separately. The letters a through g refer to the conin ring side chains; the letters A through D refer to the pyrrole rings; DMB, 5,6-dimethylbenzimidazole(Me,Bza).
BIOCHEMISTRY AND MOLECULAR GENETICS OF COBALAMIN SYNTHESIS
349
The study of cobalamin synthesis, chemistry, and cobalamin-dependent metabolism has long occupied chemists, biochemists, nutritionists, enzymologists, and, more recently, geneticists and molecular biologists. Cobalamin was first discovered as the factor responsible for curing patients with pernicious anemia, a discovery that earned a Nobel Prize [for a historical review see Folkers ( I ) ] .Determination of the crystal structure of cobalamin was completed by Hodgkins and colleagues, an accomplishment that was also recognized by a Nobel Prize ( 2 , 3 ) .Due to its structural complexity, the total chemical synthesis of Cbl was a major victory in the field of synthetic chemistry ( 4 ) .Recently, most of the in vivo pathway of Cbl synthesis has been determined in the bacterium Pseudomonas denitrificans, with the isolation of many intermediates, purification of enzymes, and cloning and characterization of most of the genes (5). In light of what has been learned from research on cobalamin, the study of its biosynthesis can now be used to examine new questions. Because it is synthesized by a wide variety of bacteria and archaea, its biosynthesis provides a good system for examining the evolution of complex biosynthetic pathways. Also, because it is utilized for many different and important cellular processes, it is of interest to examine how the synthesis is regulated in different organisms that require it for dfferent reasons. The aim of this review is to provide a general introduction to cobalamin synthesis. The reader is encouraged to consult the references provided for details, including reviews on cobalamin-dependent reactions (6), cobalamin synthesis in Propionibacteriumfreuclenreichii and P denitrijkans (5, 7, 8), and several books devoted to cobalamin (9,10).
1. Nomenclature of Corrinoids Due to the large number of biosynthetic intermediates and to the natural occurrence of compounds structurally related to, but distinct from, cobalamin, the nomenclature of comnoids is complex. (For a more complete discussion of this issue, see Refs. 10 and 11). Briefly, the term comnoid is a general one, referring to compounds consisting of four reduced pyrrole rings linked at the a positions. Three of these links are formed by methine bridges, the fourth is a direct Ca-Ca bond. Cobamides (Cbas) are corrinoids that contain an upper ligand complexed to the cobalt atom and a lower ligand attached via a nucleotide loop (see Fig. 1). Cobalamin refers to cobamides containing the base 5,6-dmethylbenzimidazole (MezBza)(12).The general nomenclature for a cobamide is Coa-(ligandin a position)-Cop-(ligandin pposition)-(coninoid), where the a ligand is that below the plane of the molecule and the p ligand is that above the plane of the molecule (11).Other
350
MICHELLE R. RONDON ET AL.
terms defining corrinoid structures are defined where appropriate in the text.
II. Diversity of Corrinoids Numerous corrinoids synthesized by archaea and bacteria contain different ligands in the upper and lower positions, although greater diversity is seen in the identity of the lower ligand. These alternative lower ligands comprise two groups, derivatives of benzimidazoles and aromatics (13).Table I lists lower ligands found naturally in bacteria and archaea. The aromatics p-cresol and phenol are unique in that no direct linkage to cobalt may occur, and these ligands are bound via an a0-glycosidic linkage to the ribose moiety of the nucleotide loop rather than the a-N-glycosidic linkage observed in cobamides containing benzimidazole derivatives (13,28). In addition to natural Cbas, a number of Cbas containing different lower ligands have been isolated from cells grown in the presence of an altemative base (29).Many prokaryotes will synthesize a Cba containing the exogenously supplied base as the lower ligand. For example, the Cba isolated from Salmonella typhimurium cells grown in medrum supplemented with benzimidazole contained this base as the lower ligand instead of Me,Bza (16).This phenomenon is called guided biosynthesis. The diversity of the lower ligand raises two questions. First, how are these alternative ligands incorporated into cobamides in the cell? It has been proposed that the phosphoribosyltransferase that transfers the ribose moiety of nicotinic acid mononucleotide (N@ to Me,Bza (see Section V,B) can incorporate different ligands, based on experimental results demonstrating a lack of specificity of this enzyme for its base substrate (18, 30, 31).Additionally, the cobalamin synthase that joins the nucleoside to the rest of the cobamide molecule must recognize its substrate regardless of the identity of the base. The second question is whether different lower ligands affect the function or utilization of cobamides. Natural corrinoids contain a lower ligand, suggesting that the lower ligand does serve some function (13).However, cobamide-dependent enzymes may utilize particular lower ligands more efficiently than others. This idea is supported by observations in some systems that the natural cobamide is preferred, because alternative cobamides added exogenously are modified by the cell into its natural cobamide (18, 24, 28). Nir is recommended by the IUPAC-IUB Commission on Biochemical Nomenclature(see
J. Biol. Chem.245,5172, 1970).
BIOCHEMISTRY AND MOLECULAR GENETICS OF COBALAMIN SYNTHESIS
351
TABLE I LOWERLIGANDS F O t l N I > IN NATURAL COBAMIDES ISOLATED FROM i\ VARIETY OF MICROORCANISMS Lower ligand
Lower ligand snicturen
:$I;; 3
5,6-dimethylbenzimidarole
I
R
5inethylbenzirnidazole
Q H 3;
N I
R
Representative microorganisms
Ref.
Eubacterium limosum Pseudomonas denitnficans Salmonella typhimurium Acetobactenum woodi~ Bacillus megaterium Clostridium fonn/coaceticum Propionrbactenum freudenreichii Desuffobubus autorropbicum Desuifobutbus propionicus Archaeaglobus fulgdus Propionibactenumarabinosum Propionigenium modesturn Clostndium tetanomorphum Me thanosarcina barker1 Methanoplanus limicola All methanococcales tested
Adenine
Smethoxybenrimidazole
Clostridium thennoaceticum
S-methoxv-6-methvl benzimidazole
Clostridium formicoaceticum Pelobacfer propionicus All methanobacterialestested Methanolobus tindarius Methanogenium marisnigri Methanospirilh hungatii
5-hydroxybenzimidazole
Phenol
Sporomusa ovata R
pcresol
Sporomusa ovata
“R represents the ribose-phosphate moiety of the cobamide nucleotide loop. R is joined to the lower ligand via either an a-N-or an 0-glycosidic linkage.
The upper ligand is also variable and may be an adenosyl moiety (coenzyme B,,), a methyl group, a hydroxyl group (hydroxo-cobamide), a cyan0 group (vitamin B,,), or a glutathionyl moiety (GS-Cbl) (32,33). As discussed below, only the first two are thought to be physioIogically functional.
352
MICHELLE R. RONDON ET AL.
111. Cobamide-producing Organisms Thus far, cobamides have been found to be synthesized de nouo only by archaea and bacteria (34 ). Genera of archaea shown to synthesize cobamides include Methanosarcina, Methanococcus, Methanobacterium, and Methanobreuibacter. Among bacteria, coninoid producers are found in many evolutionarily distinct groups, including the gram positives (Eubacterium, Clostridium, Propionibacterium, Bacillus, Streptomyces, and Rhodococcus), the purple bacteria (Chromutium, Rhodobacter, Agrobacterium, Pseudomonus, and Salmonella), and the cyanobacteria (Synechocystis). That the capacity to synthesize cobamide exists in bacteria and archaea supports the idea that its synthesis is an ancient property. In general, eukaryotes are not thought to synthesize cobamides, although some exceptions have been reported (35).
IV. Cobalamin-dependent Reactions A large number of cobalamin-dependent reactions have been described (6, 36 ). Enzymes catalyzing these reactions require either 5’-deoxyadenosylcobalamin (dAdo-Cbl)or methylcobalamin (Me-Cbl).
A. dAdo-Cbl-dependent Enzymes &do-Cbl was first discovered as a cofactor required for glutamate metabolism in clostridia (22).Most &do-Cbl-dependent enzymes studied to date catalyze intramolecular rearrangements (a general scheme for this reaction is shown in Fig. 2), which are necessary for the catabolism of compounds that serve as carbowenergy sources in prokaryotes. &do-Cbldependent isomerizations involve cleavage of the C-Co bond of the cofactor to generate an adenosyl radical. This radical abstracts a hydrogen atom from the substrate, forming a substrate radical. Rearrangement of this yields the product radical, whch recaptures the hydrogen atom from the cofactor
I
I
I
I
R
H
H
R
FIG.2. &do-Cbl-dependent rearrangement.The R represents a number of different functional groups, e.g.,amino, hydroxl.
BIOCHEMISTRY AND MOLECULAR GENETICS OF COBALAMIN SYNTHESIS
353
to generate the product and regenerate the adenosyl radical (6, 9,37).dAdoCbl-dependent enzymes of this type include ethanolamine ammonia-lyase (EC 4.3.1.7) (38),diol dehydratase (EC 4.2.1.28) (39),glutamate mutase (EC 5.4.99.1) (22);leucine 2,3-aminomutase (EC 5.4.3.7) (40),ornithine mutase (EC 5.4.3.5) (41,42),lysine mutase (EC 5.4.3.3) (43,441,methylmalonyl-Coil mutase (EC 5.4.99.2) ( 4 4 , glycerol dehydratase (EC 4.2.1.30) (46), and 2methyleneglutarate mutase (EC 5.4.99.4) ( 4 7 ) .These enzymes have been studied from a number of sources, including Clostridium sp. (amino acid mutases), enteric bacteria (diol, glycerol dehydratases, and ethanolamine ammonia-lyase), and humans (only methylmalonyl-CoA mutase). A more complete discussion of these enzymes and their sources is found elsewhere (69, 10). dAdo-Cbl is a cofactor for one class of ribonucleotide reductase (EC 1.17.4.2). This enzyme catalyzes the removal of a hydroxyl group from ribonucleotides to form deoxyribonucleotides, which are essential for DNA synthesis (48-50). There are four classes of ribonucleotide reductases, each utilizing a different mechanism to generate the rahcal site required for catalysis (50).Class I1 enzymes use dAdo-Cbl to generate the protein radical directly involved in hydrogen abstraction from the substrate (49). ado-Cbl-dependent ribonucleotide reductases have been found in many prokaryotes, including Bacillus megaterium, Rhizobium sp., Thermus aquaticus, cyanobacteria, Propionibacterium sp., Streptomyces aureofaciens, Clostridium sp., and Lactobacillus sp. (48, 50). Interestingly, enteric bacteria, which contains Cbl-producing members, appear not to contain the Cbldependent ribonucleotide reductase (49, 50). Most eukaryotes are thought not to contain Cbl-dependent ribonucleotide reductase (50),although the ciliate Euglena gracilis contains an Ado-Cbl-dependent enzyme, and thus its growth is dependent on an exogenous supply of Cbl(50).
B. Me-Cbl-dependent Enzymes Me-Cbl is one of three coenzymes capable of mediating methyl group transfer, the others being S-adenosylmethionine(AdoMet)and tetrahydrofolate (51, 52). Me-Cbl is required by one class of mehonine synthase (EC 2.1.1.13). This enzyme catalyzes the last step in methionine synthesis, the methylation of homocysteine (53).The Cbl-dependent enzyme is the only methionine synthase in humans; however, it is dispensable for organisms synthesizing the Cbl-independent enzyme (e.g.,S. typhimurium and Escherichia coli). Other bacteria containing the Cbl-dependent synthase include Rhizobium sp., Streptomyces olivaceus, and Rhodospirillum rubrum (54). Me-Cbl is also involved in methyltransferase reactions during acetogenesis in both bacteria and archaea and during methanogenesis (55).In the formation of methane from CO,, the methyl group is transferred from methyl-
354
MICHELLE R. RONDON ET AL.
tetrahydromethanopterin to coenzyme M (P-mercaptoethane sulfonic acid) via a Cbl-containing enzyme. During methanogenesis from methanol or methylamines, an analogous reaction occurs (56,57).Me-Cbl is utilized as a methyl donor in a similar manner in the acetyl-CoA pathway used by a number of organisms for energy generation (acetogenesis or methanogenesis from acetate) or autotrophic metabolism (CO, fixation via acetate) (58).In this pathway the methyl group is transferred between the Ni/Fe-S component of carbon monoxide dehydrogenase and tetrahydrofolate (or tetrahydrosarcinapterin) by a Cbl-containing methyltransferase (58, 59).
V. Biochemistry of Cobalamin Synthesis We divide the synthesis of dAdo-Cbl into three sections. The initial steps to uroporphyrinogen I11 (uro'gen 111)are common to all the cyclic tetrapyrrole derivatives. Because this review focuses on Cbl synthesis, these early steps are not discussed in detail. The reader is referred to other references for a more complete discussion of this aspect of tetrapyrrole synthesis (7,34, 60, 61). Biosynthesis of the conin ring from uro'gen I11 ([l]in Fig. 3) to &doCbl is considered next, with a focus on the pathway that occurs in I! denitnficam. Finally, synthesis of the lower ligand is considered separately.
A. Biosynthesis of Uroporphyrinogen Ill The synthesis of all tetrapyrrole derivatives begins with the synthesis of 6-aminolevulinic acid (AmLev), which can proceed by two different pathways, the C-5 pathway or the C-4 pathway (also called the Shemin pathway). Both pathways occur among Cbl-producing organisms. The Shemin pathway consists of the formation of aminolevulinatedirectly by condensation of succinyl-CoA and glycine. This reaction is catalyzed by AmLev synthase (EC 2.3.1.37) (62, 63). The C-4 pathway is used by some bacteria, including I! denitrificans, and by animals and yeasts (34). The C-5 pathway seems to be more ancient and more widely distributed (34).It begins with an activated tRNAGlumolecule produced by glutamyltRNA synthetase (EC 6.1.1.17). Glutamyl-tRNA dehydrogenase uses the charged tRNA as a substrate to produce glutamate l-semialdehyde (Gla) or its cyclized form 2-hydroxy-3-aminotetrahydropyran-l-one (61, 64 ). Gla is then internally transaminated by Gla aminotransferase (EC 5.4.3.8) to produce AmLev. This pathway occurs in Chlorobium sp., E. coli, s. typhimuriurn, Eubacterium limosum, and Methanobacterium thermoautotrophbm, as well as in plants and algae. Porphobilinogen synthase (AmLev dehydratase, EC 4.2.1.24) catalyzes
BIOCHEMISTRY AND MOLECULAR G E N E TI C S OF COBALAMIN SYNTHESIS
COOH
355
COOH
HOOC COOH
COOH
[2] Precorrin-1
[1] Uro'gm 111
Me 20
COOH
COOH 5
COOH
[3]
[4] Precorrin-3A
~recorrin-2
CobJ
COOH
HOOC COOH
[ 51 Precorrin-BE
COOH
COOH
COOH
COOH
[6] Precorrin-4
FIG.3. Synthesis of precomn-4 fi-0111 uroporphyrinogen111. The enzyme from l? &nit$cans responsible for each reaction is given below the arrows. Where the corresponding protein from S. typhirnuriurn has been demonskated to catalyze a reaction, the name is included in parentheses. Me indicates a methyl group. For simplicity, other substrates and products of the reactions are not shown.
356
MICHELLE R. RONDON ET AL.
the condensation of two molecules of AmLev to form one porphobilinogen (PBG) molecule (65,66).The conversion of four molecules of PBG into one molecule of uro’gen I11 [l]proceeds in two steps. First, PBG deaminase (EC 4.3.1.8) catalyzes the polymerization of four PBG monomers. This enzyme uses a cofactor consisting of a dimer of PBG molecules as the scaffold on which four other PBG molecules are added. Hydrolytic cleavage of the hexamer product by PBG deaminase yields the tetrameric product preuroporphyrinogen (pre’U). Uro’gen I11 synthase (EC 4.2.1.75) catalyzes formation of [l]from pre’U, although the reaction mechanism is still not understood (34).Uro’gen I11 [l] is the last common precursor in the biosynthetic pathway of the tetrapyn-oles.
B. Biosynthesis of Cobalamin The synthesis of Cbl from [l]has been studied in numerous bacteria, especially €? freudenreichii, S. typhimurium, P. denitrijkans, and Clostridium sp. For more complete information on this subject, the reader is referred to the original literature as referenced in the text, and to other reviews (5, 7,8). The pathway in €? denitrificans is the best understood, thus it serves as a framework for this discussion. The pathway for the aerobic synthesis of Cbl in P. denitrificans is shown in Figs. 3-6. It should be kept in mind that other prokaryotes may synthesize Cbl via modifications of this pathway (e.g., anaerobic Cbl synthesis in l? freudimreichii and S. typhimurium). We have attempted to point out where these differences may occur. The nomenclature of the Cbl synthetic enzymes is confusingbecause different gene designations have been given to homologous genes from l? denitrificans and s. typhimurium. We hope that the presentation of this information in Table I1 will alleviate this problem. Cbl intermediates that do not contain a cobalt atom are referred to as preconins. The number associated with each precomn indicates the number of additional C-methyl groups carried by the molecule. Letters discriminate different intermediates with the same number of methyl groups (8). For the purposes of this review, we refer to both Propionibacteriumshermanii and l? freudenreichii as l? freudenreichii, because it has been determined that €? s h e m n i i is not a valid species designation (67). 1. PRECORRIN-1 [2] AND PRECORRIN-2 [3] Modification of [l]begins with methylation at C-2 and C-7 to form dihydrosirohydrochlorin [3]. These modifications were predicted to occur following the isolation of the oxidized form of [3] (factor 11) from cells of I? freudenreichii and Clostridium sp. (7).Precorrin-2 [3] is the last common intermediate in coninoid, siroheme, and coenzyme F,,o synthesis.
BIOCHEMISTRY AND MOLECULAR GENETICS OF COBALAMIN SYNTHESIS FOOH
1
357
pOH
Me
CobM (CbiF)
'COOH
[6]
Precorrin-4
/
COOH [7] Preconin-5
I coon
COOH
I
1
Me
COOH
I
COOH
[ 101 Precwrin-Ox
COOH
[ 1 1 ] Hydrogenobyrinic acid
FIG.4. Synthesis of hydrogenobyrinic acid from precorrin-4.
Addition of the two methyl groups at C-2 and C-7 is catalyzed by one enzyme, S-adenosy1methionine:uroporphyrinogen-I11 methyltransferase (SUMT; EC 2.1.1.107). SUMT has been purified from several sources. The
358
MICHELLE R. RONDON ET AL.
<'r; Me COOH
CONH
[1 51 Adenosylcobpic acid
<
o=c
CONH
[ 161 Adenosylcobinamide
HO
FIG.5. Synthesis of &do-Cbl from hydrogenobyrinicacid.
enzymes from P. denitrificans, B. megatmiurn, and P. freudenreichii (encoded by cobA) and Methanobacterium ivanovii (encoded by corA),have sirnlar molecular masses (25-29 kDa) (68-71). The enzymes from I? denitrifi-
BI(DCHEMISTRY A N D MOLECULAR GENETICS O F COBALAMIN SYNTHESIS
CobP
359
H2NOC
Me
< /
o=c
CONH
Adenosylcobinamide phosphate
Adenosykmbkumide
Ho
[16]
ow3
(CobU)
Me
[171
Me
‘NH
0 Me
HO
7s)
[22] Adenosylcobalamin
OH
Adenosylcobinamide guanosine diphosphate
11 81
Me
[ 191 MepBra
FIG.6. Synthesis of &do-Cbl from &do-Cbl and Me2Bza.
cans and B. megatmiurn, but not the one from M . ivanovii, exhibit substrate inhibition by [l] (68-70). In contrast, the analogous enzyme from E. coli and S. typhirnurium, en-
TABLE I1 COBALAMIN B I O S Y N T HGENES ~ C AND THEIRFUNCTIONS"
Pseudomonas dentrifwans
Salmonella typhimurium
geneC
gened
cobA cob1 cobG cob] cobM cobF cobK cobL
cysG cbiLg
chi] cbiE/cbiT
C-ll/C-12 mutase
cobH
cbiC
Hydrogenobyrinicacid a,c-diamide synthase
cobB
Cobaltochelatase
cobN cobs cobT
Enzymeb
Other organisms
GeneiRef."
~~
C-2, C-7 methylase (SUMT)f C20 methylase (SP,MT) Precorrin3B synthase C-17 methylase
C-11 methylase 0
o, 0
C-1 methylase Preconin-6A reductase C-5, C-15 methylasel C-12 decarboxylase
Cob(1I)yrinic acid a,c-diamide reductasej Cob(1)alamin adenosyltransferase
cob0
-
cbiH cbiFg -
Rhodococcus sp. cobKl(170) cobLi(170) (167)
cbiAg
Rhodococcus sp. Rhodococcus sp.h Synechoystis s p h Listeria monocytogenesi Synechocystis sp. Listeria monocytogenes'
cobAg
Escherichia coli
btuRi(165)
Cobyric acid synthase
cobQ
cbiP
Cobinamide synthase
cobC cobD
cobD cbiB
protein a cobP
cobUg
Cobalamin (5'-P) synthase
cobV
cobs
Nir:Me,Bze
cob U
cobP
Cobinamide kinase/ guanylyltransferase
phosphoribosyltransferase
cobCg
Methanococcus voltae Rhodobacter capsulatus Synechocystis sp. Rhodobacter cupsulutus Rhodobacter capsulatus Listeria monocytogenesi
(174 cobQi(172) (167) bluCi(172) bluD/(1 72)
Escherichia coli Rhorlobacter capsulatus Synechocystis sp. Escherichia coli Synechocystis sp. Escherichia coli Rhodobacter capsulutus R hizobium rneliloti Bacillus stearothennophilus Escherichia coli
cobUi(166) cobPi(172) (167) cobs/(166) cobT/(166) cobU/(172) (173) PhPB
"Only genes with a demonstrated function in cobalamin biosynthesis are given here. Genes with no demonstrated function are from R dentrijcans (cobw,cobE) and S. typhimurium (cbiD,cbiG, chiK, cbiM, cbiN, cbi0, cbiQ, cobB). bEnzyme names follow those given in other reviews (5,8) and are not meant to be systematic. 'For sequence analysis of these genes, see the otiginal references: cobABCDE (7.9, cobFGHZJKLM (81, 8.9, cobNOPQW (92),cobST (91),and cobUV (113). <'For sequence analysis of these genes, see the following references: cysC (169),cobA (162),cobCD (116'),and (74) for the rest. These genes encode proteins likely to be involved in cobalamin synthesis based on sequence homology; names are assigned only when done so in the reference given. Only genes that are thought to be cobalamin synthetic genes are shown; the homology of r! dentrfazns or S. typhimurium cob genes to genes with other known function is discussed in the text. f The list of SUMT homologs is too long to be included here: see the text for a discussion of these proteins. gSalmonelZa typhimurium genes whose products have been biochemically shown to have the stated function; see text for details. "These genes are similar to cobL. 'A. Camilli and D. Portnoy, personal communication. 'No gene has been reported to encode this function. krZccessionnumbers: R. rneliloti (S27658): E. coli (U23163).
362
MICHELLE R. R O N D O N ET AL.
coded by cysG, is much larger (52 kDa) and contains another domain (72). Whereas the C-terminal domain of CysG exhibits homology to the enzymes discussed above, CysG has an additional N-terminal domain of approximately 22 kDa. In addition to its SUMT activity, purified CysG protein from E. coli catalyzes the subsequent reactions required for siroheme synthesis, namely, oxidation to sirohydrochlorin and insertion of iron to form siroheme (72, 73). Also, purified CysG can insert cobalt into factor 11, thus raising the possibility that CysG might function as a cobaltochelatase in S. typhimurium (72).This was an interesting finding, because no homologs of the cobaltochelatase proteins of l? denitrificans have been found thus far in S. typhimurium (74). 2. PRECORRIN-3A [4] Methylation of [3]at C-20 by S-adenosylmehonine:precomn-2 methyltransferase yields [4]. This enzyme has been purified from l? denitrificans (CobI) and S. typhimurium (CbiL) (75, 76). Structures [3]and [4]accumulate both in E. coli cells containing the Cbl synthetic genes from S. typhimurium and in €? fimdenreichii cells grown without cobalt, suggesting that cobalt insertion occurs at this point or before in the biosynthetic pathway in these bacteria (7, 77).This may be one instance where the aerobic and anaerobic pathways differ, because cobalt insertion occurs later in the pathway in I! denitrificans. 3. PRECORRIN3B [5]
The next step in Cbl synthesis in l? denitrificans involves the oxidation of [4] to produce [5]. This is catalyzed by CobG, which appears to be an irowsulfur protein, and involves hydroxylation at C-20 and lactone formation of ringh acetate to C-1 (78). Oxidation does not cause ring contraction directly, but lays the foundation for CobJ-catalyzedring contraction (8).CobG was found to be oxygen dependent in vitro (79),suggesting that this step may occur differently in anaerobic Cbl synthesis. No CobG homolog was found in the facultative anaerobe S. typhimurium, suggesting a different mechanism in this organism. CobG is homologous to ferredoxin-nitrite and sulfate reductase proteins, which are siroheme-dependent enzymes (5, 7 4 ) .It has been suggested that, in anaerobes, this step may occur using a cobalt-containing precorrin3A as a metallosystem for internal reduction of Co(II1) to Co(1) in lieu of an oxygen-dependent two-electron oxidation of [4]. For discussion of this proposed anaerobic mechanism, see Scott (80).] 4.
P R E C OR R IN-4
[S]
Ring contraction and AdoMet-dependent methylation at C-17 are both catalyzed by the CobJ enzyme (79, 81).The acetyl group generated from the
BIOCHEMISTRY AND MOLECULAR GENETICS OF COBALAMIN SYNTHESIS
363
ring contraction is found at C-1. Methylation at C-17 in €! freudenreichii occurs after insertion of cobalt, again suggesting early cobalt insertion in this organism (82).The structure of [6] (Fig. 4) was unexpected because ring contraction was not thought to occur this early, and because the acetyl group was found at C-1 instead of the expected C-19 (8).
5. PRECORRIN-5 [7] The next step involves methylation of [6] at C-11 by the AdoMet-dependent CobM enzyme (81).CobM and Cob1 are the only two methyltransferases involved in Cbl synthesis that do not have additional activities (5). This reaction is catalyzed by the homologous CbiF protein in S. typhimurium (76). The methyl group at C-11 is later transferred to C-12. 6. PRECORRIN-6A [S]
The formation of [S] involves loss of an acetyl group at C-1 and subsequent AdoMet-dependent methylation at this position. These reactions are catalyzed by CobF, another multifunctional enzyme (5).Acetate was detected as a probable product of the elimination reaction (78).There appears to be no homolog of CobF in S. typhimurium (74),and this reaction has been suggested to occur differently in anaerobic bacteria (80). Identification of [S] was a pivotal step in elucidation of the Cbl synthetic pathway in I? denitrificans (83).An in vitro system had been set up for the synthesis of hydrogenobyrinic acid from precomn-3. When NADPH was omitted, [S] accumulated. This structure showed that ring contraction and extrusion of C-20 had occurred, and that C-11 was methylated (83, 84).
7. PRECORRIN-6B [9] The accumulation of [S] in the reaction mix lacking NADPH suggested that the next step involved reduction of the structure to the level of the final product (5,83).Reduction of the C-18/C-19double bond requires the CobK enzyme, with reduction occurring at C-19 (5).This reaction was also demonstrated in extracts from I? freudenreichii, showing that reduction is also required in the anaerobic biosynthetic pathway (5).This is further supported by the existence of a homolog to CobK (CbiJ) in S. typhimurium (5).Interestingly, cobK is located among other cob genes in €! denitrificans (in cluster A), but the gene is transcribed from the opposite strand of DNA (85).It is not known whether this has any regulatory significance.The codmg sequence for CobK overlaps the N-terminus of the CobL coding sequence (85). 8. PRECORRIN 8x [lo]
The AdoMet-dependent CobL protein catalyzes the methylations at C-5 and C-15, and decarboxylation of the acetate side chain at C-12. Methylation
364
MICHELLE R. RONDON ET AL.
probably occurs before decarboxylation,although the order has not been determined (86).CobL contains a C-terminal domain of approximately 20 kDa in addition to the N-terminal methyltransferase domain. The C-terminal domain probably contains the decarboxylase activity (81).Interestingly, these two functions are encoded by two separate proteins, CbiE and CbiT, respectively, in S. typhimurium (74). 9. HYDROCENOBYRINIC ACID[ll]
The final reaction in the synthesis of the conin ring involves methyl group migration from C-11 to C-12. This reaction, thought to be a suprafacial 1,5sigmatropic reaction, is catalyzed by the CobH protein (87).This small protein (20 m a ) is thought to act as a monomer. Rearrangement to [ll]allows the double bonds to move into conjugation (8).
10. HYDROGENOBYRINIC ACID U,C-DJAMIDE [u] Because [l2] (Fig. 5) had been detected as the most elaborate cobalt-free corrinoid isolated from El hitrificans, it seemed likely that qc-amidation was the next step in the pathway. These amidation reactions are catalyzed by CobB (5, 88).It appears that the c side chain is amidated frst, followed by the a side chain, both reactions requiring one ATP and using glutamine as the amine donor (88).The descobalto compounds are much better substrates for CobB than the corresponding cobalt-containing compounds, supporting the idea that cobalt insertion in vivo occurs after this step in l? h i t r i f E a n s . In S. typhimurium, cobyrinic acid is amidated to cobyrinic acid qc-diamide [13] by CbiA (the S. typhimurium homolog of CobB),but hydrogenobyrinic acid [ll]is not (89). 11. COBYRINIC ACID UJ-DIAMIDE [13]
As mentioned, [ll]had been shown to accumulate in vivo in El denitn$ icans, along with the corresponding diamide (5).This suggested that cobalt insertion does not occur in this organism until conin ring synthesis is complete. Hydrogenobyrinic acid [El was found to be the substrate for the cobalt-insertingenzyme, cobaltochelatase, in an ATP-requiringreaction, producing [13]as the product (90). Cobaltochelatase was shown to be a complex enzyme, consisting of CobN (140 kDa) and a complex of Cobs (38m a ) , and CobT (70 kDa) (90-92). Cobaltochelatase shares a number of properties with magnesium chelatase, the enzyme that inserts magnesium into protoporphynin IX,the committed step in the synthesis of chlorophylls (5, 90, 93, 94). Both systems require AT€' and are composed of three subunits of similar masses. CobN is homologous to BchH, the 140-kDa Mg-chelatase subunit, and it is likely that these proteins bind the tetrapyrrole substrates. No homology is seen between
BIOCHEMISTRY AND MOLECULAR GENETICS OF COBALAMIN SYNTHESIS
365
the 70-kDa subunits (CobT and BchD) or the 38-kDa subunits (Cobs and BchI), although the latter both have an ATP-binding site consensus (94).
[14] Adenosylation of corrinoids requires that the cobalt atom be in the + 1 oxidation state; thus, reduction of [13] is required prior to adenosylation.An NADH-dependent flavoprotein (EC 1.6.99.9)with this activity has been purified from l? denitrificuns (95). The enzyme functions as a dimer, reducing Co(II1) to Co(1) in a number of substrates, including cobyric acid and Cbl (95). The N-terminal amino acid sequence of the protein showed no homology to any of the previously identified cob genes of I? denihificans. The adenosylating enzymes [cob(I)alamin adenosyltransferase, EC 2.5.1.171 encoded by cob0 in l? denitrificam and by cobA in S. typhimurium have been purified (96, 97). The enzyme from l? denitrificam adenosylates a variety of compounds, but because [I21 accumulates in cob0 mutants, it was suggested that this is the in vivo substrate (96). The in vivo biosynthetic intermediate that is the substrate for CobA in S. typhimurium is not yet known. Genetic evidence supports a role for CobA in de novo corrin ring biosynthesis in S. typhimurium (98). 12.
ADENOSYLCOBYRINIC ACIDa,C-DIAMIDE
13. ~ E N O S Y L C O B Y R I CACID[15]
The b, d, e, and g amidation of [14] is catalyzed by cobQ (99). In uitro, CobQ requires adenosylated substrates, glutamine, and ATP, and it appears that the amidations occur in a specific sequence. 14.
~ E N O S Y L C O B I N A M I D E[16]
In P. denitrijkuns, attachment of l-amino-2-propanol (AP)to [15] to form [16] requires a complex of the CobC and CobD proteins and a 38-kDa protein called a (5).This system is specific for adenosylcobyricacid and requires ATP and AP. CobC shows homology to aminotransferase proteins (5). The source of the AP moiety of Cbl has not been determined. Early studies using Clostridium tetanomorphum, I? freudenreichii, or Streptomyces grism extracts suggested threonine as the source of AP (100-103). Labeling studies performed in Methanobacterium thmoautotmphicum (Marburg strain) suggest that the source of AP may be pyruvate (104). In I? denitrificum and S. typhimurium, AP-correctable mutants have been isolated, although the biochemical basis for their phenotype has not been determined (5, 75, 105). 15. COBINAMIDE PHOSPHATECOBINAMIDE-GDP Cobinamide phosphate (Cbi-P, [17]) and cobinamide guanosine diphosphate (Cbi-GDP [IS]) (Fig. 6) had been isolated from cells of Nocardia m-
366
MICHELLE R. RONDON ET AL.
gosa (106).Subsequently, enzymatic synthesis of these two compounds was accomplished in vitro using cell-free extracts of I! freudenreichii (107,108). These studies suggested that the synthesis of Cbl proceeded via [17] and [18] intermediates. Recently a bifunctional enzyme having both Cbi kinase and Cbi-P guanylyltransferase activity was purified from 19 denitrijkans (CobP) and S. typhimurium (CobU) (109, 110).In vitro studies of the enzymes show that ATP or GTP can serve as the phosphate donor for the kinase reaction, and that transferase activity is specific for GTP (109,110).dAdo-Cbi was the preferred substrate for the enzymes. Unlike CobP, CobU shows significant oxygen sensitivity (109, 110). Genetic evidence suggests that the affinity of CobU for its corrinoid substrate is altered by the presence of oxygen (98,111). Recent studies of CobU suggest that the guanylylation reaction proceeds via a CobU-GMP intermediate (110). 16. 5,6-DIMETHYLBENZIMIDAZOLERIBONUCLEOTIDE[20]
Studies on the formation of nucleotide derivatives of Me,Bza [19] in I? freudenreichii led to the partial purification of a phosphoribosyltransferase enzyme (EC 2.4.2.21)catalyzing the formation of [20] (a-ribazole-5’-P,N1a-D-ribofuranosyl-Me,Bza 5‘-phosphate) and nicotinate from [19] and ribosylnicotinamide (Nir) (30,112).This enzyme was shown to utilize a variety of substituted benzimidazoles as substrates, suggesting a low specificity for this moiety. Subsequently, similar enzymes were purified from Clostridiumsticklandii (partially),I! denitrificans (CobU), and S. typhimurium (CobT) (31, 113,114;J. R. Tizebiatowski and J. C. Escalante-Semerena,unpublished). In all cases, the product of the reaction was shown to be a-ribazole-5‘-P.
17. ADO-CBL [22] Synthesis of [22] from [18] and [20] could occur by two pathways, depending on whether the 5’-phosphate is cleaved from the nucleotide before or after incorporation into [22]. Cbl-5‘-Pwas isolated from cells of Pfi-eudenreichii, suggesting that this was an intermediate in Cbl synthesis. A very labile phosphatase activity capable of converting this intermediate to Cbl was detected (115).A gene from S. typhirnurium encoding a phosphatase (CobC) that can dephosphorylate [20] has been identified and sequenced (116).A similar activity has been reported in l? denitrificans and I! freudenreichii (112,113),although the gene encoding this activity has not been identified in these bacteria. T h ~ ssuggests that a-ribazole [21] might be the relevant intermediate in Cbl synthesis in S. tgphimurium. The ability of the CobC enzyme from S. typhimurium to dephosphorylate Cbl-5’-Phas not been tested. An enzyme with Cbl synthase activity (Cobv) has been partially purified from I? denitrificuns (113).The preparation showed both Cbl and Cbl-5’-P synthase activity. Although true substrate specificity has not been determined
BIOCHEMISTRY AND MOLECULAR CENETICS OF COBALAMIK SYNTHESIS
367
due to the fact that the enzyme remained associated with a high-molecularweight complex, it was suggested that [2 11 is the true intermediate because this compound was more abundant in cells of l? denitrificans than the nucleotide.
C. Biosynthesis of the Lower Ligand Both aerobic and anaerobic pathways have been proposed for synthesis of [19]. In the aerobic pathway, the aerobes B. megaterium,Nocardia rugosa,
and Streptornyces sp. (ATCC11071), and the aerotolerant anaerobe I? frmdm-eichii (117),appear to derive [19] from riboflavin via flavin mononucleotide (FMN) in an oxygen-dependent pathway as demonstrated by radiolabel tracing and NMR spectroscopy studies (118-120). As shown in Fig. 7, both the dimethylbenzene moiety and the C-1’ of the ribityl side chain of ribof lavin are incorporated into [191. Interestingly, P freudenreichii appears to synthesize the comn ring via an “anaerobic” pathway, while synthesizing [191 via an “aerobic” pathway. In the anaerobic pathway, the anaerobic prokaryotes studied to date synthesize [19] from a variety of building blocks. For example, in E. limosum, glycine is incorporated into N-1, C-3a, and C-7a (121);methionine provides the methyl groups at positions 5 and 6 (14);erythrose-4-P becomes C-4, C5, C-6, and C-7 (122, 123); formate is the precursor of C-2 (123);and the amide of glutamine becomes N-3 (124).Incorporation of glycine and the acquisition of methyl groups from methionine have also been demonstrated in the anaerobe Clostridium harkeri, further supporting the existence of this pathway in anaerobes (117). There is evidence suggesting that in anaerobic prokaryotes there exists a common pathway for the synthesis of [19] and other benzimidazole derivaCHzOH HO “$-OH /
c,.
4
I
I,
+
[02l
FMN ++++-D++ 1
0 riboflavin
5,6-dimethylbenzimidazole[19]
FIG.7. Aerobic synthesis of Me,Bza from riboflavin. The C-1’ of the rihityl side chain of riboflavin becomes C-2 of Me,Bza. An eight-step reaction sequence for Me,Bza synthesis from FMN has been proposed (120).
368
MICHELLE R. RONDON ET AL.
tives. The anaerobes C. thermoacetiicum,which synthesizes 5-methoxybenzimidazolyl-Cba, and M. burkeri, which synthesizes 5-hydroxybenzimidazolyl-Cba, have both been shown to derive their lower ligands from glycine (117, 125). In addition, E. limosum can transform 5-hydroxybenzimidazole and 5-methoxy-6-methylbenzimidazole into [19], suggesting that these bases not only serve as lower ligands but are also intermediates in anaerobic Me,Bza biosynthesis (126). It has been proposed that the early steps in anaerobic biosynthesis of [19] prior to the synthesis of 5-hydroxybenzimidazoleand 5-methoxy-6-methylbenzimidazole may be shared with the purine biosynthetic pathway, because both pathways utilize formate, glycine, and the glutamine amide as building blocks (123,126).The pathways would not be identical because purines have a pyrimidine ring instead of a benzene ring, and purine nucleotides contain a ribose moiety in a P-N-glycosidiclinkage with the N from glutamine, whereas [19] is linked via an a-N-glycosidiclinkage with the N from glycine (123). To date there is no genetic support for either proposed Me,Bza biosynthetic pathway. As far as we know, only in S. typhimirium have mutations been isolated that render the cell unable to make Cbl unless provided with Me,Bza (127).Interestingly, all of these mutations map within a single gene, cobT (114, 128).This is somewhat paradoxical because it has been shown that cobT encodes the Nir:Me,Bza phosphoribosyltransferase that synthesizes [ZO]. Given the phenotype of cobT mutants, it is possible that in addition to phosphoribosyltransferase activity CobT may also be involved in MezBza biosynthesis, although other explanations have been proposed (114 ). CobT-dependent Me,Bza biosynthetic activity has not been demonstrated in vitro. The radiolabel tracing studies performed for both the aerobic and the anaerobic MezBza biosynthetic pathways indicate that the biosynthetic precursors are incorporated into [19] in a regiospecific manner. In response to this observation it has been suggested that Cbl biosynthesis involves a multienzyme complex in which the Me,Bza biosynthetic enzymes are in close association with the next enzyme in the pathway, the Nir:Me,Bza phosphoribosyltransferase (123).
D. Utilization of Exogenous Corrinoids The coenzymatic forms of cobalamin are either adenosyl- or methylcobalamin. However, the vitamin form, cyanocobalamin, can be utilized efficiently when provided exogenously. Therefore, cells must have ways to exchange upper ligands. Little is known about how this is achieved. Conversion of CN-Cbl to &do-Cbl probably requires the same enzymes as those needed for reduction and adenosylation of the biosynthetic intermediates. It has been shown, in vitro, that the reductase from €? denitri$cuns
BIOCHEMISTRY AND MOLECULAR GENETICS OF COBALAMIN SYNTHESIS
369
and the adenosyltransferases from l? denitrijkans and S. typhimurium can utilize CN-Cbl as substrate, albeit less efficiently (95-97)). Conversion of CN-Cbl to Me-Cbl, the form of cobalamin that predominates in human cells, has not been investigated in detail, although GS-Cbl may be an intermediate (33, 129). It was suggested that CN-Cbl pligand transferase produces GS-Cbl from CN-Cbl, which is then reduced to cob(I1)alamin by cob(I1)alamin reductase (129). Although AdoMet can methylate cob(1)alamin chemically, it is not known if it is the methyl donor in vivo (32).Me-Cbl is formed by various reactions in methanogenic archaea, and in bacteria and archaea containing Me-Cbl-dependent acetate metabolism pathways (58,130).Because synthesis of Cbl in these organisms has not been investigated, it is not known at what point the corrinoid initially becomes methylated, nor whether the biosynthetic intermediates are adenosylated or methylated. Synthesis of enzyme-bound Me-Cbl has been demonstrated in the methionine synthase system. Two proteins that reduce the cobalt ion from Co(I1) to Coo) generate the substrate suitable for methylation by AdoMet to produce enzyme-bound Me-Cbl(131,132). It has been suggested that other Cbldependent enzymes also have mechanisms for regeneration of enzymebound Cbls (133).
VI. Molecular Genetics of Cobalamin Synthesis
A. Genetics of Uroporphyrinogen Ill Synthesis Genetic analysis of the genes and enzymes required for the synthesis of [l]have been performed in a number of organisms,mainly in relation to their role in heme and chlorophyll synthesis. The genetics of this pathway will be reviewed only briefly here. Comprehensive reviews have been given previously (7, 60, 61). The gene encoding the first enzyme in the aminolevulinate biosynthetic pathway is called hemA, even though this designation refers to two different enzymes.In organisms utilizing the C-4 pathway, hemA encodes AmLev synthase (134-138). In Rhodobacter sphaeroides, there appear to be two genes encoding this enzyme; the second has been named hemT (139). In organisms u h g the C-5 pathway, hemA encodes glutamyl-tRNAdehydrogenase (140).The hemL gene encodes Gla aminotransferase (141, 142). The gene for AmLev dehydratase has been named hem& and PBG deaminase is encoded by hemC. The gene encoding the enzyme that catalyzes the last step in the common pathway, uro’gen I11 synthase, is called hemD. These genes have been cloned from numerous organisms, including E. coli, B. subtilis, yeast, humans, and rats (143-155).
3 70
MICHELLE R. RONDON ET AL.
In E. coli and S. typhimurium,the hem genes appear to be unlinked, with the exception of hemCD, which are thought to form an operon (61,156,157). In S. typhimurium, insertions in the hemA gene are lethal, due to transcriptional polarity onto the prfgene, an essential gene encoding peptide release factor, located in the same operon and downstream of hemA (158).In B. subtilis, at least some genes appear to form an operon including hem4 C, D, B, and L (148).A similar operon organization has been found in CZoskidiumjosui (159).megulation of hem gene expression and enzyme activity is beyond the scope of this review. The reader is referred to several books containing extensive information on this subject (60, 64 160).]
5. Genetics of Cobalamin Synthesis A large number of genes encoding proteins required for CBL synthesis in I? denitvijkans and S. typhimurium have been cloned and sequenced. In both cases, the genes were identified by complementation of mutants unable to synthesize Ado-Cbl(15, 74).In addition, a number of cob (for Cbl biosynthesis) sequences from other organisms have been reported, and continue to appear in the data bases. A summary of cob genes and their functions is found in Table 11.
1. Ado-Cbl BIOSYNTHETIC GENESOF SaZmoneZEa typhimurium The cob operon of S. typhimurium contains 17 cbi and 3 cob genes thought to be involved in Ado-Cbl synthesis (74).In most cases, function was assigned based on homology to P denitrijkans cob genes. The cob operon is located adjacent to and is transcribed divergently from the pdu operon, which encodes enzymes required for the Ado-Cbl-dependent utilization of 1,2-propanediolas carbon and energy source (161).This organization has regulatory sigdicance (see Section VILA). A number of cbilcob genes located in the cob operon are without homologs in I? &ni&iji’cans. These are cbiD, cbiK, cbiM, cbiN, cbi0, and cbiQ. The first three of these genes encode proteins of unknown function, although cbiD mutants were found to be unable to synthesize Cbi (73, suggesting that cbiD encodes a function required for Cbl synthesis. cbiN, cbi0, and cbiP may encode elements of a cobalt transport system. This idea is based on the homology of CbiO to known transport proteins and the fact that the cbiN and cbiQ gene products appear to be membrane-spanning proteins (74).Recent analysis of S. typhimurium cob gene expression in E. coli suggests that cbiM, N, 0, and Q, may not be essential for Cbl synthesis under laboratory conditions (77). In S. typhimurium, an additional five genes involved in Ado-Cbl synthesis, located away from the main cob cluster at minute 41, have been characterized. These are cobA (encoding comnoid adenosyltransferase), cobB
BIOCHEMISTRY AND MOLECULAR GENETICS OF COBALAMIN SYNTHESIS
371
(encoding a putative phosphoribosyltransferase (A. Tsang and J. C. EscalanteSemerena, unpublished), cobC (encoding a-ribazone-5’-phosphatase),cobD (homologous to cobC of I! denitr$icans), and cysG (116, 162, 163). Two genes, cobD and cobC, are located adjacent to one another and are transcribed divergently, whereas the other genes are not located near other cob genes. DNA sequence analysis suggests that there may be overlap between the regulatory regions of cobC and cobD (116). The CobC protein is homologous to prokaryotic phosphoglucomutases and acid phosphatases and to eukaryotic fructose 2,6-bisphosphatase enzymes (116). 2. Ado-Cd BIOSYNTHETIC GENESOF l’seudomonas denitnficans
A large number of cob genes from I! denitr$icans have been sequenced. They are located at four loci on the chromosome (15).This organization contrasts to that seen in s. typhimurium. These genes and their functions are described in Table 11. In most cases, the function of the gene product has been demonstrated biochemically. The two exceptions to this are cobE and cobW, to which no function has been assigned, although they are essential for Cbl synthesis. The cobE gene has homology to the cbiG gene from S. typhimurium (77). In l? denitrificans, there exist a number of genes without homologs in S. typhimurium. These are cobF (encoding C-1 methyltransferase), cobG (encoding precomn-3B synthase), cobN, s, and T (encoding the subunits of cobaltochelatase), and cobW. The cobW gene is thought to contain both NAD(H) and ATP binding sites, and shows homology to a protein required for nitrile hydratase activity in Rhodococcus (5, 164).These genes may represent steps where there are biochemical differences between the pathways in the two organisms. Additionally, several biochemical activities involved in Ado-Cbl synthesis in I? denitrificans have been described for which no gene has been identified. These are cobyrinic acid a,c-diamide reductase and protein a of the cobinamide synthase complex (5). N-Terminal amino acid sequencing of the purified proteins showed that the genes encoding these functions are not located in the previously identified cob gene clusters. 3. cob GENESOF Escherichia coli Some Cbl biosynthetic genes from E. coli have been described. This bacterium contains genes homologous to the cobA and cysG genes of S. typhimurium, as well as to cobUST and cobC. The homolog to cobA has been named btuR (165).These results correlate with the ability of E. coli to make siroheme (requiring CysG function), to convert nonadenosylated coninoids to Ado-Cbl (requiring BtuR function), and to synthesize Cbl if Cbi is provided (requiring CobUST function) (166). Escherichia coli appears not to
3 72
MICHELLE R. RONDON ET AL.
have any genes required for synthesis of the comn macrocycle beyond CysG (166). 4. OTHERENTERICBACTERIA A number of enteric bacteria besides S. typhimurium synthesize Cbl (133).These bacteria, including Klebsiellu, Citrobactw, Enterobactm, and some species of Escherichia (not E. coli),can synthesize Cbl under aerobic and anaerobic growth conditions (133).This suggests that the Cbl synthetic pathway in these organisms differs from the pathway utilized by S. typhimurium, which can synthesize Ado-Cbl de novo only under anaerobic growth conditions (127).In fact, DNA probes from the S. typhimurium cob gene cluster did not hybridize to DNA from other enteric bacteria (133). Therefore, S. typhimurium does not appear to be representative of the enteric bacteria in this regard. The origins of the S. typhimurium Cbl synthetic genes are unknown (133)(but see Section VI,B,7). The conclusion appears to be that there are two different Cbl synthetic pathways in the enteric bacteria (133).
5 . cob GENESOF Synechocystis SP. Sequence analysis of 1 Mb of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803 has identified 818 open reading frames (ORFs), 488 of which were homologous to other proteins or hypothetical proteins (167).A number of genes homologous to cob genes were identified. Assignment of these genes as cob genes is based solely on sequence homology, thus the results are tentative. However, some interesting points can be made. First, the genes (10 total) in general do not appear to be clustered on the chromosome. Second, this bacterium appears to have homologs to cob genes that have thus far been found only in I? denitriflcam or S. typhimurium (i.e., cbiM and cobw). Because this organism is more different at the phylogenetic level than many other Cbl-synthesizingbacteria (168), analysis of Cbl synthesis in this organism should prove interesting. 6. OTHERORGANISMS Although Cbl synthesis has been described and studied in a number of organisms, the Cbl synthetic genes have not been investigated. However, homologs to cob genes have been found in other organisms (seeTable 11).These genes have generally been identified by sequence homology. Cbl synthetic genes have been identified from Rhizobium meliloti, Rhodococcus sp., B. megaterium, P freudenreichii, M. ivanouii, Methunococcus voltae, Listeria monocytogenes, Synechocystis sp., and Rhodobacter capsulatus (see Table 11). These organisms are scattered among the phylogenetic tree of prokaryotes (168),and further study of the Cbl synthetic genes from diverse organisms should provide insight into the different pathways for Cbl synthesis.
BIOCHEMISTRY AND MOLECUI.AR GENETICS OF COBALAMIN SYNTHESIS
373
7. cob GENEORGANIZATTON Although the individual genes for Cbl synthesis are for the most part conserved among various prokaryotes, the gene organization does not appear to be conserved. This is shown schematically in Fig. 8, which illustrates the organization of cob genes from some bacteria. For example, the cobV and cobU genes of P denih-ijkans are adjacent to one another but are transcribed divergently, whereas in Rhodobacter capsulatus they are cotranscribed. In S. typhimurium, these two genes are also cotranscribed (cobs and cobT),but the order of the two genes is switched. Analysis of gene organization can provide clues as to the origin of genes as well. The cbi genes of S. typhimurium do not appear to be closely related to the cobamide synthetic genes from other enteric bacteria (133),and it is currently thought that S. typhimurium acquired these genes by horizontal transfer. Listeria monocytogenes contains cbi genes organized in the same order as those in S. typhimurium (see Fig. 8), thus raising the possibility that the origin of the chi genes of S. typhimurium may be from a relative of L. monocytogenes (e.g., a Gram-positivebacterium). 8. ANALYSIS OF GENESENCODING SUMT ACTMTY The Cbl biosynthetic protein for which the most nucleotide sequences are available is SUMT (CobAKysG).This enzyme is responsible for the committed step in the Cbl/siroheme/F,,, branch of the pathway. Analysis of genes encodmg this activity can provide insight into the importance and origin of the Cbl pathway in different prokaryotes. There appear to be several different types of genes encoding SUMT and siroheme synthase activity. This is shown schematically in Fig. 9. Most of the sequences are of the type represented by l? hnitrificans. Single-domainSUMT genes have been found in M. ivanovii (69),l? fieudmreichii (74,Bacillus stearothemophilus (173),Anacystis nidulans (174),Vibrio anguillarum (175),Pseudomonas f luorescens (176),B. megaterium (70), and Paracoccus denih-ificans (GenBank Accession number UOSOOZ). This protein, called CobA or CorA, contains only one domain, that with SUMT activity. A second type of protein is encoded by the cysG gene of E. coli and S. typhimurium. This protein contains a C-terminal SUMT domain attached to another domain catalyzing siroheme synthesis (72, 73,163,169).CysG contains the functions required for synthesis of siroheme from uro’gen 111. A third type of protein is found in the anaerobic bacterium C. josui (159) and a sequence in Synechocystis has been reported (167).In these bacteria, the SUMT domain is found within yet another multifunctional protein, called HemD. In this case, however, the other region has uro’gen 111 synthase (HemD) activity, which catalyzes the step precedmg SUMT in the biosyn-
3 74
MICHELLE R. RONDON ET AL.
S. typhimuriurn
P. denitriftcans
-+cobWcobNcob0 !:::::I
COWcobKcobL cobM r/)/c--c .....
cobF c & c W c o b l I
I
I
W
cobs cobT
D
cobVcobU
@ ]m -a
L. monocytogenes ___)
Rhodococcus s p ___)
R. capsulatus
--
cobVcoWcObP bluF bluf blu0 MuC bluB cob L\\
E. colt
FIG.8. Organization of the cob genes. Each gene is designated by a rectangle. The pattern in the rectangles indicates homology between genes from different bacteria. Rectangles with no patterns are those with no homology to other cob genes. Arrows indicate the direction of transcription of operons.The cobL gene of I? ~ i ~ f i c uand n sRhorlococw sp. has two patterns to indicate its homology with two S. typhimuriumgenes, cbiE and cbil: Only organisms with multiple known cob genes are shown here, the exception being Synechocyscis sp. Genes that are shown as individual units are not thought to be in an operon with other cob genes.
BIOCHEMISTRY AND MOLECULAR G E N E TI C S OF COBALAh4IN SYNTHESIS
Protein Structure
Protein Name
3 75
Organism(s)
CobA
P. denitritlcans, etc.
I
SS
1 SUMT I
CysG
S. typhimurium E. coli
[ HemAl
SS
I
HemA
C. josui
1-
HemD
C.josui Synechocystis sp.
FIG.9. Structure of CobA/CysG-type proteins. The functional domains are shown as rectangles. SS indicates siroheme synthase activity. See text for details.
thetic pathway, namely, the formation of uro’gen 111. Interestingly, this gene is found in a porphyrin biosynthetic gene cluster, which also includes another
bifunctional protein containing siroheme synthase activity and glutamyl tRNA reductase ( H e r d ) activity (167). Thus it appears that the SUMT, siroheme synthase, HemA, and HemD protein domains can function when present as different mono- or bifunctional proteins. The modular nature of these proteins is also evident in the genetic organization of the genes encoding these functions. For example, in A. nidulans the gene encodmg the monodomain SUMT protein is found immediately upstream of a gene encoding HemD, thus suggesting how the bifunctional SUMT-HemD protein may have originated (174). The sequences encoding SUMT and siroheme synthase activity are often present adjacent to genes encoding functionally related proteins. In E. coli, S . typhimurium, and Paracoccus denitrificans, the SUMT-encoding gene is present adjacent to genes required for nitrate reductase activity, one function that requires siroheme as a cofactor (163, 169).In other cases, such as C.j o sui and A. nidulans, SUMT-encoding genes are found adjacent to porphyrin biosynthetic genes (159, 174). Finally, the SUMT-encoding gene can be adjacent to cobalamin synthetic genes, as is the case in P denitrificans (near cobBCDE), l? freudenreichii (near a putative cbi0 homolog), and B. stearothmophilus (near a putative l? denitrificans cobU homolog). Given the complexity of the Cbl biosynthetic pathway, and the phylogenetic breadth represented among Cbl producers, analysis of this type may
376
MICHELLE R. RONDON ET AL.
prove to be a useful tool to trace biochemical pathways in time, and to understand their importance to the cells that produce them. 9. HOMOLOGY AMONG cob GENES
A number of Cob proteins show homology among themselves. The proteins catalyzing Ado-Met-dependent methyltransferases from S. typhimuriurn and l? denit$icans are more closely related to one another than to other proteins, and were suggested to have diverged from a putative common ancestor prior to the divergence of the two bacteria. This is discussed more thoroughly elsewhere (74). Interestingly, S. typhimuriurn appears to lack a homolog to one of these methyltransferases (cobF). Additionally, the proteins catalyzing amidation of the corrin ring (CobB and CobQ in P deniCrficuns)are homologous to one another (5).
VII. Regulation of Cobalamin Synthesis Thus far, regulation of Cbl synthesis has been studied only in S. tgphimurium. Given that the synthesis of Cbl by this organism displays distinctive features it seems likely that regulation of Cbl expression in other systems will differ from the S. typhimurium model.
A. Salmonella vphirnurium Regulation of the Cbl synthetic operon (cob operon) has been studied using Mud-lac operon fusions (177). The major factors affecting cob expression are 1,2-propane&oland &do-Cbl, although these molecules exert their effect at different stages in cob expression (178-184). The presence of 12propanediol causes activation of cob (and pdu) transcription under aerobic and anaerobic conditions (178, 179), whereas Ado-Cbl causes repression of cob expression, probably due to a postranscriptional mechanism (185,186). In addition, global control of cob transcription has been demonstrated, suggesting that cob expression responds to anaerobiosis and carbon source (178-180,182,183). Figure 10 summarizes the factors affecting transcription of the cob/pdu regulon. These regulatory mechanisms appear to affect only the cob operon. The cob genes that map elsewhere (cobABCD)are not regulated by those factors affecting expression of the main cob operon (105, 162; A. Tsang and J. C. Escalante-Semerena, unpublished). 1. THEPocR PROTEIN MEDIATES THE EFFECT OF l,%PROPANEDIOL The PocR protein has been shown to be a DNA-binding protein that specifically binds to the cob promoter region in viko (187). This was pre-
BIOCHEMISTRY AND MOLECULAR GENETICS OF COBALAMIN SYNTHESIS
377
@@
f +1 ,Z-PDL
pduF
pocR
L
cob
7
4? ArcA
4?
CRP
4?
-
L
ArcA
FIG.10. Regulation of cob and pdu transcription by PocR and global factors. For illustration purposes, PocR is shown as a h e r activated by l,Z-propanediol(1,2-PDL)prior to DNA binding; however, it has not been demonstrated that PocR-dependent transcription activation occurs in this manner. The size of the genes and operons is not to scale. The cob and pdu operons are indicated by one arrow. The pduF gene is thought to encode a facilitator protein, which may transport 1,2-PDL.The involvement of the CAMPreceptor protein (CRP) and ArcA has not been shown directly. It is thought that transcription from the pduF promoter may also include pocR (190).
dicted by the location of the pocR gene (directly upstream of the cob operon), by the predicted amino acid sequence of PocR (homologous to the AraC family of positively acting transcription factors), and by the phenotype of pocR mutants [unable to activate transcription of cob and pdu in response to 1,2-propanediol (74, 178, 179).].It seems likely that PocR recognizes 1,2propanediol in the medium and then activates transcription of the cob and pdu operons. This regulatory coupling of cob and pdu expression suggests that the most important use of Cbl by S. typhimurium may be to support 1,2propanediol catabolism. 2. Cbl REPRESSION The end product of the pathway, &do-Cbl, affects translation of the cob mRNA, although the level at which this molecule acts is not known. Mutations affecting Cbl repression map in the long untranslated region upstream of the first cob ORF (cbiA) and within cbiA (185, 186). It is not known whether repression occurs via another protein or whether dAdo-Cbl binds directly to the mRNA. A similar &do-Cbl repression phenomenon affecting expression of the E. coli btuB gene, required for CBL transport, has also been reported (188). 3. GLOBALFACTORS AFFECTINGcob EXPRESSION
Although PocR is the major factor responsible for regulating cob transcription in s. typhimurium, contributions from other proteins have been noted. CAMP receptor protein (CRP) (189)probably regulates cob expression
3 78
MICHELLE R . RONDON ET AL.
indirectly by modulating pocR expression (190), and ArcA, another global regulatory protein (191),may act in a similar manner, although there is also evidence for direct activation of cob expression by ArcA, in a PocR-independent manner (183).
B. Escherichia coli Expression of the E. cdi cobUST genes is induced by Cbi, but does not seem to be regulated by factors regulating cob expression in S. typhimurium (166).This difference may reflect the different evolutionary origin of the cob genes from these two organisms. Additionally, this result predicts that a corrinoid-sensing regulatory pathway exists in E. coli.
C. Other Organisms Regulation of Cbl synthesis has not been studied in any detail in any other system. In several organisms, Cbl levels vary depending on growth conditions, but the mechanisms responsible for this have not been investigated (192,193). SUMT activity is inhibited by uro’gen 111 in some organisms (i.e., I! &nitrzjicuns)but not in others (i.e., M. uoltue).Other Cbl synthetic enzymes have not been studied in this regard. Translational coupling of cob genes has been suggested for S. typhimurium and I! denih-ijicuns. (5, 74)However, the regulatory significance of these observations is not known.
VIII. Concluding Remarks Cbl has been considered an evolutionary ancient molecule whose synthesis is widely distributed among bacterial and archaeal groups. Further studies of the physiology, biochemistry, and molecular genetics of cobalamin synthesis and utilization will not only clanfy remaining questions in the field, but will add to our understanding of the evolution of microorganisms. Thus far, characterization of Cbl synthesis has been achieved in I! h i trzficuns (exceptfor Me,Bza synthesis) and most of the genes involved in Cbl biosynthesis in R denitrifzans and S. typhimurium have been sequenced. This provides a solid foundation for further investigation of Cbl synthesis in other organisms.This section highhghts some of the issues that merit further investigation.
A. Are There Different Pathways for dAdo-Cbl Synthesis? As discussed above, there is evidence that there may be several Cbl biosynthetic pathways. We cannot assume that Cbl synthesis in S. ty-
BIOCHEMISTRY AND MOLECULAR GENETICS OF COBALAMIN SYNTHESIS
379
phimurium and other bacteria will not differ significantly from that determined in I! denitrificuns. Considering h s , it is important that the work done with I? denitrificuns serve as a point of departure for future studies, rather than the end of the road. There are a number of genes in I? denitrijkans without homologs in S. typhimurium, and vice versa. Perhaps these are indications of where there might be differences between the two organisms (e.g., the timing of cobalt insertion). Further analysis of the unassigned genes in S. typhimurium may clarify the biochemical pathway utilized by anaerobic Cbl producers.
B. How Is the Lower Ligand Synthesized? Relatively little is known about the synthesis and significance of the lower ligand of Cbl. With the possible exception of cobT no gene involved in Me,Bza synthesis has been identified. Although some precursors to Me,Bza have been identified, no specific biochemical activity has been characterized. Analysis of this part of the Cbl synthetic pathway is thus of significant interest. Given that numerous alternative bases exist, investigation of base synthesis in diverse bacteria should give us insight into the function of this part of the molecule.
C. How Is Metabolite Flux Regulated? Cobalamin is only one product of the tetrapyrrole pathway. Other major products of this pathway are hemes, chlorophylls, and F,,,. The relative amounts of each of these compounds (and of the other products such as siroheme) are expected to vary from prokaryote to prokaryote as well as within one species of bacterium grown under mfferent environmental conditions. The means by which cells ensure that there is sufficient flux through the different branches of the pathway is likely to be complex, especially in cases where the demand for the end products varies greatly, depending on the growth conditions. The study of the metabolic regulation of Cbl synthesis (and of tetrapyrrole synthesis in general) will improve our understanding of these issues.
D. How Is Cbl Synthesis Regulated? Thus far, factors affecting cob gene expression have been studled only in S. typhimurium. In this bacterium, regulation of Cbl synthesis is closely tied to regulation of 1,2-propanediolutilization. It is likely that in other organisms this will not be the case. Regulatory mechanisms controlling production of a molecule reflect the conditions under which that molecule is important to the cell. Given that the role of Cbl varies widely, it is expected that control of cob gene expression will mirror this diversity.
380
MICHELLE R. RONDON ET AL.
ACKNOWLEDGMENTS Work in the laboratory of JCES was funded in part by NIH Grant GM403 13 and by USDA Health Grant WIS3765.
REFERENCES 1. K. Folkers, in “B,,” (D. Dolphin, ed.), p. 1. Wiley, New York, 1982. 2. P. G. Lenhert and D. C . Hodgkm, Nature (London)192,937 (1961). 3. J. P. Glusker, in “B,” (D. Dolphin, ed.), p. 23. Wiley, New York, 1982. 4. M. Kamper and D. Hodgkin, Nature (London)176,551 (1955). 5. F. Blanche et al., Angew. Chem. Znt. Ed. (Engl.)34,383 (1995). 6. J. Halpern, Science 227,869 (1985). 7. A. I. Scott, Angew. Chem. Int. Ed. (Engl.) 32, 1223 (1993). 8. A. R. Battersby, Science 264, 1551 (1994). 9. D. Dolphin (ed.),“B,,.” Wiley, New York, 1982. 10. Z. Schneider and A. Stroiski (eds.),“ComprehensiveB,,.” De Gruyter, Berlin, 1987. 11. IUPAC-IUB Commission on Biochemical Nomenclature, Bchem 13,1555 (1974). 12. N. Brink and K. Folkers,JACS 71,295 1 (1949). 13. E. Stupperich, H. J. Eisinger and B. Krautler, EJB 172,459 (1988). 14. L. Lamm, J. A. Horig, P. Renz and G. Heckmann, EJB 109,115 (1980). 15. B. Cameron, K. Brigs, S. Pridmore, G. Brefort and J. Crouzet, J. Bact. 171,547 (1989). 16. M. G. Johnson and J. C. Escalante-SemerenhJBC 267,13302 (1992). 17. P. Renz, J. Horig and R. Wurm, in “Vitamin B,,” (B. Zagalak and W. Friedrich, eds.), p. 317, De Gruyter, Berlin, 1979. 18. H. C. Friedmann and L. M. Cagen, Annu. Rev. Microbwl. 24,159 (1970). 19. B. Krautler, H. P. E. Kohler and E. Stupperich,EJB 176,461 (1988). 20. D. Perlman and J. M. Barrett, Can. J. M i m b i o l . 4 , 9 (1958). 21. E. Stupperich and H. J. Eisinger, Ado. Space Res. 9,117 (1989). 22. H. A. Barker, H. Weissbach and R. D. Smyth, PNAS 44,1093 (1958). 23. A. Pol, C. Van der Drift and G. D. Vogels, BBRC 108,731 (1982). 24. E. Stupperich and B. Krautler, Arch. Microbiol. 149,213 (1988). 25. E. Irion and L. G. Lungdahl, Bchem 4,2780 (1965). 26. B. Krautler,J. Moll and R. K. Thauer, EJB 162,275 (1987). 27. E. Stupperich, H. J. Eisinger and B. Krautler, EJB 186,657 (1989). 28. E. Stupperich and H. J. Eisinger, Arch. Microbiol. 151,372 (1989). 29. L. M e w and E. Smith, Bog. Indust. Microbiol. 5, 152 (1964). 30. H. C. Friedmann,JBC 240,413 (1965). 31. J. A. Fyfe and H. C. Friedmann,JBC 244,1659 (1969). 32. F. M. Huennekens, K. S. Vitols, K. Fujii and D. W. Jacobsen, in “B,,” (D. Dolphin, ed.), p. 145. Wiley, New York, 1982. 33. E. Pezacka, R. Green and D. W. Jacobsen, BBRC 169,443 (1990). 34. H . C . Friedmann and R. K. Thauer, in “Encyclopedia of Microbiology”(J. Lederberg, ed.), Vol. 3, p. 1. Academic Press, New York, 1992. 35. J. M. Poston and B. A. Hemmings,]. Bact. 140,1013 (1979). 36. G. W. Ashley, G. Hanis and J. Stubbe, JBC 261,3958 (1986). 37. P. A. Frey, FASEBJ. 7,662 (1993). 38. G. W. Chang and J. T. Chang, Nature 254,150 (1975).
BIOCHEMISTRY AND MOLECULAB GENETICS OF COBALAMIN SYNTHESIS
381
A. A. Poznanskaja, K. Tanizawa, K. Soda, T. Toraya and S. Fukui, ABB 194,379 (1979). J. M. Poston,JBC25l, 1859 (1976). J. K. Dyer and R. N. Costilow,]. B a t . 101, 77, (1979). Y. Tsuda and H. C. Friedmann,JBC 245,5914 (1970). T. C. Stadtman and L. Tsai, BBRC 26,920 (1967). E. E. Dekker and H. A. Barker,JBC 243,3232 (1968). J. J. B. Cannata, J. Focesi, A. R. Mazumder, R . C. Warner and S. Ochoa, JBC 240,3249 (1965). 46. H. A. Lee and R. A. Abeles,]BC 236,2367 (1963). 47. H. F. Kung, S. Cederbaum, L. Tsai and T. C. Stadtman, PNAS 65,978 (1970). 48. J. Stubbe,JBC 265,5329 (1990). 49. P. Reichard, Science 260, 1773 (1993). 50. J. Harder, FEMS Microbiol. Rev. 12,273 (1993). 51. J. M. Wood, in “B,,” (D. Dolphin, ed.), p. 151. Wiley, New York, 1982. 52. E. Stupperich, FEMS Microbiol. Reu. 12,349 (1993). 53. R. T. Taylor and H. Weissbach,in “The Enzymes” ip.D. Boyer, ed.),p. 121. Academic Press, New York, 1973. 54. R. R. Taylor, in “B,,” (D. Dolphin, ed.), p. 307. Wiley, New York, 1982. 55. A. Yeliseev, P. Gaertner, U. Harms, D. Linder and R. K. Thauer, Arch. Microbiol.159,530 (1993). 56. P. van der Meijden, B. W. T. Brommelstroet, C. M. Poiroit, C. van der Drift and G. D. Vogels,]. Bact. 160,629 (1984). 57. J. D. Kremer, X. Cao and J. Krzycki,J. B a t . 175,8824 (1993). 58. S. W. Ragsdale, Crit. Rev. Biochem. Molec. Biol. 26, 261 (1991). 59. D. Grahame, Bchem 32,10786 (1993). 60. H. A. Dailey (ed.), “Biosynthesis of Heme and Chlorophylls.” McGraw-Hill,New York, 1990. 61. P. M. Jordan (ed.),“Biosynthesisof Tetrapyn-oles.”Elsevier, Amsterdam, 199 1. 62. G. Kikuchi, A. Kumar, P. Talmage and D. Shemin,JBC 233,1214 (1958). 63. K. D. Gibson, W. G. Laver and A. Neuberger, BJ 70, 71 (1958). 64. S. I. Beale and P. A. Castelfranco,PZant Physiol. 53,291 (1974). 65. K. D. Gibson, A. Neuberger and J. J. Scott, B] 6 4 618 (1955). 66. R. Schimd and D. Shemin,JACS 77,506 (1955). 67. P. H. A. Sneath (ed.), “Bergey’s Manual of Systematic Bacteriology,”Vol. 2. Williams & Willcins, Baltimore, Maryland, 1986. 68. F. Blanche, L. Debussche, D. Thibaut, J. Crouzet and B. Cameron, ]. Bact. 171, 4222 (1989). 69. F. Blanche et al., J. Bact. 173,4637 (1991). 70. C. Robin et al.,J. Bact. 173,4893 (1991). 71. I. Sattler et al., J. Bact. 177, 1564 (1995). 72. J. B. Spencer, N. J. Stolowich, C. A. Roessner and A. I. Scott, FEBS Lett. 3 3 5 , 5 7 (1993). 73. M. J. Warren et al., BJ 302,837 (1994). 74. J. R. Roth, J. G. Lawrence, M. Rubenfield, S. Kieffer-Higgins and G. M. Church, J. Buct. 175,3303 (1993). 75. J. Crouzet etal.,J. Bact. 172,5968 (1990). 76. C. A. Roessner et ul.,FEBS Lett. 3 0 4 7 3 (1992). 77. E. Raux et al.,J. Bact. 178, 753 (1996). 78. L. Debussche, D. Thibaut, B. Cameron,J. Crouzet and F. Blance,]. Bact. 175,7430 (1993). 79. J. B. Spencer, N. J. Stolowich, C. A. Roessner, C. Min and A. I. Scott, JACS 115, 11610 (1993).
39. 40. 41. 42. 43. 44. 45.
MICHELLE R. RONDON ET AL.
80. A. I. Scott, Heterocycles 39, 471 (1994). 81. J. Crouzet et al., J. Bact. 172, 5980 (1990). 82. A. I. Scott, in “CIBA Foundation Symposium 1 8 0 (D. J. Chadwick and K. Ackrill, eds.), p. 285. Wiley, New York, 1994. 83. D. Thibaut, F. Blanche, L. Debussche, F. J. Leeper and A. R. Battersby, PNAS 87,8800 (1990). 84. D. Thibaut, L. Debussche and F. Blanche, PNAS 87,8795 (1990). 85. F. Blanche et al., J. Bud. 174,1036 (1992). 86. F. Blanche et aZ.,J. Bact. 174,1050 (1992). 87. D. Thibaut et aZ.,J. Bact. 174,1043 (1992). 88. L. Debussche, D. Thibaut, B. Cameron, J. Crouzet and F. Blanche, J. Bact. 172, 6239 (1990). 89. N. P. J. Stamford, in “CIBA Foundation Symposium 180” (D. J. Chadwick and K. Ackrill, eds.),p. 247. Wiley, New York, 1994. 90. L. Debussche et al.,J. Bact. 174,7445 (1992). 91. B. Cameron et al.,]. Bact. 173,6058 (1991). 92. J. Crouzet et al., J. Bact. 173, 6074 (1991). 93. L. C. D. Gibson, R. D. Willows, C. G. Kannangara, D. von Wettstein and C. N. Hunter, PNAS 92,1941 (1995). 94. R. D. Willows, L. C. D. Gibson, C. G. Kanangara, C. N. Hunter and D. von Wettstein, EJB 235,438 (1996). 95. F. Blanche, L. Maton, L. Debussche and D. Thibaut, J. Bact. 174,7452 (1992). 96. L. Debussche et al.,J. B a d . 173,6300 (1991). 97. S.-J. Suh and J. C. Escalante-Semerena,]. Bact. 177,921 (1995). 98. J. C. Escalante-Semerena,S.-J. Suh and J. R. Roth,]. Bact. 172,273 (1990). 99. F. Blanche et al.,]. Bact. 173,6046 (1991). 100. A. I. Krasna, C. Rosenblum and D. B. Spnnson,JBC225,745 (1957). 101. S. H. Ford, B B A 8 4 1 306 (1985). 102. S. H. Ford and H. C. Friedmann, A B B 175,121 (1976). 103. S. H. Ford and H. C. Friedmann, B B A 500,217 (1977). 104. W. Eisenreich and A. Bacher,JBC 266,23840 (1991). 105. C . Grabau and J. R. Roth,J. ??a&. 174,2138 (1992). 106. D. Barchielli et aZ., BJ 74,382 (1960). 107. P. Renz, BBRC 30,373 (1968). 108. R. A. Ronzio and H. A. Barker, Bchem 6,2344 (1967). 109. F. Blanche et al., J. Bact. 173, 6052 (1991). 110. G. A. O’Toole and J. C. Escalante-Semerena,JBC270,23560 (1995). 111. G. A. O’Toole and J. C. Escalante-Semerena,J.Bact. 175,6328 (1993). 112. H. C. Friedmann and D. L. Harris,JBC 240,406 (1965). 113. B. Cameron et al., J. B a t . 173,6066 (1991). 114. J. R. Tizebiatowski,G. A. O’Toole and J. C. Escalanate-Semerena,]. Bact. 176,3568 (1994). 115. H. C. Friedmann,JBC 243,2065 (1968). 116. G. A. O’Toole, J. R. Trzebiatowski and J. C. Escalante-SemerenaJBC 269,26503 (1994). 117. V. Hollriegl, L. Lamm, J. Rowold, J. Horig and P. Renz, Arch. Microbial. 132,155 (1982). 118. P. Renz, FEBS Lett. 6,187 (1970). 119. P. Renz andR. R. Weyhenmeyer, FEBS Lett. 22,124 (1972). 120. J. A. Horig and P. Renz, EJB 105,587 (1980). 121. L. Lamm, G. Heckmann and F. Renz, EJB 122,569 (1982). 122. J. R. A. Vogt, L. Lamm-Kolonkoand P. Renz, EJB 174,637 (1988). E 3 . M. Munder, J. R. A. Vogt, B. Volger and P. Renz, EJB 204,679 (1992).
BIOCHEMISTRY AND MOLECULAII GENETICS OF COBALAMIN SYNTHESIS
383
124. J. R. A. Vogt and P. Renz, EJB 17%655 (1988). 125. P. Scherer, V. Hollriegl, C. Krug, M. Bokel and P. Renz, Arch. Microbial. 138, 354 (1984). 126. P. Renz, B. Endres, B. Kurz and J. Marquart, EJB 217, 1117 (1993). 127. R. M. Jeter, B. M. Olivera and J. R. Roth,J. B a d . 159,206 (1984). 128. P. Chen, M. Ailion, N. Weyand and J. R. Roth,J. Buct. 177,1461 (1995). 129. E. H. Pezacka, BBA 1157,167 (1993). 130. M. T. Latimer and J. G. Ferry,]. Buct. 175, 6822 (1993). 131. V. Frasca, R. V. Baneree, W. R. Dunham, R. H. Sands andR. G. Matthews, Bchem27,8458 (1988). 132. K. Fujii and F. M. Huennekens,JBC 249,6745 (1974). 133. J. G. Lawrence and J. R. Roth, Genetics 142,ll (1996). 134. M. J. Bawden et el., NARes 15,8563 (1987). 135. I. A. Borthwick, G . Srivastava, J. D. Brooker, B. K. May and W. H. Elliott, EJB 150, 481 (1985). 136. S . A. Leong, P. H. Williams and G . S. Ditta, NARes 13,5965 (1985). 137. D. S. Schoenhaut and P. J. Curtis, Gene 48,55 (1986). 138. D. Urban-Grimal, C. Vdand, T. Gamier, P. Dehoux and R. Labbe-Bois, EJB 156, 511 (1986). 139. E. L. Neidle and S. K. Kaplan,]. Bact. 175,2292 (1993). 140. Y. J. Avissar and S. I. Beale,J. Bact. 171, 2919 (1989). 141. T. Elliott, Y. J. Avissar, G.-E. Rhie and S. I. Beale,J. B a t . 172, 7071 (1990). 142. B. Grimm, A. Bull and V.Breu, MGG 2 2 5 , l (1991). 143. Y. Echelard, J. Dymetryszyn, M. Drolet and A. Sasarman, MGG 214, 503 (1988). 144. J.-M. Li, C. S. Russell and S. D. Cosloy, Gene 75, 177 (1989). 145. A. M. Mayers, M. D. Crivellone, T. J. Koemer and A. Tzagoloff,JBC 262, 16822 (1987). 146. J. 6.Wetmur, D. F. Bishop, C. Cantelmo and R. J. Desnick, PNAS 83,7703 (1986). 147. T. R. Bishop, Frelin, J. P. and S. H. Boyer, NARe.7 14, 10115 (1986). 148. M. Hansson, L. Rutberg, I. Schroder and L. Hederstedt,J. Bact. 173,2590 (1991). 149. S. D. Thomas and P. M. Jordan, NAAes 14,6215 (1986). 150. N. Raich et al., NARes 14,5955 (1986). 151. T. Keng, C. Richard and R. Larocque, MGG 234, 233 (1992). 152. A. L. Sharif, A. G. Smith and C. Abell, EJB 184,353 (1989). 153. P. M. Jordan, B. I. A. Mgbeje, A. F. Alwan and S. D. Thomas, NARes 15,10583 (1987). 154. P. M. Jordan, B. I. A. Mgbeje, S. D. Thomas and A. F. Alwan, BJ 249,613 (1988). 155. A. Sasarman et al., J. Bact. 169,4257 (1987). 156. K. E. Sanderson, A. Hessel and K. E. Rudd, Mierobiol. Reu. 59,241 (1995). 157. K. Xu, J. Delling and T. EUiott,J. BuLd. 174,3953 (1992). 158. T. Elliott,]. Bact. 171, 3948 (1989). 159. E. Fujino, T. Fujino, S. Karita, K. Sakka and K. Ohmiya,J. Bact. 177,5169 (1995). 160. D. J. Chadwick and K. AckriU (eds.), “CIBA Foundation Symposium 180.” Wiley, ChiChester, 1994. 161. R. M. Jeter,J. Gen. Microbial. 136,887 (1990). 162. S.-J. Suh and J. C. EscalanteSemerena, Gene l29,93 (1993). 163. M. J. Warren, C. A. Roessner, P. J. Santander and A. 1. Scott, BJ 265,725 (1990). 164. Y. Hashimoto, M. Nishiyama, S. Horinouchi and T. Beppu, Biosci. Biotech. Biochem. 58, 1859 (1994). 165. M. D. Lundrigan and R. J. Kadner,]. Bact. 1 7 1 154 (1989). 166. J. Lawrence and J. R. Roth,J. Bact. 177,6371 (1995). 167. T. Kaneko et al., D N A Res. 2, 153 (1995). 168. G. J. Olsen, C. R. Woese and R. Overbeek, J. Bact. 176,l (1994).
384
MICHELLE R. R O N D O N ET AL.
169. J.-Y. Wu, L. M. Siege1 and N. M. Krednch,]. Bact. 173,325 (1991). 170. R. De Mot, I. Nagy, G. Schoofs and J. Vanderleyden, Gene 143,91 (1994). 171. C. Ouzounis, N. Kyrpides and C. Sander, NARe5 23,565 (1995). 172. M. Pollich and G. Klug,]. Bact. 177,4481 (1995). 273. P.-L. Chou et aZ., Biosci. Biotech. Biochem. 59, 1817 (1995). 174. M. C. Jones, J. M. Jenkins, A. G . Smith and C. J. Howe, PZant MoZec. B i d . 24,435 (1994). 275. D. L. Milton, A. Norqvist and H. Wolf-Watz, Gene 164,95 (1995). 176. R. De Mot, G. Schoofs, I. Nagy and J. Vanderleyden, Gene 150,199 (1994). 177. B. A. Castilho, P. Olfson and M. J. Casadaban,]. Bad. 158,488 (1984). 178. M. R. Rondon and J. C. Escalante-Semerena,]. Bact. 174,2267 (1992). 179. T. A. Bobik, M. Ailion and J. R. Roth,J. Bad. 174,2253 (1992). 180. J. C. Escalante-Semerena and J. R. Roth,]. Bad. 169,2251 (1987). 181. D. I. Andersson and J. R. Roth,]. Bad. 171,6726 (1989). 182. D. I. Andersson and J. R. Roth,]. B u t . 171, 6734 (1989). 183. D. I. Andersson, MoZ. Microbiol. 6, 1491 (1992). 184. A. A. Richter-Dahlfors and D. I. Andersson, MoZ. Microbiol. 5,1337 (1991). 185. A. A. Richter-Dahlfors and D. I. Andersson, MoZ. MlcrobioZ. 6, 743 (1992). 186. A. A. Richter-Dahlfors. S. Ravnum and D. I. Andersson, MoZ. Microbiol. 13,541 (1994). 187. M. R. Rondon and J. C. Escalanate-Semerena,]. Bact. 178,2196 (1996). 188. M. D. Lundrigan, W. Koster and R. J. Kadner, €"AS 88,1479 (1991). 189. A. Kolb, S. Busby, H. BUC,S. Garges and A. Adhya, ARB 62,749 (1993). 190. M. Ailion, T. A. Bob& and J. R. Roth,J. Bad. 175,7200 (1993). 191. S. Luchi and E. C. C. Lin, PNAS 85,1888 (1988). 192. P. Lebloas, P. Loubiere and N. D. Lindley, Biotech. Lett. 16,129 (1994). 293. J. Krzycki and J. G. Zeikus, Cum. Microbiol. 3,243 (1980).
Index
A Actinomycin D, effect on mRNA decay, 265-266 Acute myeloid leukemia, ELL translocation, 340-341 Adenosine-uridine binding factor (AUBF) discovery, 2 76 mRNA stabilization, 280 regulation, 2 7 7-2 80 sequence recognition, 276-277 Amyloid precursor protein, mRNA stability, 2 7 1-2 72 AUBF, see Adenosine-uridine binding factor Axon regeneration regulation, 247 Schwann cell interactions bilateral communication, 235,248 myelination role, 235-243 neural crest development, 227-229 transcriptional regulation, 243-246
B Brown adipose tissue functions, 84 innervation, 84 morphology, 83 uncoupling pathway of mitochondria, 85-86 uncoupling protein, see Uncoupling protein
C CAMP,see Cyclic AMP Charcot-Marie-Tooth disease (CMT),genetics, 242-243 Ciliary neurotrophic factor (CNTF), release in nerve injury, 247-248 Ciliates conjugation, 5-6
385
DNA rearrangement chromosomal fragmentation, 6-7, 41-43 gene scrambling, 8,43-44 interstitial DNA deletion, 7-8 evolution, 3 internal eliminated sequences cis-acting sequence requirements, 30-35 evolution abdication and fading, 52-54 bloom, 48-52 germ-line soma system, role in ciliates, 49-51 overview, 46,58 Tetrahymena, 54-56 transposon function maintenance, 51-52 transposon invasion, 47-48 transposon multiplication limits, 48-49 excision products macronuclear junctions, 20-21,26 M-region, 26-3 1 polymerase chain reaction analysis, 27-28 R-region, 26-27, 29 Tec transposon circles, 21-23,32-34 telomere-bearing element transposon circles, 24-25 functions, 44-45 Oxyhicha, 34-35 Paramecium, 18-19 phylogenetic distribution, 56-58 rearrangement mechanism, 7-8 short sequences, 8-9 Tetrahymenu, 15-18,26-30 trans-acting factors conjugation-specificproteins, 35-37 oId macronucleus influence on excision, 38-41 transposon-encoded proteins, 3 7-38 transposon sequences, 9, 11-15
386
INDEX
life cycle, 3,5 macronuclear development, 6 nuclear dimorphism, 3,5 c-Jun, transcriptionalregulation in Schwann cells, 246 CMT, see Charcot-Marie-Tooth disease CNTF, see Ciliary neurotrophic factor Cobalamin adenosylcobalamin-dependendent enzymes, 352-353 biosynthesis adenosylcobalamin, 365-369 adenosylcobyricacid, 365 cobinamide phosphate, 365-366 cobyrinic acid a,c-diamide, 364-365 cyanocobalamin, 368 5,6-dirnethyIbenzimidazoleribonucleotide, 366 genes cob gene organization and homology,
373,376 E s c h h h i a coli, 371-372 hem genes in uroporphyrinogen I11 synthesis, 369-370 PseudomMKls &itr$cans, 371 Salmonella typhimurium, 370-371 Synechocystis,372 uro’gen 111methyltransferase, 373,
375-376 hydrogenobyrinic acid, 364 lower ligands, 367-368,379 methylcobalamin, 369 pathways, 378-379 precorrins, 356-359,362-364 regulation of synthesis Escherichia coli, 378 metabolic flux, 379 Salmonella typhimurium, 376-379 uroporphyrinogen 111,354,356 discovery, 349 methylcobalamin-dependendent enzymes,
353-354 structure, 347-348 Corrinoids, see also Cobalamin diversity of ligands, 350-351 nomenclature, 349-350 prokaryotic production, 352 Cyclic AMP (CAMP),Schwann cell regulation, 229-230 Cycloheximide,effect on mRNA decay, 265
D DNA methylation assay, 114,116 interferon7 gene regulation, 114,116-119 DNA repair, see Homologous genetic recombination; Nucleotide excision repair assay
E ELL, gene translocation in acute myeloid leukemia, 340-341 Elongin (SIII) regulation by von Hippel-Landau tumor suppressor, 338,340 subunits, 338 Eph, Schwann cell regulation, 233 Extracellular m a m , role in peripheral nervous system development, 233-235
F FGF, see Fibroblast growth factor Fibroblast growth factor (FGF), Schwann cell regulation, 229 fos, serum-positivepromoter assays for mRNA decay, 259-260
GAS6, Schwann cell regulation, 232 Globin, mRNA stabilization, 273 GM-CSF, see Granulocyte macrophagecolony stimulating factor Granulocyte macrophage-colony stimulating factor (GM-CSF) mRNA stabilization for gene therapy,
281-282 particle-mediated transfection and decay of mRNAs, 261,263,266-267
H Hereditary neuropathy with liability to pressure palsies (HNPP),genetics, 243
387
INDEX
HNPP, see Hereditary neuropathy with liability to pressure palsies hnRNP C, effect on mRNA stability, 280-281 Homologous genetic recombination, see also RecA functions in bacteria DNA repair, 132-133,135-135 importance of elucidation, 130-132 initiation by DNA damage, 136-138 mechanism. 131
I IES, see Internal eliminated sequence Interferon short type Iinterferon expressed by pig trophoblast, 3 19-320 type I receptor, 290-29 1 types, 288-291 Interferon-y gene promoter, 111, 119-120 regulation of transcription CD28,111-112 cytokines, 112-114 DNA methylation, 114,116-119 enhancer elements, 120-123 estrogen, 123 promoter-binding proteins, 119-123 silencer, 124 structure, 109, 119,288 transfection, 111 natural killer cell production, 110 regulation of genes, 289 T cell production, 110,116 Interferon-.5. applications of recombinant protein, 303 discovely, 295 function, 303-304 gene evolution coding region, 309,311,313,315 promoter region, 315-317 linkage analysis, 317,3 19 locus, 317 glycosylation,307 human genes, 320-321 purification, 302-303
receptor binding, 296-298 structure comparison with other type I interferons primary structure, 304-305,307 receptor-binding sites, 309 secondary structure, 305,307 three-dimensional structures, 307-3 0 9 structure, 295-296 transcriptional regulation, 301-302 trophoblast-specificexpression, 298-301 Interferon-o discovery, 291-292 function, 294 gene evolution coding region, 309,311,313,315 promoter region, 3 15-3 17 linkage analysis, 317,319 locus, 317 variant of sheep, 3 19 genes, 292,294 glycosylation,307 receptor, 294 structure comparison with other type I interferons primary structure, 304-305,307 receptor-binding sites, 309 secondary structure, 305, 307 three-dimensionalstructures, 307-309 structure, 292 Internal eliminated sequence (IES),see C&ates IRE, see Iron response element Iron response element (IRE),structure, 2 72-2 73
J Jak,cytokine signaling, 113
K Krox20, transcriptional regulation in Schwann cells, 243-244
1 L1, role in myelin, 242
388
INDEX
M MAG, see Myelin-associatedglycoprotein Messenger RNA (mRNA) cis elements adenosine-uridine-rich elements, 267-269 amyloid precursor protein mRNA, 27 1-272 globin mRNA stabilization, 273 identification of new elements, 269-271 iron response element structure, 2 72-2 73 sequence homology, 267,270 decay rate assays pulse-chase, 258-259 serum-positivef i s promoter assays, 259-260 transcriptionalblockade, 259 transfection assays, 26 1 in vitro systems, 260 half-life, 258 particle-mediated transfection and decay actinomycin D effects, 265-266 cycloheximide effects, 265 granulocyte macrophage-colony stimulating factor mRNAs, 261,263, 266-267 phorbol ester effects, 263,265 uncapped mRNAs, 266 stabilization for gene therapy, 281-282 trans factors adenosine-uridine binding factor, 2 76-280 hnRNP C, 280-281 identification, 274-276 nucleolin, 280-281 Mitochondria, see Uncoupling protein mRNA, see Messenger RNA Myelin axon-glial interactions, 235-243 components in peripheral nervous system L1,242 lipids, 237 myelin-associatedglycoprotein, 240-241,247 myelin basic proteins, 238-239 myelin protein zero, 237-238 neural adhesion molecule, 242 P2,241
periaxin, 242 peripheral myelin protein 22,239-240, 242-243 structure, 236-237 Myelin-associatedglycoprotein (MAG), role in myelin, 240-241,247 Myelin basic proteins, role in myelin, 238-239 Myelin protein zero, role in myelin, 237-238
N N-CAM, see Neural adhesion molecule Nervous system axon-Schwann cell interactions bilateral communication, 235,248 myelination role, 235-243 neural crest development, 227-229 transcriptional regulation, 243-246 degeneration and regeneration, 246-248 development, 226-22 7 extracellular mabix, role in peripheral nervous system development, 233-235 membrane sorting in Schwann cells, 249 Schwann cell, proliferation and differentiation, 229-233 Neural adhesion molecule (N-CAM),role in myelin, 242 Neural crest, axon-glial interactions during development, 227-229 Neuregulins, Schwann cell regulation, 230-232 Neurotrophins, Schwann cell regulation, 232 Nucleolin, effect on mRNA stability, 280-281 Nucleotide excision repair assay overview, 63-64 in vitro assays analytical chemical assays, 69 biological activity-transformation assay, 72-73 excision assay, 65-67,69 nickinghcision assay, 64-65 repair syntheis assay, 71-72 restriction enzyme sensitivity, 72 in vivo assays host cell reactivation, 77-78 immunological detection of photolesions, 75-76
389
INDEX
ligation-mediatedpolymerase chain reaction, 74-75 nicking assay, 73 postlabeling assay, 75 T4 endonuclease V-sensitive site assay, 73-74 unscheduled DNA synthesis and equilibrium sedimentation, 77
P P2, role in myelin, 241 Pax3, transcriptional regulation in Schwann cells, 244-246 PCR, see Polymerase chain reaction Periaxin, role in myelin, 242 Peripheral myelin protein 22 (PMPZZ),role in myelin, 239-240,242-243 Phorbol ester, effect on mRNA decay, 263, 265 PMP22, see Peripheral myelin protein 22 Polymerase chain reaction (PCR) internal eliminated sequence analysis, 27-28 mapping of DNA photolesions, 74-75
R RecA ATP binding site, 163-164 hydrolysis conformational changes and cooperativity, 181-184 DNA-dependenthydrolysis, 179-180 DNA-independenthydrolysis. 179 cry& structure, 150, 156, 169 distribution in bacteria, 130 DNA binding double-stranded DNA, 174-175 gapped DNA, 175 single-stranded DNA, 173-174 site structure, 164-167 DNA strand exchange DNA pairing energetics, 192-193, 195 intermediates, 188-192 kinetics, 187-188 mechanism, 186-187
overview, 184-186 domains, 138 exchange reactions with four DNA strands, 200 functions chromosome partitioning, 210 coprotease, 208-209 DNA repair, 136-137,210-212 induced stable DNA replication, 2 10 SOS mutagenesis, 209-210 hybrid DNA, unidirectional extension and ATP hydrolysis, 19.5-197,199-200 MAW motif, 168-169 monomer-monomer interface, 167 polar filament, assembly and disassembly, 176-179, 184 protein interactions exonuclease I, 208 RecF, 203,205 RecJ, 208 RecO, 203-205 RecR, 204-205 RuvA, 206-207 RuvB, 206-207 single-strand DNA binding protein, 201-203 purification, 138 role in bacteria, 130 sequence alignment bacterial RecA proteins, 138, 140-141, 150,154-156 bacteriophage homologs, 1S7, 162 eukaryotic homologs, 157,162 Sms protein, 172 structure-function insights, 138,140 tryptophan reporter group incorporation, 169,171-172 RNA polymerase 11, see also specific h n scriptionfactors elongation factors, 335-338, 340-341 preinitiation complex formation and activation, 328-335 Ruv proteins, see RecA
s Schwann cell axon-ghal interactions bilateral communication, 235,248
390
INDEX
myelination role, 239-243 neural crest development, 227-229 transcriptional regulation c-Ju~,246 Kr0~20,243-244 Pax3,244-246 SCIP, 243-244 extracellular matrix, role in peripheral nervous system development, 233-235 membrane sorting in myelinating cells, 249 premyelination marker proteins, 229 regulation cyclic AMP, 229-230 Eph, 233 fibroblast growth factor, 229 GAS6, 232 neuregulins, 230-232 neurotrophins, 232 transforming growth factor, 229-230 SCIP, transcriptionalregulation in Schwann cells, 243-244 HI, nascent transcript cleavage, 336-337 SIII, see Elongin Sms protein, homology with RecA, 172 STAT, cytokine signaling, 113,291
helicase activity, 333-334 kinase activity, 334-335 preinitiation complex formation and activation, 332-335 TGF, see Transforming growth factor Transforming growth factor (TGF),Scliwann cell regulation, 229-230 Transposon internal eliminated sequences in ciliates evolution abdication and fading, 92-54 bloom, 48-52 germ-line soma system, role in ciliates, 49-51 overview, 46,58 Tetrahymenu, 54-56 transposon function maintenance, 51-52 transposon invasion, 47-48 transposon multiplication limits, 48-49 Tec elements, 14-15,21-23,32-34 Tel-1, 17 telomere-bearing elements, 9, 11-14, 24-25.51-52 Tlrl, 17-18,35-36 retrovirus, 2
T
U
TBE, see Telornere-bearing element T cell, cytokine production, 110 Tec, see Transposon Telomere-bearingelement ("BE), see Transposon TFIIB, bindirig of RNA polymerase I1 to TFIID-core promoter complex, 330 TFIID DNA binding, 329-330 preinitiation complex assembly, 328-330 TFIIE, preinitiation complex formation and activation, 332-33 3 TFIIF assembly of active preinitiation complex, 330-33 2 elongation activity, 337-338 structure, 331 TFIIH ATPase, 334
UCP, see Uncoupling protein Uncoupling protein (UCP) discovery, 86-87 expression mammalian cells, 92 yeasts, 92-94,96 fatty acid sensitivity of mutants, 96 flow cytomew of yeast mutants, 93,105 gene comparison between species, 9 7 comparison with genes of other caniers, 97-98 organization in rat, 96-97 polymorphism in humans, 104-105 regulation of transcription cis-acting elements, 98-99 enhancer, 99-103 inhibitory regions, 103 norepinephrine, 98
391
INDEX
retinoic acid, 98, 100-103 thyroid hormones, 98 trcms-factors,100-101, 103 nucleotide mutants, 94 regulation, 85-86 sequence homology, 87-89 topology, 89-92
uncoupling pathway of brown adipose tissue mitochondria, 85-86
V von Hippel-Landau tumor suppressor, regulation of elongin, 338,340
This Page Intentionally Left Blank
This Page Intentionally Left Blank
I S B N 0-12-5qOOSb-X