PROGRESS IN
Nucleic A c i d Research and M o l e c u l a r Biology Volume 58
This Page Intentionally Left Blank
PROGRESS IN
Nucleic A c i d Research a n d M o l e c u l a r Biology edited by
KlVlE MOLDAVE Department of Molecular Biology and Biochemistry University of Calijimia, Irvine lrvine, Calijiiiu
Volume 58
ACADEMIC PRESS San Diego london Boston New York Sydney Tokyo Toronto
This book is printed on acid-free paper.
@
Copyright 0 I998 by ACADEMIC PRESS All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the Publisher. The appearance of the code at the bottom of the first page of a chapter in this book indicates the Publisher’s consent that copies of the chapter may be made for personal or internal use of specific clients. This consent is given on the condition, however, that the copier pay the stated per copy fee through the Copyright Clearance Center, Inc. (222 Rosewood Drive, Danvers, Massachusetts 01923), for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. Copy fees for pre-1998 chapters are as shown on the title pages, if no fee code appears on the title page, the copy fee is the same as for current chapters. 0079-6603/98 $25.00
Academic Press a division of Harcourt Brace & Company 525 B Street, Suite 1900, San Diego. California 92101-4495, USA http://www.apnet.com
Academic Press Limited 24-28 Oval Road, London NW 1 7DX. UK http://www.hbuk.co.uk/ap/ International Standard Book Number: 0-12-540058-6 PRINTED IN THE UNITEDSTATFS OF AMERICA 97 98 99 00 01 02 BB 9 8 7 6
5
4
3 2 1
Contents
SOME~ T I C L E SPLANNED FOR FUTURE VOLUMES...............
ix
The Hairpin Ri bozyme: Discovery. Two-Dimensional Model. and Development for Gene Therapy . . . . . . . . .
1
Arnold Hampel I. Discovery ................................................... I1. Biochemical Properties ........................................ I11. The Hairpin Ribozyme Model .................................. IV. Development for Gene Therapy ................................ V. Delivery of the Hairpin Ribozyme for Gene Therapy ............... VI. Inhibition of HIV-1 Expression in Viuo ........................... VII. Additional Hairpin Ribozymes-GUA Specific ..................... VIII Conclusions and Perspectives ................................... References ...................................................
.
5
7 10 15 18 20 33 36 38
Serum- and Polypeptide Growth Factor-Inducible Gene Expression in Mouse Fibroblasts . . . . . . . . . . . . . 41
Jeffi-ey A . Winkles I. Mitogenic Stimulation of Quiescent Fibroblasts: The Genomic Response ....................................... I1. Identification of Serum- and Polypeptide Growth Factor-Inducible Genes: Strategies and Results ................................... I11. Serum- and Polypeptide Growth Factor-InducibleGene Products and the Control of Cellular Proliferation ......................... IY Conclusions ................................................. References ...................................................
Regulation of Translational Initiation during Cellular Responses to Stress . . . . . . . . . . . . . . . . . . . .
43 48 60 69 70
79
.
Charles 0. Brostrom and Margaret A Brostrom I. Stress Responses and Stress Proteins of Eukaryotic Cells ............ I1. Regulation of TranslationalInitiation ............................ V
82 90
vi
CONTENTS
111. Translational Accommodation to ER or Cytoplasmic Stress ......... IV. Perspectives and Speculation ................................... References ...................................................
Lactose Repressor Protein: Functional Properties and Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
110 116 120
127
Kathleen Shive Matthews and Jeffry C. Nichols I. Lactose Repressor Protein
.....................................
I1. DNABinding ................................................
111. Inducer Binding .............................................. N . Structure and Function ........................................ V. NMR and X-ray Crystallographic Structures ...................... VI. Applications of Lac1 Control ................................... VII. Conclusion and Prospects for the Future ......................... References ...................................................
Copper-Regulatory Domain Involved in Gene Expression . . . . . . . . . . . . . . . . . . . . . . . . . .
130 134 139 142 149 155 156 157
165
Dennis R . Winge I. Copper Ion Sensing in Prokaryotes
..............................
I1. Copper Sensing in Eukaryotes ..................................
I11. Copper Metalloregulation in Yeast ............................... IV. Metal Clusters in Regulation ................................... V. Summary and Perspective ...................................... References ...................................................
168 169 170 188 190 191
Molecular Biology of Trehalose and the Trehalases in the Yeast Saccharomyces cerevisiae . . . . . . . . . . . . . 197
Solomon Nwaka and Helmut Holzer I. Metabolism of Trehalose in Yeast ................................ 199 I1. Biological Functions of Trehalose in Yeast ........................ 202 111. Characterization and Location of the Yeast Trehalases .............. 207 IV. Molecular Analysis of the Yeast Trehalases ........................ 211 V. Biological Functions of the Trehalase Genes ...................... 226 VI. Trehalases and Heat Shock Proteins ............................. 229
CONTENTS
VII. Outlook on the Biotechnological Importance of Trehalose and the Trehalases ............................................ References ...................................................
Molecular and Structural Features of the Proton-Coupled Oligopeptide Transporter Superfamily . . . . . . . . . . . . .
vii
231 233
239
You-Jun Fei. Vadivel Ganapathy. and Frederick H. Leibach I. Two Different Peptide Transporter Subfamilies: A Comparison between the Members of the ABC Peptide Transporter Subfamily and the POT Subfamily .............................. I1. Molecular Cloning Procedures Employed for Identification of the POT Family Members ................................... I11. Comparison of Amino Acid Sequences of the Members of the POT Family ............................................ IV. Topological Features of the POT Subfamily ....................... V. Conclusion .................................................. References ...................................................
Doublestrand Break-Induced Recombination in Eukaryotes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
241 243 248 256 257 259
263
Fekret Osman and Suresh Subramani I. Models of Double-Strand Break-Induced Recombination ........... I1. Double-Strand Break-Induced Mitotic Recombination .............. I11. Double-Strand Break-Induced Meiotic Recombination ............. IV. The Genetic Control of Double-Strand Break-Induced Recombination ............................................... V. Concluding Remarks .......................................... References ...................................................
266 277 288 291 295 295
Impaired Folding and Subunit Assembly as Disease Mechanism: The Example of Medium-Chain acyl-CoA Dehydrogenase Deficiency . . . . . . . . . . . . . . 301
Peter Bross. Brage S. Andresen. and Niels Gregersen I. Protein Folding and Its Disturbance by Missense Mutations ......... I1. The Role of MCAD in Mitochondrial P-Oxidation of Fatty Acids ..... I11. Studies on the Molecular Pathology of MCAD Deficiency ...........
303 310 312
viii
CONTENTS
IV. Conclusions ................................................. References ...................................................
327 332
Interaction of Retroviral Reverse Transcriptase with Template-Primer Duplexes during Replication . . . . . . . . 339 Eric J Arts and Stuart F.J Le Grice
.
.
I. Human Immunodeficiency Virus Reverse Transcriptase ............ 341 I1. tRNDLYS*3-Mediated Initiation of (-) Strand DNA Synthesis ........ 346 I11. Interaction of RT with the Template-Primer Duplex ............... 361 IV. The RNase H Domain and Hydrolysis of RNA-DNA Hybrids ....... 370 V. The Polypurine Tract and Second-StrandSynthesis ................. 380 VI. Conclusions ................................................. 386 References ................................................... 387
INDEX
.....................................................
395
Some Articles Planned for Future Volumes
Structure and Transcription Regulation of Nuclear Genes for the Mouse Mitochondria1 Cytochrome c Oxidase
NARAYAN G. AVADHANIet al. Tissue Transglutaminase: Retinoid Regulation and Gene Expression
PETERJ. A. DAVIESAND SHAKIDMIAN Genetic Approaches to Structural Analysis of Membrane Transport Systems
WOLFGANG EPSI-EIN Intron-encoded snRNAs
MAURILLE J. FOURNIER AND E. STUART MAXWELL Molecular Analyses of Metallothionine Gene Regulation
LASHITEW GEDAMU et al. Mechanisms for the Selectivity of the Cell's Proteolytic Machinery
ALFXEDGOLDBERG, MICHAELSHERMAN, AND OLIVERCoux Mechanisms of RNA Editing
STEPHEN L. HAJDUK AND SUSAN MADISON-MNUCCI Structure/Function Relationships of Phosphoribulokinase and Ribulosebisphosphate Carboxylase/Oxygenase
FRED C. HARTMAN AND HILLELK. BRANDES The Nature of DNA Replication Origins in Higher Eukaryotic Organisms
JOELA. HUBERMAN AND WILLIAM C. BURHANS Synthesis of DNA Precursors in lactobacillus acidophilus R-26
DAVIDH. IVESAND SEIICHIRO IKEDA A Kaleidoscopic View of the Transcriptional Machinery in the Nucleolus
SAMSON T. JACOB Sphingomyelinases in Cytokine Signaling
KRONKE MARTIN Mammalian DNA Polymerase Delta: Structure and Function
MARIEITA Y. W. T. LEE DNA Helicases: Roles in DNA Metabolism STEVEN W. MATSONAND DANIELW. BEAM ix
SOME ARTICLES PLANNED FOR FUTURE VOLUMES
X
Haparan Sulfate-Fibroblast Growth Factor Family
WWACE L. MCKEEHANAND M ~ w KAN o Molecular Biology of Snake Toxins: Is the Functional Diversity of Snake Toxins Associated with a Mechanism of Accelerated Evolution?
ANDREMENEZet al. lnosine Monophosphate Dehydrogenase: Role in Cell Division and Differentiation
BEVERLYS. MITCHELL Specificity of Eukaryotic Type II Topoisomerase: Influence of Drugs, DNA Structure, and local Sequence
MARKT. MULLERAND JEFFREY SPITZNER localization and Movement of tRNAs on the Ribosome during Protein Synthesis
KNUDH. NIERHAUS lmmunoanalysis of DNA Damage and Repair Using Monoclonal Antibodies
MANFREDF.RAJEWSKY Mechanism of Transcriptional Regulation by the Retinoblastoma Tumor Suppressor Gene Product
PAULD. ROBBINSAND JON HOROWITZ Organization and Expression of the Chicken &lobin
Genes
KLAUS SCHERRER AND FELIXR. TARGA Physicochemical Studies on DNA Triplexes and Quadruplexes
RICHARDH. SHAFER Bacillus subtilis as I Know It
NOBORUSUEOKA Transcriptional Regulation of Steroid Receptor Genes
DONALD J.TINDALLAND M. V. KUMAR Molecular Genetic Approaches to Understanding Drug Resistance in Protozoan Parasites DYANN WIRTHd al.
The Hairpin Ribozyme: Discovery. Two-Dimensiona I Model. and Development for Gene Therapy’
I ARNOLDHAMPEL Departments of Bwlagkal Sciences and Chemistry Northern Illinois University DeKalb. Illinois 60115 I. Discovery .................................................... I1. Biochemical Properties ........................................
111. The Hairpin Ribozyme Model
..................................
A . Secondary Structure ........................................ B. Three-DimensionalInteractions .............................. IV. Development for Gene Therapy ................................. A . Targeting Rules ............................................ B. Selection of Target Sites ..................................... C. Design and Optimization of the Ribozyme ..................... D. Catalytic Improvements ..................................... V. Delivery of the Hairpin Ribozyme for Gene Therapy ............... A . Autocatalytic Hairpin Cassette ............................... B. Promoter ................................................. VI. Inhibition of HIV-1 Expression in Vim ........................... A . Targets and Ribozyme Selection .............................. B. The 5‘ Leader Target and Ribozyme .......................... C. The PolSpecific Target and Ribozyme ........................ D. The Double Ribozyme ...................................... E. Human Clinical Trials ...................................... F. Improvements ............................................. VII. Additional Hairpin Ribozymes-GUA Specific ..................... VIII. Conclusions and Perspectives ................................... References ...................................................
5 7 10 10 14 15 15 16 17 17 18 18 20 20 20 21 28 31 31 31 33 36 38
Abbreviations: 3‘F. 3’fragment; 5’F. 5’fragment;AIDS. acquired immunodeficiencysyndrome; bp. base pair(+ ELISA. enzyme-linked immunosorbent assay; HC. hairpin autocatalytic cassette; HIV.l. human immunodeficiency virus type 1; HN.2. human immunodeficiency virus type 2; LTR. long terminal repeat; MMLV. Moloney murine leukemia virus; MMTV.mouse mammary tumor virus; nt, nucleotide(s);p24 gag, one of the group-specificantigen proteins of HIV-1; pol. RNA polymerase gene of HIV-1; RRE. rev response element in HIV-1; RT-PCR. reverse transcription-polymerase chain reaction; Rz.ribozyme; S. substrate; sTRSV. satellite RNA from tobacco ringspot virus; sCYMV1. satellite RNA from chicory yellow mottle virus; sArMV. satellite RNA from arabis mosaic virus; TAR. transcriptional activation region on HIV-1 RNA; tat,transcriptional trans-activatorprotein of HIV.1 . Progress in Nucleic Acid Research and Molecular Biology. Vol. 58
1
Copyright 0 1998 by Academic Ress.
AU rights of reproductionin any form resewed. 0079-6603B8 $25.00
2
ARNOLD HAMPEL This review chronicles the discovery of the hairpin ribozyme, its charaderization, and determination of the two-dimensional structure, culminating with its use for human gene therapy as an AIDS therapeutic. The minimal sequence constituting the hairpin ribozyme catalytic domain was identified from a small plant viral satellite RNA. Biochemical characterization showed it to be among the most efficient of all known ribozymes. Mutagenesis determined that the two-dimensional structure had four helices, consisting of 17 Watson-Crick base pairs and one A:G pair for a total of 18 bp. The helices were interspersed with five single-stranded loops. Helices 1 and 2 were located between the ribozyme and substrate, allowing the ribozyme to recognize the substrate. The substrate had a sequence preference of BN*GUC where * is the site of cleavage and N*GUC the substrate loop between these two helices. By using sequences of this type, it was possible to design the ribozyme to base pair with the substrate and cleave heterologous RNA substratesleading to design of the hairpin ribozyme for gene therapy. The HW-1 sequence was searched for suitable target sites, and ribozymes were designed, optimized, catalytically characterized, and tested in vivo against HIV-1 targets. Two ribozymes had excellent in vitro catalytic parameters and inhibited in vivo expression of viral proteins by 3-4 logs in tissue culture cells. viral replication was inhibited as well. They have been developed as human AIDS therapeutics, and will likely be the first ribozymes to be tested as human drugs in clinical trials. 8 I998 Academic Pmss
RNA catalysis was co-discoveredby S. Altman, who found the M1 RNA of RNAse P could catalytically cleave and process the 5‘ terminus of the tRNA precursor (I),and by T. Cech, who found the tetrahymena ribosomal RNA intron had autocatalytic activity (2).The term “ribozyme” was coined to describe catalytic RNA. Since then, four other catalytic RNAs, all self-cleaving, have been discovered: the hepatitis delta ribozyme, the neurospora ribozyme, the hammerhead ribozyme, and the hairpin ribozyme. The latter two are from plant viral satellite, virusoid, and viroid RNAs (see Ref. 3 for review). This review focuses on the hairpin ribozyme (4,5).Specifically,I describe its discovery-followed by the many facets of development required to bring it to the point of being tested in clinical trials as a drug for human use as an AIDS therapeutic. (For previous reviews of this work, see 6, 7, and for a detailed description of many aspects, see 8. We have used the HIV-1 system as an initial test model for determining the utility of the hairpin ribozyme in the down-regulationof gene expression. Based on our excellent success with that system, we are very optimistic that the hairpin ribozyme may have more general utility for a wide variety of applications in other systems as well. The hairpin ribozyme was found as the catalytic center of three known plant satelliteRNAs. These were the negative strands of the satellite RNAs from tobacco ringspot virus (sTRSV), chicory yellow mottle v i r u s type 1(sCYMVl), and arabis mosaic virus (sArMV) (9,IO).Initial studies identified the hairpin ribozyme first in the negative strand of sTRSV. Using molecular modeling of the negative strand of sTRSV as a first approximation, we made substrate and ri-
THE HAIRPIN RIBOZYME
3
bozymes of various lengths and sequences in order to determine the minimum catalytic center. A 50-nt ribozyme sequence was found to be capable of cleaving a 14-nt substrate sequence in a truns reaction. It proceeded without depletion of the 50-nt RNA component, and therefore was catalytic. It followed true Michaelis-Menten kinetics, allowing determination of Km, kcat, energy of activation, Mg2+dependence, temperature dependence, and pH optima (4. The two-dimensional structure was determined by making an extensive collection of site-specific mutants for both the ribozyme and the substrate. The location of individual base pairs was determined by comparison of cata l ~ activity c for these mutant sequences with that of the native sequence. That is, if the site of a predicted base pair lost activity with a mismatch in this position, and if the activity was restored with an alternate base pair, then a base pair at this site has been identified. This method identified four helices and five loops for the ribozyme-substrate complex. The overall structure was hairpin-like, so I named it the hairpin ribozyme ( 5 , I I ) . Of the five helices, helices 1and 2 occurred between the ribozyme and the substrate, and helices 3 and 4 were within the ribozyme itself. Single-strandedloops 1,2,3, and 4 were in the ribozyme sequence, and loop 5 was in the substrate sequence (Fig. 1). Following its discovery, its biochemical characterization, and determination of its two-dimensional structure, the hairpin ribozyme was engineered to cleave heterologous substrate RNAs ( 5 , I I ) .This led to development of the hairpin ribozyme system for human gene therapy and other applications for down-regulation of gene expression. Targeting rules for cleavage of heterologous substrates were determined. The substrate had a sequence preference of BN*GUC where the * is the site of cleavage. The nucleotide B is G, U, or C but not A. With these targeting rules in hand, we now had the possibility of specifically cleaving target mRNA or viral RNA molecules, resulting in inhibition of gene expression or viral replication. Sequence searches were done for a number of systems, including HIV-1, to identdy sequences containing BN*GUC for use as possible target sites (5, 8,12,13).Using HIV-1 as an example, ribozymes were made to a number of potential targets and in vitro cleavage efficiency of the ribozymes to these targets carried out. Optimization was done by varying the length of helix 1 to determine its optimal length for maximum catalytic efficiency. In general the optimal length of helix 1varied between 6 and 12 bp, with 8 bp being a useful first approximation. Helix 2 was fixed at 4 bp. The catalybc activity of the ribozyme was improved by making specific sequence changes in regions of the ribozyme containing nonessential nucleotides. Certain of these changes greatly improved catalytic activity for certain targets. Those ribozymes that had the best catalytic efficiency were used for gene therapy in tissue culture cells. Two ribozymes, targeted to the 5’ leader region and a region of the pol gene of HIV-1, reduced expression of HIV-1
4
ARNOLD HAMPEL
FIG.1. The hairpin ribozyme model. The negative strand of sTRSV native hairpin ribozyme-substrate complex consists of a 14-nt substrate and a 50-nt ribozyme complexed to form four helices and five single-strandedloops named and located as shown in the model (4A8J.2). The helices consist of 18 base pairs, of which 17 are canonical Watson-Crick base pairs and 1 is a noncanonical A:G base pair (2.24.Numbering from 5’ to 3’ is 1-14 for the substrate and 1-50 for the ribozyme as standardized in the original description of the hairpin ribozyme (5). Cleavage of the substrate occurs between A5 and G6. This model contains later modifcations from the original model. The original model was published in both Ref. 5 and Ref. 8, and is reproduced in modified form from the publication of the original model in A. Hampel, R. Tritz, M. Hicks, and P. Cruz, Nucleic A& Res. 18,299 (1990) by permission of Oxford University Press.
virus by 1000 to 10,000-fold.These two ribozymes,which we developed, have been approved by the RAC (Recombinant DNA Advisory Committee) for human use and will soon be tested as potential AIDS therapeutics in humans by Dr. Flossie Wong-Staal at the University of California-San Diego (7). We recently found that the hairpin ribozymes representing the catalytic cores of the negative strands of sCYMVl and sArMV were also catalyhcally active. These ribozymes were highly active, with catalytic efficiencies only slightly less than that of sTRSV. Furthermore these two new classes of hairpin
5
THE HAIRPIN RIBOZYME
ribozymes had *GUA target preference, in contrast to the *GUC preference of the sTRSV hairpin ribozyme (10).We have used the sCYMV1-based engineered ribozyme to cleave two target sites in HIV-1 and one target site in human papillomavirus type 16. Thus we have essentially doubled our repertoire of hairpin ribozymes available for targeting. This is especially important for targeting sites in HIV-1 because it has such a high mutation rate. By generating multiple ribozymes and delivering them simultaneously, the chances of r e ducing viral expression over the long term would be expected to be enhanced. Details of this discovery, characterization, and development of the hairpin ribozyme for gene therapy follow.
I. Discovery The phenomenon of autocatalytic cleavage and ligation of small plant viroid and viral-associated RNAs had been observed to occur in cis in the large 359-nt native RNA strands (14).Others expanded upon this work to iden% the minimal catalytic center of the positive strand of sTRSV and named it the hammerhead ribozyme (15). Similarly, my own laboratory, utilizing these results, carried out experiments to determine the minimum catalytic sequence of the negative strand of sTRSV, which we named the hairpin ribozyme (4,5). Autocatalytic cleavage and ligation were first seen for dimeric transcripts of the 359-nt long negative strand of sTRSV (16,17).The cleaved RNA had a 5' fragment that had a 3' terminal A and a 2',3'-cyclic phosphate, while the 3' fragment had a 5' terminal G and a newly formed 5'-OH terminus (18). Beginning with this previous knowledge that the 359-nt negative strand of sTRSV had catalytic activity, we began a search for the minimal sequence that could carry out catalysis. Figure 2 shows the catalyhc center of the negative strand of sTRSV with the original numbering system for that of the positive strand of sTRSV (4,19). We previously knew where the site of cleavage for the substrate was (cleavage occured between positions 49 and 48), and from mutagenesis experiments we knew the general location of the catalytic center. Operating between positions 247 and 175 for the ribozyme (9) and 52 and 43 for the substrate (20),we began searching for the minimal sequences required for catalysis. We initially modeled these two regions to attempt to identdy any regions of base pairing. We then made transcripts of the RNA components and combined them in Cram-cleavagereactions to attempt to elicit catalysis. We tested several of our models without success. We observed cleavage with the minimal sequence of the catalytic center of sTRSV in a &urn reaction (8). We later named this sequence the hairpin ribozyme (5). The minimum sequence consisted of substrate nt 53-40, which
6
ARNOLD HAMPEL
FIG.2. The catalyhc center of the negative strand of sTRSV. The entire 359-nt sequence of the negative strand of sTRSVwas folded into a minimum energy structure, a portion of which is shown, and the minimum sequence determined (4,8).Numbering of the sequence is as originally described for the negative sTRSV strand (19).Shown is the active site (catalytic center of this molecule)with the ribozyme (bottom stippled area) located between nt 224 and 175 (1-50 in parentheses numbered according to Fig. l),and the substrate (top stippled area) located between nt 53 and 40 (1-14 in parentheses numbered according to Fig. 1). The arrow marks the site of cleavagebetween A49 and G48 (A5/G6according to the substrate numbering scheme of Fig. 1). Reproduced from A. Hampel and R. Tritz,Biochemistry 28,4929 (1989). Copyright 1989 American Chemical Society.
corresponds to sequence 1-14 in Fig. 1, and ribozyme nt 224-175, which corresponds to ribozyme sequence 1-50 in Fig. 1.These two sequences are shown in Fig. 2 as the “active site,” and in Fig. 1 are modeled according to later experimental results. When these two sequences were combined in a trum reaction, the ribozyme gave cleavage of substrate. These were the experiments that determined the minimal sequence of the catalytic center of the negative strand of sTRSV. Following cleavage in this reaction, we sequenced the cleavage products of the substrate and found that, indeed, cleavage had occurred in the same position as in the large 359-nt negative strand of sTRSV,between nt A49 and G48.Furthermore, cleavage generated a 5’ cleavage fragment with a 2’,3’cyclic phosphate terminus and a 3’ cleavage fragment with a 5‘-OH terminus (4,8).This showed that truns cleavage of a small substrate RNA by a por-
7
THE HAIRPIN RIBOZYME
tion of the sequence of the negative strand of sTRSV occurred at the same cleavage site and gave the same cleavage termini as that of the large 359-nt native sequence. The reaction is summarized as substrate RNA: UGACA*GUCCUGUUU
I
catalytic RNA
UGACA>P + HoGUCCUGUUU 5’F 3’F products Further attempts to reduce the number of bases in the ribozyme either from the 5’ or 3’ terminus resulted in reduced activity (4,8).Specifically,we removed the 3’ terminal A and the 3’ terminal UA, and then determined catalytic activity (kcadK,) by measuring both kcat and K,. When the terminal A was removed, activity was reduced fivefold.Removal of the 3’ terminal UA reduced activity 20-fold. Similarly,when the substrate was shortened to less than 14 nt, activity was reduced. Thus this is the minimal sequence for both the ribozyme and substrate in the native negative strand of sTRSV.
II. Biochemical Properties Once we had two RNA sequences that, when combined, resulted in cleavage of one of them, the next step was to biochemically characterize them. Using an approach similar to that successfully used for the hammerhead ribozyme (24, we carried out biochemical characterizations of these two minimal sequences from the negative strand of sTRSV (4, 8).At this point, we only had a single-event cleavage reaction and had no idea if it was catalytic, that is, if the supposed ribozyme turned over. We carried out a time course of substrate cleavage by ribozyme using a molar ratio of substrate to ribozyme of 30:l. The substrate cleaved to near completion (980/0),with no loss of ribozyme during the course of the reaction (Fig. 3).This showed the ribozyme turned over and was not used up during the the catalytic reaction, a necessary characteristic of a biological catalyst. Furthermore, the rate of cleavage of substrate was linearly dependent on ribozyme concentration, again a characteristic of a biological catalyst. Since the RNA we identifed was catalytic, we designed and carried out kinetic analyses with ribozyme concentrationlimiting. The reaction followed Michaelis-Menten kinetics, with initial velocity being dependent on substrate concentration when ribozyme concentration was limiting. The initial kinetic constants determined were K , of 30 nit4 and kcat of 2.l/min, which are excellent catalytic parameters for an RNA-catalyzed reaction (Fig. 4).
A
100
%S nrnaln 10
1
+ 0
16
9
0
4
6
60
Umo (mt)
B
Rz
S
3’F
5’F FIG.3. Time course of cleavage of substrate RNA. The substrate and ribozyme sequences used were those in Fig. 1. (A) Percent substrate (%S) remaining as a function of time of reaction. (B) 7-Murea-20% polyacrylamide gel electrophoresisof the reaction products: ribozyme (Rz), substrate (S), 3‘ fragment (399, and 5’ fragment (5’Fj.The reaction was carried out at 37°C in 12-mMMgCl,, 2-mM spermidine,and 40-mM Tris at pH 7.5 for 30 sec (lane l), 5 min (lane 2), 15 min (lane 3), 30 min (lane 4), 60 min (lane 5), 90 min (lane 6), and 150 min (lane 7). [Rz] = 3.2 nMand [S] = 90 nM. Sample was labeled with a32P CW. Reproduced from A. Hampel and R. Tritz,Biochemistry 28,4929 (1989).Copyright 1989 American Chemical Society. 8
A
0.8
0.7 0.6 *)
-.E X
C
0.5 0.4
'T 0.3 ; >
0.2 0.1 0 0
5
10
15
VO/S ( m i d )
20
25
x 1o3
B
5
3'F
5'F
FIG.4. Kinetic analysis of cleavage of varying concentrations of substrate RNA by ribozyme. (A) Eadie-Hofstee plot. (B) 7-M urea-20°/0 polyacrylamide gel electrophoresis.Conditions were the same as Fig. 3 except [Rz] = 0.4 nM and [S] was: 125 nM, lane 1; 62 nM, lane 2; 42 nM, lane 3; 31 nM, lane 4; 21 nM lane 5; 16 nM, lane 6; 8 nM, lane 7; and 3.9 nM, lane 8. The substrate and ribozyme sequences used were those in Fig. 1. Reproduced from A. Hampel and R. Tritz, Biochemistry 26,4929 (1989).Copyright 1989 American Chemical Society. 9
10
ARNOLD HAMPEL
Initial velocity of the reaction was determined as a function of temperature and analyzed using an Arrhenius plot to determine the energy of activation for the reaction. The energy of activation was 19 kcahmole. Interestingly, the o p timal temperature for the reaction was 37°C. This, however, was later shown to be dependent on the sequence and length of both helix 1 and helix 4. The reaction was dependent on both wg2+] and pH. Increasing Ng"] increased the rate of the reaction. We have only taken the Mg2+ concentration to 20 mM, so we do not know what the optimal concentration is. For our general in vitro laboratory studies,we chose 12 mMto carry out our reactions, and this is what was used for all of our studies. Hydrolysis of RNA to produce a 2',3'-cyclic phosphate is base catalyzed, so one would expect this catalyzed reaction to be pH dependent. As expected, increasing pH gave an increase in reaction rate, however, the increased rate due to pH was not linear with [OH-], as would be expected if it were simply a hydroxyl-driven hydrolysis. Rather, a 100-fold increase in [OH-] from pH 6 to 8 gave only a %fold increase in reaction rate. Thus the catalytic effect of the ribozyme is the significant factor in cleavage of the substrate. This is evidence that titratable functional groups in the ribozyme itself are likely involved in the cleavage reaction.
111. The Hairpin Ribozyme Model
A. Secondary Structure The hairpin ribozyme model was developed by specific single-site mutagenesis (5, 8 , I I ) . Mutations were introduced at positions of suspected base pairing to create mismatches and alternate base pairs. If the mismatches were inactive but the compensatory mutation to create an alternate base pair was active, then a base pair was identified (Table I). By using this method, we simultaneously eliminated a number of other possible models and identified the 17 Watson-Crick-type base pairs shown in Fig. 1, as well as determining features of the loop regions. Our first experiment to determine how the model was put together was done after we had identified the final minimum sequence of both the substrate and the ribozyme. By molecular modeling, we had identified possible pairing schemes for testing. As an experimental test of one of our models, we mutagenized both the substrate and ribozyme to create a mismatch and alternate base pair in one of the positions where we predicted a base pair (Fig. 5). We chose for testing the ribozyme-substrate bp G11:C4 in what we later named helix 2. When we changed the ribozyme GllC, it did not cleave the native substrate (Fig. 5, lane 2) because it created a C:C mismatch. When we made a new substrate with a G4C mutation, this was now cleaved by the
11
THE HAIRPIN RIBOZYME
TABLE I MUTATION ANALYSIS OF sTRSV HAIRPIN RIBOZYME"~ Percent activity
Mutation No mutution (wild-type) Subscrate g6a ......................... g6c .........................
100
.NC" .NC ........... .NC
.1 u7c ......................... u7g ......................... .2 c8a ......................... .2 c8g ......................... .1 c8u ......................... .25 SUbstrate/dmzyme ............. c4gGllC .................... .12 c4uG11U .................... .NC c4wG11A .................... .40 c4wG11G ....................32 g6c/U39G ....................NC u7g/G8U ..................... NC u7a/A9U ..................... .NC c8gG8C ..................... .NC c8a/A7C ..................... .14 c8uA20C .................... .7 c8a/A20C:A7C ............... .9 Ribozyme ..................... A7C ........................ .lo0 A7C ......................... .lo0 A7C/A20C ...................,104 G8C ........................ .NC G8U ........................ .5 .................... .NC AlOG ....................... .61 A15U/U49A ...................115 C16GiG48C ..................NC C17G ....................... .NC C17GK47C .................. .21 G19C ....................... .1 G19CiC45G ..................25 G21K ....................... .58
Mutation
Percent activity
A2OC ............................ .81 22AAA24CGU .................... .NC A24GiU37C ....................... NC C25GC36C.. ..................... NC del A26.. .......................... NC C27G.. ........................... NC C27GiG35C.. ..................... 10 A28UN34A ....................... 113 C29C/G33C ....................... 12 C29GK35C.. ..................... NC 29CGU31CCUC(GUUA)GACC.. .... NC 30GUU32UUCC. ..................NC 30GUU32GGAC(UUCG)CUCC.. ... .116 30GUU32GGUC(GUUA)GACC...... 100 30GUU32GGUC(GUUA)GACC idel U31 cut U32 ...................NC G33C. ............................ 10 U34A.. ........................... 11 G35C.. ........................... NC U37hA43U.. ...................... NC A38G.. .......................... . 2 A38U.. ........................... NC u 3 9 c . . ........................... 100 U39G.. ............* . * . . * . . . * . . * . . l o o A40U.. .......................... .3 A40G ............................ .3 U41C.. .......................... . 2 5 U42C.. .......................... .3 del U42 ........................... NC A43U.. ........................... NC G47C.. ........................... NC G48C.. ........................... NC del U49:del A50. ...................38 U49A.. .......................... .80 U49C.. .......................... . 6 0 del A50. .......................... .49 .75 A50G.. ..........................
%eproduced from P.Anderson, J. Monforte, R. Tritz,S. Nesbitt, J. Hearsf and A. Hampel, Nucleic Acids Res. 22,1096 (1994) by permission of Oxford University Press. "Percent activity is catalytic rate relative to that of the unmutated wild-type sequence. The substrate mutations are given in lower case and ribozyme mutations in uper case text. The changes are read as follows: g6a means base number 6 in the subskate was changed from its native G to an A; or A7G means base number 7 in the ribozyme was changed from its nativeA to a G. All bases are numbered according to Fig. 1. 'NC, no detectable cleavage.
FIG.5. The first mutagenesis experiment showing the location of a base pair defining the existence of a helix between the substrate and ribozyme. (A) The portion of the hairpin ribozyme mutagenized in base position 4 in the substrate and position 11 in the ribozyme in what came to be known as helix 2. (B) 7-Murea-2OYo polyaclylamide gel electrophoresis. Lane 1, native ribozyme and native substrate G:C base pair; lane 2, native substrate and ribozyme G l l C to give a C:C mismatch; lane 3, native S alone; lane 4, ribozyme G l l C and substrate C4G to give an alternate C:G base pair; lane 5,substrate C4G alone. Conditions for cleavage were as in the legend to Fig. 3. This experimentwas carried out by Richard Trik in my laboratoly (8). 12
THE HAIRPIN RIBOZYME
13
mutated ribozyme because an alternate C:G base pair was created (Fig. 5, lane 4). Thus we identified the existence of this base pair and defined the first base in helix 2 (8).This experiment had two major implications: 1. It identified the presence and location of a helical region between the ribozyme and the substrate. 2. Heterologous RNA sequences could be cleaved with appropriately engineered ribozymes, suggesting gene therapy might be possible with this ribozyme.
Since helix 2 existed, it was likely helix 1did also, therefore, the next series of experiments cleaved a large variety of different substrates by nbozymes that had been changed in both helices 1and 2 in order to allow base pairing to the substrate. Various lengths of helix 1 were used and cleavage still occurred. Mismatches did not cleave or cleaved very poorly. These experiments clearly showed the existence of helices 1 and 2 (5,8). Helices 3 and 4 were determined similarly. By making mismatch and compensatory changes, four Watson-Crick bp were identified in helix 3, and three bp identified in helix 4. Modeling predicted a fifth base pair in helix 3, between A15 and U49, however, compensatory mutagenesis showed this base pair did not exist. That is, when U49C and U49A mutations were made, no significant loss of activity was seen (Table I; 8, 11). Most recently we identified an 18th base pair in the structure of the hairpin ribozyme. A non-Watson-Crick A:G base pair was found in helix 4 between A26 and G36 (see Fig. 1).Mismatch mutations in either base reduced activity, and alternate base pairs (both G:C and A:U) restored activity (214 Thus this base interaction occurs to make a total of 18 bp determined in the hairpin ribozyme-substrate structure. Additional base pairs have been proposed (5,22), but compensatory mutational analysis either has not confirmed their existence (11;Table I) or was not done (22).Thus the two-dimensional hairpin model remains at 18 known bp with only certain required bases in the loops. With the identification of four helical regions, that left five unpaired single-stranded sequences between, which we named single-stranded loops 1, 2,3,4, and 5 ( 8 , I I ) (Fig. 1).Of the five loops, all have essential bases except one, loop 3, which is not needed. It can be replaced with a completely different sequence (8, 11)or removed entirely (23)with no loss of activity. The other four loops all have required bases. An early experiment was to replace 22AAA24 in loop 2 with a CGU sequence. When we did this, catalytic activity was completely lost (Table I; 8, 11).Loop 1has the sequence 7AGAA10. Mutations were made in each of these positions and catalytic activity analyzed. The A7C and A7G mutations gave no reduction in catalytic
14
ARNOLD HAMPEL
activity, nor did the AlOG mutation. However, changes in G8 eliminated activity nearly entirely (Table I). We have just completed an extensive analysis of the required bases in loops 2 and 4 by using a highly specific mutational analysis method. Specific mutations were placed in each position of loops 2 and 4. That is, each of the seven bases (including A26) in loop 2 was singly changed to A, G, C, and U, individually, and assayed for catalytic activity. The same thing was done for each of the nine bases (including G36) in loop 4. The result of this study was to identify all required and nonrequired bases in these loops. It was found that three bases were required in loop 2: A22, A23, and C25. Only one base was required in loop 4: A38 (214. Loop 5, with the sequence 5A*GUC8, presented interesting ramifications, because this is the substrate loop with cleavage occurring at the *. Early experiments determined the A base was completely variable, the G required, and the UC preferred (5,8,11). This determined targeting sequence constraints for the substrate. The substrate loop 5 prefers the sequence N*GUC in order for cleavage to efficiently occur.
0. Three-Dimensional Interactions No three-dimensionalx-ray diffi.action structure has been determined for the hairpin ribozyme. Tertiary interactions are likely to be essential to maintain the catalytically active conformation;however, in the absence of a threedimensional x-ray diffraction-derived structure, the location and nature of these interactions is speculative.Suggestions for specific interactions of bases between helical and loop regions have been made and models proposed. Attempts have been made to identify these tertiary interactions using non-x-ray diffraction methods. One model proposed interactions between loops 4 and 5 (24).Three proposals have been made for interactionsbetween loops 2 and 4 (22, 25, 26) and one proposal for interactions between loops 1, 2, and 5 (11).The model for interactions between loops 4 and 5 found that psoralin crosslinkingplaced U37, U39, and U42 from loop 4 of the ribozyme in very close proximity to C4 and U7 of the substrate (24).The nature of these proposed interactions between loops 4 and 5, however, was not known. To date, all of these models remain suggestive and none has been verified. The one piece of good evidence for three-dimensional modeling is that supporting the existence of a hinge at A15 allowing the molecule to fold on itself. A15 is the base between helices 2 and 3. If it were base paired to the U49 opposite it, then helices 2 and 3 would be one continuous helix. Computer modeling therefore gives a structure with an A15:U49 bp (4, 5); however, the A:A and A:C mismatch mutations made by changing U49 to A or C did not significantly reduce activity (Table I; 8,11).We speculated that A15 served as a hinge, allowing helices 2 and 3 to coaxially stack. This has been
THE HAIRPIN RIBOZYME
15
supported by experiments removing A15 and linking the ribozyme at its 3’ terminus to the 5’ end of the substrate with a linker, showing that when a bent structure was allowed, A15 was not needed at all (27). Thus the hairpin ribozyme likely folds back on itself, to bring some combination of loops 1 , 2 , 4, and 5 in proximity.
N. Development for Gene Therapy A. Targeting Rules Following the discovery, biochemical characterization, and determination of the two-dimensionalstructure of the hairpin ribozyme, it was apparent to us that the ribozyme may be able to be engineered to cleave a variety of heterologous substrate RNAs. We showed this was indeed the case. The engineered ribozyme must base pair to the substrate sequence in helices 1 and 2, and the substrate has a sequence preference of BN*GUC where * is the site of cleavage (Fig. 6).The nucleotide B is G, U, or C but not A. This led to our development of the hairpin ribozyme system for human gene therapy. In initial experiments cleaving heterologous RNA in a trans reaction, we chose substrate RNA sequences that retained the A*GUC sequence in loop 5. The ribozyme was engineered to base pair to the substrate at helices 1 and 2, and, for most constructs, cleavage occurred. There were exceptions. We found that substrates with an A in position 4, the B nucleotideposition in Fig. 6, were not cleaved by ribozymes with the correspondingU in position 11 (8, 11;Table I). For Cram reactions, this has been true in all cases we have studied. Interestingly, for the cis reaction using an autocatalytic hairpin cassette, an A is permitted in this position. In order to determine the sequence requirements of the 5A*GUC8 loop 5 in the substrate, we mutated each position to each of the other three bases. The substrate A5 base could be anythmg-no change in catalytic activity occurred for any substrate with any of the other three bases in this position (5). The G6 base was absolutely required and both the U7 and C8 bases were preferred. When the U7 and C8 were individually changed, sign&cant loss of activity was seen (5, 8,1428; Table I). This then gave the substrate targeting sequence requirement of BN*GUC. We found no other substrate sequence requirements. A different set of targeting rules have been proposed by others based on cis cleavage/ ligation of the hairpin ribozyme (29).We found these rules were not true for trum cleavage of a substrate by the hairpin ribozyme. As an example, it was reported that cleavage was greatly reduced when the base following the BN*GUC was an A (29).This was not true for trum cleavage. One of the best
16
ARNOLD HAMPEL
FIG.6. The engineered hairpin ribozyme. This is the model used to design hairpin ribozymes capable of cleaving heterologous target RNAs. The ribozyme is engineered to bind to the substrate by forming base pairs between helices 1 and 2. Helix 2 is fixed at 4 bp and helix 1 is of variable length. The B nucleotide in the substrate can be U, G, or C, it cannot be an A. The correspondingV nucleotide in the ribozyme is an A, C, or G. (Inset) The tetraloop is used to replace loop 3/helix 4 and, for certain sequences, catalytic activity can be improved.
substrates we have, CACC U*GUC AACAUAA, found in the pol sequence of HIV-1, has an A in this position. The hairpin ribozyme to this substrate cleaves with a catalybc efficiency even greater than that of the native ribozyme-substrate (30; see Section V).
B. Selection of Target Sites To initially locate appropriate targets, GenBank searches for the particular system chosen were carried out using the BN*GUC sequence requirement. All sites containing BN*GUC sequences were identified for that sys-
THE HAIRPIN RIBOZYME
17
tem. Sites containing regions of obvious strong secondary structure such as TAR or RRE were not further considered. Among the remaining sites, those near the 5' cap, near the 3' polyadenylation region, and near the exon side of the splice acceptor site were selected for further study since they were most likely to be exposed for eventual gene therapy. The reason for this was that, as a first approximation, antisense targeting showed these were preferred in vivo targets since they were the least likely areas to be parts of RNA structural elements or covered with proteins to make them inaccessable (31). For the HIV-1 targets selected, it was also necessary to i d e n e those targets that were conserved. This was accomplished by doing a search of GenBank for the BN*GUC sequences found in the various HIV-1 isolates. Those that had greater than 80% conservation were chosen for in vitro testing of catalytic activity and development of the ribozyme (8,32).
C. Design and Optimization of the Ribozyme In order to test these targets in vitro, the ribozyme needed to be designed and optimized for each particular target (Fig. 6).The ribozyme to the chosen target was designed by changing bases in its helix 1and 2 stem region to base pair with those of the substrate outside of the N*GUC loop 5 sequence. The base 5' to the N in helix 2 could not be an A; however, it could be a G, C, or U. This meets the sequence constraint of BN*GUC for the target. Helix 2 was fixed at 4 bp, while helix 1was of variable length. A long helix 1allowed very tight binding of the substrate to the ribozyme, giving a very low K, and a correspondingly low turnover number kc,. The result, in general, was a low catalytic efficiency (kcalK,). Since long helix 1sequences acted similarly to an antisense-type mechanism by turning over very slowly or not at all, they would obviate the catalytic advantages of the ribozyme. On the other hand, a very short helix 1length had the disadvantage of a very large K,. We found the best catalytic efficiency (kcadKm) occured when both K, and k,,, were at an intermediate range. Thus a range of helix 1lengths were needed for each individual target to determine the optimal length. Depending on the sequence, we found optimal lengths in the range of 5-10 bp. A good first approximation was 8 bp. We typically prepared substrates with helix 1lengths in this range in order to determine initial cleavage rates at a fixed ribozyme and a fixed, but high, substrate concentration to overcome high K, problems. Normally, a clear optimum was evident. If the optimal length was outside the initial chosen range, additional substrates needed to be prepared (8,33).
D. Catalytic Improvements Once the ribozyme was designed and optimized for helix 1 length, we found that for certain target sequences we could improve catalytic efficien-
18
ARNOLD HAMPEL
cy as much as 15-fold by replacing loop 3 with a tetraloop sequence. The commonly found GGAC(UUCG)GUCC tetraloop sequence was used in place of loop 3 (insetFig. 6). The tetraloop forms a very stable stem loop structure (34) and thus likely stabilizes the ribozyme itself against thermal denaturation and alternate inactive conformations (8,11).Depending on the specific target sequence to be cleaved by the ribozyme, the teiraloop addition to the ribozyme either had no effect on activity, decreased activity slightly, or increased activity. The change in the catalytic parameters of the HIV-1 polspecific hairpin ribozyme was most sigdicant. When the tetraloop addition was made to the basic hairpin ribozyme, the K , decreased from 42nM to 6.7 nM and kcat increased from 0.2 to 0.5 min to give an overall increase in catalytic efficiency (kca&J of 15-fold (8,30). The hairpin tetraloop ribozyme was designed to cleave specific target sequences, following the same targeting rules for helix 1and 2 as for the conventional hairpin ribozyme described above. This was done for all ribozymes developed for targeting. Specific examples given in this review are the 5’ leader and pol-specificribozymes to HIV-1 described in the next section.
V. Delivery of the Hairpin Ribozyme for Gene Therapy A. Autocatalytic Hairpin Cassette (8, 35) Once hairpin ribozymes were optimized for these targets, we designed a unique in vivo delivery system for pol I1promoters. We hypothesized that the ribozyme might work best in vivo if it were not part of a large mRNA transcript with a long 3’ untranslated region and an additional terminal poly-A sequence. Therefore, we designed an autocatalytic hairpin cassette (HC) to terminate the transcript. This cassette consisted of the sequence of the native hairpin substrate tethered at its 3’ end to the 5’ end of the ribozyme by a CCUCC pyrimidine loop. This arrangement created an intramolecular ribozyme-substrate complex that would autocatalyticallycleave, generating a defined 3’ terminus. When it was cloned into a vector and transcribed in vitro by the T7 RNA polymerase system, autocatalysis occurred, generating both 5’ and 3’ cleavage fragments (Fig. 7). The idea was to clone the gene-specific hairpin ribozyme into an appropriate vector with appropriate promoter, and allow transcription to proceed. The ribozyme would be transcribed first, followed by the HC. Transcription of the HC would generate an intramolecularribozyme-substrate complex that would autocatalyticallycleave, forming a precisely defined 3’ terminus on the ribozyme. The FWA polymerasewould proceed to the end of the gene, but this would have no effect on the gene-specific ribozyme transcript, because it would already have been terminated by the autocatalytic cassette (Fig. 8).
19
THE HAIRPIN RIBOZYME
-
3 '
5'
HC
LOOP4
U
u
cum
0.00
G Loop3
UAUAUUAC OUOG
0.00
pyrlmldlno loop
UGAC 0.00
A QACC A A C W
-AC
AAAG MIX4
Loop2
Hellx3
HellX2
u
00.00.
G A W A~~ A LOOP1
cc 3' F
HetlX1 5' F
FIG.7. The hairpin autocatalytic cassette. The 3' end of the substrate and 5' end of the ribozyme (Fig. 1) have been covalently joined by the CCUCC pyrimidine sequence to form an intramolecular ribozyme, the hairpin autocatalyticcassette (HC).When immcribed, this ribozyme self-cleaves to form two fragments (5'Fand 3'F), shown on the gel. Note that the 5'F contains only 5 nt of the processed HC. It thus serves as an excellenttranscriptionaltermination molecule (8,35). Reproduced from M. Altschuler, R. Tritz, and A. Hampel, Gene l22,85 (1992) with kind permission of Elsevier Science-NL, Sara Burgerhartstraat 25,1055 KV Amsterdam, The Netherlands.
Ribozyme transcripts were prepared in vitro from vectors in which the transcripts were terminated by the HC. Autocatalytic termination occurred as predicted, even when the vectors were not linearized. It was necessary to determine if such a catalytically terminated ribozyme maintained its activity, so the resulting ribozymes were tested for catalytic activity. No change in activity was observed for the ribozymes that had the HC 3' terminus. They were fully active (35). ribozyme 5' transcript
ribozyme transcript cleavage
5' 1
+
+
3
'
Y
short HC derived sequence
hairpin cassette (HC) FIG.8. Generation of an active hairpin ribozyme in vivo from pol I1 promoters. The hairpin autocatalytic cassette (HC) was cloned downstream of the gene-specifichairpin ribozyme in pol I1 promoters. Upon transcription the gene-specific ribozyme was transcribed first, followed by the HC. The HC then autocataIyticaUy cleaves to generate the short (5 nt minimum) finished 3' terminus of the ribozyme.
20
ARNOLD HAMPEL
B. Promoter A variety of choices for promoters were available. For our studies of downregulation of gene expression in vivo by the HIV-1-specific5' leader and pol ribozymes, six different promoters of the pol I1 and pol I11type were used. All six have successfully delivered the ribozyme in vivo and successfully downregulated gene expression of HIV-1. The six promoters are as follows: Promoters
Comments
1) Pol I1 MMTV LTR 2) Pol I1 p-actin promoter 3) Pol 111human tRNAvd (tVl) 4) Pol 111adenovirus VA1 5) Pol I1 MMLV LTR 6) Pol 111human tRNAvd (tV5)
Dexamethasone inducible and HC terminated HC terminated This promoter processed at the 3' terminus of the tRNA HC terminated This promoter does not process at the 3' end of the tRNA
VI. Inhibition of HIV-1 Expression in Vivo A. Targets and Ribozyme Selection In order for a ribozyme to be active in vivo against HIV-1, three criteria must be met. 1. The ribozyme must have suitable catalFc activity against the small exposed target While the effectivenessof the ribozyme is not finally known until the in dvo assays are done, it is possible to approximatelysimulate the catalytic conditions in wii?ro.For this purpose in vitro assays are done under our standard assay conditions of 37"C, pH 7.5 in low salt, and the presence of Mg2+,using a very short substrate. It is pointless to use a long transcript for in dtro studies, because activity against only the exposed target is of interest. Additionalvariables of folding, helical structures,and so forth in long RNA only prevent determination of catalybc activity against the target of interest Therefore great care was taken to determine and optimize this catalyticactivity.Optimization was done by varying the length of helix 1to determine optimal length and by measuring catalytic activity with and without the tetraloop addition (8,33).As stated earlier, the tebaloop GGAC(UUCG)GUCC used to replace the GUU sequence of loop 3 greatly improved catalytic activity in many cases. In searching the HIV-1 genome for target sites, the sequence BN*GUC, while a preference for a target site, was found not to be sufficient for suitable targets. Certain target sequences containing this sequence, while catalytically active, had poor in vitro catalytic activity. The reasons for this were not known. We found that in some cases, by
THE HAIRPIN RIBOZYME
21
using one of our improved hairpin ribozymes, this low activity could be overcome. 2. The ribozyme and substrate must be co-compartmentalized in vivo. The ribozyme cannot work unless it is next to the substrate. If they are in different cellular compartments, no cleavage will occur. 3. The substrate target site must be exposed in viuo. Many RNA sequences are either part of RNA structures, covered with proteins, or part of other cellular structures. In each of these cases, the target sequence would be inaccessible to the ribozyme. Such in vivo constraints are normally not known and cannot be predicted. They can only be obtained empirically. That is, once a highly active ribozyme to a suitable target has been made, in vivo activity must be determined. Beyond known regions of obvious structure in the target region, it is impossible to tell if a site is exposed or not. RNA folding programs for free RNA in vitro are of limited value because of other variables in vivo. At present, too many unknown variables exist, making it impossible to accurately predict in vivo activity. Once a target-ribozyme combination has been shown to have suitable catalytic activity in vitro, it must be tested in vivo to determine its effectiveness.
B. The 5’ Leader Target and Ribozyme 1. IDENTIFICATION AND in Vitro PROPERTIES Using the rules and procedures outlined, we identified a site in HIV-1 that was efficiently cleaved in vitro by an appropriately engineered hairpin ribozyme. The suitable target was in the 5’ leader region of HIV-1, with the sequence UGCCC*GUCUGUUGUGU. This sequence was located at nt 561-576 for the sequence of the HXB2 clone. The cleavage site of the sequence is 1111112 nt from the start of transcription (Fig. 9). Ribozyme to this site had an optimal helix 1length of 8 bp. The catalytic parameters were excellent, with a Km of 100 nM and kcat of 1.6/min,compared to those of the native hairpin ribozyme (Km = 30 nM and kcat = 2.llmin). The 5’ leader ribozyme thus had a catalytic efficency of 20% of the native sequence (8,36).Addition of the tetraloop did not improve catalytic activity, although the ribozyme had increased thermostability (A. Hampel, unpublished). We therefore chose to continue studies with the conventional hairpin ribozyme form for this particular target site. The target selected had a cleavage site between positions 111 and 112 from the start of transcription. This sequence was untranslated, but essential because it contained the 5’ cap and the TAR sequence. This target was chosen because it did not contain obvious secondary structure and it was at a key position in the mRNA. If the mRNA was made capless it would be inactive. In addition to being found on all HIV-1 transcripts, it was also found on
22
ARNOLD HAMPEL
FIG.9. The HN-1 5' leader target and ribozyme. (A) The target is located between positions 107 and 122 from the start of transcription,with cleavage between positions 111and 112. This correspondsto the nucleotide numbers 561-576 in the sequence from the HXB2 clone of HlV-1. This sequence is in the 5' leader of all HlV-1 transcripts. (B)The target-specificHIV-1 5' leader ribozyme is of the conventional sTRSV type, with an 8-bp helix 1.An inactive form of this ribozyme was also constructed that had an AAA -+ CGU mutation in loop 2, as shown in parentheses (8,36).
the viral genomic RNA. Furthermore, this target was conserved among more than 85% of the HIV-1 sequences found in GenBank. It was found in all the major HIV-1 isolates, with only strain MN having a single base mutation (Table 10. Thus we are targeting a region of the viral genome that appeared not to contain large mutagenic changes. 2. INHIBITIONOF HIV-1 EXPRESSION in Vivo BY THE 5' LEADER RIBOZYME a. Inhibition in Stable Transfectunts with the Ribozyme Driven by the MMTV R - m t e r (8). The 5' leader HIV-1-specific ribozyme was cloned into a mammalian expression plasmid, pMSG-dhh, along with the HC to
23
THE HAIRPIN RIBOZYME
TABLE I1
CONSERVATION OF THE HIV-1 5’LEADER AND POL TARGET SITES 5’ leader target sequence“
UGCCC*GUCUGUUGUGU UGCCC*GUCUGUUGUGU UGCCC*GUCUGUUGUGU UGCCC*GUCUGUUGUGU UGCCC*GUCUGUUAUGU UGCCC*GUCUGUUGUGU UGCCC*GUCUGUUGUGU pol target sequenceh
CACC U*GUCAACAUAA CACC U*GUCAACAUAA CACC U*GUCAACAUAA CACC U*GUCAACAUAA CACC U*GUCAACAUAA CACC U*GUC AACAUAA CACC U*GUC AACAUAA CACC U*GUC AACAUAA CACC U*GUCAACAUAA CACC U*GUCAACAUAA CGCC U*GUCAACAUAA
HIV-1 strain
HxB2 MAL ELI SF2 MN NL43 RF HIV-1 strain
HXB2 BRU MN OYI SF2
HAN RF 22
NDK MAL ELI
“The 5’ leader target sequence is nucleotides 561-576 (nucleotides 107-122 from the start of transcription).Reproduced from Ref. 36. “The pol target sequence is nucleotides 2490-2504. Reproduced from M. Yu, E. Poeschla, 0. Yamada, P. DeGrandis, M. Leavitt, M. Heusch, J. Yees, F. Wong-Staal,and A. Hampel, Vimlogy 206,381 (1995) by permission of Academic Press.
generate a construct, pdRHIV, with the ribozyme driven by the dexamethasone-inducible mouse mammary tumor virus (MMTV) LTR promoter and terminated by the hairpin autocatalytic cassette. This plasmid was stably cotransfected into HeLa T4+ cells along with the plasmid ptat and the reporter plasmid pCDLTR. The plasmid ptat provided tat, which transcriptionally trans-activated the HIV-1 LTR promoter found in the reporter plasmid pCDLTR. The plasmid pCDLTR had the HIV-1 LTR promoter and the first 132 nt of the HIV-1 transcript, followed by the CAT gene. The CAT gene was chloramphenicol acetyltransferase,which served as the reporter protein for detection purposes. The pCDLTR provided both the target for the ribozyme and the reporter CAT protein. If the ribozyme cleaved the target, then a reduction in CAT activity would occur. An outline of this experiment is shown schematically in Fig. 10.
24
ARNOLD HAMPEL
FIG 10. Schematic of inhibition of HIV-1 transcripts by the 5’ leader hairpin ribozyme. HeLa T4+cells were stably transfected with all plasmids shown and then induced with dexamethasone to allow expression of the 5‘ leader ribozyme from the mouse mammary tumor v i r u s (MMTV) LTR promoter in plasmid pdRHIV. This transcript contained the hairpin autocatalytic cassette (HC) following the ribozyme. This allowed transcription of the ribozyme with a short processed 3’terminus from this plasmid. It was targeted to the HIV-1 5’ leader region, which was the 5’ region of the CAT transcript found in plasmid pCDLTR. The CAT transcript begins with TAR, and thus its transcription is under control of the tat protein from the plasmid ptat. However, cleavage of the 5’ leader region pf pCDLTR by the ribozyme would remove the TAR region and inactivate expression of CAT (8).
Stably transfected cells were isolated by selection. This created a cell expressing tat, which trans-activated the expression of CAT as determined by high levels of CAT being produced. The ribozyme was under control of the dexamethasone-inducibleMMTV promoter and terminated with the hairpin autocatalytic cassette. Upon induction with dexamethasone, the levels of CAT activity fell on average 42% (Table 111),showing the ribozyme was effective against the HIV-1 target sequence in human cells. This experiment clearly showed the ribozyme decreased expression of a protein dependent on the target sequence of this ribozyme. This was the first example of in vivo efficacy of the hairpin ribozyme against a gene target in human cells (8). b. Hairpin Ribozyme Driven by a pol 11 P-Actin Promoter Inhibits HIV1 Expression in Vivo (8, 36). The pol I1 p-actin promoter was also used to deliver the ribozyme into human cells. The catalytically active 5’ leader hairpin ribozyme was cloned into a mammalian expression vector following the pol I1 p-actin promoter and terminated by the HC. As an antisense control, a disabled 5’ leader hairpin ribozyme containing an AAA + CGU mutation in loop 2 was developed (Fig. 9B). This ribozyme had no catalytic activity, but it had the same affinity for the target sequence as the catalytically active 5’ leader ribozyme. This was determined by binding experiments.Therefore,
25
THE HAIRPIN RIBOZYME
TABLE 111 INHIBITION OF HIV-1 EXPRESSION IN STABLY TRANSFECTED HELACELLS BY THE 5' LEADER HAIRPINRIBOZYME" Experiment (cpm)
Uninduced Induced Yo Reduction"
1
2
3
4
126.819 42,199
33,993 23,443
67,491 49,841
62,633 35,950
2 6010
4 3O/o
70%
3 1"/o
"The 5' leader ribozyme was expressed in HeLa cells by dexamethasone induction.The riboyme was designed to remove the TAR region from the CAT reporter plasmid, thus reducing expression of CAT. Given are the cpm for CAT expression (8). "Average percent reduction: 42.5.
the disabled ribozyme served as an antisense control. Its effect would be only antisense, with no catalybc component. The disabled ribozyme was cloned following the p-actin promoter exactly as the active ribozyme. The ribozyme plasmids, along with an infective HIV-1 clone, were transiently transfected into HeLa cells and HIV-1 gene expression measured. The target of the ribozyme would be the HIV-1 transcripts for the tat and p24 gag proteins. Expression of the tat protein was monitored by co-transfecting a CAT reporter plasmid with CAT transcription under the control of the TAR region. The p24 gag protein was measured by ELISA. The ribozyme inhibited expression of HIV-1 proteins by reducing the levels of tat and gag proteins approximately fourfold over that found in the control without ribozyme (Fig. 11).The disabled ribozyme gave only a small ( 1 0 0 ) reduction in tat and gag activity. Since the disabled ribozyme was catalytically inactive, this small reduction was likely antisense. Therefore, the primary effect of the active ribozyme could be attributed to catalysis and not antisense. This experiment clearly showed a dramatic inhibition of expression of HIV-1 proteins in the presence of the catalybcally active ribozyme. Both viral proteins were inhibited, indicating that the ribozyme was likely cleaving only the 5' leader region, since a capless mRNA would not express any viral proteins. No other effects on the cells were observed (36). c. Delivery by Retroviral Vectors Inhibits Expression of HIV-1 in Transient Transfections (37). The 5' leader ribozyme was cloned into LNLG, a Moloney murine leukemia virus (MMLV) vector, with the ribozyme driven by the pol I11 human tRNAvd tV1 promoter (38).This was done such that the ribozyme was transcribed in the direction opposite that of the transcript from the viral LTR promoter. In transient transfections using HeLa cells,
26
ARNOLD HAMPEL
A
Effect of Rlbozyme on HIV-1 TAT Activlty
B
Effect of Rlbozyme on p24 Production 11Or
100
90 80 70
60 50 40
30 20 10 n
B
T
-I-
FIG.ll. Hairpin ribozyme driven by a pol I1 P-actin promoter inhibits HIV-1 expression in uiuo. HeLa cells were co-transfected with three plasmids. The first contained the ribozyme (Rz) driven by the p-actin promoter and terminated by the hairpin autocatalyticcassette. The second plasmid was HIV-1 (HXB2 strain),which contained the target for the ribozyme. The third plasmid was a CAT reporter plasmid, which was activated by tat from the HW-1. The HIV-l/Rzplasmid ratios were done at both 1:5 and 1:lO ratios. As a control, the disabled hairpin ribozyme containing the AAA + CGU mutation (Fig. 9B) was used. Both (A) tat expression (as measured by CAT levels) and (B) p24 gag protein expression as measured by ELISA were assayed (36).
HIV-1 was inhibited 95% while the antisense control remained at 10% inhibition, again showing the effectiveness of this ribozyme against HIV-1 and the improved efficacy of the retroviral vector and pol I11 promoter. A lesser, but still highly effective, inhibition was seen when the ribozyme was driven by the pol I11 adenovirus VA1 promoter.
d. Expression of HIV-1 Is lnhibited in T Lymphocytes Stably Transfected with the 5’ Leader Hairpin Ribmyme (39). Human T lymphocytes, stably transfected with the 5’ leader hairpin ribozyme and challenged with live HIV1virus, displayed reduced expression of viral proteins by 1000 to 10,000-fold. This effect continued for the duration of the experiment, 35 days, without detection of any escape mutants. During this time period, viral titers were reduced to undetectable levels. The experiment was carried out using the 5’ leader HIV-1 ribozyme driven by a pol I11 human tRNAvd tV1 promoter in the MMLV retroviral vector. A similar but slightly reduced effect was seen
THE HAIRPIN RIBOZYME
27
with ribozyme driven by the viral LTR pol I1 promoter and terminated by the HC. As a control experiment, the 5’ leader ribozyme was tested against HIV2, a virus similar to HIV-1 but that does not contain the target sequence. No inhibition whatsoever was seen against this virus. Thus the ribozyme was exquisitely specific. It only inhibited the gene product for the mRNA sequence it was designed to cleave. In addition, cells containing the ribozyme were not impaired. Their growth properties and DNA synthesis rates were unchanged. Thus the ribozyme was not toxic to human cells-a necessary requirement for future human gene therapy using this tool. It had been suggested that an additional target for the 5’ leader hairpin ribozyme might be the incoming viral RNA itself (12).Human T cells were challenged with virus and viral DNA isolated after 6 h-a time before the virus could complete one full cycle of replication. In cells that had the ribozyme stably transfected, a 50 to 100-fold reduction in provirus DNA occurred. Thus, the ribozyme inhibited the incoming viral RNA in these cells. This places the hairpin ribozyme in a unique class of HIV-1 agents. It was able to serve as a combination drug at the level of both viral infection and viral expression. This is a regimen often promoted for the ultimate control of HIV1. Rather than requiring multiple drugs to carry out this function, the hairpin ribozyme was able to perform combination therapy as a single entity (39). e. Diverse Strains of HIV-1 Are Inhibited. Since HIV-1 was diverse in its sequence, a necessary requirement for a sequence-specific therapy such as the hairpin ribozyme was that the target sequence be present. In order for it to be present, it needed to be conserved. The 5’ leader target sequence was indeed highly conserved, with 85% of all known HIV-1 isolates in GenBank having the same identical sequence. The 5’ leader-specific hairpin ribozyme was able to inhibit all strains tested: HXB2, SF2, Eli, and an uncloned clinical isolate (8, 37, 39). An exception to this conserved sequence was that found in the strain MN. Here a single base substitution of G to A occurred in the middle of helix 1 (Table 11).When this target sequence was assayed in vitro, about a 10%reduction in catalytic activity occurred (A. Hampel, unpublished). A similar result was seen in vivo with a reduction of expression of HIV-1 proteins by about a factor of 100-foldrather than the 1000-fold reduction seen in strains containing the nonmutated sequence, making it about one tenth as effective when the mutation was present (37,39). Overall, these results clearly determined that the single 5’ leader ribozyme was capable of inactivating a wide range of diverse HIV-1 strainseven the variants found in clinical isolates. Specificity was for the single unique target chosen, with no effect whatsoever on the related virus HIV-2,
28
ARNOLD HAMPEL
which lacked the target sequence. Furthermore, no detectable deleterious effects occurred on the host cells themselves.
C. The Pol-Specific Target and Ribozyme 1. IDENTIFICATIONAND in Vitro PROPERTIES (8,30) Because of the existence of diversity, albeit limited, at the 5’ leader target site, and the plasticity of the HIV-1 genome, the virus likely would be able to escape such a single-sequence-specifictargeting therapeutic. An answer to this problem may be to find additional site-specific hairpin ribozymes. Such a collection of hairpin ribozymes targeted to a number of sites would create an HIV-l-specific weapons arsenal. With this in mind, we continued the search of the HIV-1 genome for additional suitable targets and ribozymes. Another target site for which the ribozyme had excellent in uitro catalybc activity was found in the protease region of the pol gene. The sequence of the target at this site is CACC U*GUC AACAUAA (nt 2490-2504), with cleavage after position 2494, numbered according to the HXB2 clone of HIV-1 (Fig. 12A).
FIG.12. The pol ribozyme target site in HIV-1. (A) A highly conserved target site in the protease region of the pol gene served as substrate for (B) the tetraloop hairpin ribozyme engineered to cleave this target. This is the pol-specifichairpin ribozyme. Helix 1 was optimized to 7 bp.
29
THE HAIRPIN RIBOZYME
Furthermore, this sequence was highly conserved. Sequence conservation among major HIV-1 isolates was greater than 85%, with only ELI being a variant. It had a single base mutation A + G in the middle of helix 2 (Table 11). Optimization of the ribozyme was carried out, and helix 1optimized to 7 bp with this target. Both the conventional and tetraloop versions of the ribozyme were made and assayed for catalytic activity. The pol tetraloop ribozyme (Fig. 12B) had the highest catalytic efficiency (kc,dKm of 75 FM-' min-l), of any hairpin ribozyme we have ever tested. It was 15-fold greater than that of the conventional version for this pol site, 5-fold geater than that of the 5' leader ribozyme, and even greater than that of the native sTRSV hairpin ribozyme (about 7%) (Table IV). Note the very small Km value of 6.7 nM for the tetraloop version. This is the smallest Kn, we have seen for any hairpin ribozyme we have tested. It is especially significant when compared to that of the 5' leader ribozyme, which has a Km of 100 nM. One is large, and one is small. This makes them complimentary in removing substrate, and argues for simultaneous in vivo delivery of the two ribozymes. 2. In Vivo INHIBITION OF HIV-1 (30) The pol-specific hairpin ribozyme with tetraloop and an optimized 7-bp helix 1 was tested in vivo against infective HIV-1 virus. This ribozyme was TABLE IV COMPARISON OF CATALYTIC EFFICIENCY OF HAIRPIN RIBOZYMES"
Native
UGACA*GUCCUGUUU HIV-1 pol/tetraloop CACCU*GUCAACAUAA 2490-2504 HN-1 POI CACCU*GUCAACAUAA 2490-2504 HIV-1 5' leader UGCCC*GUCUGUUGUGU 561-576
30
2.1
70
0.5
75
42
0.2
5
100
1.6
16
6.7
"Catalytic parameters of the pol ribozyme, both with and without the tetraloop rnodification. were determined. These were compared to those of the native sequence and the HIV-15' leader ribozyme (30). "Catalyticefficiency = kc,,&, (N-'rnin-I).
30
ARNOLD HAMPEL
cloned into the MMLV vector LNLG. It was transfected into the human transformed T cell lines Jurkat and Molt 418, and stable transfectants selected. Ribozyme expression was confirmed by RT-PCR analysis and no deleterious effect was seen when the ribozyme was expressed in long-term cultures. When transfected cells containing the pol ribozyme under control of the pol I11 adenovirus VA1 promoter were challenged with live HIV-1 isolates, expression of HIV-1 p24 protein was reduced to background levels up to 12 days posttransfection (Fig. 13). When the ribozyme was expressed under the control of the HIV-1 LTR with the HC in place to terminate it, inhibition of HIV-1 p24 expression was not as great. As a reference control, the 5' leader ribozyme under the control of the human tRNAvd promoter tV1 was tested alongside the pol ribozyme, and again, it was highly effective in inhibiting expression of HIV-1, but not as effective as the pol ribozyme from the pol I11 promoter. The pol ribozyme is, to date, the most effective in uivo ribozyme we have tested. The fact that this directly corresponds to its outstanding in vitro catalytic efficiency is likely not an accident. These results argue for a logical de-
A
B Jurkat
Molt 418 1507
1400,
IS Days post infection
Days post infection
FIG.13. Inhibition of HIV-1 expression in (A) Jurkat and (B) Molt 418 cells by pol-specific hairpin ribozyme. The hairpin ribozyme was cloned into MMLV vectors, stably transfected into cells, and challengedwith HN-1 virus strain HXBP. Transfected cells contained (U)the parental MMLV vector (no ribozyme); (A)the pol ribozyme driven by the adenovirus pol 111 promoter VA1; ( 0 )the pol ribozyme driven by the MMLV LTR pol I1 promoter with the transcript terminated by the hairpin autocatalytic cassette; and (0)the 5' leader ribozyme driven by the pol 111 tRNAvd tV1 promoter. Reproduced in modified form from M. Yu, E. Poeschla, 0. Yamada, P. DeCrandis, M. Leavitt, M. Heusch, J. Yees, F. Wong-Staal, and A. Hampel, Virology 206,381 (1995)with permission of Academic Press.
THE HAIRPIN HIBOZYME
31
velopment of site-specificribozymes using optimized in vitro enzymatic and biochemical properties of the system. By developing the most efficient and stable iibozymes possible, the chances of high in vivo efficacy may well be improved (8).
D. The Double Ribozyme Since the HIV-1 genome has an extremely high frequency of mutation, due primarily to the very rapid rate of proliferation and turnover of the virus (40), we designed a construct containing both the 5‘ leader and the polspecific ribozymes (A. Hampel, unpublished). Complimentarity of the Km and kcat values for the pol and 5‘ leader ribozymes made this an extremely powerful construct. That is, when both ribozymes are delivered simultaneously, the higher Km and kcat value of the 5’ leader ribozyme will allow this ribozyme to work against the initial higher concentrations of viral RNA. As the levels are reduced, the very low Km (6.7 nM) of the pol ribozyme will allow it to “clean up” remaining viral RNA transcripts. Another advantage of using both ribozymes simultaneouslyis the decrease in likelihood of the virus mutating both target sites simultaneously. Sequence searches of GenBank find only two rare isolates that are mutated in both sites. In vivo studies with the double ribozyme are in progress.
E. Human Clinical Trials As this review is being written, Phase I human clinical trials, under the supervision of Dr. Flossie Wong-Staal at the University of California-San Diego, are beginning with these two ribozymes. These Phase I human clinical trials are of an ex vivo nature. The general protocol proposed is to isolate human T lymphocytes, transfect one half of these cells with the MMLV vector containing both the 5’ leader and pol-specific HIV-1 ribozymes, and transfect the other half with the parental control LNL6 vector alone. These transfected lymphocytes will then be reinjected back into the patient. In addition to monitoring the patient for toxicity, the ratio of T cells containing the ribozymes relative to those containing the control will be monitored for efficacy (7, 41). This study represents the first human clinical trial of any ribozyme, and we excitedly await the results.
F. Improvements A significant improvement has been made with the design of the promoter. The human tRNAva’ promoter used for studies described in this review was the human tV1 tRNAvdAC major promoter of Hans Gross (38). This is a normal tRNA promoter, and therefore after transcription it processes at its 3‘ terminus. Since the ribozyme was cloned at the 3’ terminus of the tRNAvd, normal tRNA processing events will occur to release the ribozyme
32
ARNOLD HAMPEL
C?AuwwuG GGUC GUGCCUGGc "NN4CcAGA CACGGAC U N ~ G A ~ GAAACA U
5'
I
UA
ribozyme
non-processing ~ R N A V ~
c
CG
U A A
c A C
FIG.14. The transcriptfrom the nonprocessing tRNAvd promoter.This transcript from the plasmid ptV5 contains the ribozyme from the major tRNA"l'AACpromoter with four mutations from the minor tRNAVdCAC form. These four mutations are identified by the arrows and make the transcript nonprocessing. A knotted tetrdoop sequence has been added to the 3' terminus to slow 3' exonuclease activity (A. Siwkowski, M. DeYoung, J. Rappaport, unpublished).
with a new 5' terminus. The resultant ribozyme has now been processed away from the tRNA, and since it has a free 5'-OH terminus, it is likely to be susceptible to destruction by intracellular nucleases. This is a likely explanation why intracellular levels of ribozyme using this construct were found to be low, decreasing the possibility of it being an effective therapeutic. To help overcome this processing problem, we are now utilizing a combination of two of the tRNAvd promoters of Gross. These two promoters are the human tRNAvuc promoter tV1, which is the major form, and the tRNAvdCAC promoter tV4, which is a nonprocessing minor form (42).We have inserted the four mutations from the tRNAVdCACminor form, which made it nonprocessing, into the major form to create a hybrid promoter that we named ptV5. This combination was used because it would potentially give
THE HAIRPIN RIBOZYME
33
the promoter with the highest in vivo transcription and, simultaneously, make it nonprocessing. Indeed, this was the case as verified by analyzing the transcripts and processing products in HeLa cell nuclear extracts containing the pol I11 transcription system as well as the tRNA processing enzymes. High levels of transcription were achieved and it lacked typical tRNA processing (A. Siwkowski and J. Rappaport, unpublished). Thus construction now yields the ribozyme with a tRNA on the 5’ terminus (Fig. 14). As a further improvement we have added a knotted tetraloop at the 3’ terminus to inhibit 3’ exonuclease activity. Transcripts containing the ribozyme with tRNA at the 5’ terminus and the tetraloop at the 3’ terminus, have the same catalytic activity as those with the ribozyme alone (A. Siwkowski, unpublished). This promoter is presently being tested by using it to deliver hairpin ribozymes against HIV-1 and human papillomavirus to determine if it has improved in vivo effectiveness for delivering the ribozyme. Preliminary results for transient transfections,using the 5’ leader ribozyme in human cells in culture, have shown that the tV5 promoter is 4 times more effective in the inhibition of expression of HIV-1 proteins than the tV1 processing tRNAvuc promoter (J. Rappaport, unpublished).
VII. Additional Hairpin Ribozymes--GUA Specific While the hairpin ribozyme to date has been remarkably effective against HIV-1, additional ribozymes would be advantageous for the development of the hairpin ribozyme for gene therapy. Additional ribozymes would greatly expand the number of ribozymes available against the target. We determined the catalytic properties and targeting rules for two new classes of hairpin ribozymes-those from the negative strands of the satellite RNAs of chicory yellow mottle virus type l (sCYMV1)and arabis mosaic virus (sArMV).We modeled these ribozymes according to the rules for the sTRSV hairpin ribozyme and were able to identify a suitable natural substrate for trans cleavage studies for each of these systems (10, 43) (Fig. 15). Notice that helix 1 of both the sCYMVl and sArMV hairpin ribozymes has 5 bp as compared to that from sTRSV, which has 6 bp. Furthermore, helix 2 of the sArMV ribozyme has only 3 Watson-Crick base pairs as compared to 4 bp for the ribozymes from sTRSV and sCYMV1. We determined catalytx parameters for these ribozymes and found they had excellent catalyhc activity compared to the native sTRSV hairpin ribozyme (Table V). Improvements have been made for both the sCYMVl and sArMV systems. For the sArMV system a 4-bp helix 2 was preferable, and we later found that catalytic parameters of both ribozymes improved with a 6-bp helix 1.
34
ARNOLD HAMPEL
FIG.15. GUA-specific hairpin ribozymes. Hairpin ribozymes from the negative strands of' the satellite RNAs of (A) chicory yellow mottle virus (sCyMV1) and (B) arabis mosaic virus (sArMV). Both of these ribozymes have GUA specificity.The boxed sequences are identical between these two sequences and that of the sTRSV hairpin ribozyme in Fig. 1.These are the only three known naturally occurring forms of the hairpin ribozyme. Reproduced from M. B. DeYoung, A. Siwkowski, Y.Lian, and A. Hampel, Biochanistry 34,15785 (1995).Copyright 1995 American Chemical Society.
Note that the catalytic parameters for the sTRSV hairpin ribozyme have varied somewhat during the 8 years this study was carried out. The reason for this is that, from time to time, we have observed variations in the kinetic parameters obtained for a given system. This is likely due to changes in buffers, pH, metal contamination, temperature, and technique of the laboratory worker doing the experment. Therefore, all kinetic analyses for any given ribozyme in our studies are done in comparison to that of the native sTRSV hairpin sequence done simultaneously at that time. This allows us to have an invariant standard, the native hairpin sTRSV hairpin ribozyme, for comparison purposes.
35
THE HAIRPIN RIBOZYME
TABLE V EFFICIENCY O F *GUX CLEAVAGE BY s m s v , sCYMV1, AND sArMV RIBOZYMES~J?
KINETIC: PARAMETERS AND COMPARISON O F THE
Ribozyme
STRSV
Substrate TUX
GUA
GUC GUG CUU SCYMVl
GUA GUC GUG GUU
sArMV
GUA GUC CUG GUU
kcat (min-') 0.038 0.36 0.01 0.045 0.32 0.14 ND' ND 0.26 0.22 0.26 0.40
Km (a)
k,,iK, (lo4M - I min-')
2500 96 540 390
2 360 2 12 80 9
400 1500 >6500 ~10,000
880 5400 1700 3000
-
-
30 4 15 10
"Reproduced from M. B. DeYoung, A. Siwkowski, Y. Lim, and A. Hampel, Eiochmishy 34, 15785 (1995).Copyright 1995 American Chemical Society. 'The kinetic parameters of these ribozyme substrates were determined by only changing the X base in the "GUX sequence of' the substrate. For all the s h M V substrates a 4-bp helix 2 was used. Note that all sArMV and s C W l reactions used a 5-bp helix 1 while the sTRSV reaction had a 6 b p helix 1. "ND, not determined; timax could not bc reached.
A most remarkable aspect of these studies was the discovery that both the sArMV and sCYMVl ribozymes had different target specificity than the sTRSV-based hairpin ribozymes. The sTRSV system preferred *GUG after the site of cleavage, while both the sCYMVl and sArMV systems preferred the *GUA sequence. At present we do not know the reasons for this variation in target specificity. However, the catalytic efficiency of the sCYMV1based ribozyme for GUA sites was nearly %fold greater than that of the sArMV ribozyme. By extending helix 1 to 6 bp, we improved catalytic efficiency an additonal 4-fold over the 5-bp complex. Catalytic efficiency was improved an additional 1.7-fold when the thermostable tetraloop replaced loop 3 of the ribozyme (10,43,44). With another excellent ribozyme system in hand, and a new repertoire of targets containing *GUA, we began searching the HIV-1 genome for potential targets for gene inactivation. A large number of conserved sequences containing GUA were located and appropriately engineered sCYh4Vl hairpin ribozymes constructed and tested. One site in particular, which was highly conserved, showed excellent in viEro activity. The cleavage site was at position 9218, located in the nef/LTR region of HIV-1 (Fig. 16).
36
ARNOLD HAMPEL
FIG.16. Target site for a GUA-specific ribozyme. This target site is located in the nef/LTR region and was cleaved by an engineered sCYMV1-basedhairpin ribozyme (44).
The sCYMVl ribozyme optimized to this site contained 6 bp in helix 1 and had an altered tetraloop in place of loop 3. The overall catalytic efficiency (kc,/Km) of this ribozyme was 51 X lo4 ( M - l min-l, which is very close to that of the native sCYMVl ribozyme, 80 X lo4 ( M - l min-l (Table V; 44). Thus a new class of hairpin ribozymes for targeting was discovered, those with *GUA specificity. This, when added to the repertoire of sTRSV-based hairpin ribozymes, which have a *GUG specificity, should essentially double the number of available hairpin ribozymes for gene therapy, simply because the number of potential targets is approximately doubled. We are continuing to develop these for HIV-1 and for human papillomavirus gene therapy applications.
VIII. Conclusions and Perspectives A previously unknown catalybc RNA, the hairpin ribozyme, has been discovered in the negative strand of sTRSV and its minimal sequence determined. The essential elements of its two-dimensional structure have been identified by site-specific mutagenesis. Four helices containing 18 bp, one of which was A:G, were identified. The helices were interspersed by five singlestranded loops. The cleavage reaction has been shown to occur in trans for this minimal sequence, and it has been shown to be catalytx. The catalytic and thermodynamic parameters have been determined along with the pH and divalent cation requirements. Once the secondary structure was known, it was evident that heterologous RNA could be cleaved by an appropriately engineered ribozyme. Targeting rules for cleavage of such substrate heterologous RNAs were deter-
37
THE HAIRPIN RIBOZYME
mined, and it was found that the substrate had a BN*GUC sequence constraint where B was any nucleotide except A. We searched GenBank for suitable targets in a number of systems, including HIV-1. Highly conserved targets containing the BN*GUC sequence were identified and ribozymes engineered to these targets. Ribozymes were optimized by varying the length of helix 1and adding a tetraloop in place of loop 3 to improve activity in certain cases. Two of these, the 5' leader ribozyme and the pol ribozyme, were found to have excellent catalytic efficiency in vitro against small targets. An in vivo delivery and test system was developed. For pol I1 promoters, the ribozyme transcript was terminated with an autocatalytic hairpin cassette to give a defined 3' terminus. Both the HIV-1-specific5' leader ribozyme and the pol ribozyme were tested against HIV-1 in vivo in human cell culture. Expression of viral proteins was reduced by 3-4 logs and viral titers were reduced to background. The ribozyme reduced both expression of the viral proteins and replication of the virus. It was effective in inhibiting a wide range of viral strains. The reduction was shown to be catalytic and not antisense. No deleterious effects were observed by the ribozyme on the cells in which it was being expressed. Human clinical trials, under the direction of Dr. Flossie Wong-Staal at the University of California-San Diego, are about to begin using these two ribozymes. The system is currently being improved and expanded upon. As an improvement in delivery, a promoter containing a nonprocessing human tRNAVa'was constructed and is now being used for in vivo studies. Two new classes of hairpin ribozymes from sCYMVl and sArMV were identified and characterized. Both of these ribozymes had *GUA target specificity rather than the *GUC specificity of the sTRSV-based hairpin ribozyme. Catalytic efficiency was very high. The sCYMVl ribozyme has been engineered to cleave *GUA targets in both human papillomavirus- and HIV-1-infected human cells. These studies are continuing. Overall, these results are very exciting and bring the field of ribozymes to a new level of importance. In addition to the promise of the hairpin ribozyme as an AIDS therapeutic discussed in this review, a myriad of other gene therapy applications in human and nonhuman systems can be postulated for the hairpin ribozyme. It has promise for gene therapy applications in the areas of medicine, pharmaceuticals, virology, agriculture, and simply understanding how a gene works.
ACKNOWLEDGMENTS I am indebted to the many people, both within my laboratory and in collaboration, who worked with me on this project. All are gratefully recognized,because without their efforts these
38
ARNOLD HAMPEL
discoveries would not have occurred. Financial support from Northern Illinois University, the National Science Foundation, the National Institutes of Health, Genentech, and Biotechnology Research and Development Corporation is appreciated.
REFERENCES 1. C. Guemer-Takada, K. Gardiner, T. Marsh, N. Pace, and S. Altman, Cell 35,849 (1983). 2. T. Cech, Annu. Rm. Biochem. 59,543 (1990). 3. D. Long and 0. Uhlenbeck, FASEB]. 7,25 (1993). 4. A. Hampel and R. Tritz, Biochemistry 28,4929 (1989). 5. A. Hampel, R. Tritz, M. Hicks, and P. Cruz, Nucleic Acid Res. 18,299 (1990). 6. A. Hampel, S. Nesbitt, R. Tritz, and M. Altschuler, Methods. Comp. Methods Enzymol. 5, 37 (1993). Z P. Welch, A. Hampel, J. Barber, F. Wong-Staal, and M. Yu, in “NucleicAcids and Molecular Biology” (F. Eckstein and D. Lilley, eds.), Vol. 10, p. 315. Springer-Verlag, Berlin, 1997. 8. A. Hampel and R. Tritz, “HIV Targeted Hairpin Ribzoymes,” US. Patent 5,527,895, June 18, 1996. 9. J. Haseloff and W. Gerlach, Gene 82,53 (1989). 10. M. B. DeYoung, A. Siwkowski, Y. Lian, and A. Hampel, Biochemistry 34,15785 (1995). 11. P. Anderson, J. Monforte, R. Tritz, S. Nesbitt, J. Hearst, and A. Hampel, Nucleic Acids Res. 22,1096 (1994). 12. N. Smer, M. Johnston, A. Hampel, J. Zaia, E. Cantin, P. Chang, and J Rossi, in “Gene Regulation and AIDS” (T Papas, ed.), p. 305. Gulf Publishing Co., Houston, 1989. 13. N. Sarver, A. Hampel, E. Cantin, J. Zaia, P. Chang, M. Johnston, J. McGowan, and J. Rossi, Annals New Ywk A c d . Sci. 616,606 (1990). 14. G. Prody, J. Bakos, J. Buzayan, I. Schneider, and G. Bruening, Science 231,1577 (1986). 15. T. Forster and R. Symons, Cell 49,211 (1987). 16. J. Buzayan, W. Gerlach, & G. Bruening, Nature (London) 323,349 (1986). 1 Z W. Gerlach. J. Buzayan, M. Schneider, & G. Bruening, Virology 151,172 (1986). 18. J. Buzayan, A. Hampel, and G. Bruening, Nwleic Acids Res. 14,9729 (1986). 19. J. Buzayan, W. Gerlach, G. Bruening, P. Keese, and A. Could, Virology 151, 186 (1986). 20. P. Feldstein,J. Buzayan, and G. Bruening, Gene 82,53 (1989). 21. 0. Uhlenbeck, Nature (London)328,596 (1987). 21a. R. Shippy, A. Siwkowski, and A. Hampel, Biochemistry 36,3930 (1997). 22. S. Schmidt, L. Beigelman, A. Karpeisky, N. Usman, U. Serensen, and M. Gait, Nucleic Acicls Res. 24,573 (1996). 23. B. Chowrira, A. Benal-Hemanz, C. Keller, and J. Burke,]. Biol. Chem. 268, 19458 (1993). 24. J. Monforte, Ph.D. Thesis, University of California-Berkeley (1991). 25. S. Butcher and J. Burke, Biochemistry 33,992 (1994). 26. J. Grasby, K. Mersmann, M. Singh, and M. Gait, Biochemistry 34,4068 (1995). 27. Y. Komatsu, I. Kanzaki, M. Koizumi, and E. Ohtsuka,]. Mol. Biol. 252,296 (1995). 28. B. Chowrira and J. Burke, Nature (London)354,320 (1991). 29. S. Joseph, A. Berzal-Herranz, B. Chow-rira, S. Butcher, and J. Burke, Genes Deo. 7, 130 (1993). 30. M. Yu, E. Poeschla, 0.Yamada, P. DeGrandis,M. Leavitt, M. Heusch, J. Yees, F. Wong-Staal, and A. Hampel, Virology 206,381 (1995). 31. J. Goodchild, S. Agrawal, M. Civeira, P. Sarin, D. Sun, and P. Zamecnik, R-oc. Natl. Acad. Sci. U.S.A.86,5507 (1988).
THE HAIRPIN RIBOZYME
39
32. M. DeYoung and A. Hampel, in “Methods in Molecular Biology: Ribozyme Protocols” (P. Tumer, ed.), Vol. 74, p. 27. Humana Press, Totowa, NJ, 1997. 33. A. Hampel, M. DeYoung, S. Galasinski,and A. Siwkowski, in “Methods in Molecular Biology:Ribozyme Protocols” (lTurner, ? ed.), Vol. 74, p. 171. Humana Press, Totowa, NJ, 1997. 34. C. Cheong, G. Varani, and I. Tinoco, Nature (London) 346,680 (1990). 35. M. Altschuler, R. Tritz, and A. Hampel, Gene 122,85 (1992). 36. J. Ojwang, A. Hampel, D. Looney, F. Wong-Staal, and J. Rappaport, Proc. Nut2. Acad. Sci. U.S.A. 89,10802 (1992). 3%M. Yu, J. Ojwang, 0.Yamada, A. Hampel, J. Rappaport, D. Looney, and F. Wong-Staal, h c . Natl. A d . Sci. U.S.A. 90,6340 (1993). 38. H. Thomann, C. Schmutzler, U. Hudepohl, M. Blow, and H. Gross,J. MoZ. Biol. 209,505 (1989). 39. 0.Yamada, M. Yu, J. Yee, G. Kraus, D. Looney, and F. Wong-Staal, Gene ”her. 1,38 (1994). 40. D. Ho, A. Neumann, A. Perelson, W. Chen, J. Leonard, and M. Markowitz,Nature (London) 373, 123 (1995). 41. F. Wong-Staal, HlV Ado. Res. “her. 4 , 3 (1994). 42. B. Kahnt, R. Frank, H. Blocker, and H. J. Gross, DNA 8,51 (1989). 43. A. Siwkowski, M. DeYoung, P. Anderson, and A. Hampel, in “Methods in Molecular Biology:Ribozyme Protocols” (P. Turner, ed.), Vol. 74, p. 357. Humana Press, Totowa, NJ, 1997. 44. Y. Lian, M.S. Thesis, Northern Illinois University, 1996.
This Page Intentionally Left Blank
Serum- and Polypeptide Growth Factor-Inducible Gene Expression in Mouse Fibroblasts’ JEFFREY A.WINKLES
Department of Molecular Biology2 Holland Laboratory American Red Cross Rockville, Mayland 20855 and Department of B i o c h i s t y and Molecular Biology and the Institutefm Biomedical Sciences G e q e Washington University Medical Center Washington,DC 20037 I. Mitogenic Stimulation of Quiescent Fibroblasts: The Genomic Response ......................................... 11. Identification of Serum- and Polypeptide Growth Factor-Inducible Genes: Strategies and Results ....................................
................................................ Platelet-Derived Growth Factor ............................... Epidermal Growth Factor ....................................
B. C. D. Insulin-like Growth Factor-1 .................................. E. Fibroblast Growth Factor-1 ................................... 111. Serum- and Polypeptide Growth Factor-InducibleGene Products and the Control of Cellular Proliferation .......................... A. Immediate-Early Response Transcription Factors ................ B. MiscellaneousProteins ...................................... n! Conclusions .................................................. References.. ..................................................
43 48 48 57 58 58 59 60 61 67 69 70
* Abbreviations:EGF, epidermal growth factor; Egr, early growth response; ES, embryonic stem; FGF, fibroblast growth factor; FIC, fibroblast-induciblecytokine; FR, FGF-regulated; IGF, insulin-like growth factor; IL, interleukin; MCP, monocyte chemoattractant protein; MGSA, melanoma growth-stirnulatoryactivity;MKP, MAP kinase phosphatase; PDGF, plateletderived growth factor; RT-PCR, reverse transcription-polymerase chain reaction; SRF, serum response factor. Address for correspondence. Progress in Nucleic Acid Research and Molecular Biology. Vol, 58
41
Copylight 0 1998 by Academic Press. All rights of reproductionin any farm reserved. 0079-6603/98 $25.00
42
JEFFREY A. WINKLES
Complex cellular processes such as proliferation, differentiation, and apoptosis are regulated in part by extracellularsignaling molecules: for example, polypeptide growth factors, cytokines, and peptide hormones. Many polypeptide growth factors exert their mitogeniceffeds by binding to specificcell surface receptor protein tyrosine kinases. This interaction triggers numerous biochemical responses, including changes in phospholipid metabolism, the activation of a protein phosphorylation cascade, and the enhanced expression of specific immediate-early, delayed-early, or late response genes. In this review, I summarize the major findings obtained from studies investigatingthe effects of serum or individual polypeptide growth factors on gene expression in murine fibroblasts. Several experimental approaches, including differential hybridization screening of cDNA libraries and differential display, have been employed to identify mRNA species that are expressed at elevated levels in serum- or polypeptide growth factor-stimulatedcells. These studies have demonstrated that serum- and growth factor-inducible genes encode a diverse family of proteins, including DNA-binding transcription factors, cytoskeletal and extracellular matrix proteins, metabolic enzymes, secreted chemokines, and serine-threonine kinases. Some of these gene products act as effectors of specific cell cycle functions (e.g., enzymes involved in nucleotide and DNA synthesis), others are required to successfully convert a metabolically inao tive cell to a metabolically active cell that will eventually increase in size and then divide (e.g., glucose-metabolizingenzymes), and some actually function as positive or negative regulators of cell cycle progression. In conclusion, research conducted during the past 15 years on s e w - and growth factor-regulated gene expression in murine fibroblasts has provided significant insight into mitogenic signal transduction and cell growth control. 0 1888 Academic kese
Complex cellular processes such as proliferation, differentiation, and apoptosis are regulated in part by extracellular signaling molecules: for example, polypeptide growth factors, cytokines, and peptide hormones. Many polypeptide growth factors, including members of the fibroblast growth factor (FGF) family and platelet-derived growth factor (PDGF), the major growth promoter present in serum, exert their mitogenic effects by binding to specific cell surface receptor protein tyrosine kinases (reviewed in 1, 2). This interaction triggers numerous biochemical responses, including changes in phospholipid metabolism, the activation of a protein phosphorylation cascade, and the enhanced expression of specific gene products (Fig. 1,reviewed in 1-5). Numerous laboratories have initiated research programs focused on the identificationand characterization of proteins that are encoded by serumand/or polypeptide growth factor-inducible genes. Results reported to date indicate that the majority of these proteins perform the basic biochemical functions required for cell growth and division. However, there is evidence that several actually control cell cycle progression, and these proteins are of particular interest since they are likely to play an important role in cellular transformation and tumorigenesis.
43
GENE EXPRESSION IN MOUSE FIBROBLASTS
Growth Factor Binding
1 1 1 1
Cell Surface Receptor Activation Signal Transductionto the Nucleus Changes in Gene Expression
J
Growth in Size of the Cell
DNA Synthesis and Chromosomal Replication
Mitosis and Cytokinesis
Fic:. 1. Polypeptide growth factors induce a specific genetic program that is required for cellular proliferation. The addition of serum or individual purified growth factors to quiescent fibroblasts promotes the transcriptional activation of numerous genes encoding proteins with diverse functions. Some of these proteins actually control cell cycle progression while others are required for specific cellular functions (eg., energy production, biogenesis of organelles, nucleotide and DNA synthesis).
In this review, I summarize the major findings obtained from studies investigating the effects of serum or individual polypeptide growth factors on gene expression. Also, those reports addressing whether the specific proteins encoded by growth factor-responsivegenes have a critical role in the control of cellular proliferation are described. The review is limited to studies using mouse fibroblast cell lines, which have proven to be especially useful in this area of research, and only considers the genomic response to mitogenic stimulation by serum or purified polypeptide growth factors. Reports describing the identification and analysis of genes induced by other mitogenic agents, such as the tumor promoter tetradecanoyl phorbol acetate, are not discussed. Several excellent reviews emphasizing different aspects of growth factor-inducible gene expression have been published and are recommended for additional information on this topic (3, 4,6-10).
I. Mitogenic Stimulation of Quiescent Fibroblasts: The Genomic Response Immortalized murine fibroblast cell lines (e.g., NIH 3T3, Swiss 3T3, Balb/c 3T3) are aneuploid, contact-inhibited cells that have been extensively used as model systems to study mitogen-regulated cell cycle progession (reviewed in 4,11).They are generally grown in cell culture medium con-
44
JEFFREY A. WINKLES
taining 10-200/0 calf serum, which provides the required growth-promoting agents, including the polypeptide growth factors PDGF, epidermal growth factor (EGF), and insulin-like growth factor (1GF)-1. When the serum concentration in the culture medium is reduced (usually to 0.1-0.50/0), the cells will enter a nonproliferative,quiescent state termed the Go phase of the cell cycle, which is characterized by a relatively low level of metabolic activity. The length of time required for any one individual cell to enter Go depends on its position within the cell cycle at the time of mitogen withdrawal (reviewed in 4, 7, 12). Proliferating fibroblast populations are normally serum starved for 24-72 h to ensure that the majority of the cells enter the Go phase. Quiescent cells can then be stimulated to synchronously reenter the cell cycle by the addition of serum or purified growth factors. Mitogen-stimulated fibroblasts will proceed into the G, phase and then the DNA synthesis (S) phase, which usually occurs 10-12 h after mitogen addition (e.g., see 13,14). A number of early studies examining the biochemical events associated with serum- or growth factor-stimulated fibroblast proliferation established that (i) de nmo RNA and protein syntheses were required for DNA synthesis and (ii) mitogen treatment could induce the expression of specific mRNAs and proteins that were not readily detectable in quiescent, nonproliferative cells (reviewed in 8,15). In consideration of these initial findings, several investigators initiated research programs to identify and characterize serum- or growth factor-regulated genes. The first reports describing the molecular cloning of cDNA sequences representing mitogen-inducible genes were published in July of 1983 by Linzer and Nathans (16)and Cochran et al. (17). These and other more recent studies, discussed in detail in Section 11, have indicated that serum or polypeptide growth factor treatment of quiescent fibroblasts can induce the expression of more than 100 distinct genes. Individual growth factor-regulated genes are generally classified into one of three groups; however, it should be noted that some genes have a complex expression pattern and are therefore difficult to classify in this manner (e.g., Nur77 (18)).Immediate-early response genes, also referred to as primary response genes, are the first set of genes expressed in growth factor-stimulated cells. They are transcriptionally activated in the absence of de nooo protein synthesis; therefore, quiescent cells must contain the regulatory factors required for their activation and, furthermore, these factors must be rapidly converted from an “inactive” to an “active” state in response to growth factor-mediated intracellular signals. Immediate-early response genes have been subclassified into “slow” or “fast” categories, depending on their kinetics of induction as well as their mechanism of transcriptional activation (19). Many of the immediate-early genes that encode transcription factors or cytokines contain a 7-nucleotide genomic element within their 3’-untranslated region that has been implicated in transcriptional regulation (20).Transcripts
45
GENE EXPRESSION IN MOUSE FIBROBLASTS
encoded by immediate-early genes are transiently expressed, with peak levels generally detected within 30 min to 2 h after growth factor addition (Fig. 2). The majority of immediate-early mRNAs have relatively short half-lives (21,22; Fig. 3); consistent with these findings, an AU-rich sequence motif im-
FGF- 1 I
0' 0.5h l h 2h 4h
FGF-1 I
+ CHX
CHX I
I
0.5h l h 2h 4h
I
~~
I
0.5h l h 2h 4h
Egr- 1
c-fos
c-iun
c-myc
TSP- 1
GAPDH
Frc. 2. FGF-1 induction of immediate-early response gene expression in NIH 3T3 fibroblasts. Serum-starved cells were either left untreated or treated with FGF-1, FGF-1 and cycloheximide (CHX),or cycloheximidealone for the indicated time periods. RNA was isolated and equivalent amounts of each sample were analyzed by Northern blot hybridization as described (134) using the cDNA probes indicated on the left (Egr, early growth response; TSP, thrombospondin; GAPDH, glyceraldehyde-3-phosphatedehydrogenase).GAPDH mRNA levels were assayed to c o n k that equivalent amounts of RNA were present in each gel lane. Only the region of each autoradiogram that contained a mRNA hybridization signal is shown.
46
JEFFREY A. WINKLES
L 0‘
?!
ACT.D ’2,
ACT.D + CHX
4h 8h 12h1’2h 4h 8h 12h‘
Fnk
FIG.3. Fnk mRNA stabilization by cycloheximide.Serum-starved cells were either left untreated or treated with FGF-1 for 2 h and then treated with actinomycin D (ACTD) alone or actinomycin D and cycloheximide(CHX) for the indicated time periods. RNA was isolated and equivalent amounts of each sample were analyzed by Northern blot hybridization (134) using the cDNA probes indicated on the left (Fnk, FGF-induciblekinase; GAPDH, glyceraldehyde-3phosphate dehydrogenase). GAPDH mRNA levels were assayed to confirm that equivalent amounts of RNA were present in each gel lane. Only the region of each autoradiogram that contained a mRNA hybridization signal is shown.
plicated in rapid mRNA degradation (23-26) is found in the 3’-untranslated region of many of these transcripts (23,24). Another characteristic of the immediate-early group of growth factorregulated genes is that they are frequently “superinduced when quiescent cells are simultaneouslytreated with both a mitogenic agent and an inhibitor of protein synthesis (e.g., cycloheximide). In some cases, immediate-early mRNA expression is actually detectable when cells are treated with the inhibitor alone (Fig. 2). Various studies have indicated that several distinct mechanisms, perhaps working in concert, are likely to be responsible for these protein synthesis inhibitor effects. First, the inhibitors may prevent the synthesis of transcriptional repressor proteins, thus leading to prolonged gene transcription (21,22,27). Second, the inhibitors may prevent the synthesis of labile mRNA-degrading enzymes, thus resulting in mRNA stabilization (21,22;Fig. 3).Third, immediate-earlymRNA decay may be coupled to the translation process itself; in this case, protein synthesis inhibitors would also increase mRNA half-life. Finally, there is evidence that at certain concentrations some protein synthesis inhibitors can actually act as activators of intracellular signaling pathways (28).This latter finding emphasizes the general concept that drugs may have several “nonspecific” cellular effects, which can vary with dose and/or treatment length. Accordingly, if pos-
GENE EXPRESSION IN MOUSE FIBROBLASTS
47
sible, results obtained using metabolic inhibitors should be confirmed using other experimental approaches. The next two groups of genes expressed in growth factor-stimulated cells are referred to as delayed-early response genes (also termed secondary response genes) and late response genes. In general, delayed-early mRNA expression is first detected in mid to late G , while late response mRNAs are expressed several hours later (e.g., during S phase) (Fig. 4).The induction of
FGF- 1
0’ 4h 8h 12h 24h PLF
TK
H3
rRNA
FIG.4. FGF-1 induction of delayed-early response and late response gene expression in NIH 3T3 fibroblasts. Serum-starved cells were either left untreated or treated with FGF-1 for the indicated time periods. RNA was isolated and equivalent amounts of each sample were analyzed by Northern blot hybridization (134)using the cDNA probes indicated on the left (PLF, proliferin; ODC, ornithine decarboxylase;TK, thymidine kinase; H3, histone H3). Only the region of each autoradiogram that contained a mRNA hybridization signal is shown. In the bottom panel, a photograph of the 28s rRNA band is shown to illustrate that equivalent amounts of RNA were present in each gel lane.
48
JEFFREY A. WINKLES
delayed-early and late response genes is dependent on de not10 protein synthesis, consistent with studies indicating that several of these genes are regulated by transcription factors encoded by immediate-early response genes (reviewed in 6; also see 18,29-33). Three general properties of serum- and growth factor-inducible genes should be noted. First, many of these genes are expressed when fibroblasts are treated with a variety of distinct polypeptide growth factors, cytokines, or peptide hormones. This result indicates that different ligands can activate similar signal transduction pathways and consequently a common genetic program. Second, the majority of these genes are not cell cycle regulated in normally proliferating cells (e.g., see 34,35).This implies that Go + S progression and G, + S progression are somewhat distinct cellular processes and, furthermore, that several specific gene products may be required during the Go + G, transition. Finally, the expression of many serum- and growth factorinducible genes-in particular, immediate-early response gene family members-fiequently occurs in various cell types and can be associated with such diverse biological responses as proliferation, differentiation, hypertrophy, and excitation (e.g., see 36-38). This indicates that many of the gene products encoded by this family are involved in a common set of shared cellular functions rather than acting as effectors of particular biological responses.
II. Identification of Serum- and Polypeptide Growth Factor-lnducible Genes: Skategies and Results Two basic experimental approaches have been employed to identify specific mRNA species that are expressed at elevated levels in serum- or polypeptide growth factor-stimulated fibroblasts. In one approach, RNA is isolated from quiescent or stimulated cells and the expression levels of a particular mRNA are assayed by Northern blot hybridization using a previously identified cDNA probe. In the second approach, one of several alternative strategies is used to isolate cDNAs derived from mRNAs present in stimulated cells but not quiescent cells. The cDNA clones are then characterized in detail. These latter studies, listed in Table I, are reviewed in this section. The studies are categorized with respect to the mitogenic stimulus that was used and then described in order of their publication date.
A. Serum The majority of the studies described to date have used serum as the mitogenic stimulus for inducing cellular proliferation. Serum contains many nutrients, peptide hormones (e.g., insulin) and polypeptide growth factors (e.g.,
49
GENE EXPRESSION IN MOUSE FIBROBLASTS
TABLE I IDENTIFICATION OF cDNAs CORRESPONDING TO SERUMOR GROWTH FACTOR-INDUCIBLE IN MOUSEFIBROBLAST CELLLINES^ GENES cDNA libray (or PCR template) preparation Mitogen Serum Serum Serum Serum Serum Serum Serum Serum Serum PDGF EGF IGF-1 FGF-1
Treatment period (h)
Cycloheximide addition?"
Experimental approach'
Year(s)of publication
12 3 3 8
-
DC DC DC DC DC DC DC,SC
1983 1985,1987 1987 1988,1990 1988 1988 1991 1992 1994 1983 1988 1987 1993
4
10 12 10 8 4 4 3-4 2 or 12
+ + + -
+
-
-
ss
DS DC DC DC DD
Reference(s)
16 2450 63 88,89
22 102 110 35 33 17 125 127 129
'Twostudies using early-passage cultures of mouse embryo fibroblaststhat had been Weated with growth factor-supplementedserum are not included in this table (213,214).It should be noted that studies using hamster (215,216),rat (217),or human (34fibroblasts have also been reported. "Cycloheximide is used to enrich libraries and probes for immediate-earlyresponse cDNAs. 'Abbreviations: DC, differential screeningof a cDNA library using cDNA probes; SC, Merentid screening of a subtracted cDNA library using cDNA probes; SS,differential screening of a subtracted cDNA library using subtracted cDNA probes; DS, differentialscreening of a cDNA l i b r q using subtracted cDNA probes; DD, differential display.
PDGF); thus, although it has proven to be a useful reagent for the identification of "growth-regulated genes, it is a relatively nonspecific mitogen that can activate multiple intracellular signal transduction cascades. 1. LINZER AND NATHANS (1983) Linzer and Nathans co-authored the first report describing the successful isolation of cDNA clones representing genes that were up-regulated after serum stimulation of quiescent Balb/c 3T3 cells (16).The experimental approach that was used, now commonly referred to as differential hybridization screening, has been subsequently employed by numerous investigators and includes four basic steps: (i) construction of a cDNA library using RNA isolated from mitogen-stimulated cells (in their case, serum treatment for 12 h); (ii) preparation of two 32P-labeledcDNA probes, one synthesized from the quiescent cell mRNA population and one from the mitogen-stimulated
50
JEFFREY A. WINKLES
cell mRNA population; (iii)hybridization of each 32P-labeledcDNA probe to one of the duplicate sets of immobilized cDNA library clones; and (iv) selection of clones that hybridized preferentially to the probe made from mitogen-stimulated cells. Two of the genes identified by these authors, 18A2 and 28H6, have been studied in detail, and their properties are summarized here. The 18A2 gene is a late response gene expressed primarily during the S phase of the Balb/c 3T3 cell cycle (at -18 h after serum stimulation) (16,39). It is predicted to encode a 101-amino-acid protein with sequence similarity to several known Ca2+-bindingproteins (39).One of these structurally related proteins, calcyclin, is also encoded by a serum-inducible gene (40, 41). The 28H6 gene is a serum- and polypeptide growth factor-inducible delayed-early response gene encoding an 1.0-kb mRNA that is expressed at maximal levels -12 h after mitogen addition (16, 42). It encodes a 224amino-acid protein with a significant degree of sequence identity to members of the prolactin-growth hormone family (43).This protein, termed proliferin by Linzer and Nathans (43),was subsequently shown to be closely related to mitogen-regulatedprotein, a growth factor-inducible, secreted glycoprotein characterized by other investigators (44, 45). In the mouse, proliferin is expressed in a tissue-specific manner, primarily within giant trophoblasts of the placenta (46, 47). The biological function of proliferin is unknown; however, forced expression in a myoblast cell line can inhibit muscle-specificgene expression and differentiation, apparently via an intracellular mechanism of action (48).In addition, proliferin stimulates endothelial cell migration in vitro and angiogenesis in vivo (49).
-
2. LAUAND NATHANS(1985,1987) In two separate reports published in the mid-l980s, Lau and Nathans described the isolation and initial characterization of cDNA clones representing 10 distinct serum- and polypeptide growth factor-inducible genes in mouse fibroblasts (21, 50).They used a differential hybridization screening approach that was designed to identify members of the immediate-early response gene family; specifically, the cDNA library was constructed using RNA isolated from cells subjected to a brief serum treatment in the presence of cycloheximide. Accordingly, as expected, all of the genes are transcriptionally activated within minutes of serum (or PDGF) stimulation ( 2 4 ,and the respective mRNA species are transiently expressed, superinduced in the presence of cycloheximide, and relatively unstable (2450).Full-length cDNA clones representing these 10 genes have been sequenced and are predicted to encode a diverse set of polypeptides, including a proline-rich cytoplasmic protein (pip92 (51)), a 153-amino-acid protein containing a potential trans-
G E N E E X P R E S S I O N IN M O U S E FIBROBLASTS
51
membrane domain (gly96 (52)),and tissue factor, a protein involved in blood coagulation (53).Tissue factor was also identified in a differential screening experiment to isolate EGF-inducible genes in mouse fibroblasts (see Section 11,CJ). Five of the 10 immediate-early genes characterized by this group appear to encode transcriptional regulatory proteins that function during cell cycle progression: Nur77, Zij7268, HLH462, jun-B, and Nup475. The Nur77 polypeptide is an orphan nuclear receptor that is structurally related to proteins comprising the steroid-thyroid-retinoid superfamily of ligand-binding transcription factors (54).This receptor has been described by other groups as well and referred to as N10 (55)or NGFI-B (56).Nur77 can bind a specific DNA sequence motif (57) and function as a potent transcriptional activator (57-60). The Zij7268 gene encodes a 533-amino-acid DNA-binding transcription factor with three tandomly repeated copies of a zinc finger domain (61, 62).The same gene has also been cloned independently by several other groups and named early growth response (Egr)-1(63),Krox-24 (64), and NGFI-A (65).This protein is one of the best characterized immediateearly transcription factors (reviewed in 38; also see Section II,A,3). HLH462 encodes a protein of 119 amino acids with a significant degree of sequence identity to members of the helix-loop-helix family of transcription factors (66).The HLH462 polypeptide does not contain a DNA-bindingdomain, but can dimerize with other helix-loop-helix proteins and inhibit their DNAbinding activity in vitro (66).Thejun-B (clone 465) gene is one member of the Jun family of proto-oncogenes encoding related DNA-binding transcriptional regulatory proteins (67).Finally, the Nup475 gene encodes a 319amino-acid nuclear protein that contains two zinc finger domains; as predicted, recombinant Nup475 can bind zinc ions (68). The immediate-early genes Cyr6l and MAP kinase phosphatase (MKP)1 have been characterized in some detail and are also of particular interest. Cyr61 encodes a cysteine-rich, heparin-binding protein that, although secreted from cells, is found associated mainly with the cell surface and the extracellular matrix (69, 70).The chicken homolog of this gene, CEF-10, was identified as a v-src-inducible gene (71).Two other structurally related proteins, Fisp-l2/connective tissue growth factor (72, 73) and Nov (74),have been described; interestingly, Fisp-l2/connective tissue growth factor is also encoded by a mitogen-inducible immediate-early response gene (72).The Cyr61 gene is somewhat unique among the immediate-early response gene family in that it remains transcriptionally active through the mid-G, phase of the cell cycle (69).Recent studies using purified recombinant Cyr61 have indicated that it can promote cell proliferation, migration, and adhesion in vitro; thus, it has been proposed to function primarily as an extracellular matrix signaling factor (75).
52
JEFFREY A. WINKLES
The immediate-early gene originally referred to as 3CH134 encodes MKP-1, a dual-specificity mitogen-activatedprotein kinase phosphatase (76, 77). This gene was also independently cloned using a similar differential hybridization screening strategy by Almendral et al. (see Section II,A,5) and named externally regulated phosphatase (78).The human MKP-1 homolog, named CL100, is induced by oxidative stress and heat shock in human skin fibroblasts (79, 80). Mitogen-activated protein kinases are a family of serine-threonine kinases that play a key role in converting ligand-cell surface receptor interactions into specific cellular responses (reviewed in 81, 82). They are activated via phosphorylation on specific tyrosine and threonine residues, and certain phosphatases, including MKP-1, can inactivate them by dephosphorylating both sites (reviewed in 82, 83). Thus, MKP-1 may represent an immediate-earlygene product that functions to attenuate growth factor-stimulated mitogenesis (see Section 111).
3. SUKHATME et al. (1987) A cDNA library constructed using RNA isolated from Balb/c 3T3 cells treated with serum and cycloheximide for 3 h was differentially screened with 32P-labeledcDNA probes by Sukhatme et al. (63).Seven distinct cDNA clones were identified, one of which represented the immediate-early protooncogene c-fos. Another gene identified by these investigators, designated Egr-1, encoded a protein with three zinc finger domains of the Cys,-His, subclass (84).As mentioned earlier, the Egr-1 gene was also identified by Lau and Nathans as well as several other groups (see Section II,A,2). It is a member of a multigene family that is highly conserved during vertebrate evolution. At least two other family members, Egr-2 (Krox-20)(85, 86) and Egr-3 (87), are also immediate-early response genes in mouse fibroblasts. Egr-1 gene expression is induced in response to various extracellular signals (e.g., mitogenic, differentiative, hypertrophic, excitatory) and in a wide array of cell types (reviewed in 38).The -80-kDa Egr-1 nuclear phosphoprotein can act as a positive or negative regulator of gene transcription, depending on the cell type (38).
4. MASIBAY et al. (1988); BOEGGEMANet al. (1990) A cDNA library constructed using RNA isolated from cells stimulated with serum for 8 h was screened by differential hybridization first by Masibay et al. (88) and then by Boeggeman et al. (89). Six cDNA clones representing serum-inducible genes in Swiss 3T3 fibroblasts were identified. Partial sequence analysis of the cDNAs indicated that one of the genes @ME,) encoded y-actin (88) while another (cl-15) encoded p-actin (89). Northern blot hybridization experiments indicated that these two cytoskeletal actin genes were induced in the presence of protein synthesis inhibitors and thus
GENE EXPRESSION IN MOUSE FIBROBLASTS
53
were members of the immediate-early class of serum-regulated genes (88, 89).The remaining four cDNAs isolated by these investigators appear to represent delayed-early response genes. At the time this work was published the DNA sequences of these clones were not related to any previously identified genes. 5.
ALMENDRAL
et al. (1988)
These investigators reported the isolation of -80 cDNA clones that represented distinct serum-inducible immediate-early response genes in NIH 3T3 fibroblasts (22).The experimental approach employed was differential hybridization screening of a cDNA library prepared using RNA isolated from cells treated for 4 h with serum and cycloheximide. The immediate-early mRNAs could be divided into two groups according to whether their peak expression levels occurred very early (-30 min) or somewhat later (-2 h) after serum stimulation.DNA sequence analysis indicated that this set of genes encoded transcription factors, secreted proteins, and cytoskeletal-extracellular matrix components (reviewed in 90; also see Table 11). Three of the most interesting immediate-early genes identified by this group, originally named N51, N65, and P16, encode members of a superfamily of secreted chemotactic and inflammatory proteins that are generally referred to as chemokines or intercrines (reviewed in 91, 92). AU of the proteins in this superfamily are relatively basic, are small (between 70 and 100 amino acids), and contain four conserved cysteine residues that form intrachain disulfide bonds. These chemokines are classified into one of two subfamilies on the basis of whether the two NH,-terminal cysteine residues are adjacent to one another (the C-C family) or separated by one amino acid residue (the C-X-C family). The immediate-early gene N51 encodes a 96amino-acid C-X-C type chemokine (93).N51 was also independently cloned as a PDGF-inducible gene in Balb/c 3T3 cells and named KC (see Section 11,BJ). The human and rat N51/KC homologs appear to be the Groa-melanoma growth-stimulatory activity (MGSA) (94, 95) and CINC (96') genes, respectively. Recombinant N51 is a potent chemotactic factor for human neutrophils, and this response is mediated via interleukin (1L)-8cell surface receptors (97).N51 is also a neutrophil-specific chemoattractant in vivo (98). The N65 and P16 genes are predicted to encode two distinct members of the C-C subfamily of proinflammatory chemokines. The N65 gene was also identified in a differential hybridization experiment designed to isolate PDGF-inducible genes and named JE (see Section 11,BJ). N65/JE, now generally referred to by the name given to the human homolog, monocyte chemoattractant protein (MCP)-1(99, IOO), has been extensively characterized and is described later in Section II,B,l. The P16 gene encodes a 97-
54
JEFFREY A. WINKLES
TABLE I1 SERUM-INDUCIBLE IMMEDIATE-EARLY RESPONSE GENESIDENTIFIEDIN THE DIFFERENTIAL SCREENING EXPERIMENT REPORTED BY ALMENDRAL ET AL. (22) cDNA clone A12 A15 AC16 AC113 AF21 AH119 B2 B31 C15 M57 M97 N10 N51 N65 P16 P38 P49 TT1 lT13 V58 v59 VlOl x34 x97
Predicted protein Fisp-12 (CTGF) Plasminogen activator inhibitor-1 Krox-20 (Egr-2) FOS-B c-Re1 c-Jun Phctin Jun-B Tissue factor &ox-24 (Egr-l,Z@268NCFI-A) Macrophage-colonystimulating factor Nu77 (NGFT-B) KC (Gro-wMGSA, CINC) JE (MCP-1) FIC P ,-Integrin C-FOS Fibronectin Helix-destabilizingprotein a-Tropomyosin C-MYC Actin-associatedprotein Cyclooxygenase-2 ERP (MKP-1, CL100)
Reference 72 90
86 218 90
219 220 90 90 64
221 55 93 90 101 220 90
220 90
220 90 222 223 78
amino-acid C-C type protein named fibroblast-inducible cytokine (FIC), which has -57% amino acid sequence identity to human MCP-1 (101).Purified recombinant FIC can bind to specific high-affinity receptors that are present on monocytes and endothelial cells but not lymphocytes, neutrophils, or fibroblasts (101).FIC binding to human monocytes promotes an increase in the intracellular calcium ion concentration, apparently via a pathway involving a pertussis toxin-sensitive G protein (101).It has been suggested that serum-inducible secreted proteins like MCP-1 and FIC may not function in cell growth control per se but instead may play a role in orchestrating complex biological responses such as tissue repair (101). 6. TOMINAGA (1988)
Differential hybridization screening has also been performed by Tominaga (102)using a cDNA library constructed from RNA isolated from Balb/c
GENE EXPRESSION IN MOUSE FIBROBLASTS
55
3T3 cells stimulated with serum for 10 h. Seven cDNA clones were isolated and two of these have been characterized. DNA sequence analysis indicated that one of the cDNAs, designated ST1, encoded the P,-integrin subunit (102). PI-Integrin was also identified as an immediate-early gene in the screening experiments of Almendral et al.(Section II,A,5).The second cDNA clone, ST2, was derived from an -2.7-kb mRNA predicted to encode an -37-kDa, secreted protein with -25% amino acid sequence identity to the IL-1 receptor extracellular domain (103,104).This gene has also been identified by a group studying c-Ha-rm-induciblemRNAs, who named it Tl(105, 106).It should be noted that, although the mouse ST2Rl gene was originally reported to be an immediate-earlyresponse gene (107), Lanahan et al. (35) and Tavtigian et al. (33)identified this same gene in differential hybridization screening experiments for delayed-early response genes (see Sections II,A,8 and II,A,9). In addition, the rat ST2/T1 homolog (named Fit-1 (see later)) has been classified as a serum-inducible delayed-early response gene in Rat-1A cells. A murine cDNA representing a serum-inducible -5-kb ST2-related transcript was also cloned and sequenced by Yanagisawa et al. (108).The results indicated that this cDNA, called STSL, also encoded a protein related to the type 1 IL-1 receptor; in this case, the predicted protein contained extracellular, transmembrane, and cytoplasmic domains (108). Characterization of the rat ST2/ST2Lhomolog, named Fit-1, has indicated that the ST2 and ST2L mRNAs are likely to be (i)transcribed from a single locus using distinct promoters and (ii) differentially processed at their 3' end (109).Also, at least in the case of Fit-1, alternative promoter usage determines the mRNA splice variant ratio and thus the relative expression levels of the secreted or membrane-bound forms of the IL-1 receptor-related protein (109).It is currently unknown whether either of these protein isoforms can bind the cytokine IL1;if so, they could function as physiological regulators of IL-1 activity. 7. NIKAIDO et al. (1991) The identification of eight distinct cDNA clones representing transcripts that accumulate in the late G, phase of the cell cycle was reported by Nikaido et al. (110).Six of these clones were identified by differential screening of a cDNA library prepared using RNA isolated from benzo[a]pyrene-transformed Balbic 3T3 (clone A31) fibroblasts stimulated with serum for 12 h. The two remaining cDNAs were identified by differential screening of a subtracted cDNA library enriched for mRNAs expressed by untransformed clone A31 cells synchronized in the late GI-S phase. The use of subtracted cDNA libraries or screening probes is a strategy used by several groups (see later) to increase sensitivity for lower abundance mRNAs. Genes identified in these experiments included those encoding p-actin, a-tubulin, nonneu-
56
JEFFREY A. WINKLES
ronal a-enolase, lactate dehydrogenase, and ribosomal protein L32. Northem blot hybridization experiments examining the effect of cycloheximide on serum-stimulated mRNA induction levels indicated that the latter two genes were delayed-early response genes. 8. LANAHAN et al. (1992)
Another study that was specifically designed to identify serum-induced delayed-early response genes in mouse fibroblasts was reported by Lanahan et al. (35).Their experimental strategy included three basic steps. First, a directional cDNA library was constructed using RNA isolated from Balb/c 3T3 cells stimulated with serum for 10 h. To enrich for delayed-early cDNAs, sequences derived from mRNAs that were also expressed in quiescent cells andor cells co-stimulated with serum and protein synthesis inhibitors were reduced by subtraction procedures. Second, this subtracted cDNA library was differentially screened with two 32P-labeled subtracted cDNA probes, one enriched for those mRNAs expressed at 10 h after serum addition and one enriched for mRNAs expressed in quiescent cells and cells treated for 3 h with serum and protein synthesis inhibitors. Finally, those cDNA clones showing greater hybridization to the first probe were picked for further analysis. Thirteen different cDNA clones, representing -40% of the 650 initial phage isolates, were subsequently characterized. All of the genes represented by this cDNA collection were transcriptionally activated within a few hours of serum or growth factor treatment and required de novo protein synthesis for expression. Most of the delayed-early mRNAs reached a maximum expression level from 7 to 10 h after serum stimulation. Partial or complete DNA sequence analysis indicated that 10 of the 13 cDNAs were related to known gene sequences, including those encoding the nonhistone chromosomal proteins HMGI(Y)and HMGI-C, adenine phosphoribosyltransferase,cyclin D1, macrophage inhibitory factor, the nucleolar protein FUN26, and CHIP28/aquaporin-17a water channel (35,111).These results indicate that, like the immediate-early response gene family, the delayed-early response gene family encodes a large group of functionally diverse polypeptides. et al. (1994) 9. TAVTIGIAN The main goal of the work described by Tavtigian et al. (33)was to isolate and characterize cDNAs representing transcripts that were either upor down-regulated at 8 h after serum stimulation of Balbic 3T3 fibroblasts. To identify serum-induced genes, a cDNA library constructed using RNA isolated from cells treated with serum for 8 h was differentially screened using 32P-labeled cDNA probes enriched by subtractive hybridization for mRNAs specifically expressed in either quiescent or serum-stimulated (8 h) cells. Fifteen serum-induced genes were identified using this approach. DNA
GENE EXPRESSION IN MOUSE FIBROBLASTS
57
sequence analysis indicated that several of these genes encoded known proteins, including the extracellular matrix protein tenascin, lactate dehydrogenase, glyceraldehyde-3-phosphate dehydrogenase, ornithine decarboxylase, liver thioltransferase, the cytokine MGSA (also known as KC/N51, Gro-a, and CINC; see Sections II,A,5 and II,B,l), and the mouse T1 protein (see Section II,A,6). Several of these serum-induced genes were also up-regulated at the transcriptional level by conditional myc expression, which can drive fibroblast cell cycle progression in the absence of serum mitogens (see Section III,A,2).
B. Platelet-Derived Growth Factor PDGF, originally discovered as a cationic protein present in platelet agranules that exhibited growth-promoting activity, is one of the major mitogens in serum (reviewed in 1 1).It is composed of two closely related polypeptide subunits, PDGF A-chain and PDGF B-chain, which form three dimeric isoforms: AA, AB, and BB (reviewed in 112).PDGF B-chain is the product of the c-sis proto-oncogene. PDGF dimers induce mitogenesis via binding to high-affinity cell surface tyrosine kinase receptors, which themselves also consist of combinations of two related subunits (1,112). 1. COCHRAN et al. (1983) There has been one report describing the identification of PDGF-inducible genes in mouse fibroblasts (17).For these experiments, a cDNA library was constructed using RNA isolated from Balb/c 3T3 cells that had been density arrested and then incubated with heat-treated platelet lysates (sufficient amounts of purified native or recombinant PDGF were unavailable at the time these experiments were initiated) for 4 h. This library was then differentially screened with 32P-labeledcDNA probes synthesized from RNA isolated from quiescent or PDGF-stimulated cells. Approximately 8000 clones were screened, 55 were scored as likely representatives of PDGF-inducible mRNAs, and 46 of these could be grouped into five independent gene sequences. Two of these PDGF-responsive genes, named KC and JE, were studied in detail and shown to be immediate-early response genes regulated by several distinct growth factors (17). DNA sequence analysis indicated that both the KC and JE genes were predicted to encode low-molecular-weight proteins with NH,-terminal signal peptide sequences (113-1 15).They were subsequently classified as members of a family of secreted chemotactic proteins called chemokines or intercrines that are involved in the immune and inflammatory responses (reviewed in 91,92).As mentioned earlier, KC and JE were also identifed as immediate-early response genes by Almendral et al. and named N51 and N65, respectively (see Section II,A,5). The rat KC/N51 homolog is named
58
JEFFREYA. WINKLES
CINC (96) and the human homolog Gro-a(94) or MGSA (95).The human homolog of JE/N65 is referred to as MCP-1 (99, 100) or monocyte chemotactic and activating factor (116).MCP-1, a member of the C-C subfamily of chemokines (see Section II,A,5), is a potent chemotactic factor for human monocytes and T lymphocytes, but not neutrophils (117-120). MCP-1 can bind and activate a pair of seven transmembrane domain-containing highaffinity receptors that differ in their carboxyl-terminaltails (121).MCP-1 has been implicated as an important factor mediating monocyte migration to inflammatory sites; however, high levels of MCP-1 expression in organs of transgenic mice do not induce monocyte infiltration (122).
C. Epidermal Growth Factor EGF is the prototypic member of a family of structurally related polypeptide growth factors that are synthesized as membrane-anchored precursors and then proteolytically cleaved to release soluble mitogens (reviewed in 123, 124). EGF is one of the several mitogens that are present in platelet agranules (and thus serum preparations). EGF and its related peptides (e.g., transforming growth factor-a) bind to an -170-kDa cell surface protein tyrosine kinase receptor, which is the product of the c-erbB proto-oncogene (reviewed in 1, 123). 1. B u m et al. (1988) Several EGF-inducible genes in AKR-2B mouse embryo fibroblasts were identifiedby Blatti et a2. (125).The experimental approach employed was differential hybridization screening of a cDNA library constructed using RNA isolated from cells stimulated for 4 h with EGF and cycloheximide. Differential screening of 40,000 clones resulted in the isolation of 28 cDNAs corresponding to EGF-inducible mRNAs. The majority of these cDNAs represented either the p-actin, y-actin, or VL30 element-containing genes (125). DNA sequencing established that 2 of the 5 remaining cDNA clones encoded fibronectin, an extracellular matrix glycoprotein (125),and tissue factor, a transmembrane protein involved in initiating the protease cascade leading to blood coagulation (27).The cytoskeletal actin, fibronectin, and tissue factor genes are all immediate-early response genes.
D. Insulin-like Growth Factor-1 IGF-1 (somatomedin-C)and the closely-related IGF-2 molecule are multifunctionalpolypeptide mitogens found in the plasma fraction of whole blood in associationwith specific high-molecular-weightbinding proteins (reviewed in 126). Most of the cellular effects of the IGFs are mediated by binding to the type 1IGF receptor, a protein tyrosine b a s e consisting of two extracellular a-subunits and two transmembrane p-subunits (reviewed in 1,126).
GENE EXPRESSION IN MOUSE FIBROBLASTS
59
1. ZUMSTEINAND STILES(1987)
A cDNA library constructed using RNA isolated from Balblc 3T3 fibroblasts treated with IGF-1 for 3-4 h was differentially screened with 32P-labeledcDNA probes by Zumstein and Stiles (127).A total of 12 distinct cDNA clones representing IGF-1-inducible genes were identified using this strategy and subsequently classified into two categories. The majority of the cDNAs represented category I1 mRNAs, which were up-regulated by IGF-1 via a posttranscriptional mechanism and were also PDGF inducible. In comparison, category I mRNAs were up-regulated by IGF-1 at the transcriptional level in a protein synthesis-independent manner but were not PDGF inducible. These authors did not comment in this report as to whether any of these cDNAs had nucleotide sequence similarity to known DNA sequences.
E. Fibroblast Growth Factor-1 FGF-1, also referred to as acidic FGF, is one member of a family of structurally related proteins that stimulate cellular proliferation, migration, and differentiation (reviewed in 128). The biological effects of this -17-kDa heparin-binding protein are mediating via binding to a family of four cell surface protein tyrosine kinase receptors (reviewedin 1, 128). In contrast to the other three polypeptide mitogens described above, FGF-1 is (i) not synthesized with an NH,-terminal signal peptide sequence and consequently released from cells via the classical secretory pathway and (ii) not found at significant levels in platelets or plasma. 1. Hsu et al. (1993) Several years ago, our laboratory initiated a research program to identify and characterize FGF-1-induciblegenes in NIH 3T3 fibroblasts.The first report describing our experimental approach was published in late 1993 (129). This approach, based on the reverse transcription-polymerase chain reaction (RT-PCR) technique, is conceptuallysimilar to the mRNA differentialdisplay and RNA fingerprinting methods that have been used by other investigators to isolate differentially expressed genes (reviewed in 130).Briefly, the strategy includes four basic steps: (i) RNA is isolated from either quiescent or FGF1-stimulated cells and then converted to cDNA using random hexamer primers and reverse transcriptase; (ii)PCR assays are performed using either degenerate oligonucleotide primers designed to amplify cDNA templates encoding proteins with particular structural motifs or, alternatively, long (20-25 nucleotides) arbitrary primers; (iii) amplification products are displayed using agarose gel electrophoresis and ethidium bromide staining; and (iv)products of interest are isolated and cloned into an appropriate plasmid vector. In comparison to the differential hybridization screening approaches used by other
60
JEFFREY A. WINKLES
investigators,this RT-PCR-based strategy is relatively quick and inexpensive, is technically simple, does not require the use of radioactive compounds, and can be performed using very small quantities of total RNA. In our initial series of experiments using three cDNA templates (representing quiescent cells or cells stimulated with FGF-1 for either 2 or 12 h) and -40 different motif primer combinations, 30 cDNA fragments were isolated and 25 of these were successfully isolated and cloned. When used as probes in Northern blot hybridization experiments, 15 of the 25 cDNAs detected transcripts that were expressed at an increased level in FGF-l-stimulated fibroblasts. DNA sequence analysis revealed that 13 of the 15 cDNAs were unique, It should be noted that, although our initial goal was to use the motif primers to enrich for differentially expressed members of particular gene families, the majority of the FGF-l-inducible genes characterized to date do not encode proteins with the targeted motifs. This is because, under the PCR conditions used, many of the motif primers were able to anneal to and prime cDNA templates with a relatively low degree of sequence identity. The 13 genes identified by this differential display approach include representatives of the immediate-early,delayed-early,and late response classes. DNA sequence analysis indicated that several of the genes were novel while others were either identical or related to known gene sequences. Two of the immediate-early genes identified by this strategy encode FGF-inducible kinase, a member of the polo subfamily of serine-threonine protein kinases (131;see Fig. 3) and thrombospondin-1, an extracellular matrix protein (see Fig. 2). Delayed-early genes identified by this approach include fibroblast growth factor-regulated (FR)-1,which encodes an NADPH-dependent aldoketo reductase (132, 133); FR-19, which encodes a member of the transcriptional enhancer factor-1 family of DNA-bindingproteins (134);and the gene encoding a-actinin, an actin cross-linking protein found along microfilaments and in focal adhesion plaques (135).The FR-1 gene is of particular interest to our laboratory since it appears to be regulated by FGF-1 but not by serum mitogens (132).One of the late response genes identified using this approach, called FR-3, encodes G/T mismatch-binding protein, a component of the mammalian DNA mismatch correction system (136).
111. Serum- and Polypeptide Growth Factor-Inducible Gene Products and the Control of Cellular Proliferation The cDNA cloning and characterization studies described in the preceding section, as well as Northern blot hybridization studies using previously identified cDNA probes, indicate that mitogenic stimulation of quiescent mouse fibroblasts promotes the expression of numerous genes encoding pro-
GENE EXPRESSION IN MOUSE FIBROBLASTS
61
teins with diverse biological functions (Table 111).Why are these proteins expressed at elevated levels in mitogen-stimulated cells? It is apparent that some of these gene products act as effectors of specific cell cycle functions (e.g., enzymes involved in nucleotide and DNA synthesis) while others are required to successfully convert a metabolically inactive cell to a metabolically active cell that will eventually increase in size and then divide (e.g., glucose-metabolizing enzymes). It is anticipated that a third class of proteins would also be encoded by mitogen-inducible genes, and these would actually function in the control of cellular proliferation. These proteins could act as positive regulators or, alternatively,could counteract or minimize the growthpromoting effect of the mitogenic agent. In this section, I summarize the results of various studies designed to investigate whether particular serum- or growth factor-inducible gene products play a critical role in cell cycle progression. The approaches used by the majority of investigators to assess the role of these proteins in cellular proliferation include: (i) overexpressing the protein by introducing the cDNA, in vitro transcribed mRNA, or purified protein into cells by transfection or microinjection; (ii)inhibiting the synthesis of the endogenous protein by treatment of cells with antisense oligonucleotides,transfection with an antisense RNA expression vector, or microinjection of in vitro transcribed antisense RNA; (iii) neutralizing the biological activity of the protein by transfection with expression constructs encoding dominant-negative mutants or microinjection of specific antibodies or mutant proteins; and (iv) inactivating the gene encoding the protein of interest by homologous recombination in mouse embryonic stem (ES) cells and then deriving knockout mice. This latter approach in particular has provided new insight into the functions of several serum- and growth factor-regulated gene products.
A. Immediate-Early Response Transcription Factors Many immediate-early genes encode transcription factors that are thought to regulate the complex genetic program that eventually leads to cell division, Studies reported to date have investigated the roles of c-Fos, Fos-B, cjun, c-Myc, Egr-1 (Zif/268, &ox-24, NGFI-A), Egr-2 (Krox-20), Nur77 (NGFI-B),and serum response factor (SRF')in cellular proliferation. 1. c-Fos, Fos-B, C-JUN The various members of the Fos and Jun families of transcription factors associate to form Jun homodimers or Fos-Jun heterodimers. These complexes bind AP-1 sites located within the regulatory regions of target genes (reviewed in 137). It has been demonstrated that the constitutive overexpression of the c-Fos, Fos-B, or cJun proto-oncogenes can induce cellular transformation in vitro (138-141).In addition, inhibition of c-Fos synthesis
TABLE 111 AND/OR GROW FACTOR-INDUCIBLE GENES IN MOUSEFIBROBLASW SERUMFunctional classification of gene product A. DNA-binding proteins and transcription factors
Gene(s)h
c-Fos, Fos-B, Fm-1 ~ J n nJ-B , c-Myc oRel
Eg-1, -2,3 m-19 G/T mismatch-bindingprotein Helix-destabilizingprotein Histone H2B,H3 HMGI-C, HMGI(Y) MYn NGFI-A-bindingprotein 2
NU77 P53 Rat Y-box binding protein-a RNA Pol I transcnWon factor UBF Serum response factor
B. Cytoskeletal and extracellular matrix proteins
Actin (B, Y) Actin-associatedprotein Actinin (a) Fibronectin Integrin (B*)
Talin
Tenascin
Thrombospondin-1, -2
Referen+) 13,63,90,218,224,225 67,219,226 13,90,224,225,227 90,228,229 61,63-65,84-8 7 134 136 90 230 35 33,231 232 54-56 33J 71 233 234 235 88,89,l10,125,220,224,225,236-238 222 135,239 125,220,240 102,220,241 239 33,242 240,243,244
Tropomyosin (a) Tubulin (a) Vimentin
viculin C. Enzymes involved in: 1. Energy metabolism and transport
Glyceraldehyde3-phosphatedehydrogenase Lactate dehydrogenase Nonneuronal a-enolase
2. Nucleotide and DNA synthesis
Adenine phosphoribosyltransferase CAD (CPSase, AXase, DHOase) Dihydrofohte reductase Primase subunit p49 Ribonucleotide reductase R1, R2 Thymidine b a s e Thymidylate synthase
3. Protein posttranslational modifications
Cdc2 kinase FGF-inducible kinase MAP kinase phosphatase-1 Polo-like b a s e Protein tyrosine phosphatase35 PRGl phosphatase Serum-inducible kinase Cyclooxygenase-2 Galactosyltransferase (f3-1,4) Liver thioltransferase Nitric oxide synthase Omithine decarboxylase S-Adenosyhnethioninedecarboxylase
4. Other metabolic pathways
220 110 245 239,241,246 33,203 33,110 110,203 35 247 248,249 250 251 252-254 255 256 131 76-80 21 421 1 257 258 259 223,260 261 33 2 62 33,263-265 265
(continues)
TABLE I11 (Continued) Functional classification of gene product D. Secreted proteins
E. Miscellaneous
Gene(s)h Cathepsin L Cfll Fhroblast-inducible cytoldne Fq-12 KC Macrophage-colony stimulating factor Monocyte ahemoattractantprotein-1 Osteopontin Plasminogen activatorinhibitor-1 Proliferin 'Iissue inhibitor of metalloproteinase Urokinase-type plasminogen activator ADPiATP carrier c-Ha-ras Ca2+-bindingprotein Calcyclin Carbohydrate-bindingprotein 35 CHIP28 water channel Cyclin D1" Glucose transporter Poly(A)-bindingprotein Proliferatingcell nuclear antigen Ribosomal protein L32 Tisue factor
Reference(s)
46 69,71 101 72,73 17,33,93-96J14 221 17,90,99JOOJ13,115 266 90 43-45 267 14 268 191 39 4441 269 35 35,201 -203 2 70-2 72 2 73 254,274 110 27,53,90
"Only those genes regulated by serum and/or multiple polypeptide growth factors are listed. Genes identified by either Northern blot hybridization experimentsusing cloned, previously identified cDNA probes or the cDNA cloning experiments listed in Table I are included. This list, and the literature citations in the right-hand column, are not intended to be comprehensive. "hmediate~arlyresponse genes are in bold print. 'Repolted to have properties typical of an immediate-early(202)and a delayed-early(3.5) gene.
GENE EXPRESSION IN MOUSE FIBROBLASTS
65
in NIH 3T3 cells using antisense RNA has been reported to decrease the growth rate of these cells (142)and also inhibit serum- or PDGF-stimulated DNA synthesis (143).However, antibody microinjection experiments have dem-onstrated that, although c-Fos, Fos-B, and c-Jzdnmay play a role in the Go + S transition, this process can still occur in a significant percentage (-30-6Oo/o) of the injected cells (144,145). Neutralization of c-Jun appears to have the greatest impact on cell cycle progression, inhibiting serum-stimulated DNA synthesisby -70% (144).The observation that c-Fos function is not essential for cell growth per se is consistent with studies demonstrating that c-Fos-deficientES cells, primary embryonic fibroblasts, or immortalized fibro-blast cell lines have normal growth rates and can re-enter the cell cycle following serum stimulation (146-148).Indeed, c-Fos (149,150) (and also Fos-B [151])-homozygousnull mice appear normal at birth and are viable, although they do display several distinct phenotypic abnormalities during postnatal development. The cjun proto-oncogene has also been inactivated in mice by homologous recombination; in this case, homozygous mutant embryos die at midgestation (-12.5 days postcoitum) (152, 153).Although one group has reported that c-Jun-deficient ES cells exhibited the same growth rate as wild-type cells (154),another study found that primary embryonic fibroblasts derived from 11.5-day-postcoitum mutant embryos had greatly reduced growth rates in serum-containing culture medium (152). Furthermore, serum-starved mutant fibroblasts had a diminished proliferative response to several mitogens, including PDGF and FGF-2 (basic FGF). These studies indicate that, in contrast to c-Fos or Fos-B, the c-Jun protein appears to be required for optimal fibroblast growth in vitro and embryonic viability. 2. C-MYC The three highly related Myc oncoproteins (c-, N-, and L-Myc) are transcription factors that form heterodimers with the helix-loop-helixleucine zipper phosphoprotein Max (the murine homolog is called Myn). These complexes bind to a consensus binding site (CACGTG) found within the promoter region of target genes (reviewed in 155).Ectopic overexpression of c-Myc protein in quiescent cells can stimulate a subpopulation (-40-50°/o) of these cells to re-enter the cell cycle, progress through G,, and enter S phase (33,156,157).Microinjection of c-Mycinto the nuclei of quiescent Swiss 3T3 fibroblasts has a similar effect (158).However, it has also been shown that c-Myc-deficientES cells have normal growth properties in vitro (159).Nevertheless, c-Myc expression, like c-Jun expression (see earlier),is necessary for embryonic survival beyond mid-gestation (159).These results suggest that Myc family members, like Fos (see earlier),or Egr (see later) family members, may have partially overlapping functions so that the loss of any one member
66
JEFFREY A. WINKLES
may not have dramatic consequences on cell growth in vitro or early embryonic development.
3. EGR-1,E G R - ~ The zinc finger proteins Egr-1 (also called Zif1268, &-ox-% or NGFI-A) and Egr-2 (also called &-ox-20) are members of a family of transcription factors, All of these related proteins bind to the same GC-rich consensus sequence and can act as positive or negative regulators of gene transcription (see Sections II,A,2 and II,A,3). Antisense strategies have indicated that the Egr-1 protein is necessary for macrophage differentiation (160)and T lymphocyte proliferation (161)in vitro. However, Egr-1-deficient ES cells proliferate normally and are capable of differentiating into several cell lineages (162).Furthermore, Egr-1-homozygous mutant mice are viable and exhibit no obvious developmental defects (162),although females are infertile (163). In contrast, mice homozygous for a targeted mutation in the Egr-2 gene have several phenotypic abnormalities and die shortly after birth (164,165). Taken together, these results indicate that Egr-1 is not required for cellular growth and differentiation (perhaps due to functional redundancy of Egr family proteins), but Egr-2 appears to have a more specific role in cellular growth and differentiation in vivo.
4. N u ~ 7 7 The immediate-earlyresponse gene Nur77 (also called NGFI-B) encodes a member of the steroid-thyroid-retinoid superfamily of receptors that can bind a specific DNA sequence motif and thereby activate gene transcription. The Nur77 ligand has not been identified (see Section II,A,2). Nur77 knockout mice develop normally without overt changes in size, growth rate, or behavior (166);thus, it appears that the Nur77 transcription factor is not required for cellular proliferation or differentiation in vivo. 5. SRF
The -67-kDa SRF protein is a transcription factor that binds a DNA sequence motif called the serum response element. This element is present in the promoters of several immediate-early response genes (reviewed in 167, 168).SRF interacts with Ets domain-containing ternary complex factors (e.g., SAP-1, Elk-1), which become phosphorylated following stimulation of the mitogen-activatedprotein kinase pathway (167).The role of SRF in serumstimulated cell cycle progression has been investigated using an antibody microinjection approach (169).These experiments demonstrated that microinjection of anti-SRF antibodies into the cytoplasm of either Go or G phase (up to 8 h post-serum stimulation) cells inhibited DNA synthesis in
GENE EXPRESSION IN MOUSE FIBROBLASTS
67
-8O-9O0/o of the injected cells. These results indicate the SRF function is required for entry into S phase.
B. Miscellaneous Proteins 1. P53
The p53 tumor suppressor gene encodes a multifunctional DNA-binding transcription factor that can interact with a diverse group of cellular as well as viral proteins (reviewed in 170). Several p53 response genes have been identified, including GADD45, cyclin G, and IGF-binding protein 3. A p proximately 50% of the major forms of human cancer contain p53 missense mutations. Serum stimulation of quiescent fibroblasts increases p53 gene expression in late G, (33, 171). Although several early studies indicated that p53 was a positive regulator of cellular proliferation (172-179, more recent reports have clearly demonstrated its antiproliferative functions. For example, transfection experiments have shown that p53 overexpression can block cell cycle progression in the G, phase (170, 176-180); furthermore, it can also induce apoptosis in several cell types (170, 181, 182). Also, fibroblasts derived from p53-deficient mouse embryos have several altered growth characteristics relative to wild-type fibroblasts, including a significantly shorter doubling time, increased growth capacity in low-density conditions, and escape from replicative senescence (183).Similar results have been reported by another group using fibroblast-like cells derived from several tissues of 2month-old p53 knockout mice (184).It should be noted that, although several of the knockout studies have indicated that p53 is not required for normal embryonic development (185-187),other groups have found that p53 deficiency results in specific developmental abnormalities andor embryonic lethality (188,189).In any case, it is clear that a p53 null mutation leads to tumorigenesis in mice. Taken together, the results reported to date indicate that p53 is a critical negative regulator of cell proliferation in uitro and in uiuo. 2. C-HA-W The ras p21 proto-oncogenesencode four members (H-ras, N-ras, K-rasA, K-rasB) of a superfamily of small GTPases regulated by guanine-nucleotide exchange factors, guanine-nucleotide dissociation inhibitors, and GTPase-activating proteins (reviewed in 190). Bas is involved in many aspects of cell growth and differentiation and in particular has been identified as a major component of several mitogenic signal transduction pathways. The c-Ha-ras gene is expressed at elevated levels following serum, insulin, IGF-1, or EGF treatment of quiescent mouse fibroblasts (191). Microinjection (192) and transfection (193) studies using NIH 3T3 cells have dempated that c-Haras overexpression can promote the Go + S transition and induce cellular
68
JEFFREY A. WINKLES
transformation. Additionally, cells that overexpress a dominant inhibitory cHa-ras mutant protein are growth inhibited (194, 195) and show weak mitogenic stimulation by serum or individual growth factors (195).Furthermore, micro-injection experiments using either neutralizing anti-ras (H-rm, K-ras) antibodies (196,197)or two distinct dominant inhibitory c-Ha-rasprotein mutants (198)have indicated that ras activity is required during both early and late G, for serum-stimulated DNA synthesis in murine fibroblasts. These results indicate that c-Ha-ras function is critical for cell cycle progression.
3. C Y C L I N D ~ The synthesis and assembly of various holoenzymes composed of a regulatory subunit (the cyclins) and a catalytic subunit (the cyclin-dependent kinases) occurs during cell cycle progression (reviewed in 199,200). One of the cyclins, cyclin D1, is encoded by a serum- and polypeptide growth factor-inducible gene (35, 201-203). It is currently unclear whether cyclin D1 should be classified as an immediate-earlyor delayed-earlyresponse gene since serum-stimulated cyclin D1 transcription has been reported to occur in both a protein synthesis-dependent (35)and -independent (202) manner. The role of cyclin D1 in cellular proliferation has been investigated using several approaches. Overexpression studies in fibroblasts using cyclin D1 cDNA plasmid constructs driven by either a constitutive (204) or a tetracyline-regulated (205, 206) promoter have demonstrated that cyclin D1 can contract the G, phase and thereby accelerate both the Go + S and G , + S transitions. Transfected cell lines expressing cyclin D1 were also less dependent on serum for growth in vitro and were of smaller size than control cells (2045 ). Additionally, microinjection of anti-cyclin D1 antibodies into human lung fibroblasts (207), rodent fibroblasts (204), or human tumor cell lines (208) during the early to mid-G, phase has been shown to prevent S phase entry. Similar results were also obtained by electroporation of the antibodies (208) or microinjection of a cyclin D1 antisense RNA expression plasmid (207). Taken together, these studies indicate that cyclin D1 function is required for G , progression and thus cellular proliferation in vitro. Cyclin D1-deficient mice (generated by gene targeting in ES cells) can develop to term, but a majority of these animals show reduced body size, have neurological abnormalities, and die within 1month of birth (209).A detailed analysis of these knockout mice has indicated that cyclin D1 function is critical for retinal precursor cell proliferation during embryonic development as well as mammary epithelial cell proliferation during pregnancy (209).
4. POLO-LIKE KINASE The polo-like kinase protein is a serine-threonine-specific kinase encoded by a serum-inducible late response gene (210, 211). Microinjection of in vitro transcribed polo-like kinase mRNA into serum-starved NIH 3T3 cells
GENE EXPRESSION IN MOUSE FIBROBLASTS
69
can promote DNA synthesis in a subpopulation of these cells (212).Also, microinjection of antisense polo-like kinase RNA reduces the percentage of cells able to re-enter the cell cycle following serum stimulation (212).These results indicate that this kinase plays an important role in the Go + S transition. However, a recent study has shown that the microinjection of anti-pololike kinase antibodies has no effect on S phase progression in HeLa or Hs68 cells (275). 5. MKP-1 MKP-1 (also called externally regulated phosphatase or CL100) is a dualspecificity mitogen-activated protein kinase phosphatase encoded by an immediate-early gene (see Section II,A,2). Transfected NIH 3T3 cells that express MKP-1 in a constitutive manner show a significant decrease in growth rate compared with parental cell lines (78).However, M U - 1 knockout mice are phenotypically normal and MKP-l-deficient fibroblasts grow in a similar manner as control fibroblasts (276).These results indicate that M U - 1 function is not critical for cell cycle progression.
IV. Conclusions Research conducted during the past 15 years on serum- and growth factor-regulated gene expression in murine fibroblasts has provided significant insight into mitogenic signal transduction and cell growth control. It is now well documented that growth factor stimulation of quiescent cells promotes the sequential expression of a large family of nuclear genes comprising immediate-early, delayed-early, and late response gene members. The proteins encoded by this gene family include DNA-binding transcription factors, cytoskeletal and extracellular matrix proteins, metabolic enzymes, secreted chemokines and serine-threonine kinases. In addition, a significant number of growth factor-regulated genes encode proteins that do not contain recognizable structural motifs or have amino acid sequence similarity to previously identified gene products. The cellular functions of these “novel” proteins are of particular interest, and it is likely that many future studies will focus on this group of genes. Once a particular growth factor-regulated gene is identified, subsequent efforts to characterize the gene usually proceed in either an “upstream” direction (analysisof the signal transduction pathways, transcription factors, or promoter elements responsible for gene activation) or a “downstream” direction (analysis of protein structure-function or the specific role of the protein in cellular proliferation).Studies attempting to determine whether a particular growth factor-inducible gene product plays a critical role in the Go + S transition have been emphasized in this review. Several experimental ap-
70
JEFFREY A. WINKLES
proaches, including the generation of knockout mice, have been employed in these studies, The results reported to date indicate that, while some proteins appear to be important positive or negative effectors of cell cycle progression, others are not required for this process. In some cases, this latter finding is probably due to the fact that many of the specific gene products under investigation (e.g., immediate-earlytranscription factors) are members of a family of proteins with partially overlapping cellular functions. In any case, the realization that certain growth factor-induced genes do encode proteins that mediate the transition of cells from a resting to a growing state has validated research efforts in this field. It is anticipated that future studies will provide additional information on how extracellular signaling molecules can regulate eukaryotic cell proliferation. ACKNOWLEDGMENTS I thank Pat Donohue and Debbie Hsu for performing the Northern blot hybridization experiments presented here and for their helpful comments on this manuscript. I am also very grateful to Kim Peifley for help with reference management and Kit9 Wawzinski and Debi Weber for excellent secretarial assistance. The studies cited from my own laboratory were supported by research grants from the National Institutes of Health (HL39727,HL54710) and the American Heart Association (96014230).
REFERENCES 1. P. van der Geer and T.Hunter, Annu. Reu. Cell Biol. 10,251 (1994). 2. W. J. Fan& D. E. Johnson, and L. T.Williams, Annu. Rev. Biochem. 62,453 (1993). 3. G. T. Williams, A. S. Abler, and L. F.Lau, in “Molecular and Cellular Approaches to the Control of Proliferation and Differentiation”(G. S. Stein and J. B. Lian, eds.), p. 115. Academic Press, Orlando, FL, 1992. 4. R.Muller, D. Mumberg, and F. C. Lucibello, Biochim. Biophys. A& 1155,151 (1993). 5. K. Malarkey, C. M. Belham, A. Paul, A. Graham, A. McLees, d al., Blochem.]. 309,361 (1995). 6. S. B. McMahon and J. G. Monroe, FASEB]. 6,2707 (1992). Z R.Hofbauer and D. T. Denhardt, Crit. Rm. Eukayotic Gene Expression 1,247 (1991). 8. B. J. Rollins and C. D. Stiles,Ado. Cancer Res. 53,1(1989). 9. R.Bravo, Cell Growth Diflrn I, 305 (1990). 10. H. R. Henchman, Annu. Rm. Biochem. 60,281 (1991). 11. C. D. Scher, R.C. Shephard, H. N. Antoniades, ahd C. D. Stiles, B i o c h . Biophys. A& 560,217 (1979). 12. A. B. Pardee, Science 246,603 (1989). 13. R. Muller, R. Bravo, J. Burckhardt, and T.Curran, Nature (London)312,716 (1984). 14. G. Grimaldi, P. Di Fiore, E. K. Locatelli, J. Falco, and F, Blasi, EMBO]. 5,855 (l986). 15. L. F. Lau and D. Nathans, in “The Hormonal Control Regulation of Gene Transcription” (p. Cohen and J. G. Foulkes, eds.), p. 257. Elsevier Science Publishers, Cambridge, England, 1991.
GENE EXPRESSION IN MOUSE FIBROBLASTS
71
D. I. H. Linzer and D. Nathans, R-oc. Natl. Acad. Sci. U.S.A. 80,4271 (1983). B. H. Cochran, A. C. Reffel, and C. D. Stiles, Cell 33,939 (1983). G. T. Williams and L. F. Lau, Mol. Cell. Biol. 13,6124 (1993). R. R. Freter, J. A. Alberta, G. Y. Hwang, A. L. Wrentmore, and C. D. Stiles,]. Bwl. Chem. 271, 17417 (1996). 20. R. R. Freter, J.-C. Irminger, J. A. Porter, S. D. Jones, and C. D. Stiles, Mol. Cell. Biol. 12, 5288 (1992). 21. L. F. Lau and D. Nathans, R-oc. Natl. Acad. Sci. U.S.A. 84,1182 (1987). 22. J. M. Almendral, D. Sommer, H. MacDonald-Bravo, J. Burckhardt, J. Perera, et d., Mol. Cell. Bwl. 8,2140 (1988). 23. G. Shaw and R. Kamen, Cell 46,659 (1986). 24. C.-Y. A. Chen and A.-B. Shyu, Mol. Cell. Bwl. 14, 8471 (1994). 25. C. A. Lagnado, C. Y. Brown, and G. J. Goodall, Mol. Cell. Biol. 14,7984 (1994). 26. A. M. Zubiaga, J. G. Belasco, and M. E. Greenberg, Mol. Cell. Biol. 15,2219 (1995). 27. G. Ranganathan, S.P. Blatti, M. Subramaniam, D. N. Fass, N. J. Maihle, etal.,]. Bwl. Chem. 266,496 (1991). 28. L. C. Mahadevan and D. R. Edwards, Nature (London)349,747 (1991). 29. C. Bello-Fernandez,G.Packham, and J. L.Cleveland,h c . Natl. Acad. Sci. U.S.A. 90,7804 (1993). 30. K. E. Tobias, J. Shor, and C. Kahana, Oncogene 11,1721 (1995). 31. G. Molnar, A. Crozat, and A. B. Pardee, Mol. Cell. Biol. 14,5242 (1994). 32. J. C. Groskopf and D. I. H. Linzer, Mol. Cell. Bid. 14,6013 (1994). 33. S. V. Tavtigian, S.D. Zabludoff, and B. J. Wold, Mol. Biol. Cell 5,375 (1994). 34. M. Wick, C. Burger, S. Brusselbach, F. C. Lucibello, and R. Muller, ]. Cell Sci. 107,227 (1994). 35. A. Lanahan,J. B. Williams, L. K. Sanders, and D. Nathans, Mol. Cell. Biol. 12,3919 (1992). 36. K. L. Mohn, T. M. Laz, J.-C. Hsu, A. E. Melby, R. Bravo, et d., Mol. Cell. Bwl. 11, 381 (1991). 37. P. F. Zipfel, S . G. Irving,K. Kelly, and U. Siebenlist,Mol. Cell. Biol. 9, 1041 (1989). 38. A. Gashler, and V. P. Sukhatme, h g . Nucleic Acid Res. Mol. Bwl. 50,191 (1995). 39. L. L. Jackson-Grusby,J. Swiergiel, and D. I. H. Linzer, N u c k Acids Res. 15,6677 (1987). 40. B. Calabretta, R. Battini, L. Kaamarek, J. K. de Riel, and R.Baserga,]. Biol. C h .261, 12628 (1986). 41. X. Guo, A. F. Chambers, C. L. J. Parfett, P. Waterhouse, L. C. Murphy, d al., Cell Growth nflm.1,333 (1990). 42. D. I. H. Linzer and E. L. Wilder, Mol. Cell. Biol. 7,2080 (1987). 43. D. I. H. Linzer and D. Nathans, h c . Natl. A d . Sci. U.S.A. Sl, 4255 (1984). 44. M. Nilsen-Hamilton,R. T. Hamilton, and E. Alvarez-Azaustre, Gene 51,163 (1987). 45. C . L. J. Parfett, R.T Hamilton, B. W. Howell, D. R. Edwards, M. Nilsen-Hamilton,et al., Mol. Cell. Biol. 5,3289 (1985). 46. M. Nilsen-Hamilton,Y. Jang, M. Delgado, J. Shim,K. Bruns, et d., Mol. Cell. Endocriml. 77, 115 (1991). 47. S . Lee, F. Talamantes,E. Wilder, D.I. H. Linzer, and D. Nathans, Endocrinology 122,1761 (1988). 48. E. L. Wilder, and D. I. H. Linzer, Mol. Cell. Bwl. 9,430 (1989). 49. D. Jackson, 0.V. Volpert, N. Bouck, and D. I. H. Linzer, Science 266,1581 (1994). 50. L. F. Lau, and D. Nathans, EMBOJ. 4,3145 (1985). 51. C. H. Charles, J. S. Simske, T. P. OBrien, and L. F. Lau, Mol. Cell. Bwl. 10, 6769 (1990). 52. C. H. Charles, J. K. Yoon, J. S. Simske, and L. F. Lau, Oncogene 8,797 (1993). 53. S. Hartzell,K. Ryder, A. Lanahan, L. F. Lau, and D. Nathans, Mol. Cell. Bwl. 9,2567 (1989). 54. T. G. Hazel, D. Nathans, and L. F. Lau, R-oc. Natl. Acad. Sci. U.S.A. 85,8444 0988). 16. 17. 18. 19.
72
JEFFREY A. WINKLES
55. R.-P. Ryseck, H. MacDonald-Bravo,M.-G. Mattei, S. Ruppert, and R. Bravo, EMBO J. 8, 3327 (1989). 56. J. Milbrandt, Neuron 1,183 (1988). 5%T. E. Wilson, T. J, Fahmer, M. Johnston, and J. Milbrandt, Science 252,1296 (1991). 58. R. E. Paulsen, C. A. Weaver, T.J. Fahmer, and J. Milbrandt, J. Biol. Chem. 267, 16491 (1992). 59. I. J. Davis, T. G. Hazel, R.-H. Chen, J. Blenis, and L. F. Lau, Mol. Endocrinol. 7,953 (1993). 60. I. J. Davis, T. G. Hazel, and L. F. Lau, Mol. Endocrinol. 5,854 (1991). 61. B. A. Christy, L. F, Lau, and D. Nathans, Proc. Natl. Acad. Sci. U.S.A. 85,7857 (1988). 62. B. Christy and D. Nathans, Roc. Natl. Acad. Sci. U.S.A. 86,8737 (1989). 63. V. P. Sukhatme, S. Kartha, F. G. Toback, R. Taub, R. G. Hoover, et al., Oncogene Res. 1,343 (1987). 64. P. Lemaire, 0.Revelant, R. Bravo, and P. Charnay, Proc. Natl. Acad. Sci. U.S.A. 85,4691 (1988). 65. J. Milbrandt, Science 238,797 (1987). 66. B. A. Christy, L. K. Sanders, L. F. Lau, N. G. Copeland, N. A. Jenkins, et al., Roc. Natl. Acad. Sci. U.S.A. 88,1815 (1991). 6% K. Ryder, L. F.Lau, and D. Nathans, Proc. Natl. Acad. Sci. U.S.A. 85, 1487 (1988). 68. R. N. DuBois, M. W. McLane, K. Ryder, L. F. Lau, and D. Nathans,J. Biol. Chem. 265, 19185 (1990). 69. T. P. OBrien, G. P. Yang, L. Sanders, and L. F. Lau, Mol. Cell. Biol. 10, 3569 (1990). 70. G. P. Yang and L. E Lau, Cell Growth Diffm 2,351 (1991). 71. D. L. Simmons, D. B. Levy, Y. Yannoni, and R. L. Erikson, Roc. Natl. Acad. Sci. U.S.A.86, 1178 (1989). 72. R.-P. Ryseck, H. MacDonald-Bravo,M.-G. Mattei, and R. Bravo, Cell Growth Diffm. 2,225 (1991). 73. D. M. Bradham, A. Igarashi, R. L. Potter, and G. R. Grotendorst,J. Cell Biol. 114, 1285 (1991). 74. V. Joliot, C. Martinerie, G. Dambrine, G. Plassiart, M. Brisac, et al., Mol. Cell. Biol. 12, 10 (1992). 75. M. L. Kireeva, F. Mo, G. P. Yang, and L. E Lau, Mol. Cell. Biol. 16, 1326 (1996). 76. C. H. Charles, H. Sun, L. F. Lau, and N. K. Tonks, Proc. Natl. Acad. Sci. U.S.A. 90,5292 (1993). 7% H. Sun, C. H. Charles, L. F. Lau, and N. K. Tonks, Cell 75,487 (1993). 78. T. Noguchi, R. Metz, L. Chen, M.-G. Mattei, D. Carrasco, et al., Mol. CelZ, Biol. 13,5195 (1993). 79. S. M. Keyse and E. E. Emslie, Nature (London) 359,644 (1992). 80. D. R. Alessi, C. Smythe, and S. M. Keyse, Oncogene 8,2015 (1993). 81. G. L. Johnson and R. R. Vaillancourt, Cum. @in. Cell Biol. 6,230 (1994). 82. T. Hunter, Cell 80,225 (1995). 83. P. R. Clarke, Cum. Biol. 4,647 (1994). 84. V. P. Sukhatme, X. Cao, L. C. Chang, C. Tsai-Moms, D. Stamenkovich,et al., Cell 53,37 (1988). 85. L. J. Joseph, M. M. Le Beau, G . A. Jamieson, Jr., S. A c h q a , T. B. Shows, et al., Proc. Natl. Acad. Sci. U.S.A. 85,7164 (1988). 86. P. Chavrier, M. Zerial, P. Lemaire, J. Almendral, R. Bravo, et al., EMBOJ. 7,29 (1988). 8%S. Patwardhan, A. Gashler, M. G. Siegel, L. C. Chang, L. J. Joseph, et al., Oncogene 6,917 (1991). 88. A. S. Masibay, P. K. Qasba, D. N. Sengupta, G. P. Damewood, and T. Sreevalsan,Mol. Cell. Biol. 8,2288 (1988).
GENE EXPRESSION IN MOUSE FIBROBLASTS
73
89. E. Boeggeman, A. S. Masibay, P. K. Qasba, and T. Sreevalsan,]. Cell. Physiol. 145,286 (1990). 90. R. Bravo, M. Zerial, L. Toschi, M. Schurmann, R. Muller, et al., Cold Spring Harbor Symp. @ant. Biol. 53, 901 (1988). 91. J. J. Oppenheim, C. 0. C. Zachariae, N. Mukaida, and K. Matsushima, Annu. Rev. Zmmunol. 9, 617 (1991). 92. M. D. Miller and M. S. Krangel, CRC Crit. Rev.Zmmunol. 12,17 (1992). 93. R. P. Ryseck, H. MacDonald Bravo, M. G. Mattei, and R. Bravo, Exp. Cell Res. 180,266 (1989). 94. A. Anisowicz, L. Bardwell, and R. Sager, Roc. Natl. Acad. Sci. U.S.A. 84,7188 (1987). 95. A. Richmond, E. Balentien, H. G . Thomas, G. Flaggs, D. E. Barton, et nl., EMBO]. 7,2025 (1988). 96. K. Watanabe, K. Konish, M. Fujioka, S. Kinoshita, and H. Nakagawa, J. Biol. Chem. 264, 19559 (1989). 9% J. N. Heinrich, E. C. O'Rourke, L. Chen, H. Gray, K. S. Dorfman, et al., Mol. Cell. Biol. 14, 2849 (1994). 98. S. A. Lira, P. Zalamea, J. N. Heinrich, M. E. Fuentes, D. Camasco, etal.,]. Exp. Med. 180, 2039 (1994). 99. T. Yoshimura, N. Yuhki, S. K. Moore, E. Appella, M. I. Lerman, etal., FEBS Lett. 244,487 (1989). 100. B. J. Rollins, P. Stier, T. Emst, and G. G. Wong, Mol. Cell. Biol. 9,4687 (1989). 101. J. N. Heinrich, R.-P. Ryseck, H. MacDonald-Bravo, and R. Bravo, Mol. Cell. Biol. 13,2020 (1993). 102. s.Tominaga, FEBS Lett. 238,315 (1988). 103. S. Tominaga, FEBS Lett. 258,301 (1989). 104. T. Takagi, K. Yanagisawa, T. Tsukamoto, T. Tetsuka, S. Nagata, et al., Biochim. Biophys. Acta 1178, 194 (1993). 105. R. Klemenz, S. Hoffmann, and A.-K. Werenskiold, Roc. Natl. Acad. Sci. U.S.A. 86,5708 (1989). 106. A.-K. Werenskiold, S. Hoffmann, and R. Klemenz, Mol. Cell. Biol. 9,5207 (1989). 10% K. Yanagisawa, T. Tsukamoto, T. Takagi, and S. Tominaga, FEBS Lett. 302,51 (1992). 108. K. Yanagisawa, T. Takagi, T. Tsukamoto, T. Tetsuka, and S. Tominaga, FEBS Lett. 318,83 (1993). 109. G. Bergers, A. Reikerstorfer, S. Braselmann, P. Graninger, and M. Busslinger, EMBO]. 13, 1176 (1994). 110. T. Nikaido, D. W. Bradley, and A. B. Pardee, Exp. Cell Res. 192,102 (1991). 111. J. B. Williams and A. A. Lanahan, Biochem. Biophys. Res. Commun. 213,325 (1995). 112. C.-H. Heldin and B. Westermark, Cell Regul. 1, 555 (1990). 113. B. J. Rollins, E. D. Momson, and C. D. Stiles, Proc. Natl. Acad. Sci. U.S.A. 85, 3738 (1988). 114. P. Oquendo, J. Alberta, D. Wen, J. L. Graycar, R. Derynck, et al.,]. Bwl. Chem. 264,4133 (1989). 115. R. S. Kawahara and T. F. Deuel,]. Biol. Chem. 264,679 (1989). 116. Y. Furutani, H. Nomura, M. Notake, Y. Oyamada, T. Fukui, et al., Biochem. Biophys. Res. Commrcn. 159,249 (1989). 117. B. J. Rollins, A. Walz, and M. Baggiolini, Blood 78,1112 (1991). 118. C. A. Emst, Y. J. Zhang, P. R. Hancock, B. J. Rutledge, C. L. Corless, etal.,]. Immunol. 152, 3541 (1994). 119. T.Yoshimura, E. A. Robinson, S. Tanaka, E. Appella, J. Kuratsu, et al., J. Exp. Med. 169, 1449 (1989).
74
JEFFREY A. WINKLES
120. M. Woldemar-Cam,S . J. Roth, E. Luther, S . S. Rose, and T. A. Springer, R-oc. Natl. A d . Sci. U.S.A. 91,3652 (1994). 121. I. F. Charo, S. J. Myers, A. Herman, C. Franci, A. J. ConnoUy, et al., h c . Natl. Acad. Sci. U.S.A. 91,2752 (1994). 122. B. J. Rutledge, H. Raybum, R. Rosenberg, R. J. North, R. P. Gladue, et al.,]. Immunol. 155, 4838 (1995). 123. G. N. Gill, P. J. Bertics, and J. B. Santon, MoZ. Cell. Endominol. 51, 169 (1987). 124. J. Massague and A. Pandiella, Annu. Reo. Biochem. 62,515 (1993). 125. S. P. Blatti, D. N. Foster, G. Ranganathan, H. L. Moses, and M. J. Getz, Proc. Natl. Acad. Sci. U.S.A. 85,1119 (1988). 126. J. I. Jones and D. R. Clemmons, Endocrine Reu. 16,3 (1995). 127. P. Zumstein and C. D. Stiles,]. Bwl. Chem. 262,11252 (1987). 128. W. H. Burgess and J. A. Winkles, in “Cell Proliferation in Cancer: Regulatory Mechanisms of Neoplastic Cell Growth” (IPusztai, ,. C. E. Lewis, and E. Yap, eds.), p. 154. Oxford University Press, Oxford, England, 1996. 129. D. K. W. Hsu, P. J. Donohue, G. E Alberts, and J. A. Winkles, Biochem. Biophys. Res. Cornmun. 197,1483 (1993). 130. M. McClelland, F. Mathieu-Daude, and J. Welsh, Trends Genet. 11,241 (1995). 131. P. J. Donohue, G. F.Alberts, Y. Guo, and J. A. Winkles,J. Bbl. Chem. 270,10351 (1995). 132. P. J. Donohue, G. E Alberts, B. S . Hampton, and J. A. Winkles,J. BioZ. Chem. 269,8604 (1994). 133. D. K. Wilson, T. Nakano, J. M. Petrash, and F. A. Quiocho, Biochemistry 34,14323 (1995). 134. D. K. W. Hsu, Y.Guo, G. F. Alberts, N. G. Copeland, D. J. Gilbert, etd.,].Bid. Chem. 271, 13786 (1996). 135. D. K. W. Hsu, Y. Guo, G. F. Alberts, K. A. Peifley, and J. A. Winkles,]. Cell. Physiol. 167, 261 (1996). 136. P. J. Donohue, S.-L. Y.Feng, G. E Alberts, Y. Guo, K. A. Peifley, et al., Biochem. J. 319,9 (1996). 137. P. Angel and M. Karin, Biochirn. Biophys. A& 1072,129 (1991). 138. A. D. Miller, T Curran, and I. M. Verma, Cell 36,51 (1984). 139. R. Wisdom and I. M. Verma, Mol. Cell. Bwl. 13,2635 (1993). 140. M. Castellazzi,G. Spyrou, N. La Vista J . 3 Dangy, F. Piu, etal., Proc. Natl. Acad. Sci. U.S.A. 88,8890 (1991). 141. H. Okuno, T Suzuki, T Yoshida, Y. Hashimoto, T.Curran, et al., Oncogene 6,1491 (1991). 142. J. T Holt, T Venkat-Gopal,A. D. Moulton, and A. W. Nienhuis, h c . Natl. Acad. Sci. U.S.A. 83,4794 (1986). 143. K.Nishikura and J. M. Murray, Mol. Cell. Bwl. 7,639 (1987). 144. K. Kovary and R. Bravo, Mol. Cell. Biol. 11,4466 (1991). 145. K. Kovary and R. Bravo, Mol. Cell. BbZ. 12,5015 (1992). 146. S . J. Field, R.S. Johnson, R.M. Mortensen, V. E. Papaioannou, B. M. Spiegelman, et al., tsOc. Natl. Acad. Sci. U.S.A. 89,9306 (1992). 147. E. Hu, E. MueUer, S. Oliviero, V. E. Papaioannou, R. Johnson, et al., EMBOJ. 13,3094 (1994). 148. S. Brusselbach,U. Mohle-Steinlein,Z.-Q. Wang, M. Schreiber, F. C. Lucibello, etal., Oncogene 10,79 (1995). 149. R.S. Johnson, B. M. Spiegelman, and V. Papaioannou, Cell 71,577 (1992). 150. Z.-Q. Wang, C . Ovitt, A. E. Grigoriadis, U. Mohle-Steinlein,U. Ruther, et al., Nature (London) 360,741 (1992). 151. J. R. Brown, H.Ye, R. T.Bronson, P. Dikkes, and M. E. Greenberg, Cell 86, 297 (1996).
GENE EXPRESSION IN MOUSE FIBROBLASTS
75
152. R. S . Johnson, B. van Lingen, V. E. Papaioannou, and B. M. Spiegelman, Genes Dm. 7, 1309 (1993). 153. F. Hilberg, A. Aguzzi, N. Howells, and E. E Wagner, Nature (London) 365,179 (1993). 154. F. Hilberg and E. F. Wagner, Oncogene 7,2371 (1992). 155. M. D. Cole, Cell 65,715 (1991). 156. H. A. Armelin, M. C. S. Armelin, K. Kelly, T. Stewart, P. Leder, et al., Nature (London) 310, 655 (1984). 15% F. Cavalieri and M. Goldfarb, Mol. Cell. Bwl. 7,3554 (1987). 158. L. Kaczmarek, J. K. Hyland, R. Watt, M. Rosenberg, and R. Baserga, Science 228,1313 (1985). 159. A. C. Davis, M. Wims, G . D. Spotts. S . R.Hann, and A. Bradley, Genes Deu. 7,671 (1993). 160. H. Q. Nguyen, B. Hoffman-Liebermann, and D. A. Liebermann, Cell 72,197 (1993). 161. A. Perez-CastiUo, C. Pipaon, I. Garcia, and S. Alemany,]. BioZ. C h .268,19445 (1993). 162. S . L. Lee, L. C. Tourtellotte, R. L. Wesselschmidt, and J. Milbrandt, J. Bwl. C h . 270, 9971 (1995). 163. S. L. Lee, Y. Sadovsky, A. H. Swimoff, J. A. Polish, P. Coda, etd.,Science273,1219 (1996). 164. S. Schneider-Maunoury,P. Topilko, T.Seitanidou, G. Levi, M. Cohen-Tannoudji,etal.,Cell 75, 1199 (1993). 165. P. J. Swiatek and T. Gridley, Genes Deu. 7,2071 (1993). 166. P. A. Crawford, Y. Sadovsky, K. Woodson, S. L. Lee, and J. Milbrandt, Mol. Cell. BioZ. 15, 4331 (1995). 167. R. Treisman, Cum. @in. Genet.Deu. 4,96 (1994). 168. R. Treisman, EMBO]. 14,4905 (1995). 169. C. Gauthier-Rouviere,J.-C. Cavadore, J.-M. Blanchard, N. J. C. Lamb, and A. Femandez, Cell Regul. 2, 575 (1991). 170. L. J. KO and C. Prives, Genes Dm. 10,1054 (1996). 171. N. C. Reich and A. J. Levine, Nature (London) 308,199 (1984). 172. W. E. Mercer, D. Nelson, A. B. DeLeo, L. J. Old, and R. Baserga, R-oc. Natl. Acad. Sci. U S A . 79,6309 (1982). 173. W. E. Mercer, C. Avignolo, and R. Baserga, Mol. Cell. Bwl. 4,276 (1984). 174. L. Kaczmarek, M. Oren, and R. Baserga, Exp. Cell Res. 162,268 (1986). 175. 0. Shohat, M. Greenberg, D. Reisman, M. Oren, and V. Rotter, Oncogene 1,277 (1987). 176. L. Diller, J. Kassel, C. E. Nelson, M. A. Gryka, G. Litwak, et d., Mol. Cell. Bwl. 10,5772 (1990). 17% W. E. Mercer, M. T. Shields, M. Amin, G. J. Sauve, E. Appella, et d., R-oc. Natl. Acad. Sci. U S A . 87,6166 (1990). 178. S . J. Baker, S . Markowitz, E. R. Fearon, J. K. V. Willson, and B. Vogelstein, Science 249,912 (1990). 179. J. Martinez, I. Georgoff, and A. J. Levine, G m Dm. 5,151 (1991). 180. W. E. Mercer, M. Amin, G. J. Sauve, E. Appella, S . J. Ullrich, etd.,Oncogene 5,973 (1990). 181. E. Yonish-Rouach,D. Resnitzky, J. Lotem, L. Sachs,A. Kimchi, etd.,Nature (London) 352, 345 (1991). 182. P. Shaw, R. Bovey, S.Tardy, R.Sahli,B. Sordat, etal.,Rpc.Natl. Acad. Sci. U.S.A.89,4495 (1992). 183. M. Harvey, A. T. Sands, R.S . Weiss, M. E. Hegi, R.W. Wiseman, et d.,Oncogene8,2457 (1993). 184. T.Tsukada, Y. Tomooka, S . Takai, Y. Ueda, S. Nishikawa, et al., Oncogene 8,3313 (1993). 185. L. A. Donehower, M. Harvey, B. L. Skigle, M. J. McArthur, C. A. Montgomery, Jr., et d., Nature (London)356,215 (1992).
76
JEFFREY A. WINKLES
186. C. A. Purdie, D. J. Hanison, A. Peter, L. Dobbie, S. White, et d., Oncogene 9,603 (1994). 18% T. Jacks, L. Remington, B. 0.Williams, E. M. Schmitt, S. Halachmi, et al., Cum B i d . 4 , l (1994). 188. C. J. Nicol, M. L. Hanison, R. R. Laposa, I. L. Gimelshtein, and P. G. Wells, Nature Genet. LO, 181 (1995). 189. V. P. Sah, L. D. Attardi, G. J. Mulligan, B. 0. Williams, R. T. Bronson, et al., Nature Genet. 10, 175 (1995). 190. M. S . Boguski and F. McCormick, Nature (London) 366,643 (1993). 191. K. Lu, R. A. Levine, and J. Campisi, Mol. Cell. Biol. 9,3411 (1989). 192. D. W. Stacey and H.-F. Kung, Nature (London) 310,508 (1984). 193. E. H. Chang, M. E. Furth, E. M. Scolnick, and D. R. Lowy, Nature (London) 297, 479 (1982). 194. L. A. Feig and G. M. Cooper, Mol. Cell. Biol. 8,3235 (1988). 195. H. Cai, J. Szeberenyi, and C . M. Cooper, Mol. Cell. Bid. 10,5314 (1990). 196. L. S. Mulcahy, M. R. Smith, and D. W. Stacey, Nature (London) 313,241 (1985). 19% S. Dobrowolski,M. Harter, and D. W. Stacey, Mol. Cell. Biol. 14,5441 (1994). 198. D. W. Stacey, M. Roudebush, R. Day, S. D. Mosser, J. B. Gibbs, et al., Oncogene 6,2297 (1991). 199. X. Grana and E. P. Reddy, Oncogate 11,211 (1995). 200. C. J. Sherr, Cell 79,551 (1994). 201. E. Surmacz, K. Reiss, C. Sell, and R. Baserga, Cancer Res. 52,4522 (1992). 202. J. T. Winston and W. J. Pledger, Mol. Bid. Cell 4, 1133 (1993). 203. M. A. Guthridge, M. Seldin, and C. Basilico, Oncogene 12,1267 (1996). 204. D. E. Quelle, R. A. Ashmun, S. A. Shurtleff,J. Kato, D. Bar-Sagi, et al., Genes Dm. 7,1559 (1993). 205. D. Resnitzky, M. Gossen, H. Bujard, and S. I. Reed, Mol. Cell. Bid. 14,1669 (1994). 206. D. Resnitzky and S. I. Reed, Mol. Cell. Biol. 15,3463 (1995). 20% V. Baldin, J. Lukas, M. J. Marcote, M. Pagano, and G. Draetta, Genes Dm.7,812 (1993). 208. J. Lukas, M. Pagano, Z. Staskova, G. Draetta, and J. Bartek, Oncogene 9,707 (1994). 209. P. Sicinski, J. L. Donaher, S. B. Parker, T. Li, A. Fazeli, et al., Cell 82,621 (1995). 210. R. J. Lake and W. R. Jelinek, Mol. Cell. Biol. 13,7793 (1993). 211. K. S . Lee, Y. 0. Yuan, R.Kuriyama, and R. L. Erikson, Mol. Cell. Biol. 15,7143 (1995). 212. R. Hamanaka, S. Maloid, M. R. Smith, C. D. OConnell, D. L. Longo, et al., Cell Growth Difler. 5,249 (1994). 213. D. R. Edwards and D. T Denhardt, Exp. Cell Res. 157,127 (1985). 214. C . L. J. Parfett, R. Hofbauer, K. Brudzynski,D. R. Edwards, and D. T Denhardt, Gene 82, 291 (1989). 215. R. R. Hirschhom, P. Aller, Z. Yuan, C. W. Gibson, and R. Baserga, R-oc. Natl. Acad. Sci. U S A . 81,6004 (1984). 216. S. Vincent, L. Marty, L. LeGallic, P. Jeanteur, and P. Fort, Oncogene 8,1603 (1993). 21% L. M. Matrisian, G. Rautmann, B. E. Magun, and R. Breathnach, Nucleic Acids Res. 13, 711 (1985). 218. M. Zerial, L. Toschi, R.-P. Ryseck, M. Schuermann,R. Muller, etal., EMBO]. 8,805 (1989). 219. R.-P. Ryseck, S. I. Hirai, M. Yaniv, and R. Bravo, Nature (London)334,535 (1988). 220. R.-P Ryseck, H. MacDonald-Bravo,M. Zerial, and R. Bravo, Exp. Cell Res. 180,537 (1989). 221. R.-P. Ryseck, H. MacDonald-Bravo,and R. Bravo, N a u Biol. 3,151 (1991). 222. J. M. Almendral,J. F. Santaren,J. Perera, M. Zerial, and R. Bravo, Exp. Cell Res. 181,518 (1989). 223. R.-P. Ryseck, C. Raynoschek,H. MacDonald-Bravo,K. Dorfman, M.-G. Mattei, et al., Cell Growth Difer. 3,443 (1992).
GENE EXPRESSION IN MOUSE FIBROBLASTS
77
224. M. E. Greenberg and E. B. Ziff, Nature (London)311,433 (1984). 225. M. E. Greenberg, A. L. Hermanowski, and E. B. Ziff, Mol. Cell. Biol. 6,1050 (1986). 226. K. Ryder and D. Nathans, &oc. Natl. Acud. Sci. U.S.A. 85,8464 (1988). 22% K. Kelly, B. H. Cochran, C. D. Stiles, and P. Leder, Cell 35,603 (1983). 228. P. Bull, T. Hunter, and I. M. Verma, Mol. Cell. Biol. 9,5239 (1989). 229. R. J. Grumont and S. Gerondakis, Cell Growth D@m. 1,345 (1990). 230. A. J. DeLisle, R. A. Graves, W. E Marzluff, and L. F. Johnson, Mol. Cell. Biol. 3, 1920 (1983). 231. G. C. Prendergast, D. Lawe, and E. B. Ziff, Cell 65,395 (1991). 232. J. Svaren, B. R. Sevetson,E. D. Apel, D. B. Zimonjic, N. C. Popescu, et ul., Mol. Cell. Bwl. 16,3545 (1996). 233. K. Ito, K. Tsutsumi, T. Kuzumaki, P. F. Gomez, K. Otsu, et al., Nucleic Acids Res. 22,2036 (1994). 234. M. Glibetic, L. Taylor, D. Larson, R. Hannan, B. Sells, etal., J. Biol. Chem.270, 4209 (1995). 235. R. P. Misra, V. M. Rivera, J. M. Wang, P. Fan, and M. E. Greenberg, Mol. Cell. Bwl. 11,4545 (1991). 236. P. K. Elder, L. J. Schmidt, T. Ono, and M. J. Getz, Proc. Natl. Acad. Sci. U S A . 81, 7476 (1984). 23% E. B. Leof, J. A. Proper, M. J. Getz, and H. L. Moses, J. Cell. Physiol. 127,83 (1986). 238. S. M. Rybak, R. R. Lobb, and J. W. FeKJ. CeZl. Physiol. 136,312 (1988). 239. U. Gluck, J. L. R. Femandez, R. Pankov, and A. Ben-Ze’ev,Exp. Cell Res. 202,477 (1992). 240. R. P. Penttinen, S. Kobayashi, and P. Bomstein, Proc. Nutl. Acad. Sci. U.S.A. 85, 1105 (1988). 241. R. E. Bellas, R. Bendori, and S. R. Farmer,J. Biol. Chem. 266,12008 (1991). 242. R. P. Tucker,J. A. Hammarback, D. A. Jenrath, E. J. Mackie, and Y.Xu,J. Cell Sci. 104,69 (1993). 243. D. B. Donoviel, S. L. Amacher, K. W. Judge, and P. Bomstein, J. Cell. Physiol. 145, 16 (1990). 244. C. D. Laherty, K. O’Rourke, F. W. Wolf, R. Katz, M. E Seldin, d al., J. Biol. Chem. 267, 3274 (1992). 245. S. Ferrari, R. Battini, L. Kaczmarek, S. Rittling, B. Calabretta, d ul., Mol. Cell. Biol. 6,3614 (1986). 246. A. Ben-Ze’ev, R. Reiss, R. Bendori, and B. Gorodecki, Cell Regul. 1,621 (1990). 24% G. N. Rao and R. L. Church, Exp. Cell Res. 178,449 (1988). 248. S. L. Hendrickson, J.-S. R. Wu, and L. F. Johnson, h c . Natl. Acad. Sci. U.S.A. 77,5140 (1980). 249. C . Santiago, M. Collins, and L. E Johnson,J. Cell. Physiol. 118,79 (1984). 250. B. Y.Tseng, C. E. Prussak, and M. T. Almazan, Mol. Cell. Biol. 9,1940 (1989). 251. S. Bjorklund, S . Skog, B. Tribukait, and L. Thelander, Biochemistry 29,5452 (1990). 252. D. L. Coppock and A. B. Pardee, Mol. Cell. Biol. 7,2925 (1987). 253. J. M. Gudas, G. B. Knight, and A. B. Pardee, h c . Natl. Acad. Sci. U.S.A. 85,4705 (1988). 254. D. Jaskulski,C. Gatti, S. Travali, B. Calabretta, and R. Baserga,]. Biol. Chem. 263,10175 (1988). 255. C. Jenh, P. K. Geyer, and L. F, Johnson, Mol. Cell. Biol. 5,2527 (1985). 256. E. Surmacz, P. Nugent, Z. Pietnkowski, and R. Baserga, Exp. Cell Res. 199,275 (1992). 257, G. Magistrelli, N. CoviN, M. Mosca, G. Lippoli, and A. Isacchi, Biochem. Biophys. Res. Commun. 217,154 (1995). 258. R. H. Diamond, D. E. Cressman,T. M. Laz, C. S. Abrams, and R. Taub, Mol. Cell. Biol. 14, 3752 (1994).
78
JEFFREY A. WINKLES
259. D. L. Simmons, B. G. Neel, R. Stevens, G. Evett, andR. L. Erikson, Mol. Cell. Biol. 12,4164 (1992). 260. D. A. Kujubu, B. S. Fletcher, B. C. Varnum, R. W. Lim, and H. R. Herschman,]. Bid. C h . 266,12866 (1991). 261. A. S. Masibay, G . P. Damewood, E. Boeggeman, and P. K. Qasba, Biochim. Biophys. Actu 1090,230 (1991). 262. R. S. Gilbert and H. R.Henchman,]. Cell. Physiol. 157,128 (1993). 263. C. Kahana and D. Nathans, h c . Nutl. Acad. Sci. U.S.A. 81,3645 (1984). 264. A. Katz and C. Kahana, Mol. Cell. Biol. 7,2641 (1987). 265. E. Stimac and D. R. Morris,]. Cell. Physiol. 133,590 (1987). 266. J. H. Smith and D. T. Denhardt]. Cell. B i o c h . 34,13 (1987). 267. D. R. Edwards, P. Waterhouse, M. L. Holman, and D. T.Denhardt, Nucleic Acids Res. 14, 8863 (1986). 268. R. Battini, S. Ferrari, L. Kammarek, B. Calabretta, S.-T. Chen, d al.,]. Biol. Chem. 262, 4355 (1987). 269. N. A p a l , J. L. Wang, and P. G. Voss,]. Biol. Chem.264,17236 (1989). 270. B. J. Rollins, E. D. Morrison, P. Usher, and J. S.Flier,]. Bid. Chem. 263,16523 (1988). 271. Y.Hiraki, 0.M. Rosen, and M. J. Birnbaum,]. Biol. C h .263,13655 (1988). 272. T. Kitagawa, M. Tanaka, and Y.Akamatsu, Biochim. Biophys. Acta 980,100 (1989). 273. W. E. Mercer, D. Jaskulski, and M. T. Shields, Exp. Cell Res. 181,531 (1989). 274. J. M. Almendral, D. Huebsch, P. A. Blundell, H. MacDonald-Bravo, and R. Bravo, h c . Natl. Acad. Sci. U.S.A.84,1575 (1987). 275. H. A. Lane, and E. A. Nigg,]. Cell Bwl. 135,1701 (1996). 276. K. Dorfman, D. Carrasco, M. Gruda, C. Ryan, S. A. Lira, and R. Bravo, Oncogene 13,925 (1996).
Regulation of Translational Initiation during Cellular Responses to Stress’ CHARLES 0.BROSTROM AND MARGARET A. BROSTROM Department of Pharmacology Robert Wood Johnson Medical School University of Medicine and Dentistry of New Jersey Piscataway, New Jersey 08854 I. Stress Responses and Stress Proteins of Eukaryotic Cells ............. A. The Heat Shock Response .................................... B. ER Function and the ER Stress Response ...................... C. GRP78, HSC70, and HSP70: Functional and Structural Considerations .................................... 11. Regulation of TranslationalInitiation ............................. A. Inhibition of Translational Initiation in Response to ER Stressors ... B. Inhibition of Translational Initiation in Response to Cytoplasmic Stressors ..................................... C. Mammalian Enzymes Catalyzing the Phosphorylation of eIF-2a ... D. Translation in Reticulocytes as Compared to Nucleated Cells . . . . . . 111. Translational Accommodation to ER or Cytoplasmic Stress ........... A. Accommodation to Depletion of ER Ca2+Stores ................ B. The Role of GRP78 ......................................... C. Signaling Systems: Translational Initiation versus the Induction of GRP78 ..................................... D. Translational Accommodation to Cytoplasmic Stress ............. E. Relationships between the ER and Cytoplasmic Stress Response Systems .......................................... F. HSV-I Infection and Translational Tolerance to ER Stress ......... G. Physiological Relevance of Translational Accommodation ......... IV. Perspectives and Speculation .................................... References ....................................................
82 82 83 89 90
92 96 102 108 110 110 111 112 113 114 115 115 116 120
Abbreviations: [Ca2+Ii,cytosolic free Ca2+ concentration; Cbz-Gly-Phe-NH2,benzyloxycarbonyl-glycyl-phenylalanyl-amide; eEF, eukaryotic elongation factor; EGF, epidermal growth factor; eIF, eukaryotic initiation factor; ER, endoplasmic reticulum; GCN2, general control eIF-2a kinase of yeast; GRP, glucose-regulated stress protein; GRP78/BiP, glucose-regulated stress protein 78 or immunoglobulin heavy chain-bindingprotein; HRI, heme-regulated protein kinase of erythroid cells; HSE, heat shock response elements; HSF, heat shock factor; HSP, heat shock protein; HSV-1, herpes simplex virus 1; IP,, myo-inositol 1,3,4-trisphosphate; kelp, Ser-Thr protein kinase activated in response to protein unfolding in the yeast ER; IRF-1, interferon regulatory factor 1;PKR, double-stranded (ds) RNA-activated eIF-Za kinase; PMA, phorbol 12-myristate13-acetate. Progress in Nucleic Acid Research
and Molecular Biology, V d . 58
79
Copyright 8 1998 by Academic Ress.
AU
rights of repduction in any form resewed.
0079-6603/98525.00
80
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
Chemicals and conditions that damage proteins, promote protein misfolding, or inhibit protein processingtrigger the onset of protective homeostaticmechanisms resulting in “stress responses” in mammalian cells. Included in these responses are an acute inhibition of mRNA translation at the initiation step, a subsequent indue tion of various protein chaperones,and the recoveryofmRNA translation. Separate, but closely related, stress response systemsexist for the endoplasmicreticulum (ER), relating to the induction of spec& “glucose-regulatedproteins” (GRPs), and for the cytoplasm, pertaining to the induction of the “heat shock proteins” (HSPs). Activators of the ER stress response system, including Ca”+-mobilizingand thiolreducing agents, are discussed and compared to activators of the cytoplasmic stress system, such as arsenite, heavy metal cations, and oxidants. An emergingintegrative literature is reviewed that relates protein chaperones associatedwith cellular stress response systems to the coordinateregulation of translational initiation and protein processing. Background information is presented describing the roles of protein chaperones in the ER and cytoplasmic stress response systems and the relationships of chaperones and protein processing to the regulation of mRNA translation. The role of chaperones in regulating eIF-2a kinase activities, eIF-2 cycling, and ribosomal loading on mRNA is emphasized. The putative role of GRP78 in coupling rates of translation to processing is modeled, and hnctional relationships between the HSP and GRP chaperone systems are discussed. 8 lee8 Academic press
This review deals with an emerging literature that relates protein chaperones associated with cellular stress response systems to the coordinate regulation of translational initiation and protein processing. The editors have offered us an exceptional opportunity to present a focused and personal perspective of this area as opposed to more conventional reviews that require a comprehensive survey of the literature. We have attempted to balance our text so as to provide sufficient perspective on the issues of interest while avoiding entanglement in endless details or erring by being too superficial. Reviewing an integrative set of relationships against the backdrop of an enormous literature has provided a wonderful opportunity to commit errors of omission. Readers may therefore wish to consult various recent reviews pertaining to protein chaperones (1-6), translational control (7-14, and protein processing (12-15) for more extensive information on these subjects. Our interest in the regulation of translation stems from experiments conducted in mid-1982 exploring the turnover of the calmodulin-dependent (type I) form of cyclic nucleotide phosphodiesterase in C6 glial tumor cells. We had previously established procedures utilizing EGTA-buffered media that allowed cytosolic free Ca2+ ([Ca2+],)to be lowered within 1 min as determined by measurements of Ca2+-dependentCAMPaccumulation in C6 glial tumor cells (16).In initial attempts to explore the effect of Ca2+ depletion on the intracellular stability of the phosphodiesterase, it was found that
TRANSLATIONAL INITIATION REGULATION IN STRESS
81
amino acid incorporations of Ca2+-depletedcells were sharply reduced. Mysteriously, the inhibitory effect of EGTA developed slowly over a period of 15-45 min. This slow onset was ultimately localized to the mobilization of sequestered intracellular Ca2+ from the endoplasmic reticulum (ER). Twodimensional gel electrophoresis of pulse-labeled proteins indicated that the synthesis of all of the proteins of the cell was markedly inhibited except for a protein now identified as the chaperone, GRP78/BiP.A survey of the literature prior to 1982 revealed a scattering of papers indicating that protein synthesis in various tissues and cell cultures was stimulated by Ca2+ (17-20). The earliest of these studies appeared in 1969 (24, in which 1.3-mM Ca2+ was reported to provide a sixfold stimulation of Ieucine incorporation in isolated liver cells pretreated in Ca2+-fi-eeHanks solution. Protein synthesis, if viewed as a continuum from mRNA transcription to mRNA translation to posttranslational modification and processing, is an extremely complex process that is sharply suppressed by traumatic conditions. As measured by amino acid incorporation, protein synthesis is strongly inhibited by cell damage; low concentrations of detergents and organic solvents; alterations of osmolarity, pH, and ionic strength; oxidants; elevated temperatures; hypoxia; and various perturbants of ER function. Much of the emphasis of the present review relates to the ER, which is the site of early processing as well as the assembly of multimeric proteins destined either for secretion or targeting to various intracellular structures of the cells of higher organisms. This organelle is also a repository for Ca2+ that is releasable to the cytosol in response to various extracellular stimuli, including neurotransmitters, autocoids, and hormones as well as various chemicals, including ionophores or inhibitors of ER Ca2+ accumulation. Release of ER Ca2+ is associated with slowed rates of processing of some, but not all, proteins. The ER is also notable for maintaining a redox environment that promotes the formation of disulfide bonds during protein folding. Mild reducing agents abolish this environment and inhibit the processing of proteins that acquire such bonds. The inhibition of protein processing associated with Ca2+ release or with mild reducing conditions results in an acute (within minutes) reduction of amino acid incorporation ranging from 80 to 95%without loss of cellular viability or deterioration of ATP content. Continued (2-3 h) inhibition of processing results in the induction of the ER resident chaperone, GRP78, and a resumption of incorporation. In the following sections, these relationships are highlighted and the putative role of GRP78 in coupling rates of translation to processing is modeled. Evidence is also summarized that one or more HSP chaperones function comparably in the cytosol to regulate mRNA translation. Functional relationships between the HSP and GRP chaperone systems are discussed.
82
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
1. Stress Responses and Stress Proteins of Eukaryotic Cells
AU cells, including those of multicellular organisms exerting considerable control over their internal environment, are capable of adapting quite rapidly to moderate alterations of metabolic, physical, or environmental conditions. Nucleated cells are recognized to possess protective homeostatic mechanisms that are widely termed the “stress response” and that are activated by a broad range of trauma, including exposure to elevated temperature, heavy metals, viral infection, oxidative free radicals, glucose deprivation, ischemia, ion imbalances, or various toxic chemicals (22). The most rapid aspect of these protective mechanisms is a general suppression of mRNA translation at the initiation step, the details of which are discussed in subsequent sections of this review. The stress response is also integrally associated with the independent induction of two small groups of evolutionarily conserved proteins that are generally termed the “heat shock proteins” (HSPs) and the “glucose-regulatedproteins” (GRPs). Induction of the HSPs (the “heat shock response”) is provoked by thermal or chemical insults that damage or cause formation of aberrant cytosolic, nuclear, or mitochondrial proteins, whereas the GRPs are synthesized in response to conditions that foster accumulation of abnormally or poorly processed proteins within the lumen of the ER. The term “proteotoxic”is frequently applied to conditions or agents that promote formation of these unnatural proteins (I).Stress responses are considered diagnostic for determining the extent of proteotoxic damage to tissues or organs, such as heart and brain, during ischemia and reperfusion as well as for identifylng cells likely to survive stress. The conferring of protective effects against subsequent proteotoxic damage has been repeatedly associated with induction of particular stress proteins (I, 22-24). It has been suggested that exploitation of such inductions could reduce injury from trauma during ischemia-reperfusion that accompanies surgery or organ transplantation (22).
A. The Heat Shock Response An increased synthesis of the HSPs or their mRNAs following exposure to various environmental poisons is considered to be a sensitivebiomarker of proteotoxicity. Induction of the HSPs commonly follows the production of damaged or misfolded cytoplasmic (cytosol plus mitochondria) proteins in response to elevated temperature, oxidative stress, and heavy metals or from the synthesis of aberrant proteins in response to amino acid analogs (22,25,26). Proteins containing sulhydryl groups are particularly sensitive to modification. Sodium arsenite, a prominent inducer of the HSPs that produces minimal suppression of viability on short-term exposures, is thought to act largely
TRANSLATIONAL INITIATION REGULATION IN STRESS
83
by inactivating sdfhydryl groups (27). Mammalian HSPs induced during stress typically include proteins with masses of 110,90,72/73,70,60, and 30 kDa that have differing subcellular distributions. Latent monomeric heat shock factors (HSFs 1,2, and 3) are thought to be complexed with HSPs 70 and/or 72/73 in nonstressed cells (28,29).The depletion of preexisting HSP70s during stress, believed to occur as a consequence of binding to eccentric protein structures, permits the trimerization of HSF to a form that binds to heat shock response elemerlts (HSEs) on DNA such that induction of heat shock mRNAs ensues. HSF is also phosphorylated under certain conditions that signal transcription of the hsp genes, but the role of this modification in signaling is unclear. Activation of heat shock gene transcription also requires factors bound to GAGA and TATA box sequences and a paused RNA polymerase I1 complex. Particular structural features of mRNAs for the HSPs permit their selective translation in stressed mammalian cells (25).As compared with normal transcripts, mRNAs for the HSPs manifest unusually long 5’-untranslated leader sequences rich in adenosine residues, have little secondary structure, and possess conserved sequences centrally and at their 5’ ends. The mammalian HSPs perform essential protein chaperoning functions in both stressed and nonstressed cells (3-5). For example, HSPM and a closely related form of the protein expressed constitutively in nonstressed cells, variously termed HSC70 or HSP72/73, prevent incorrect folding of polypeptides during synthesis, promote protein assembly, and facilitate delivery of newly synthesized peptides to the mitochondria in an unfolded state for translocation. Current findings support the hypothesis that HSP70 and HSC70 function during proteotoxic stresses to solubilize or refold denatured or aberrant proteins, to deliver them to a degradative system, or both. The “chaperonin” HSPGO facilitates mitochondrial translocation and subsequent protein folding and maturation, whereas HSPQO stabilizes the high-affinity ligand-binding conformation of steroid hormone and dioxin receptors. An HSP6O homolog (TRiC) and HSP40 are proposed to assist HSC70 in the cotranslational folding of selected proteins in the mammalian cytosol(30, 31). A highly organized chaperone machinery is required for folding of newly synthesized firefly luciferase in the reticulocyte lysate (30). HSPs can be recognized by the immune system as major antigens produced by stressed infectious organisms, but may also serve to chaperone autoantigens during their presentation in certain autoimmune disorders (32).
B. ER Function and the ER Stress Response 1. THEER AS AN INTEGRATING ORGANELLE
Morphologically,the ER is composed of a convoluted, bilayer membrane sheathing a continuous luminal or cisternal space that occupies up to 10% or
a4
CHARLES 0.BROSTROM AND MARGARET A. BROSTROM
more of the total cell volume. The ER is conventionally recognized to consist primarily of the “rough” or ribosomally decorated region and the “smooth”or ribosome-free region. The rough ER is acknowledged to be continuous with the outer membrane of the nucleus and therefore accesses the deepest recesses of the cell. Much of the rough ER occurs in platelike folds extending throughout much of the cytosol, with an intraluminal space of 20-30 nm. Periodically these folds taper into elements of the smooth ER, a dynamic network of anastomosing tubules 30-60 nm in diameter that tend to localize to the more superficial regions of the cell (33).In addition to these traditional divisions, various subregions of the ER have been described possessing cisternae associated with glycogen particles, mitochondria, cytoskeletal components, and the plasmalemma. Vesicles that transport proteins for processing to the Golgi are derived from and represent another ER compartment. Specialized cell types differ markedly in their ER content and structure. Acinar cells, for example, possess almost no smooth ER but extraordinary amounts of rough ER, whereas neurons have a relatively high content of smooth ER. Functions commonly associated with the ER include early protein processing, phospholipid and membrane synthesis, and the early steps of steroid biosynthesis, as well as the oxidative metabolism of hydrophobic molecules, including many drugs and toxic substances. In addition, the ER is an intracellular repository for Ca2+ releasable in response to extracellular stimuli. It is apparent, therefore, that the ER is a highly complex organelle that infiltrates and organizes the cytoplasmic space of the cell and interfaces with and produces much of the structural material comprising other organelles. Overall, the ER possesses the structural and functional properties that would be expected for an organelle supporting the integration and coordination of major cellular processes. A wealth of literature supports the central role of ER releasable Ca2+ in intracellular signaling related to stimulus-response coupling (34-43).Prominent processes regulated by the cation include secretion, membrane transport and permeability, glycogen metabolism, and muscle contraction. As [Ca2+Iiin response to a stimulus rises severalfold from resting values near 0.1 pM, the cation binds to high-affinity Ca2+receptor proteins, such as calmodulin, permitting the activation of various enzymatic processes. Contributions of Ca2+to the cytosolicpool are derived from the extracellular fluid and from intracellular sites of storage. Ca2+ entry across the plasmalemma involves voltage- and/or ligand-gated Ca2+ channels (37, 39) and is driven by concentration gradients on the order of lo4. Ca2+ efflux is supported by a Na+/Ca2+antiport and by active transport by Ca2+-selectiveATPases. The relative contribution of ER-sequestered Ca2+ from one cell type to another during Ca2+ signaling is a subject of considerable uncertainty ER
TRANSLATIONAL INITIATION REGULATION IN STRESS
85
stores of Ca2+ are well established to be released by myo-inositol 1,4,5trisphosphate (IP,) generated in response to hormonal or other stimuli (44-46). Total ER Ca2+is commonly estimated to range from 1to 5 mM.The free Ca2+ concentration of the ER is much lower, however, since much of the cation is bound to matrix proteins of high capacity, but of relatively low affinity.It is reasonable to suspect that localized concentrations of Ca2+-binding proteins may support differences in Ca2+ distribution in various parts of the ER. Local release of Ca2+ to the cytosol in response to hormonal action is thought to occur at superficial layers of the cell from sites of concentration adjacent to the plasmalemma (47). Ca2+ accumulation, however, occurs throughout the organelle (48). It is tempting to speculate that Ca2+ penetrating to the deeper layers of the cell is recovered by the rough ER for subsequent return to concentrating sites associated with the smooth ER. Release of Ca2+from binding sites would buffer ER free Ca2+during periods of Ca2+ mobilization to the cytosol. In GH, pituitary cells approximately 40% of the total cell-associated Ca2+ is released in response to IP, and other Ca2+mobilizing agents that appear to act selectively on the ER (49).
2.
PROTEIN PROCESSING IN THE ER
The ER functions centrally in the co-translational translocation, folding, and processing of newly synthesized secretory, lysosomal, and integral membrane proteins. While both the rough and smooth ER overlap considerably in their protein content, proteins concerned with ribosomal docking and protein processing localize to the rough ER (34,35).After enby into the ER lumen, proteins must fold correctly in order to exit the organelle. Most secretory and transmembrane proteins are processed and assembled into large complexes in the ER prior to export to other subcellular compartments. Posttranslational modifications occurring in the organelle include proteolytic cleavage of signal sequences, transfer of core oligosaccharide to selected asparagine residues, trimming of oligosaccharide side chains, formation of disulfide linkages, peptidyl-prolyl cis-tmns isomerization, and oligomerization. Protein folding, however, is generally thought to comprise the ratelimiting step for ER-to-Golgi transport (2, 50). The ER lumen provides a milieu highly conducive to protein folding. A moderately oxidizing environment is achieved through maintenance of increased concentrations of oxidized relative to reduced glutathione moieties (51).Co-translational formation of disulfide bonds in the ER, catalyzed by protein disulfide isomerase, consists of rapid nonenzymatic oxidation followed by slower thiol-disulfide oxidoreductions and requires the oxidizing environment of the ER lumen (2,50).Ca2+ sequestered by the ER, while not conventionally regarded as regulatory, is established to sustain such early protein processing events as oligomerization of viral glycoproteins (52), fold-
86
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
ing of asialoglycoprotein receptor subunits (53, 54), and trimming of oligosaccharideside chains (55-5 7). Degradation of incompletely assembled or abnormal proteins in the ER lumen is also strongly influenced by stored Ca2+(54,58,59). Chaperones that expedite co- and posttranslational protein folding,including GRP78/BiP, GRP94, calnexin, and calreticulin, are present at high concentrations in the ER (23, 24, 60-62). Most chaperones are capable of binding Ca2+ at the relatively high free concentrations believed to prevail in the ER, and some appear to require the cation in promoting protein processing (60). 3. THEMAMMALIAN ER STRESSRESPONSE
Identifiable by enhanced expression of genes encoding the GRPs, the ER stress response is ordinarilyprovoked by conditions that retard or disrupt normal ER protein processing. Such conditions include glucose deprivation or tunicamycin treatment, depletion of stored Ca2+,introduction of a reducing environment, viral infection, and overexpression of normal or abnormal secretory proteins (22-24). Depletion of ER Ca2+ stores impairs removal of mannose residues from glycoproteins, assembly of viral glycoproteins, and folding of receptor molecules (52, 53, 57). Tunicamycin and glucose deprivation limit protein glycosylation (63),and thiol-reducing agents prevent appropriate disulfide bond formation (51,56). Disulfide-bonded intermediates observed during the processing of certain glycoproteins are not formed when Ca2+-dependentfoldingis suppressed (53,Sa).It is reasonable to suspect that Ca2+ should exert additional actions on protein folding and oligomerization beyond those selectively affecting glycoproteins. The most prominently induced of the GRPs are 78- and 94-kDa species that share amino acid homology with HSP70 and HSP90, respectively, but localize to the ER lumen. ERp72, which possesses proteolytic and protein disulfide isomerase activity (64, 65), and GRP58, an ER component of unknown function, are induced less extensively. GRP78, also termed BiP, has been hypothesized to function in the correct folding and assembly of proteins at the earliest site of protein processing (66) and in the retention of improperly folded proteins that accumulate within the ER lumen when processing is distressed (50). It is thought to facilitate the co-translational translocation of peptides from the ribosome to the ER (67, 68). GRP94 is a glycoprotein proposed to serve as a luminal chaperone for partially oxidized intermediates and as a low-affinity, high-capacity Ca2+-bindingprotein (24, 69). GRP78 is essential for the maintenance of viability in yeast (70), and the induction of the GRPs appears necessary for mammalian cell survival during persistent ER stress(71). Genes for the mammalian GRPs possess highly conserved promoter regions that confer ER stress inducibility and bind specific transcription fac-
TRANSLATIONAL INITIATION REGULATION IN STRESS
87
tors (72-74). Two regulatory elements of the rat grp78 promoter have been identified, one containing the grp core element conserved from yeast to human and a second containing a CCAAT motif and proximal to the TATA element. Both mediate GRP78 inducibility in response to mobilization of sequestered Ca2+ stores (75). Specific changes in factor occupancy occurring after stress are observed within the grp core element (7s). The factor that binds to this region under nonstressed conditions either undergoes a conformational change or dissociates from inhibiting elements during stress, resulting in the alterations observed in the in vivo footprinting pattern. In certain, but not all, cultured cell types, elevation of CAMP or treatment with phorbol esters amplifies the effects of ER stress ongrp78 expression (77-79). Amplification involves increased grp 78 transcriptional rates rather than enhanced message stability (78). A nucleotide sequence homologous to the CAMP-responsive element consensus potentially exists in the grp78 promoter region. Activation of the transcription of either the hsp or the grp group of genes does not preclude activation of the other group. Conditions have been described wherein both groups of genes are transcribed concurrently (22, 80). Following induction of the HSPs, mammalian cells have been shown to remain susceptible to induction of the GRPs and vice versa (79). Unlike the heat shock response, however, the ER stress response exhibits a protein synthesis requirement. Induction of grp78 mRNA by a Ca2+ ionophore or thiol-reducing agent is decreased in cells pretreated with cycloheximide or puromycin (81-83). Maximal degrees of transcriptional suppression require that translation be fully (>9900) inhibited for approximately 1 h prior to imposition of an ER stressor (83). In contrast, release from a prolonged (17-h) elongation blockade results in a severalfold induction of grp78 mRNA without the addition of ER stressors (83). Such cells behave as though protein processing capacity has eroded to the point of being inadequate to support the resumption of translation. The accumulation of underprocessed intermediates under this condition must therefore be signaling a stress response. The unusually long 5'-untranslated and unique 3'-untranslated regions of grp78 mRNA may confer unique regulatory properties on GRP78 expression at the translational step (84).The chaperone is induced at the translational level during poliovirus infection (89,a condition under which initiation on other cellular mRNAs cannot occur. Sucrose density gradient analyses (78)also revealed that grp78 mRNA in stressed cells is largely associated with mono- and polysomal fractions rather than ribonucleoprotein particles, a distribution different from actin and tubulin mRNAs. These findings are entirely consistent with the finding (86) that initiation on grp78 mRNA can occur by a cap-independent, internal ribosome-binding mechanism.
88
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
4. THE“UNFOLDED PROTEIN RESPONSE”
The mechanisms through which mammalian cells monitor the status of protein folding in the ER and then convey information regarding this status to the nucleus are poorly understood. In Saccharomyces cermisiae, accumulation of unfolded proteins in the ER lumen signals increased expression of genes encoding GRP78 (87,88),protein disulfide isomerase (89, go), and peptidyl-prolyl cis-trans isomerase (92).A 22-base pair element, referred to as the unfolded protein response element, of a cis-acting promoter was recently found to be necessary and sufficient to activate transcription of genes encoding yeast GRP78 (92) and peptidyl-prolyl cis-trans isomerase (91). Yeast mutants unable to induce the unfolded protein response have now been isolated and analyzed. A defective gene, named IRE2 due to identity with a gene required for inositol autotrophy and ERN1 for ER-to-nucleus signaling pathway, was identified independently in two laboratories (90,93).ZREl encodes a transmembrane protein of 1115 amino acids with a luminally oriented N-terminal region, a short domain spanning the ER membrane once, and a C-terminal region facing the cytosol. A region of the cytosolic domain exhibits significanthomology to catalytic domains of known Ser-Thr protein kinases, particularly CDC28 (94). Irelp is similar in structure to class I growth factor receptor kinases. Its cytosolic domain was recently found to possess intrinsic Ser-Thr kinase activity and to contain Ser-Thr phosphorylation sites; oligomerization and trans-phosphorylation were essential for kinase activation in vitro (95).Irelp was therefore proposed to function as the proximal sensor for unfolded proteins in the yeast ER lumen. It is thought to phosphorylate a trans-acting factor that binds to the unfolded protein response element to induce transcription of yeast genes encoding the GRPs. The mammalian counterpart of ZRE1 has not yet been characterized. Current models (70),supported by studies in both yeast and mammalian cells (96-98), propose that Irelp is inhibited by binding GRP78 in competition with the binding of the chaperone by unfolded ER proteins. Competitive binding of GRP78 by an accumulation of unfolded proteins within the ER therefore permits the activation of Irelp and the signaling of grp78 transcription. The observation that translational inhibitors suppress the transcription of mammalian grp78 in response to ER perturbants is explicable by these models. Cycloheximide would be anticipated to prevent the accumulation of unfolded peptide intermediates capable of binding GRP78 in the stressed ER. Alternatively, the hypothesis has not been excluded that grp78 transcription requires the continuous synthesis of a cytosolic protein factor possessing a rapid turnover rate. Hypothetically this factor would undergo modification at the ER membrane, as, for example, through phosphorylation by Irelp, and subsequently transport the signal for grp78 induction to the
TRANSLATIONAL. INlTIATION REGULATION IN !3TRESS
89
nucleus. A third possibility is that signaling for the induction of grp78 may focus on covalently modified form(s)of GRP78 (see later) that accumulate in the mammalian ER during periods of reduced processing requirements, as, for example, when cells are exposed to inhibitors of elongation.
C. GRP78, HSC70, and HSP70: Functional and Structural Considerations The most highly expressed of the mammalian stress proteins, GRP78 and HSC70, associate with “partner proteins” (99)that permit the occurrence of specific chaperone functions within their respective subcellular compartments. GRP78 cannot substitute for HSC70 in binding to peptide sequences that target proteins for lysosomal degradation (100)or in facilitating in vitro translocation of peptides into mammalian microsomes (101).Although the amino acid sequences of inducible HSP70 and HSC70 differ by only 2-30/0, their clathrin uncoating activity is markedly different (102).Such differences indicate that members of the HSP70 family of chaperones are not functionally interchangeable. In uitro, however, GRP78 and HSC70 exhibit both common and exclusive peptide-binding specificities that are highly sensitive to peptide sequence (103). GRP78, HSC70, and HSP70 interact with protein substrates in a nucleotide-dependent-manner and can oligomerize. Extended peptides enriched in hydrophobic residues are bound and released rapidly from chaperones complexed with ATP, as compared to a slow release when complexed with ADP (104-106). An ATP-dependentconformational change, rather than ATP hydrolysis, fosters peptide release (107).ATP and ADP also affect the oligomerization of HSP70 chaperones. In uitro, ADP promotes formation of HSC70 dimers and oligomers while ATP favors monomerization (108-110). Binding, but not hydrolysis, of ATP is necessary and sufficient for stabilization of HSC70 monomers (110). Oligomerization of HSC70 was suggested to occur via the peptide-binding site in view of the inhibition of oligomer formation by protein substrates (110).Monomeric HSC70 was recently found to bind peptide and protein substrates more than 10-foldmore strongly than oligomeric chaperone (111).It was proposed that, in vivo, oligomerizationfavors storage of the HSP70 chaperones in the inactive form. It is presently unclear to what degree oligomerized GRP78 retains the ability to bind peptide substrates in vitro and to undergo monomerization by ATP. In cells accumulating nonprocessible proteins, GRP78 was found to be primarily monomeric, and only the monomeric form was found in association with substrate (112).In resting cells, however, GRP78 was mostly oligomerized and substrate free. Resting cells and cells induced to overexpress GRP78 nonetheless maintained similar amounts of free chaperone.
90
CHARLES 0. BROSTF4OM AND MARGARET A. BROSTROM
GRP78 differs from its homologs in that it is subject to modification in uiuo by ADP ribosylation and phosphorylation (113,114).The percentage of GRP78 in the modified state is increased by cycloheximide or by amino acid starvation and is decreased by tunicamycin, glucose analogs, and growth factors (112, 114-116). Both ADP ribosylation and phosphorylation are restricted to oligomeric chaperone; GRP78 bound to other proteins is not covalently modified (112). Posttranslational modifications occur upon release of GRP78 from associated proteins and are reversed upon accumulation in the ER of nontransportable proteins. GRP78 has therefore been proposed (112) to exist in complexes of the monomer with protein substrates, as free unmodified monomers, or as free modified oligomers. It is notable that certain conditions, such as treatment with cycloheximide, promote covalent modification of the chaperone and suppression of grp78 transcription. 0ther conditions, such as tunicamycin treatment, result in the removal of existing modifications and increased GRP78 expression. These findings raise the fascinating,but currently unexplored, possibility that modification of GRP78 serves as part of a unique mechanism through which the cell detects alterations in the functional status of the ER.
II. Regulation of Translational Initiation Translational initiation involves a thoroughly reviewed series of steps designed to assemble and position the ribosome at the AUG start codons of mRNA (7-1L 117, 118). These events ordinarily proceed through mRNA m7GpppG 5’-end cap-dependent association and more rarely by internal loading of ribosomes. Internal ribosomal loading is thought to occur on certain mRNA with unusually long 5’ untranslated regions, such as that for GRP78 (84-86), various HSPs (119),and some virally derived messages (103). Internal loading would be anticipated to avoid much of the regulation of translational initiation encountered by mRNAs that load ribosomes through the cap-dependent mechanism. Cap-dependent loading involves binding of an initiator Met-tRNA, to the 40s ribosomal subunit, subsequent association of this complex with the mRNA 5‘-m7GpppG cap, migration to (scanning) an AUG initiator codon, and complexing with a 60s ribosomal subunit to form an 80s ribosome capable of polypeptide chain elongation.These events are catalyzed through the agency of a dozen or more initiation factors (eIl?) that associate with and disassociate from the preinitiation complex at various points of the process. A number of these factors are heteromeric and are subject to alterations of catalytic activity through protein phosphorylation. Control of translational initiation is concerned with the selection frequency of specific mRNAs to be utilized and with alterations in overall rates
TRANSLATIONAL, INITIATION REGULATION IN STRESS
91
of ribosomal loading onto the messages. Selection frequency governs the relative amounts of each protein to be synthesized and depends on such factors as the relative abundance of each mRNA population, which is a function of rates of transcript turnover; on the ease of mobilizing the mRNA from ribonuclear particles into polysomes; and on the rate of recruitment of additional ribosomes onto preexisting polysomes. A primary consideration in both mRNA utilization and ribosomal loading relates to localized loss of secondary structure (melting)of mRNA in the region of the cap sufficient to expose ribosomal binding sites (7-11,117,118).Melting is associated with the overall action of the eIF-4 factors 4E, 4G, 4A, and 4B. Interaction of the capbinding protein eIF-4E is influenced by the 5’ leader structure of mRNA such that differential rates of utilization occur with different species of mRNA; eIF-4E is also thought to promote internal ribosomal loading. The activities of various components of the eIF-4 complex, including 4E and 4G, are increased by protein phosphorylation occurring in response to mitogens and hormonal growth factors. Two factors, 4E-BP1 and 4E-BP2, have been reported to exist that modify the activity of eIF-4E (118).The dephosphorylated form of 4E-BP1 is an inhibitor of eIF-4E. Phosphorylation and dissociation of 4E-BP1 occur in response to growth promoters such as insulin and EGF, whereas certain viral infections, including polio and encephalomyocarditis virus, result in dephosphorylation of this protein (119).A number of protein kinase activities participate in these phosphorylations in cell-free preparations, including protein kinase C, CAMP-dependent protein kinase, S6 ribosomal protein kinase, and certain kinases for which casein serves as a substrate, but the endogenous protein kinase remains to be clearly identified. Phosphorylation of either 4E or 4G is associated with increased rates of initiation (10). A second major regulatory point in translational initiation relates directly to ribosomal loading and involves the priming of the 40s ribosomal subunit with initiator tRNA. In conjunction with GTP,eIF-2 mediates binding of the initiator tRNA (Met-tRNAp) to the 40s ribosomal subunit. The resultant 43s preinitiation complex joins with the 60s ribosomal subunit to form a monosome capable of translation, the eIF-2-associated GTP is converted to GDP, and the factor dissociates from the ribosome. The binary eIF-2-GDP complex is inactive in supporting further initiation until eIF-2-GTP is reformed via the catalytic exchange of GDP for GTP. This exchange is accomplished by eIF-2B, a factor typically present at low stoichiometric ratios with respect to eIF-2. The affinity of eIF-2B for eIF-2 is greatly increased by phosphorylation of Ser-51 of the (Y subunit of eIF-2 by specific eIF-2 kinase activities. Increased phosphorylation of eIF-2a is recognized to mediate the translational repression occurring in mammalian cells subjected to a wide variety of physical, chemical, and nutritional stresses (7-11).A
92
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
20-30% increase in the phosphorylation of eIF-2a is frequently adequate for sequestration of eIF-2B into an inactive complex such that recycling of eIF2 cannot occur (120, 121). However, cells vary in their relative contents of eIF-2B and eIF-2 and, therefore, in the degree of eIF-2a phosphorylation required for translational suppression. For example, the ratio of eIF-2B to eIF2 is reportedly 0.6 in liver as opposed to 0.3 in reticulocytes (122).There is reason to suspect that eIF-2B may be subject to additional regulatory inputs. The catalytic activity of the factor may be affected by phosphorylation of the E subunit or allostericallythrough the binding of various adenine nucleotides, including ATP and NAD (122, 123). The rate of ribosomal loading in conjunction with the rate of peptide bond formation determines the overall rate of translation and the polysomal content of cells. Polysome content and size tend to increase whenever the rate of peptide chain elongation is rate limiting with respect to initiation. While some additional recruitment of mRNA from mRNP into polysomes may occur through the phosphorylation and activation of eIF-4 in response to mitogens or growth factors, it is unclear that sufficient initiation factors and ribosomes are generally available to support large increases in initiation through this mechanism. Growth factors also have the potential to alter the peptide chain elongation rates through the phosphorylation and activation of eEF-1 and the phosphorylation of S6 ribosomal protein (8). Overall promotion of translation by these substances as measured by amino acid incorporation ranges on the order of 30-100% in intact cells depending on cell type. In contrast, the phosphorylation of eIF-2a in response to various chemical stressors and viral infections inhibits initiation by approximately 85-950!0. The eIF-2 input into the regulation of translational initiation represents an adjustable braking system whereby polypeptide synthesis can be slowed across a range of values extended to extremely slow basal rates. This control point therefore constitutes a logical site for coordinating rates of synthesis with posttranslational protein processing. Potential regulation emanating through protein phosphatase activities for the various initiation factors remains largely unexplored.
A. Inhibition of Translational initiation in Response to ER Stressors A substantial body of evidence links depletion of ER-sequestered Ca2+ to the inhibition of translational initiation in a variety of cell types (48). Depletion of ER-sequestered cation as measured by 45Ca efflux occurs in response to (a) hormones generating IP,; @) chelating agents, such as EGTA, acting as extracellular extractants; (c) thapsigargin, a sesquiterpene lactone that specificallyblocks active transport of Ca2+into the organelle (124);and
TRANSLATIONAL INITIATION REGULATION IN STRESS
93
(d) various agents fostering the passage of Ca2+ across the ER membrane to the cytosol. These latter agents include the divalent cation ionophores ionomycin and A23187, arachidonic acid, and various hydrophobic peptide analogs such as Cbz-Gly-Phe-NH, (49,125-127).Amino acid incorporation as a function of time appears to be unaffected by these agents until the ER becomes substantially depleted of Ca2+.The effects of EGTA on translation have been studied in particular detail. EGTA effectively reverses Ca2+ concentration gradients such that extracellular free Ca2+ is low in comparison to [Ca2+],.Cells that are exposed to EGTA-buffered medium normally exhibit an extremely rapid fall (seconds) of [Ca2+],concentration as monitored with Ca2+fluorescent dyes (128).This rapid fall is driven by the active transport of Ca2+ from the cytosol by plasmalemmal pumps. The affinity of these pumps for Ca2+is comparable to the ER Ca2+ transport system that accumulates the cation within the organelle (129).With continuing exposure to EGTA, intracellular sequestered Ca2+ is gradually depleted by spontaneous release to the cytosolic pool. Some cell types, such as GH, pituitary cells, release their Ca2+ stores relatively rapidly and become largely depleted within 15-30 min, whereas other cell types, such as HeLa, HepG2 liver cells, and CHO cells, are relatively resistant to depletion by this procedure. Amino acid incorporation into nearly all protein populations in normal rat hepatocytes (130),C6 glial tumor cells (134, and GH, pituitary cells (132-136) was found to be inhibited 80-9O0/o upon depletion of intracellular Ca2+pools with EGTA-bufferedmedia within 30-45 min. The synthesis of peptide hormones destined for secretion was also inhibited (133).Addition of 1-mM Ca2+ in excess of chelator restored the rate of protein synthesis within 7-10 min to that of nondepleted control preparations. Ca2+ specifically among physiologically occurring cations restored amino acid incorporation over a broad range of Mg2+,Na+, and K+ concentrations, pH, and osmolarity in both minimal and enriched media either with or without sera. The effects of Ca2+ depletion were not traceable to changes in amino acid uptake, aminoacylation of transfer RNA, RNA synthesis, protein catabolism, removal of cells from growth surfaces, or changes in viability as measured by dye exclusion, replating, and determinations of ATP and GTP contents. Ca2+depletion with EGTA results in the disappearance of polysomes and an accumulation of monosomes and ribosomal subunits typical of slowed rates of translational initiation, as well as a large decrease in 43s preinitiation complex (4OS.eIF-2.Met-tRNAfGTP) (135). The methionylation of tRNAinetwas not altered. Re-introduction of Ca2+rapidly (minutes) restored cellular contents of 43s ribosomal preinitiation complex and polysomes with corresponding decreases in monosomal and ribosomal subunits. Comparable polysomal profiles were found for Ca2+-depleted and restored cells ex-
94
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
posed to cycloheximide, a reversible inhibitor of elongation. The introduction of cycloheximide to Ca2+-depletedcells depressed peptide chain elongation to rate-limiting values with respect to rates of translational initiation. Average ribosomal transit times for both Ca2+-depleted and restored cells were identical and were extended in parallel as a function of increasing cycloheximide concentration, indicating that neither peptide chain elongation nor termination was directly affected by depletion-repletion of the cation. Lysates of C6 or GH, cells exhibited amino acid incorporation that was proportional to the polysomal contents derived from the original intact cell preparations. These lysates lacked the ability to initiate new peptide synthesis, and their activity was not directly affected by the addition of Ca2+ or EGTA. Lysates derived from either Ca2+-depleted or restored cells that had been treated with cycloheximide possessed identical elongation activities. Ionomycin and A23187 (133,thapsigargin (138),Cbz-Gly-Phe-NH, (49, 123, and arachidonate (126) mobilize ER-sequestered Ca2+ by different mechanisms than EGTA but produce comparable alterations of translational initiation.These agents mobilize Ca2+and inhibit initiation relatively more rapidly (ranging from 6 to 12 min) than EGTA. Maximal inhibition of amino acid incorporation occurs more rapidly in cells exposed to combinations of these agents. Cell types such as HeLa, CHO, and HepG2, which are resistant to translational inhibition by EGTA, are responsive to Ca2+ ionophores, especially when added in combination with EGTA. With the exception of thapsigargin (138),which is an irreversible inhibitor of ER Ca2+ accumulation, all of the various Ca2+-mobilizingagents are reversed within several minutes by the addition of supraphysiologicalCa2+concentrations to the extracellular medium. For example, inhibitions of protein synthesis produced by low concentrations of ionophore at low extracellular Ca2+concentrations were reversed by adjustment to high extracellular Ca2+.Hormones that mobilize ER-associated Ca2+ from specific cell types were found to inhibit protein synthesis in a manner reversed by the addition of high extracellularCa2+. Thyrotropin-releasinghormone, which acts on GH, cells in part via the generation of IP, (123,potentiated the inhibitory action of EGTA at early times of exposure (132).Angiotensin 11, vasopressin, and a-adrenergic agonists, which mobilize sequestered Ca2+from the ER to the cytosol in hepatocytes, inhibited amino acid incorporation in isolated hepatocytes and reduced the polysome contents of rat liver perfused at physiological Ca2+ concentrations and pH (130, 139, 140). While optimal concentrations of the various Ca2+-mobilizingagents confer comparable alterations of translational initiation, such drugs can produce a variety of effects on intact cells as a function of increasing concentrations. EGTA appears to increase the fraghty of cells to physical manipulation and, in particular, to collection by centrifugation. Ionophores A23187 and iono-
TRANSLATIONAL INITIATION REGULATION IN STRESS
95
mycin promote Ca2+ efflux within 6-8 min and inhibit translational initiation at low (50-1000 nh4) concentrations in cultured mammalian cells without reducing ATP or GTP contents (56,138,141).At higher concentrations these agents promote Ca2+influx superimposed on declining ATP and GTP contents (141).Arachidonate at moderate concentrations (-10 pM) inhibits translational initiation within about 4 min with approximately a 40% decline in nucleotide triphosphate contents (126,138).At twofold higher concentrations, ATP declines by roughly 60% (138)and peptide chain elongation is inhibited. These effects on elongation are mimicked by low concentrations of Lubrol-PX, a nonionic detergent (E. I. Rotman, M. A. Brostrom, and C. 0. Brostrom, unpublished). Cbz-Gly-Phe-NH2,originally described as an enis an effective Ca2+-releasingagent that inhibits dopeptidase inhibitor (142), initiation at relatively high concentrations (1-2 mM). These concentrations of the drug lower ATP and GTP contents by 15-30%. Thapsigargin is a highly potent inhibitor of initiation (low nanomolar range) that does not lower nucleotide contents. Disruption of the redox environment of the ER with mild reducing agents such as dithiothreitol, which does not affect Ca2+ sequestration, inhibits translational initiation in a rapid and comparable fashion to Ca2+-mobilizing agents (143,144).Inhibition occurs within 10 min at relatively low concentrations of reducing agent (50-500 pM), the synthesis of almost all proteins is suppressed by 80-90%, and polysomal contents are abolished in a cycloheximide-reversible fashion. These concentrations of dithiothreitol have not been found to lower ATP in tissue culture cells incubated with media containing high-concentration (24-mM)glucose (56).Amino acid incorporation returns to the original rate upon restoration of the cells to fresh medium without dithiothreitol, and the cells retain their viability. As noted earlier, both dithiothreitol and Ca2+-mobilizingagents promote comparable inductions of GRP78 and inhibit the processing of various proteins. The inhibition of translational initiation associated with Ca2+-mobilizing agents or reducing agents corresponds temporally with the phosphorylation of eIF-2a. Incubations of intact GH, pituitary cells with various Ca2+-mobilizing agents or with dithiothreitol produced an average fivefold increase in the amount of phosphorylated eIF-2a and a 50% reduction in eIF-2B activity (143,144). Alterations in eIF-2a phosphorylation and translational activity in response to EGTA were reversed by addition of Ca2+in excess of chelator, while responses to dithiothreitol were reversible by washing. Phosphorylation of eIF-2a in response to ionophores or dithiothreitol was not prevented by conventional inhibitors of translational elongation, including cycloheximide, puromycin, and verrucarin, or the initiation blocker pactamycin. While a flow of processible protein to the ER does not appear to be essential for the phosphorylation of eIF-2a in response to ER stressors,
96
CHARLES 0.BROSTROM AND MARGARET A. BROSTROM
the sensitivity to these agents is sharply decreased for up to 1h following addition of cycloheximide (83).Thapsigargin also stimulates eIF-2a phosphorylation and inhibits eIF-2B activity (143, 145). The inhibition of translational initiation in vasopressin-treated livers is also associated with eIF-2a phosphorylation and the inhibition of eIF-2B activity (146).Tunicamycin, an inhibitor of core oligosaccharide biosynthesis, caused eIF-2a to be phosphorylated and translational initiation to be inhibited, whereas sugar analogs that inhibit posttranslocational glycoprotein processing did neither (57,144). Included among these inhibitors were l-deoxynojirimycin, an inhibitor of glucosidases I and 11; 1-N-methyl-deoxynojirimycin and castanospermine, which inhibit glucosidase I; 1-deoxymannojirimycin,which inhibits ER and Golgi (Y 1,2-mannosidases;and swainsonine, which inhibits Golgi mannosidase 11. Brefeldin A, a fungal metabolite that causes rapid transport of cis-, medial-, and trans-Golgi enzymes to the ER, was devoid of effects on eIF-2a phosphorylation for periods of up to 1 h (83). As noted earlier, the regulatory actions of Ca2+are conventionally viewed as involving an increase in [Ca2+],that is derived either from plasmalemmal influx or from mobilization of ER-sequestered cation (34-47).In accord with such modeling, Ca2+-dependentphosphorylation is observed for a variety of cytoplasmic proteins that frequently act as regulatory enzymes for specialized cellular functions. However phosphorylation in response to Ca2+ depletion by EGTA, as is observed for eIF-2a, is uncommon. A 26-kDa ribosomal associated protein of unknown function also behaves in this manner (147). While protein dephosphorylations accompanying declining [Ca2+Ii could conceivably suppress translational initiation, evidence of such regulation is lacking. Phosphorylation of eIF-4E and 4E-BP1 correlates with enhanced rates of protein synthesis and growth in mammalian cells (118, 119). The phosphorylation state of eIF-4E in Ca2+-depletedGH, cells was not altered from that of nondepleted controls (147).Phosphorylation of ribosomal protein S6, which is observed to correlate with increased formation of ribosome initiation complexes, with ribosomal entry into elongation, and with increased growth (148),was also unaffected by cellular Ca2+ depletion.
B. Inhibition of Translational Initiation in Response to Cytoplasmic Stressors As detailed in various reviews (8,149),eukaryotic cells undergo an acute suppression of mRNA translation during moderate elevations of their optimal ambient temperature, in response to challenge with proteotoxic agents that damage cytoplasmic proteins, or in response to incorporation of amino acid analogs that result in the synthesis of aberrant proteins. Proteotoxic agents include sulfhydryl poisons, heavy metal cations, and chemicals acting
TRANSLATIONAL INITIATION REGULATION IN STRESS
97
as oxidants or as free radical generators. As discussed earlier, the inhibition of normal protein synthesis is subsequently followed by the induction of HSPs that function in various capacities as protein chaperones. The inhibition of mRNA translation in response to these stressors is accompanied by reduced polysomal content and corresponding increases in monosomes and ribosomal subunits. Various reports have emphasized that eIF-2a phosphorylation is increased by a variety of HSP inducers to uneven degrees, ranging from pronounced phosphorylations with arsenite, to low to high phosphorylations with heat shock, to marginal phosphorylations with iodoacetamide and various amino acid analogs (reviewed in 7, 8).Variable increments in eIF-2a phosphorylation and decrements in the phosphorylation of eIF-4E, eIF-4B, and S6 ribosomal protein have been described (8).Changes in S6 phosphorylation do not appear to produce altered ribosome activity The alterations in the phosphorylation during heat shock and recovery (150). of eIF-2, eIF-4E, and eIF-4B in conjunction with the reduction in polysomal contents have led to the prevailing view that the disruption of translation centers on initiation rather than on elongation. The phosphorylation of eIF-2 has been repeatedly suggested to be a particularly prominent component of the inhibition associated with these stressors. For example, transfection of CHO cells with a cDNA that overexpresses eIF-2a containing an alanine substitution at Ser-51 (the critical phosphorylation site) overcomes much, but not all, of the effects of heat shock (151). Substantial uncertainty remains regarding the site(s) of action through which cytoplasmic stressors inhibit translation. For example, potential contributions of elongation blockade to the overall degree of translational inhibition by these chemicals are not excluded on the basis of decreased polysoma1 contents without measures of average ribosomal transit times. It has proven difficult to achieve good experimental reproducibility utilizing conditions and chemicals that harshly damage cells in a relatively indiscriminant fashion. More stringent conditions that generate relatively complete inhibition of translation tend to reduce cell viability, whereas less rigorous challenge produces only partial suppression. These considerations have largely precluded decisive quantitative studies in which the effects of stressors on translation could be carefully characterized at high degrees of inhibition of translation with good maintenance of cell viabilities. Recently, however, it has proven feasible to employ sodium arsenite to generate reproducible, strong heat shock responses with good recoveries of amino acid incorporation during the period of HSP induction following washout. While arsenite assuredly inactivates many proteins with sulfhydryl groups (27), this damage is apparently more readily repaired than that developed by elevated temperatures or strong oxidants. Sodium arsenite provokes a strong inhibition of amino acid incorporation
98
CHARLES 0. BROSTROM AND MARGARET A. BROWROM
in GH, cell suspensions (80) or in NIH 3T3 cells in monolayer culture within 30 min. Some previously unpublished data for NIH 3T3 cells are presented at this point to illustrate more graphically the magnitude of the effects under discussion and to highlight the value of arsenite as a cytoplasmic stressor. Comparable acute inhibitions of amino acid incorporation were generated by the ER stressors ionomycin, thapsigargin, and dithiothreitol and the cytoplasmic stressors arsenite and cadmium as a function of increasing concentrations (Table I). Polysomal content was almost completely abolished by either Ca2+ionophore or arsenite during this period as compared to the content of untreated cells (Fig. 1,traces b, c, and a, respectively). Polyribosomal contents were depleted by either agent in a cycloheximide-reversible man-
TABLE I SUPPRESSION OF PROTEIN SYNTHESIS IN NIH-3T3 CELLS BY AGENTSPROVOKING THE ER STRESS OR HEAT SHOCKRESPONSES" Agent None Ionomycin, 30 nM Ionomycin, 100 nM Ionornycin, 300 nM Ionomycin, 1000 nM Thapsigargin, 1 nM Thapsigargin, 3 nM Thapsigargin, 10 nM Thapsigargin, 100 nM Dithiothreitol, 100 p M Dithiothreitol, 200 pA4 Dithiothreitol, 400 p M Dithiothreitol, 600 pA4
Leucine incorporation (nrnol/lO"cells)
0.79 f 0.03 0.29 f 0.03 0.18 0.05 f 0.01 0.02 f 0.01 0.66 f 0.05 0.34 f 0.05 0.04 f 0.01 0.01 0.60 0.08 f 0.01 0.02 0.01
Sodium arsenite, 20 p M Sodium arsenite, 40 p M Sodium arsenite, 60 p M Sodium arsenite, 100 pM
0.59 f 0.06 0.27 0.15 f 0.01
Cadmium chloride, 10 p M Cadmium chloride, 20 Cadmium chloride, 40 pM Cadmium chloride, 100 pM
0.61 f 0.06 0.28 f 0.02 0.15 f 0.01 0.01
0.05
"Cells in serum-free medium were challenged for 30 min with agents at the indicated concentrations.Pulse incorporationof [3Hlleucine into proteins was then determined.
TRANSLATIONAL INITIATION REGULATION IN STRESS
2d
99
1-
2 .0? -LLLL ["L\
FIG.1. Cross-resistance of pretreated cells to polyribosome depletion by sodium arsenite or ionomycin. NIH 3T3 cells in medium containing 0.6-pMphorbol 12-myristate 13-acetate were treated for 2 h without further additions (a, b, c) or with 150-pMsodium arsenite (d, e, r), or for 3.5 h with 0.5-pM ionomycin (g, h, i). Cultures were washed twice with drug-free medium containing 2 mqml fatty acid-free bovine serum albumin and incubated in medium lacking drugs or albumin for 2 h (a-f) or for 30 min (g-i). Preparations were then challenged for 30 min without further additions (a, d, g), with 0.3-pM ionomycin (b, e, h), or with 150-pM sodium arsenite (c, f, i). Lysates of the cells were subjected to sucrose-density gradient centrifugation for analysis of ribosomal size distributions. The arrow indicates the position of 80s ribosomes.
ner, indicating that physical damage to the translational apparatus did not occur (79,137). During longer term exposures suitable for inducing either HSPs or GRPs, most cell types recover approximately 50-100% of their original rates of amino acid incorporation. These recoveries, which depend on new mRNA synthesis, are prevented by inhibitors of transcription such as actinomycin D. NIH 3T3 cells, however, were unable to restore amino acid incorporation when challenged by ER stressors such as thapsigargin unless either fetal bovine serum or phorbol 12-myristate 13-acetate (PMA) was included in the incubation (Table IIA). This requirement could also be satisfied by the addition of epidermal growth factor (EGF) (79).Amino acid incorporation by NIH 3T3 cells declined sharply (84%)during a 3-h exposure to actinomycin D unless either serum or PMA was added to the incubations. When either serum or PMA was added, actinomycin D blocked recovery from thapsigargin inhibition in the predicted fashion. In contrast to the findings with thapsigargin, recovery of amino acid incorporation in cells challenged with arsenite was not dependent on the addition of promoters such as PMA (T.able IIB). Addition of PMA was necessary, however, to demonstrate that actinomycin D prevented recovery of amino acid incorporation on longer term arsenite treatment. Recovery of amino acid incorporation following exposure to either type of stressor was associated with the development of resistance to rechallenge by either type of stressor.Polysome contents, for example,were maintained in cells originally exposed to ionophore and rechallenged with either ionophore or arsen-
100
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
TABLE I1 RECOVERYFROM TRANSLATIONAL SUPPRESSION BY THAPSIGARGIN OR SODIUM ARSENITE I N NIH 3T3 CELLS: EVIDENCE FOR DISTINCT TRANSCRIFTONAL REQUIREMENTS" Leucine incorporation (nmol/1O6 cells) Pretreatment additives
10-min pretreatment
3-h pretreatment
A: Thapsigargin None Thapsigargin PMA Thapsigargin + PMA FBS Thapsigargin + FBS Actinomycin D Actinomycin D + thapsigargin Actinomycin D + PMA Actinomycin D + thapsigargin + PMA Actinomycin D + FBS Actinomycin D + thapsigargin + FBS
0.84 f 0.08 0.04 1.18 f 0.14 0.04 1.88 f 0.04 0.06 0.85 f 0.03 0.04 1.13 f 0.17 0.05 1.84 f 0.04 0.07 f 0.01
0.77 f 0.01 0.07 1.18f 0.02 0.38 f 0.01 1.88 f 0.01 0.42 k 0.03 0.16 f 0.02 0.03 0.65 f 0.03 0.06 1.19 f 0.11 0.08 f 0.01
Leucine incorporation (nmol/106cells) Pretreatment additives
30-min pretreatment
4-h pretreatment
0.69 f 0.03 0.16 f 0.02 0.68 f 0.11 0.16 0.63 f 0.12 0.11 f 0.01 0.67 f 0.06 0.09 0.01
0.74 f 0.07 0.81 f 0.02 1.04 f 0.07 1.03 f 0.11 0.21 f 0.01 0.10 0.48 f 0.05 0.17 f 0.02
B: Arsenite None Arsenite PMA Arsenite + PMA Actinomycin D Actinomycin D + arsenite Actinomycin D + PMA Actinomycin D + arsenite + PMA
*
"A: Cultures were pretreated for either 10 min or 3 h in F-10 medium containing phorbol 12-myristate 13-acetate (PMA, 0.6 pM), fetal bovine serum (FBS, 100/0),actinomycin D (1 pgml), or thapsigargin (30 "M) as indicated. Fresh medium lacking drugs or serum wus then added, and pulse incorporation of [3H]leucine into proteins was determined. B: Cultures were pretreated for 30 min or 4 h in medium containing PMA (0.6 actinomycin D (1 kgrml), or sodium arsenite (60 pM) as indicated. Fresh medium was then added to all samples, sodium arsenite (60 pM) was added to the arsenite-pretreated preparations, and pulse incorporation was determined.
a),
ite (Fig. 1,traces d, e, and f) or originally challenged with arsenite and then rechallenged with either ionophore or arsenite (Fig. 1,traces g, h, and i). These findings are addressed fwther in Section I11 of this review. Neither the acute inhibition of protein synthesis nor the longer term re-
101
TRANSLATIONAL INITIATION REGULATION IN STRESS
covery associated with arsenite challenge was attributable to alterations in Caz+ homeostasis. The ability of sodium arsenite to release NIH 3T3 cell-associated Ca2+was compared to that of two established releasers, ionomycin and thapsigargin (Table111). Modest Ca2+release occurring in response to the addition of sodium arsenite was attributable to the sodium ion and, presumably, involved competitive displacement from superficialplasmalemmal sites. Similar release was achieved by addition of comparable concentrations of sodium chloride. Both thapsigargin and ionomycin released approximately 40% of cell-associated Ca2+during incubation periods ranging from 90 min to 4 h. Arsenite,when sodium addition was taken into account, did not release Ca2+or alter the release of Ca2+occurring in response to either thapsigargin or ionomycin. Arsenite at concentrations up to 150 km does not alter cellular ATP or GTP contents (C. Brostrom and M. Brostrom, unpublished results). Arsenite treatment of NIH 3T3 cells is associated with the phosphorylaTABLE 111 LACKOF EFFECTSOF ARSENITE ION ON C A ~HOMEOSTASIS + I N NIH 3T3 CELLSO Cell-associated 4sCa2+ (nmoblO" cells)
Additions A: Treatment None NaCl NaCl + ionomycin NaCl + thapsigargin Arsenite Arsenite + ionomycin Arsenite + thapsigargin
0.92 f 0.05 0.74 0.28 f 0.02 0.32 0.74 f 0.02 0.30 0.27 zk 0.03
Cell-associated4sCa2+ (nmob106cells) Additions
Nonprebeated cells
Arsenite-pretreatedcells
B: Recovery None Ionomycin Thapsigargin
0.74 f 0.06 0.31 f 0.01 0.31 f 0.01
0.73 f 0.05 0.33 f 0.03 0.44 f 0.05
"A: Cell-associated .laCa2+following exposure to sodium arsenite in the absence or presence of Ca2+mobilizing agents. Cultures were equilibrated for 90 min in F-10 medium containing 0.2-mM Ca2+ and W a 2 + (0.02Ciimmol): challenged with NaCl (150 JLM),sodium arsenite (150 @), ionomycin (1 p M ) , or thapsigargin (0.1 JLM)as indicated, and analyzed for cell-associated 4sCa*+. B: Cell-associated 45Ca2+during recovely from sodium arsenite treatment. Cultures were pretreated for 2 h in the absence or presence of 150JLMsodium arsenite, washed twice with fresh medium, and equilibrated for 2 h in medium containing 0.2mM Ca2+and '%a2+ (0.02Ci/mmol). Preparations were then challenged for 30 min with ionomycin (1 or thapsigargin (0.1 pM) as indicated and cell-associated 45Ca2+was determined.
a)
102
CHARLES 0.BROSTROM AND MARGARET A. BROSTROM
tion of eIF-2a concomitant with the inhibition of amino acid incorporation and disappearance of polysomal contents (79). The concentration dependence for the inhibition of incorporation and the extent of this inhibition correlated closely with the phosphorylation of eIF-2a. Following removal of arsenite, recovery of amino acid incorporation paralleled the dephosphorylation of eIF-2a. A variety of other HSP-inducing chemicals, including Cd2+,Hg2+,t-butylhydroperoxide, menadione, and diamide, also promoted the rapid phosphorylation of eIF-2a. No effects of arsenite were found on peptide chain elongation as determined by analyses of polyribosomal contents and average ribosomal transit times. A striking similarity was apparent between the effects of ER stressors, which clearly act on initiation by promoting the phosphorylation of eIF-2, and the effects of arsenite on translational initiation. No effort has been made as yet, however, to establish whether the phosphorylation of factors associated with the eIF-4F complex or S6 ribosomal protein are affected by arsenite within the same time frame as eIF-2a.
C. Mammalian Enzymes Catalyzing the Phosphorylation of elF-2a The eIF-2cu kinase family is comprised of a small group of Ser-Thr protein kinases bearing sequence and structural similarities in their catalytic domains. Formerly known under various names, the accepted terminology for the dsRNA-activated and interferon-inducible member of this family is PKR. Other members of this group are HRI, the heme-regulated protein kinase of erythroid cells, and GCN2, the “general control” eIF-2a kinase of yeast. GCN2 differs markedly in structure from either PKR or HRI and is regulated by a variety of metabolic parameters, including amino acid availability (152-154). The following discussion is restricted to the two mammalian eIF2a kinases, each of which is subject to regulation under conditions of stress. 1. HRI
This member of the eIF-2a kinase family is expressed in an erythroid-specific manner, is present in reticulocytes at particularly high concentrations, and is found in the postribosomal supernatant fraction of reticulocyte lysates (155,156). HRI is activated in immature erythroid cells by heme deficiency and is inhibited by heme. Accumulation of the mRNA for HRI in differentiating erythroleukemic cells depends on the presence of heme (155).The enzyme is therefore proposed to function physiologically in adjusting rates of globin synthesis to the availability of heme (156).Removal of heme from lysates promotes autophosphorylation of HRI and eIF-2a phosphorylation in a manner that is reversible by hemin re-addition. These phosphorylations are
TRANSLATIONAL INITIATION REGULATION IN STRESS
103
thought to be closely related events because a specific point mutation prevents their occurrence. In heme-deficient lysates HRI exists as an active dimer. Intersubunit disulfide bond formation between HRI molecules, accompanied by suppression of kinase activity, occurs in response to hemin. This conversion is neither complete nor irreversible. ATP binding to HRI is increased by hemin. The heme-binding domain has not been identified. Purified HRI homodimers are active and fully responsive to hemin, indicating that other proteins are not essential for regulation by heme. However, HRI is isolated by co-immunoadsorption as a complex with HSPSO, HSP70, and another protein termed p59. Under these conditions the enzyme is an inactive dimer (157). The association with HSPSO, but not HSP70, is stabilized by hemin and requires Mg2+/ATP (157, 158). It is unclear whether HSPSO influences eIF-2a kinase activity, heme-dependent HRI inactivation, or HRI stability. Heat shock, oxidizing agents, toxic heavy metals, and denatured proteins result in the irreversible activation of HRI (156,.259).The concentration of HSP70, but not HSPSO, correlates inversely with the degree of translational inhibition in heme-supplemented lysates during heat shock or oxidative stress (158). HSP70 reduces the concentration of hemin required to suppress the enzyme (160);this effect requires GTP and a reducing environment, consistent with a role for critical sulfhydryl groups in regulation of HRI activity, Competition of denatured protein and HRI for binding of HSP70 is therefore proposed to signal HRI activation in response to thermal and oxidative stress (156-160). In uivo, the association of HRI with HSPs is probably dynamic and influenced by hemin, ATP, GTF’, and the redox state. 2. PKR The unique properties and functions of this e1F-h kinase are detailed in several reviews (153,154,16l, 162).PKR was originally discovered as the enzyme responsible for the inhibition of translational activity in reticulocyte lysates by virally encoded dsRNAs. In these preparations as well as in lysates of nucleated mammalian cells, addition of dsRNA results in the phosphorylation of eIF-2a and the inhibition of translational initiation. This suppression is attributable to the activation of PKR, which catalyzes the phosphorylation of eIF-2a at serine residue 51, thereby effecting the sequestration of eIF-2B and the inhibition of eIF-2 recycling. The observation that PKR is induced by interferon is consistent with the proposal that the enzyme functions as an “antiviral” gene product. Since replication of many viruses depends upon production of dsRNAs, PKR induction provides the host cell with a mechanism whereby viral protein synthesis can be repressed. PKR is also strongly implicated in antiproliferative growth control mechanisms, is thought to mediate many of the effects of the interferons, and is activated by stresses affecting the ER or cytoplasm.
104
CHARLES 0.BROSTROM AND MARGARET A. BROSTROM
a. Properties and Activation. PKR is mainly associated with ribosomes of the rough ER, with approximately 20% localizing to the nucleus. The kinase possesses a molecular mass of 62-68 kDa, clusters of charged residues, two basic amino-terminal dsRNA-bindingregulatory domains termed R, and R,,, and all sequence motifs conserved in other protein kinases. The R, domain is necessary and sufficient for RNA binding (163), which is thought to enhance affinity of the enzyme for ATP. PKR is somewhat homologous to other RNA-binding proteins but bears the most resemblance to HRI and GCN2, with homology residing exclusively in the catalytic domain. As with other eIF-2a kinases, the regulatory domains are adjacent to the catalytic domain (153,162). dsRNAs containing 85 base pairs produce full activations; shorter strands cause lesser activations and larger strands or those with excessive secondary structure are inhibitory. PKR lacking the dsRNA binding motifs is constitutively active when expressed in vivo (169,suggesting that the unoccupied regulatory domains act to suppress catalytic activity. PKR is also activated in vitro by small polyanionic molecules, such as heparin, by a mechanism independent of R, and R,, (165). Activation by dsRNA in vitro is associated with an autophosphorylation that correlates directly with dimerization of the enzyme. dsRNA is required for dimerization in vitro. Homodimerization, which is also observed in vivo, is mediated by the RNA-bindingregions (166,167). Dimerization with other dsRNA-bindingproteins may also occur in vivo and may be physiologically relevant (167).Autophosphorylation is thought to be necessary, but not sufficient, for activation. Enzyme possessing point mutations at the aminoterminus exhibited optimal autophosphorylationbut depressed eIF-2a phosphorylation and dsRNA binding (165).Viral activation of PKR occurring in vivo is believed to require occupation of dsRNA-binding sites and autophosphorylation, but the functional sigdcance of dimerization in this activation has been questioned (168).Whether dsRNA or an endogenous dsRNA-like substance serves as the physiological activator of PKR in noninfected cells is a subject of current controversy. Transfection with certain plasmids promotes PKR activation and eIF-2a phosphorylation (169,170). In this circumstance translation of the plasmidderived mRNAs is suppressed selectively in that normal cellular messages continue to be translated. Expression of a nonphosphorylatable eIF-2a mutant overturns this selective translational inhibition. The mechanism underlying the selective inhibition is poorly defined. It was proposed that PKR may directly bind plasmid-derived RNAs to phosphorylate eIF-2a, which then forms a nondissociable translational inhibitor on the mRNA to obstruct its further translation. Under this condition, eIF-2B would not be sequestered in complexes with phosphorylated eIF-2a. Alternatively, eIF-2B may localize preferentially to polysomes, whereas RNA molecules associated with
TRANSLATIONAL INITIATION REGULATION IN STRESS
105
phosphorylated eIF-2a may be unable to enter the polysomal pool efficiently. Collateral regulatory effects, such as those involving eIF-4, have not been investigated. Several animal viruses have developed mechanisms to escape the antiviral effects of PKR (171).These mechanisms include viral encoding of proteins that block the active site of the enzyme or that sequester activators such as dsRNA, viral activation of host proteins that directly inhibit the enzyme, production of high concentrations of small viral dsRNAs that occupy the RNA-binding sites without inducing activation, and the sequestration and/or degradation of PKR. B.
ROLEIN GROWTHCONTROL
Various studies (172-175) in cultured fibroblasts indicate that PKR expression, autophosphorylation, and an increased phosphorylation of eIF-2a are associated with reduced rates of cell proliferation and/or differentiation. Endogenous activators or inhibitors of PKR have been proposed to mediate some of these effects. Overexpression of PKR in selected cell types inhibits translational activity and suppresses growth (276).Translation of the mRNA for PKR is down-regulated during overexpression of the kinase, suggesting that an autoregulatory mechanism normally controls PKR synthesis at the translational step (177).Accordingly, inactive forms of PKR bearing mutations in the catalytw or RNA-bindingdomains can be expressed in heterologous systems at much higher concentrations than the wild-type protein. Inactive mutants of PKR have been reported to induce malignant transformation in fibroblasts and generate tumors in nude mice (176, 178). Mutant forms of the kinase were shown to act as trans-dominant repressors of the wild-type enzyme, consistent with a role for the wild-type enzyme as a tumor suppressor. NIH 3T3 cells expressing either a trans-dominant negative mutation of PKR resulting in reduced eIF-2a phosphorylation or expressing a Ser-51 + Ala (nonphosphorylatable) mutation in eIF-2a were transformed (179).Expression of the mutant PKR has been proposed to inhibit the endogenous kinase either through the formation of inactive heterodimers or through binding and depletion of potential activators of the endogenous enzyme (168, 180). Interferon regulatory factor 1(IRF-l),a transcription factor that activates type I interferon and interferon-inducible genes, also manifests tumor suppressor activity. Expression of the PKR gene, which contains a promoter element for IRF-1 binding, is hypothesized to mediate this tumor suppressor action (181, 182). IRF-1-mediated cell growth inhibition and interferon induction correlate with PKR expression; an inactive dominant negative PKR mutation abolishes both effects of IRF-1. Also, PKR expression is reduced in
106
CHARLES 0.BROSTROM AND MARGARET A. BROSTROM
leukemias and myelodysplasias associated with a deficiency in the IRF-1 gene. PKR may signal the activation of specific gene transcription. Evidence has been provided for the existence of two distinct PKR-mediated transcriptional signals that vary with cell type and stimulus, one of which involves activation of the transcription factor NF-KB (183-185).PKR has also been observed to interact with cellular and virally encoded proteins that may initiate regulatory cascades controlling cell proliferation and gene expression (161).Phosphorylation of some of these proteins may be catalyzed by PKR (161,186).
c. Stress Responses and the Activation of PKR. The broad details of the mechanism whereby phosphorylation of eIF-2a occurs in response to ER or cytoplasmic stress are becoming evident. Depletion of sequestered ER Ca2+ was found to activate PKR (145).Analyses of extracts derived from cultured cells that had been pretreated briefly with Ca2+ionophore A23187 or thapsigargin revealed a two- to threefold increase in eIF-2a kinase activity without detectable changes in eIF-2ol phosphatase activity. Direct addition of A23187, EGTA, or thapsigargin to extracts did not signal eIF-2a phosphorylation. A 65 to 68-kDa polypeptide was phosphorylated concurrently with eIF-2ol in extracts of pretreated cells. Pretreatment with interferon-a caused a fivefold induction of this polypeptide, which was identifed as PKR by immunoblotting. Culturewith interferon-olreduced leucine incorporationmodestly and increased eIF-2a phosphorylation slightly. After challenge with A23187 or thapsigargin, however, leucine incorporation was inhibited 80-90% in conjunction with the phosphorylation of 40-50% of the eIF-2ol subunit regardless of interferon pretreatment. Depletion of ER Ca2+ stores did not affect the extraction of PKR. When incubated with reovirus dsRNA, extracts derived from cells with depleted ER Ca2+ stores displayed greater degrees of phosphorylation of PKR and of eIF-2a than did extracts prepared from untreated cells. The enhanced dsRNA-dependent phosphorylation of PKR was observed regardless of prior induction of the kinase with interferon. Lower concentrationsof dsRNA were required for optimal phosphorylation of PKR in extracts of treated as compared to control preparations. Collectively these results indicate that PKR is subject to activation by Ca2+-mobilizingagents that disrupt ER protein processing and inhibit translational initiation. The role of PKR in the translational suppression resulting from depletion of ER Ca2+stores was also examined in NIH 3T3 cells overexpressingeither wild-typeeIF-201, a mutant eIF-2a (S51A),wild-typePKR, or a dominant negative mutant PKR (K296P)in the catalybc domain (187).Translational inhibition in response to varying concentrations of A23187 was reduced by ex-
TRANSLATIONAL INITIATION REGULATION IN m E S S
107
pression of mutant eIF-2a or mutant PKR but not by expression of wild type eIF-2a. Overexpression of wild-type PKR increased the sensitivity of translation to inhibition by 1-pMionophore. Transient expression of the dominant negative PKR mutant in COS-1 monkey cells also decreased the phosphorylation of eIF-2a occumng upon treatment with A23187. Overexpression of the PKR regulatory RNA-binding domain in the absence of the catalytic domain was sufficient to inhibit eIF-2a phosphorylation in response to A23187. Furthermore, overexpression of the HIV transcriptional activation region (TAR) RNA-binding protein also inhibited eIF-2a phosphorylation in response to the ionophore. These findings strongly implicate the RNA-binding regulatory domain(s) of PKR in the mechanism by which this eIF-2a kinase is activated in response to depletion of sequestered CaZ+stores. Sodium arsenite, a prototype for stressors fostering cytoplasmic protein misfolding and induction of the HSPs, was found to inhibit translational initiation through the activation of PKR in a manner comparable to Ca2+mobilizing agents (79) (Table IV). When incubated with dsRNA, extracts derived from arsenite-treated cells displayed greater degrees of phosphorylation of PKR and eIF-2a than did control extracts. Cells overexpressing a dominant negative PKR mutation (K296P) in the catalytic domain resisted translational inhibition and eIF-2a phosphorylation in response to ER or cytoplasmic stressors; these effects were partially overturned after induction of endogenous PKR with interferon-a. The present body of available information supports the concept that PKR functions as a common focal point for controlling rates of translational initiation in response to a variety of stimuli, including, but not restricted to, viral infections, ER stress, and cytoplasmic proteotoxic stress. Currently only two mammalian eIF-2a kinases are known to exist: HRI, which is expressed only by erythroid cells, and PKR, which is ubiquitous to all mammalian cell types. TABLE IV EVIDENCE THATPKR MEDIATESTRANSLATIONAL SUPPRESSION BY C A ~ IONOPHORES, + THAPSIGARGIN, AND SODIUM ARSENITE ~
1. Translationalinitiation is suppressed concurrentlywith phosphorylation of eIF-2u in treated cells; translational suppression and eIF-2a phosphorylation occur at comparable drug dosages.Peptide chain elongation is undected. 2. Treatments increase eIF-2a kinase activity while inhibiting eIF-2Bactivity. 3. Greater phosphorylations of PKR and of eIF-2a occur in extracts of treated, as compared with nontreated, preparations;phosphorylationsare concurrent. 4. Phosphorylations of PKR and eIF-2a in extracts of treated cells are further enhanced by addition of dsRNA or culture in the presence of interferona. 5. Translationalinhibitions and eIF-2a phosphorylationsin treated cell preparations are reduced by expression of a dominant negative PKR mutation.
108
CHARLES 0.BROSTROM AND MARGARET A. BROSTROM
The lack of other eIF-2ol kinases and the activation of PKR by stressors suggest that the enzyme may mediate most, if not all, eIF-2a-dependent inhibitions of translational initiation in higher eukaryotes. Putative additional stimuli for PKR activation could include hormonal or nutritional alterations or conditions that damage the plasmalemma. In support of this hypothesis, the induction of apoptosis in NIH 3T3 cells by serum deprivation was recently found to depend on the activation of PKR (S. R. Srivastava and R. J. Kaufman, unpublished results). The possibility that PKR possesses multiple substrates and/or serves more broadly in cellular control mechanisms is also supported by findings discussed previously. The emerging picture suggests that the structure and regulation of this enzyme should prove both interesting and informative.
D. Translation in Reticulocytes as Compared to Nucleated Cells The development of erythroid stem cells into reticulocytes involves the ejection of the nucleus from the cell in conjunction with the loss of a functional ER. The residual translational apparatus of the reticulocyte is devoted to the accumulation of extremely high intracellular hemoglobin concentrations accompanying maturation to the erythrocyte. Rabbit reticulocytes retain remarkably high rates of translational initiation upon lysis as contrasted to nucleated mammalian cells, which lose the activity upon cell damage or disruption. Reticulocyte lysates have therefore been used extensively in the characterization of the translational apparatus of mammalian cells and for the synthesis of proteins directed by exogenously added mRNA. Various lines of experimental evidence support the hypothesis that reticulocyte lysates retain high degrees of initiation due to an uncoupling of the process from control by vesicular membranes that prevails in other cell types. In contrast to nucleated cells, intact reticulocytes lack Ca2+-dependentinitiation but do display inhibition of translational elongation (188). Rates of amino acid incorporation in intact rabbit reticulocytes were unaffected by depletion of Ca2+with EGTA (188)or by exposure to Cbz-Gly-Phe-NH, (127), a peptide that releases ER-associated Ca2+(49).Low concentrations of Ca2+ ionophore A23187 strongly inhibited incorporation in reticulocytes incubated with 1-mM Ca2+ but not with EGTA. Polysomal profiles and the extension of average ribosomal transit times of cells treated with the ionophore at 1-mM Ca2+ were characteristic of translational elongation block. Unfortunately the intense red color of these preparations precluded an analysis of the degree to which [Ca2+Iiwas increased by the ionophore treatment. Reticulocyte lysates were not affected by the addition of comparable respective concentrations of the peptide, the ionophore, or EGTA. Translational elongation
TRANSLATIONAL. INITIATION REGULATION IN STRESS
109
in response to Ca2+ addition is clearly inhibited in lysates through the phosphorylation and inhibition of elongation factor 2 (eEF-2) by calmodulin-dependent protein kinase I11 (189, 190). Currently eEF-2 is the only identified substrate for this particular kinase. The overall body of evidence from reticulocyte lysates indicates that the inhibition of elongation is only loosely coupled to the fractional phosphorylation of the eEF-2 pool. Inhibitions of approximately 45% have been found to occur with eEF-2 phosphorylations approaching 98% (189~). eEF-2 is an abundant protein in most cells. Transient phosphorylation of eEF-2 has been reported for a wide variety of cell types following exposure to Ca2+-mobilizingsubstances. For example, treatment of fibroblasts with serum, bradykinin, vasopressin, EGF, or Ca2+ionophores, each of which provoked transient increases in [Ca2+],,resulted in 2 to 10-foldincreases in eEF2 phosphorylation (191).Phosphorylation of the factor was maximal at 0.5-1 min and attenuated at 5 min. Thrombin and histamine, which elevate [Ca2+Ii in umbilical vein endothelial cells, also provoked a rapid and transient phosphorylation of eEF-2, whereas phorbol esters or CAMP-elevatingagents were ineffective (192). Phosphorylation of eEF-2 in cells exposed to bradykinin, serum, vasopressin, histamine, or thrombin was not reduced during incubations in Ca2+-depleted medium, whereas phosphorylation in response to EGF or ionophore required the presence of extracellular cation. Phosphorylation of the factor following depolarization of PC-12 pheochromocytoma cells was observed to be overturned by nerve growth factor or CAMP-elevating agents (193). A negative regulatory input occurring through eEF-2 would require either a preferential binding of the phosphorylated form to the ribosome or a relatively large fractional phosphorylation of the protein. Although eEF-2 is clearly phosphorylated to some extent in various types of intact nucleated cells in response to agents known to increase [Ca2+],,it has been difficult to demonstrate resultant inhibitions of amino acid incorporation as a function of increasing eEF-2 phosphorylation. Given the transitory nature of eEF-2 phosphorylation in most cells, the consequent inhibition of translation would be expected to be comparably short lived and therefore difficult to measure experimentally by standard techniques. Hormonal treatments to increase [Ca2+], normally promote ER Ca2+ depletion. The consequent inhibition of initiation developing after 5-7 min complicates the demonstration of an interim slowing of elongation, Sustained elevations in [Ca2+],can be achieved without depleting ER Ca2+ stores utilizing BAYK 8644, an L-type Ca2+ channel agonist. GH, pituitary cells exposed to this substance at 1-mM extracellular Ca2+ were found to maintain [Ca2+],values of 0.5-1 pA4 for at least 10 min. The phosphorylation of eEF-2 was increased but amino acid incorporation was reduced by no more than 10-20% (194).
110
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
The physiological role played by a brief slowing of translational elongation is not immediately obvious. Conceivably inhibition through this mechanism would provide a device whereby cells could rapidly divert energy from protein synthesisto support hormonally activated cell-specific functions. Such model building would be precluded by an alternative proposal that eEF-2 kinase may be active only at mitosis (195).Phosphorylation of eEF-2 in amnion cells is reportedly increased during mitosis, a period during which the rate of translation is believed to decrease and [Ca2+lito rise briefly (196).Also, the activity of eEF-2 kinase in Xmopus oocytes decreased substantially during the final stages of oogenesis and was absent in fully grown oocytes (197).
111. Translational Accommodation to ER or Cytoplasmic Stress Mammalian cells possess the ability to adapt to the translational suppression provoked by pharmacological agents eliciting either the heat shock or ER stress responses. This phenomenon, which we have termed “translational accommodation”or tolerance to stress, refers to the recovery of amino acid incorporation that develops within 2-3 h of exposure to the stressor. Accommodation is dependent on transcriptional events, is preserved upon washout of stressor and rechallenge, and differs from “thennotolerance,” the acquired ability to survive thermal stress. Thermotolerance depends on the accumulation of a full complement of HSPs and/or alterations of various other components, but does not involve the induction of GRPs (26,198).The expression of translational accommodation, which has been studied primarily in our laboratory, is coupled to the inactivation of PKR and is viewed as a component of overall cellular adjustment to chemical or metabolic stress.
A. Accommodation to Depletion of ER Ca2+ Stores In early experiments, C6 glial tumor cells that were depleted of Ca2+continued to selectively synthesize an 80-kDa peptide, later identified as GRP78 (131).Ca2+-depletedC6 cells, like the majority of cultured cell types that have been examined, induce this chaperone in conjunction with the development of translational accommodation to depletion of ER Ca2+ stores. Other cell types, including GH,, NIH 3T3, and P3X63Ag8 myeloma cells, do not recover from the acute inhibition of amino acid incorporation imposed by a Ca2+-mobilizing agent unless phorbol esters, cyclic AMP-elevating agents such as forskolin, growth factors, or serum are added to the incubation (77, 78, 136).Translational recoveries from the inhibited state range from four to eightfold or, alternatively, from 50 to 100% of the incorporation values of the noninhibited controls. Accommodation does not alter cellular Ca2+ storage
TRANSLATIONAL INITIATION REGULATION IN STRESS
111
capacities as measured by 45Caaccumulation, sensitivity of sequestered Ca2+ to mobilizing agents, or maximal rates of amino acid incorporation upon restoration of Ca2+ with optimal media (77, 78). Cells accommodate identically when exposed to dithiothreitol, an agent that does not affect sequestered Ca2+.Accommodated cells are resistant (“cross-tolerant”)to translational inhibition upon challenge with dithiothreitol or various Ca2+-mobilizingagents as well as such HSP inducers as elevated temperature or arsenite (77, 78).
B. The Role of GRP78 GH, cells treated with Ca2+-mobilizing agents develop perceptible degrees of accommodation within 60 min (77,136). Without exception, the induction of GRP78 synthesis has been found to correlate directly with expression of translationalaccommodationto ER stress. While most of the induction relates to the increased synthesisof grp78 mRNA, preferential ribosomal loading of preexisting grp 78 mRNA may also be involved (78).Induction of grp 78 mRNA and GRP78 in GH, cells is produced by either dithiothreitol or a variety of agents that deplete ER Ca2+ stores (77-79), with inductions dependent on the presence of a promoting factor such as phorbol ester. Total poly(A)+mRNA is unaffected by these treatments. Actin mRNA content and the incorporation of amino acids into actin tend to decline during extended treatments with agents that mobilize sequestered Ca2+regardless of the presence of phorbol ester or CAMPelevation (138). The decay in actin message may reflect a lack of protection by polysomes against destabilizinginfluences or nucleases that degrade this mRNA. Both the induction of grp78 mRNA and the decline in actin mRNA occur independently of changes in [Ca2+],. Experiments with two established myeloma cell lines also indicate that GRP78 mediates the expression of translational tolerance (77).The NS-1 cell, a nonsecreting derivative of the immunoglobulin-secreting P3X63Ag8 cell, synthesizes kappa, but not gamma, chains and is unable to release the kappa chains from the ER. The nonsecreting myeloma expresses higher contents of GRP78 and its mRNA than does the P3X63Ag8 cell from which it is derived (199). Abnormal accumulation of kappa chains in the ER was proposed to constitute the signal for GRP78 induction. Consistent with a role for GRP78 in tolerance to ER stress, amino acid incorporation by NS-1 cells was found to resist inhibition by EGTA, ionophore, and dithiothreitol. Antisense oligodeoxynucleotides directed against grp78 mRNA reduced amino acid incorporation in tolerant, but not in nontolerant, preparations of myeloma and GH, cells (77). CHO cells overexpressing wild-type GRP78, but not other grp genes, have been observed to be protected against translational inhibition in response to ionophore A23187. Cells overexpressinga GRP78 deletion mutant capable of binding peptides and ATP, but defective in ATP hydrolysis and
112
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
peptide release, were not protected (199~). These findings offer direct support for the involvement of functional GRP78 molecules in the mechanism by which translational accommodation to ER stress is imposed.
C. Signaling Systems: Translational Initiation versus the Induction of GRP78 PKR and yeast h e l p exemplify Ser-Thr kinases that are activated in response to protein unfolding in the ER. It is not known, however, whether yeast undergo equivalent EX stress responses to those of mammalian cells or develop accommodation, As noted previously, the eIF-2a kinase of yeast (GCN2) and PKR are subject to different control mechanisms, and a mammalian homolog of Irelp has not been identified. Substantial evidence indicates that PKR does not signal grp78 gene expression in response to ER stress. The degree of phosphorylation of eIF-213 did not correspond to the induction of grp78 mRNA in GH, cells exposed to identical degrees of ER stress (83).Grp78 mRNA was induced by low concentrations of ionomycin, or dithiothreitol that were insufficient to signal eIF2a phosphorylation or to inhibit amino acid incorporation. Mobilization of the bulk of cell-associatedCa2+and the induction of grp78 mRNA occurred at comparable low concentrations of ionomycin, whereas phosphorylation of eIF-2a and the inhibition of protein synthesisrequired higher ionophore concentrations. Prolonged (17-h)cycloheximide treatment by itself increased the phosphorylation of eIF-2a without affecting relative grp 78 mRNA contents. Upon release from cycloheximideblockade, eIF-2a was dephosphorylated in parallel with induction of grp78 mRNA. Additionally, induction of grp78 mRNA preceded eIF-2a phosphorylation during treatment with brefeldin A, a fungal metabolite that fosters retrograde resorption of the Golgi apparatus into the ER and inhibits protein secretion (200). Pretreatment for 1 h with cycloheximide,which lowers processible protein within the ER, suppressed grp 78 mRNA induction in response to treatment with either Ca2+ionophore or dithiothreitol.Simultaneously, the sensitivity of eIF-201to phosphorylation was reduced such that it occurred only at high doses of either perturbant. It was concluded that eIF-2a phosphorylation is signaled under conditions wherein GRP78 is insufficient for management of stress within the ER, that signaling of grp 78 transcription can occur independently of PKR activation, and that greater degrees of ER stress are required for eIF-2a phosphorylation and translational suppression than for grp78 mRNA induction. Translational accommodation to Ca2+ionophore or dithiothreitol is associated with the dephosphorylationof eIF-2a and the reversal of the inhibition of eIF-2B activity (143,144). Neither the dephosphorylation of eIF-2a nor the restoration of eIF-2B activity is observed when actinomycin D is added to the treatments. Rather, the percentage of eIF-2a in the phosphorylated form in
TRANSLATIONAL INITIATION REGULATION IN STRESS
113
GH, cells increases significantly with time of incubation in the presence of ER stressor and the transcriptional inhibitor (143, 144). Accommodation to translational inhibition by ionomycin or dithiothreitol, accompanied by dephosphorylation of eIF-Za, was observed only when grp78 mRNA was induced. For example, the eIF-Za dephosphorylation that accompanies translational recovery during extended incubations at high ionophore concentration did not occur when grp78 mRNA induction was suppressed by cycloheximide or by actinomycin D (83).Cumulative findings support a model (78) in which enhanced transcription of the grp78 gene, promoted by cAMP/phorbol ester, in conjunction with preferential polysomal loading and translation of grp78 mRNA, confers translational accommodation to ER stress.
D. Translational Accommodation to Cytoplasmic Stress GH, and NIH 3T3 cells responded to heat shock, sulfhydryl poisons such as sodium arsenite, or agents that generate reactive oxygen species with an acute translational inhibition followed by a recovery period in which the HSPs were preferentially synthesized. Synthesis of all polypeptides in GH, cells recovering from heat shock was resistant to inhibition by agents that deplete ER Ca2+ stores (80).Induction of the HSPs in NIH 3T3 cells recovering from arsenite was accompanied by a reduced phosphorylation of eIF-2 and a partial resumption of mRNA translation (79). These observations provided further evidence that the acute inhibition of initiation by arsenite derived from the phosphorylation of eIF-2a. Induction of HSPs by arsenite coincided temporally with expression of translational cross-tolerance to subsequent rechallenge with either ER or “heat shock” stressor. Cross-tolerance involved continued amino acid incorporation, preservation of polyribosomaI contents, and the lack of increased eIF-2a phosphorylation. As with induction of GRP78, translational suppression was not mandatory for the induction of HSP70. Mild thermal stress induced expression of this chaperone in GH, cells without inhibiting amino acid incorporation (80). The particular HSP(s) that confer translational tolerance to ER and cytoplasmic stress have not been identified. HSC70 is an attractive candidate in view of the observations that this chaperone has been localized to the translational apparatus of reticulocyte lysates, suppresses the activity of HRI in reticulocyte lysates, is rapidly induced, bears homology to GRP78, and stimulates ribonucleoparticle-independent transport of precursor proteins into mammalian microsomes (23,30, 101, 158, 160). In this regard we have observed the preferential synthesis of both GRP78 and HSC70, but not other conventional stress proteins, in GH, preparations recovering from prolonged cycloheximide blockade (M. A. Brostrom and C. 0. Brostrom, unpublished results). In contrast, CHO cells constitutively expressing high contents of
114
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
grp78 antisense RNA (71) exhibited translational cross-tolerance during re-
covery from arsenite in the absence of GRP78 or HSC70 induction. However HSP70, a functional homolog of HSC70 (2,22),was optimally induced.
E. Relationships between the ER and Cytoplasmic Stress Response Systems The HSP and GRP stress proteins display comparable protective abilities to maintain translational initiation. Induction of HSPs after cytoplasmic
stress or of GRPs after ER stress is associated with translational recovery, reduced eIF-2 phosphorylation, and maintenance of polyribosomal contents. Translational cross-tolerance to inhibition by ER or cytoplasmic stressors invariably coincides with the induction of either class of stress proteins. These relationships are highlighted in Table V, which compares the acute and longer term effects of the cytoplasmic stressor arsenite to those of three activators of the ER stress system. With the exception of their mutual abilities to influence the activity of
TABLE V PROPERTIES OF CERTAIN PKR-ACTIVATING DRUGS
Property Rapid translational suppression At initiation At elongation Depletion of ATP/GTP contents at suppressing dosages eIF-2a phosphorylation at suppressingdosages Reversal by treatment with drug-free medium Polysome disaggregation reversible by cycloheximide Perturbation of Caz+ homeostasis Perturbation of redox status Activation of gene transcription hsp expression grp expression Rotein synthesis requirement Induction of translational tolerance To heat shock stressors To ER stressors Activation of HRI
Ionomycin Thapsigargin Dithiothreitol Sodium arsenite
+ -
+
+
+
+
+
+
+
+
+
+
+
+
+
-
-
-
-
+
+
-
-
+ +
+ +
+
+
+ +
+ +
+
+
+ + +
TRANSLATIONAL INITIATION REGULATION IN STRESS
115
PKR and translational initiation, however, the two stress systems appear to operate independently. For example, anoxia and glucose deprivation were each observed to foster slow increases in GRP synthesis in tumors; upon reoxygenation or restoration of glucose, GRP synthesis was suppressed while HSP synthesis gradually increased (201).Sodium arsenite did not induce detectable amounts of the GRPs, nor did it affect ER function in NIH 3T3 cells. The cells maintained their Ca2+ contents during extended treatment with arsenite and, upon Ca2+ depletion with either ionomycin or thapsigargin, readily induced GRP78 superimposed upon the preexisting HSP induction by arsenite (79).Similarly, following the induction of GRP78 with Ca2+mobilizing agents, the cells remained responsive to HSP induction by subsequent arsenite treatment. Thus, despite the cross-tolerance of protein synthesis developingin response to ER and heat shock stressors,the HSP and GRP stress proteins remain independently inducible.
F. HSV-1 Infection and Translational Tolerance to ER Stress Viruses interrupt host cell protein synthesis by various mechanisms while utilizing other strategies to reprogram the translational apparatus for synthesis of virally encoded proteins. During early infection of HEp-2 epidermal cells by herpes simplex virus 1 (HSV-l),the synthesis of nearly all proteins, including viral (a)proteins, was sensitive to inhibition by Ca2+ionophore; by 4-6 h after infection, however, overall polypeptide synthesis in infected cells had become resistant to depletion of Ca2+ stores (B. Pancake, C. R. Prostko, M. A. Brostrom, and C. 0. Brostrom, unpublished results). Specific viral mRNAs were readily detected in polysomes, and the synthesis of viral polypeptides of early (p) and late (-y) kinetic classes was found to be insensitive to the effects of ionophore, although maturation (glycosylationand/or transport) of viral glycoproteins was reduced. It is therefore apparent that productive infection of cells by HSV-1, like heat shock or chemical stress, results in a modification (or modifications)that confers accommodation to ER stress and that assures the continued translation of selected mRNAs. Although the nature of this modification is unknown, inactivation of PKR during the later stages of HSV-1 infection is consistent with these findings. The HSV-1 gene product a,34.5 is thought to prevent eIF-2a phosphorylation in response to the onset of viral DNA synthesis (186)and may mediate the observed tolerance to stress.
G. Physiological Relevance of Translational Accommodation The rapid translationalshutdown observed in cultured cells in response to a severe proteotoxic stress ensures against further production of misfolded or
116
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
damaged proteins. In accord with the model proposed in Section IV,we hypothesize that translation can proceed only when HSP70 chaperones are sufficient to manage both existing and impending proteotoxicity. It is not established, however, that translational accommodation comparable to that observed in cultured cells is expressed in normal cells or tissues under conditions that prevail in uiuo. For example,it is questionable that intracellular Ca2+ or serum glucose ever reaches the low concentrations required for induction of GRP78 and tolerance. Relevant to this issue, hepatic grp78 mRNA concentrations in mice were found to be down-regulated after moderate restriction of caloric intake, a condition associated with decreased physiological stress, longer life, and lower cancer incidence (202).Findings were consistent with destabilization of the mRNA but not with effects at grp78 transcription or mRNA translation. High caloric intake, which is associated with protein glycation and oxidation, was proposed to constitute a metabolic stress requiring increasedm78 expression.Also relevant are the findings that HSP70/HSC70 concentrations increase after transient brain, pulmonary, or cardiac ischemia and correlate directly with the ability of these tissues to survive an ischemic trauma (reviewed in 22,203).Evaluation of the relative importance of translational accommodation in the mechanisms whereby normal tissues adapt to ischemic or high-calorie stress is therefore an issue of considerable interest.
IV. Perspectives and Speculation The preceding sections sketch the broad literature that exists pertaining to stress-inducedproteins and their respective roles as protein chaperones. Despite the existence of sequence homologies between GRP78 and HSP70/ HSC70 and between GRP94 and HSP9O (2-5,22-24, the two sets of stress proteins function quite differently. They are induced in response to different chemicals and conditions in mammalian cells and localize to different subcellular compartments. In this review we have chosen to view ER stressorsas those perturbants that inhibit translation and ER protein folding or processingwhile subsequently inducing the ER resident chaperones GRP78 and GRP94. The ER stress response system is activated by Ca2+-mobilizingor thiol-reducing agents. In contrast, those perturbants that inhibit translation and protein folding in the cytoplasm while inducing the HSPs are viewed as cytoplasmic or “heat shock stressors. The cytoplasmic stress response system is activated by oxidants, including free radicals, and by heavy metal ions. Comparison of the effects of the cytoplasmic stressor sodium arsenite to that of the ER stressors ionomycin, thapsigargin, and dithiothreitol emphasizes the parallel construe tion of these two stress systems (Table V; Fig. 2). The two systems are activated and apparently operate independently with the exception that both sup-
117
TRANSLATIONAL INITIATION REGULATION IN STRESS
press translational initiation through a common mechanism involving the a c tivation of an eIF-2a kinase. A lack of other candidates in conjunction with several lines of persuasive supporting evidence (Table IV)indicate that this kinase is PKR. Induction of either the GRPs or HSPs is associated with the development of equivalent degrees of translational accommodation, cross-tolerance to either class of stressor, and eIF-2a dephosphorylation. The two chaperone systems, by virtue of regulating eIF-2 kinase activity, permit rates of protein processing to be coordinated with rates of translational initiation through alterations in the rate of recycling of eIF-2 and ribosomal loading on mRNA. The interactions of GRP78, as we understand them, in regulating translational initiation are modeled in Fig. 3. GRP78 is generally thought to assist the folding of polypeptide chains entering the ER during co-translational translocation. Two additional putative roles for GRP78 are indicated in the model. First, the chaperone inhibits the activity of PKR, an enzyme associated with the 60s ribosomal subunit, which in turn is associated with the ER outer membrane. Interaction of the chaperone with PKR presumably occurs through a membrane-spanning subunit of the kinase. Second, GRP78 must interact with a protein involved with ER-tonucleus signaling that sponsors induction of grp78 mRNA synthesis. This protein would perform as the mammalian equivalent of Irelp.
I HEAT SHOCK RESPONSE 1
[ER STRESS RESPONSE
I
Elevated temperaturrs Sulfhydryl polaona Rrrctlve oxygrn specler
Deplrtlon of celclum atoms Sulfhydryl reduclno raants Abnormrl srcrrtory protein expression
Protein mlsfoldlng. drmaae. rggreaatlon In cytosol mitochondrls. nucleus
Protein mlsfoldlng In thr ERIncomplete procerrlna or subunlt assembly. Incorrrct dlsulfide bonding
.
.
Q L O E A L TRANSLATIONAL SUPPRESSON
+
Actlvrtlon of HSF
Actlvatlon of the "unfolded proteln rmspansr" pathwry
Induction of HSP rxpresslon (increased transcriptlon. prefrrentlel trrnslrtlon)
lnductlon of ORP exprosslon (Incrrised transcrlptlon. prrfrrentlrl trrnrlrtlon)
1
TOLERANCE To TRCINS&ATIONA& SUPPRESWON RECOV6RY OF CELL FUNCTIONS
FIG.2. Stress response of mammalian cells.
118
CHARLES 0. BROSTROM AND MARGARET A. B R O m O M
acute (min) decreased available GRP78 decreased ribosomal loading on mRNA Events:
: FORWARD REGULATION long term (h) induction of GRP78 increased ribosomal loading on mRNA Events:
lnhibitlon of protein folding by ER stressors (calcium mobilizers, dithiothreitol, tunicamycin, overproduction of unprocessible proteins)
$.
Blnding of GRP78 to newly synthesized unfolded proteins
inhibition of protein folding by ER stressors Binding of QRP78 to newly synthesized unfolded proteins
$.
Decreased GRP78 availabilitv
ctlvation of IRE? Induction of grp78 mRNA Induction of GRP78
Increased elF-2u klnase (PKR) activity
increased GRP78 avaiiabllitv
elF-2u phosphoryiation & decreased elF-2B activity
Decreased e1F-b klnase
$.
Decreased elF-2 cycling & decreased initiation rates
$.
elF-2u dephosphoryiation& increased eiF-26 activity
$.
increased eiF-24 cycling & increased initiation rates
FIG.3. Coordination of rates of protein processing and protein synthesis.
In this model GRP78 is seen to function as a central mediator in the acute inhibition of translational initiation (‘back regulation”)and in the subsequent recovery from (accommodation) this inhibition upon induction of the chaperone (“forward regulation”).Back regulation is viewed as a cascade of events occurring in response to any slowing of protein processing that results in GRP78 binding to unfolded or misfolded protein. Under such circumstances the chaperone would be drawn from binding sites associated with the suppression of PKR. The extent of inhibition of initiation would correspond to the degree of PKR activation. Forward regulation, in contrast, is visualized as the subtraction of GRP78 from binding sites that normally inhibit the induction of the protein. Induction of the chaperone clearly reverses eIF-2a phosphorylation and must involve some suppression of PKR activity. While
TRANSLATIONAL INITIATION REGULATION IN STRESS
119
induction partially restores rates of translational initiation, it does not appear to relieve the block in protein processing mediated by ER stressors. Sustained initiation must require a continued synthesis of GRP78. Similar modeling could also be advanced for the action of cytoplasmic stressors on translational initiation. At this point, however, it is unclear which of the HSP chaperones is (are) actually involved in the suppression of PKR. The most likely candidates, based on homologies with GRP78, would be HSC70 and HSP70. It should also be noted that the preceding model (Fig. 3) should suitably describe dynamics associated with any rapid slowdown in translational initiation associated with decreased protein-processing capacity or misfolded protein formation. Such changes would be expected during hormonal mobilization of ER-sequestered Ca2+,the introduction of amino acid analogs, various proteotoxic agents, or severe hypoxia, as, for example, from a myocardial infarct or a cerebral blood clot. This model does not describe alterations occurring when translation is slowed relative to protein-processing capacity, occumng as, for example when cells are moved from nutrient-rich to minimal media, or exposed to various translational inhibitors. It is known, however, that GRP78 undergoes ADP ribosylation and oligomerization to a form that is believed to be inactive (112-116). This modification is fostered by cycloheximide and is reversible. Presumably the ADP-ribosylated form represents a “resting” pool that excess chaperone enters during periods when protein-processing capacity exceeds translational rates. The rapidity and extent to which this modification or its reversal occurs in vim remain to be determined. As noted previously, the sensitivity of eIF-2ol to phosphorylation in response to ER stressors diminishes rapidly following translational arrest with cycloheximide (83).HSC70 is also believed to oligomerize in the absence of substrates (108-111) but not to undergo covalent modification(s). We hope that this review focuses attention on the role of GRP78 as a key regulatory component of translational initiation and emphasizes the function of the ER as an integrating organelle. Other open questions in addition to those discussed earlier include the following: How does Ca2+affect the function of GRP78, protein folding, and the assembly of quaternary protein structures? Does activation of PKR by ER or cytoplasmic stress require autophosphorylation of the kinase, and how is the dsRNA requirement of the enzyme satisfied during the stress activation? What is the nature of the interaction of GRP78 with PKR, and does this interaction result in occupation of the active site of the kinase by an inhibitor with a “pseudosubstrate sequence”? What is the nature of the ER-to-nucleus signaling system in the induction of grp78 mRNA, and why is this induction suppressed by inhibitors of translational elongation such as cycloheximide and puromycin? How is the loading of ribosomes onto grp78 mRNA regulated?
120
CHARLES 0.BROSTROM AND MARGARET A. BROSTROM
ACKNOWLEDGMENTS We thank the American Diabetes Association for recent support of our work and Dr. Randy Kaufman for providing pertinent manuscripts prior to publication.
REFERENCES 1. L. E. Hightower, Cell 66, 191 (1991). 2. M.-J. Gething and J. Sambrook, Nature (London)355,33 (1992). 3. E. A. Craig, B. D. Gambill, and R. J. Nelson, Mkobiol. Reu. 57,402 (1993). 4. J. P. Hendrick and F,-U. Had, Annu. Reu. Biochem. 62,349 (1993). 5. W. J. Welch, Phil. Tmns. R. SOC.L d .Ser. B 339,327 (1993). 6. J. Buchner, FASEB]. 10, 10 (1996). Z J. W. B. Hershey, Annu. Reu. Biochem. 60,717 (1991). 8. C. G. Proud, Cum. Topics Cell Regul. 32,243 (1992). 9. W. C. Menick, Microbiol. Reu. 56,291 (1992). 10. R. E. Rhoads,]. Biol. Chem. 268,3017 (1993). 11. D. R. Moms, h g . Nucleic Acid Res. Mol. Bid. 51,339 (1995). 12. J. K. Rose and R. W. Dorns, Annu. Reu. Cell Biol. 4,257 (1988). 13. V. R. Lingappa,]. Clin. Znuest. 83,739 (1989). 14. A. Helenius, Mol. Bwl. Cell 5,253 (1994). 15. T A. Rapoport, B. Jungnickel, and U. Kutay, Annu. Rev. Biochetn. 65,271 (1996). 16. M. A. Brostrom and C. 0. Brostrom, in “Calcium and Cell Function 5” (V?Y. Cheung, ed.), p. 165. Academic Press, New York, 1984. 1Z R. V. Farese, Science 173,447 (1971). 18. E. Kaplan and H. G. Richman, Can. J. Biochem. 51,1331 (1973). 19. N. Ruiz and M. Krauskopf, Lije Sci. 27,2359 (1980). 20. C. J. Wilde, H. R. Hasan, D. A. White, and R. J. Mayer, Biochem. Biophys. Res. Commun. 103,934 (1981). 21. D. N. Burton, J. M. Collins and J. W. Porter,]. B i d . Chem.244,1076 (1969). 22. W. J. Welch, Phys. Reo. 4, 1063 (1992). 23. A. S. Lee, Cuw. @in. Cell Biol. 4, 267 (1992). 24. E. Little, M. Ramakrishnan, B. Roy, G. Gazit, and A. S. Lee, Crit. Reo. Eukayotic Gene Expression 4, 1(1994). 25. R. H. Burdon, Biochem.]. 240,313 (1986). 26. S. Lindquist, Annu. Bar. Biochem. 55,1151 (1986). 2% W. Levinson, H. Opperman, and J. Jackson, Biochim. Biophys.Actu 606,170 (1980). 28. R. I. Morimoto, Science 259, 1409 (1993). 29. J. Lis and C. Wu, Cell 74,l (1994). 30. J. Frydman, E. Nimmesgem, K. Ohtsuka, and F. U. Had, Nature (London) 370,111 (1994). 31. S. A. Lewis, G. Tian, I. E. Vainberg, and N. J. Cowan,]. Cell Bid. 132,l (1996). 32. S. Jindal, TZBTECH 14,17 (1996). 33. C. Lee and L. B. Chen, Cell 54,37 (1988). 34. M. Green and R. A. Mazzarella, in “Protein Transfer and Organelle Biogenesis” (R. C. Das and P. W. Robbins, eds.), p. 243. Academic Press, Orlando, FL, 1988. 35. G. L. E. Koch, BioEssays 12,527 (1990). 36. E. Carafoli, Annu. Reu. Bwchem. 56,395 (1987). 3%J. Meldolesi and T.Pozzan, Exp. Cell Res. 171,271 (1987).
TRANSLATIONAL INITIATION REGULATION IN STRESS
121
M. J. Bemdge, Nature (London) 34%197 (1989). R. Y. Tsien, Annu. Rev. Cell Biol. 6, 715 (1990). B. Walz and 0.Baumann, Rug. Histochem. Cytochem. 2 0 , l (1989). J. Meldolesi, L. Madeddu, and T.Pozzan, Biochim. Biophys. Ada 1055, 130 (1990). M. F. Rossier and J. W. Putney, TZNS 14,310 (1991). M. J. Bemdge, Cell Calcium 12,63 (1991). 0.Baumann, B. Walz, A. V. Somlyo, and A. P. Somlyo, Proc. Natl. Acad Sci. U.S.A. 88,741 (1991). 45. R. D. Burgoyne and T. R. Cheek, TZBS 16,319 (1991). 46. M. Terasaki and C. Sardet J. Cell Biol. 115,1031 (1991). 47. J. Lytton and S. K. Nigam, Cum Opin. Cell Biol. 4,220 (1992). 48. M. A. Brostrom and C. 0. Brostrom, in “Nuhition and Gene Expression” (C. D. Berdanier and J. L. Hargrove, eds.), p. 117. CRC Press, Boca Raton, FL, 1993. 49. M. A. Brostrom, W. L. W. Ling, D. Gmitter, and C. 0. Brostrom, Biochem. ]. 304,499 (1994). 50. H. R. B. Pelham, Annu. Reo. Cell Bid. 5 , l (1989). 51. C. Hwang, A. J. Sinskey, and J. F. Lodish, Science 257,1496 (1992). 52. M. S. Poruchynsky, D. R. Maass, and P. H. Atkinson, ]. Cell Biol. 114,651 (1991). 53. H. F. Lodish, N. Kong, and L. Wikstrom,]. Cell Biol. 267,12753 (1992). 54. L. Wikstrom and H. F. Lodish,]. Biol. Chem. 268,14412 (1993). 55. H. F. Lodish and N. Kong,]. Bid. Chem. 265,10893 (1990). 56. G . Kuznetsov, M. A. Brostrom, and C. 0. Brostrom,]. Biol. Chem. 267,3932 (1992). 57. G. Kuznetsov, M. A. Brostrom, and C. 0. Brostrom,]. Biol. Chem. 268,2001 (1993). 58. T. Wileman, L. P. Kane, G. Carson, and C. Terhorst]. Biol. Chem. 266,4500 (1991). 59. Y.S. Tsao, N. E. lvessa, M. Adesnik, D. D. Sabatini, and G. Kreibich, ]. Cell Bwl. 116,57 (1992). 60. S. K. Nigam, A. L. Goldberg, S. Ho, M. F. Rohde, K. T. Bush, and M. Y. Sherman,]. Biol. Chem. 269, 1744 (1994). 61. J. J. M. Bergeron, M. B. Brenner, D. Y.Thomas, and D. B. Williams, TZBS 19,124 (1994). 62. I. G. Haas, Everentia 50,1012 (1994). 63. A. D. Elbein, Annu. Reu. Biochem. 56,497 (1987). 64. R. Urade, Y.Takenada, and M. Kito,]. Biol. Chem. 268,22004 (1993). 65. K. Rupp, U. Bimbach, J. Lundstrom, P. N. Van, and H.-D. Soling,]. Biol. Chem. 269,2501 (1994). 66. I. G . Haas, Cuw. Topics Microbid. Zmmunol. 167, 71 (1991). 67. J. P. Vogel, L. M. Misra, and M. D. Rose,]. Cell Biol. 110, 1885 (1990). 68. S. L. Sanders, K.M. Whitfield,J. P. Vogel, M. D. Rose, and R. W. Schekman, Cell 69,353 (1992). 69. J. Melnick, J. L. Dul, and Y.Argon, Nature (London)370,373 (1994). 70. C. E. Shamu, J. S. Cox, and P. Walter, TZCB 4,56 (1994). 71. L.-J.Li, X. Li, A. Ferrario, N. Rucker, E. S. Liu, S. Wong, C. J. Comer, and A. S. Lee,]. Cell. Physiol. 153,575 (1992). 72. S. C. Chang, A. E. Erwin, and A. S. Lee, Mol. Cell. Biol. 9,2153 (1989). 73. X. Li and A. S. Lee, Mol. Cell. Biol. 11,3446 (1991). 74. S. K. Wooden, L.-J. Li, D. Navarro, I. Qadri, L. Pereira, and A. S. Lee, Mol. Cell. Biol. 11, 5612 (1991). 75. W. W. Li, S. Alexandre, C. Cao, and A. S. Lee, J. Biol. Chem. 268,12003 (1993). 76. W. W. Li, L. Sistonen, R. I. Morimoto, and A. S. Lee, Mol. Cell. Bwl. 14, 5533 (1994). 7% M. A. Brostrom, C. Cade, C. R. Prostko, D. Gmitter-Yellen, and C. 0. Brostrom, ]. Bwl. Chern. 265,20539 (1990). 38. 39. 40. 41. 42. 43. 44.
122
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
78. C. R. Prostko, M. A. Brostrom, E. M. Galuska-Malara,and C. 0. Brostrom,]. Biol. Chem. 266, 19790 (1991). 79. C . 0. Brostrom, C. R. Prostko, R. J. Kaufman, and M. A. Brostrom, ]. Biol. Chem. 271, 24995 (1996). 80. M. A. Brostrom,X. Lin, C. Cade, D. Gmitter, and C. 0.Brostrom,]. Biol. Chem. 264,1638 (1989). 81. E. Resendez,J. Ting, K. S. Kim, S. K. Wooden, and A. S. Lee,]. Cell Biol. 103,2145 (1986). 82. Y. K. Kim, K. S. Kim, and A. S . Lee,]. Cell. Physiol. 133,553 (1989). 83. M. A. Brostrom, C. R. Prostko, D. Gmitter, and C. 0. Brostrom,]. Biol. Chem. 270,4127 (1995). 84. J. Ting, S . K. Wooden, R. Kriz, K. Kelleher, R. J. Kaufman, and A. S. Lee, Gene 55, 147 (1987). 85. P. Sarnow, R-oc. Natl. Acad. Sci. U.S.A. 86,5795 (1989). 86. D. G. Macejak and P. Sarnow, Nature (London) 353,90 (1991). 8%K. Nonnington, K. Kohon, Y.Kozutsumi, M.-J. Gething, and J. Sambrook, Cell 57,1223 (1989). 88. M. D. Rose, L. M. Misra, and J. P. Vogel, Cell 57, 1211 (1989). 89. M. LaMantia, T. Miura, H. Tachikawa, H. A. Kaplan, W. J. Lennarz, and T. Mizunaga, Roc. Natl. Acad. Sci. U S A . 88,4453 (1991). 90. J. S . Cox, C. E. Shamu, and P. Walter, Cell 73,1197 (1993). 91. J. A. Partaledis and V. Berlin, h c . Natl. Acad. Sci. U.S.A. 90,5450 (1993). 92. K. Mori, A. Sant, K. Kohno, K. Nonnington, M.-J. Gething, and J. Sambrook, E M B O ] . 11, 2583 (1992). 93. K. Mori, W. Ma, M.-J. Gething, and J. Sambrook, Cell 74, 743 (1993). 94. S. K. Hanks and T. Hunter, FASEBJ. 9,576 (1995). 95. A. A. Wehlinda and R. J. Kaufman,]. Biol. Chem. 271,18181 (1996). 96. K. G. Hardwick, M. J. Lewis, J. Semenza, N. Dean, and H. R. B. Pelham, E M B O J . 9,623 (1990). 9% D. T.W. Ng, S. S. Watowich, and R. A. Lamb, Mol. Biol. Cell 3, 142 (1992). 98. K. Kohno, K. Normington,J. Sambrook,M.-J. Gething, and K. Mori, Mol. Cell. Biol. 13,877 (1993). 99. R. Joachim, W. Voos, and N. Pfanner, TZCB 5,297 (1995). 100. S . R. Terlecky, H.-L. Chiang, T. S. Olson, and J. F. Dice,]. Biol. Chem. 267,9202 (1992). 101. H. Wiech, J. Buchner, M. Zimmerman, R. Zimmerman, and U. Jakob,]. Biol. Chem. 268, 7414 (1993). 102. B. Gao, J. Biosca, E. A. Craig, L. E. Green, and E. Eisenberg,J. Bid. Chem. 266, 19565 (1991). 103. A. M. Fourie, J. E Sambrook, and M.-J. Gething,]. B i d . Chem. 269,30470 (1994). 104. K. Prasad, J. Heuser, E. Eisenberg, and L. Greene,]. Biol. C h a . 269,6931 (1994). 105. L. E. Greene, R. Zinner, S. Naficy, and E. Eisenberg,]. Biol. Chem. 270,2967 (1995). 106. J. S. McCarty, A. Buchberger, J. Reinstein, and B. Bucau,]. Mol. B i d . 249, 126 (1995). 10% J. Wei, J. R. Gaut, and L. M. Hendershot,]. Biol. Chem. 270,26677 (1995). 108. D. R. Palleros, K. Reid, L. Shi, and A. L. Fink, FEBS Lett. 336,124 (1993). 109. H. Toledo, A. Carlino, V. Vidal, B. Redfield, M. Y.Nettleton, J. P. Kochan, N. Brot, and H. Weissbach, R-oc. Nafl.Acad Sci. U.S.A. 90,2505 (1993). 110. N. Benaroudji, F. Triniolles, and M. M. Ladjimi,]. Biol. Chem. 271,18471 (1996). 111. B. Gao, E. Eisenberg, and L. Greene,J. Biol. Chem. 271,16792 (1996). 112. P. Freiden, J. Gaut, and L. M. Hendershot, E M B O ] . 11,63 (1992). 113. L. M. Hendershot, J. Tmg, and A. S . Lee, Mol. Cell. B i d . 8,4250 (1988). 114. L. Carlsson and E. Lazarides, R-oc. Natl. Acad. Sci. U.S.A. 80,4664 (1983).
TRANSLATIONAL, INITIATION REGULATION IN STRESS
123
G. H. Len0 and B. E. Ledford, Eur. ]. Biochem. 186,203 (1989). J. M. Staddon, M. M. Bouzyk, and E. Rozengurt,]. B i d . Chem. 267,2539 (1992). M. Altmann and H. Trachsel, TIBS 18,429 (1993). S. Mader and N. Sonenberg, Biochirnie 77,40 (1995). A. C. Gingras, Y. Svitkin, G. J. Belsham, A. Pause, and N. Sonenberg,Roc. Natl. Acad. Sci. U.S.A.93,5578 (1996). 120. B. Safer, Cell 33, 7 (1983). 121. A. G. Rowlands, R. Panniers, and E. C. Henshaw, J. Biol. Chem.263,5526 (1988). 122. S. R. Kimball, A. M. Karinch, R. C. Feldhoff, H. Mellor, and L. S. Jefferson, Bwchim. Biophys. Actu 1201,473 (1994). 123. S. R. Kimball and L. S. Jefferson, Biochem. Biophys. Res. Commun. 212,1074 (1995). 124. 0.Thastrup, P.J. Cullen, B. K. Drobak, M. R. Hanley, and A. P. Dawson, Roc. Nod. Acad. Sci. U.S.A.87,2466 (1990). 125. J. W. Westley, in “Polyether Antibiotics: Naturally Occurring Acid Ionophores: Vol. 1. Biology,” W. Westley, ed.) p. 1. Marcel Dekker, New York, 1982. 126. E. I. Rotman, M. A. Brostrom, and C. 0. Brostrom, Bi0chem.J.282,487 (1992). 12% M. A. Brostrom, C. R. Prostko, D. Gmitter-Yellen, L. J. Grandison, G. Kuznetsov, W. L. Wong, and C. 0. Brostrom,]. B i d . Chem. 266,7037 (1991). 128. P. R. Albert and A. H. Tashjian,Jr.,]. Biol. Chem. 259, 15350 (1984). 129. E. Carafoli and M. Chiesi, Cum. Top. Cell Regul. 32,209 (1992). 130. C. 0. Brostrom, S. B. Bocckino, M. A. Brostrom, and E. M. Galuska, MoZ. P h a m o l . 29, 115. 116. 117. 118. 119.
a.
104 (1986). 131. C. 0. Brostrom, S. B. Bocckino, and M. A. Brostrom,]. Biol. Chem.258,14390 (1983). 132. M. A. Brostrom, C. 0. Brostrom, S. B. Bocckino, and S. S. Green,]. Cell. Physiol. 121,291
(1984). 133. S. E. Wolfe, C. 0.Brostrom, and M. A. Brostrom, Mol. P h a m o l . 29,411 (1986). 134. S. E. Wolfe and M. A. Brostrom, Mol. Phamcol. 29,420 (1986). 135. K.-V. Chin, C. Cade, M. A. Brostrom, E. M. Galuska, and M. A. Brostrom, J. Bwl. Chem.
262,16509 (1987). 136. M. A. Brostrom, K.-V. Chin, C. Cade, D. Gmitter, and C. 0. Brostrom,]. B i d . Chem. 262,
16515 (1987). 137. C. 0.Brostrom,K.-V. Chin, W. L. Wong, C. Cade, and M. A. Brostrom,]. Biol. Chem.264,
1644 (1989). 138. W. L. Wong, M. A. Brostrom, G. Kuznetsov, D. Cmitter-Yellen, and C. 0. Brostrom,
Biochem. ]. 289,71 (1993). K.-V.Chin, C. Cade, M. A. Brostrom, and C. 0.Brostrom, Int.]. Biochem.20,1313 (1988). S. R. Kimball and L. S. Jefferson, Am.]. Physiol. 263, E958 (1992). D. Gmitter, C. 0. Brostrom, and M. A. Brostrom, Cell Biol. Toxicol. 12,101 (1996). W. J. Strittmatter,C. B. Couch, and D. I. Mundy, in “Cell Fusion” (A. E. Sowers, ed.),p. 99. Plenum Press, New York, 1987. 143. C. R. Prostko, M. A. Brostrom,E. M. Malara, and C. 0.Brostrom,]. B i d . Chem.267,16751 (1992). 144. C. R. Prostko, M. A. Brostrom, and C. 0. Brostrom, Mol. Cell. Biochem. 1271128, 255 (1993). 145. C. R. Prostko,J. N. Dholakia, M. A. Brostrom, and C. 0. Brostrom,]. B i d . Chem. 270,6211 (1995). 146. S. R. Kimball and L. S. Jefferson,]. Biol. Chem. 265,16794 (1990). 147. E. H. Fawell, I. J. Boyer, M. A. Brostrom, and C. 0. Brostrom,]. Bid. Chem.264, 1650 (1989). 148. J. A. Traugh and A. M. Pendergast, h g . Nucleic Acid Res. Mol. Biol. 33,195 (1986). 139. 140. 141. 142.
124
CHARLES 0. BROSTROM AND MARGARET A. BROSTROM
149. R. Panniers, Biochimie 76,737 (1994). 150. A. S. Olsen, D. F. Triemer, and M. M. Sanders, Mol. Cell. Biol. 3,2017 (1983). 151. P. Murtha-Riel,M. V. Davies, B. J. Scherer, S. Y.Choi, J. W. B. Hershey, and R. J. Kaufman, ]. Biol. Chern. 268,12946 (1993). 152. A. G. Hinnebusch, Mimbiol. Rev. 52,248 (1988). 153. C. E. Samuel,]. Bwl. Chem. 268,7603 (1993). 154. R. C. Wek, TZBS 19,491 (1994). 155. J. S. Crosby, K. Lee, I. M. London, and J.-J. Chen, Mol. Cell. Biol. 14,3906 (1994). 156. ].-I. Chen and 1. M. London, TZBS 20,105 (1995). 15% R. L. Matts, Z. Xu, J. K. Pal, and J.-J, Chen,]. Biol. C h .267, 18160 (1992). 158. R. L. Matts and R. Hunt,]. Biol. C h .267,18168 (1992). 159. R. L. Matts, R. Hurst, and Z. Xu, Biochemistry 32,7323 (1993). 160. M. Gross, A. Olin, S. Hessefoe and S. Bender,]. Biol. Chem. 269,22738 (1994). 161. R. Jagus and M. M. Gray, Biochimie 76,779 (1994). 162. C. G . Proud, TIBS 20,241 (1995). 163. S. J. McCormack, L. G. Ortega, J. P. Doohan, and C. E. Samuel, Virology 198,92 (1994). 164. S. B. Lee, S. R. Green, M. B. Matthews, and M. Esteban, Roc. Natl. Acad. Sci. U.S.A. 91, 10551 (1994). 165. R. C. Patel, l? Stanton, and G. C. Sen,]. Biol. Chem.269,18593 (1994). 166. R. C. Patel, P. Stanton, N. M. J. McMillan, B. R. G. Williams, and G. C. Sen, Roc. Natl. Acad. Sci. U.S.A.92,8283 (1995). 16% G . P. Cosentino, S. Venkatesan, l? C. Serluca, S. R. Green, M. B. Matthews, and N. Sonenberg, Proc. NatZ. A d . Sci. U.S.A. 92,9445 (1995). 168. S. Wu and R. J. Kaufman,]. Biol. Chem. 271,1756 (1996). 169. R. J. Kaufman and P. Murtha, Mol. Cell. Biol. 7,1568 (1987). 170. R. J. Kaufman, M. V. Davies, V. K. Pathak, and J. W. B. Hershey, Mol. Cell. Biol. 9, 946 (1989). 171. M. G. Katze, Trends Microbial. 3,75 (1995). 172. R. Pehyshyn, J.-J. Chen, and I. M. London, Roc. Natl. Acad. Sci. U.S.A. 85, 1427 (1988). 173. L. J. Mundschau, and D. V. Faller,]. Biol. Chem. 267,23092 (1992). 174. T.Ito, R. Jagus, and W. S. May, Proc. Natl. Acad. Sci. U.S.A. 91, 7455 (1994). 175. L. J. Mundschau and D. V. Faller,]. Biol. Chem. 270,3100 (1995). 176. A. E. Koromilas, S. Roy, G. N. Barber, M. G. Katze, and N. Sonenberg, Science 257,1685 (1992). 17% D. C. Thomis and C. E. Samuel, Proc. Natl. Acad. Sci. U.S.A. 89,10837 (1992). 178. E. E Meurs,J. Galabru, G. N. Barber, M. G. Katze, and A. G. Hovanessian,Proc. Natl. Acad. Sci. U.S.A.90,232 (1993). 179. 0. Donze, R. Jagus, A. E. Koromilas, J. W. B. Hershey, and N. Sonenberg, E M B O J . 14, 3828 (1995). 180. N. A. J. McMillan, B. W. Carpick, B. Hollis, W. M. Toone, M. Zamanian-Daryoush, and B. R. G. Williams,]. Biol. Chem. 270,2601 (1995). 181. S. Kirchoff, A. E. Koromilas, F. Schaper, M. Grashoff, N. Sonenberg,and H. Hauser, Oncogem 11,439 (1995). 182. L. Baretta, M. Gabbay, R. Berger, S. M. Hanash, and N. Sonenberg, Oncogene 12, 1593 (1996). 183. A. Kumar, J. Haque, J. Lacoste, J. Hiscott, and B. R. G. Williams, Proc. Natl. Acad. Sci. U.S.A.91,6288 (1994). 184. A. Maran, R. K.Maitra, A. Kumar, B. Dong, W. Xiao, G. Li, B. R. G. Williams, P. F. Torrence, and R. H. Silverman,Science 265,789 (1994).
TRANSLATIONAL INITIATION REGULATION IN STRESS
125
185. A. E. Koromilas, C. Cantin, A. W. B. Craig, R. Jagus, J. Hiscott, and N. Sonenberg,]. Biol. C h n . 270,25426 (1995). 186. J. Chou, J.-J. Chen, M. Gross, and B. Roizman, R-oc. Nutl. Acud. Sci. U.S.A. 92, 10516 (1995). 18% S. P. Srivastava, M. V. Davies, and R. J. Kaufman,]. Biol. Chem. 270, 16619 (1995). 188. W. L. Wong, M. A. Brostrom, and C. 0. Brostrom, 1nt.J. Biochem. 23,605 (1991). 189. A. G . Ryazanov and A. S. Spirin, in “Translational Regulation of Gene Expression 2;” (J. Ilan, ed.),p. 433. Plenum Press, New York, 1993. 189a. N. T. Redpath, N. T. Price, K. V. Severinov, and C. G. Proud, Eur. ]. Biochem. 213,689 (1993). 190. H. C. Palfrey and A. C. Naim, Ado. Sec. Mess. Phosphoprotein Rm. 30,191 (1995). 191. H. C. Palfrey, A. C. Nairn, L. L. Muldoon, and M. L. Villereal,]. Biol. Chem. 262,9875 (1987). 192. K. P. Mackie, A. C. Nairn, G. Hampel, G. Lam, and E. A. Jaffe, E. A.,]. Biol. Chem. 264, 1748 (1989). 193. A. C. Nairn, R. A. Nichols, M. J. Brady, and H. C. Palfrey,]. Biol. Chem. 262,14265, (1987). 194. A. Laitusis, K. P a w , A. G. Ryazanov, C. 0. Brostrom, and M. A. Brostrom, Mol. Biol. Cell 6,79a (1995). 195. A. G . Ryazanov and A. S . Spirin, New Biologist 2,843 (1990). 196. J. E. Celis, P. Madsen, and A. G. Ryazanov, Roc. Nutl. Acad. Sci. U.S.A. 87,4231 (1990). 19% K. V. Severinov, E. G. Melnikova, and A. G. Ryazanov, New Biologist 2,887 (1990). 198. S. W. Carper, J. J. Duffy, and E. W. Gemer, Cancer Ras. 47,5249 (1987). 199. T. Nakaki, R. J. Deans, and A. S. Lee, Mol. Cell. Biol.. 9,2233 (1989). 199,. J. A. Moms, A. J. Domer, C. A. Edwards, L. M. Hendershot, and R. J. Kaufman,]. Biol. Chem. 272,4327 (1997). 200. R. D. Klausner,J. G. Donaldson, and J. Lippincott-Schwartz,]. Cell Biol. 116,1071 (1992). 201. J. J. Sciandra,J. R. Subjeck,and C. S. Hughes, Proc. Nutl. Acud. Sci. U.S.A.81,4843 (1984). 202. J. B. Tillman, P. L. Mote, J. M. Dhahbi, R. L. Walford, and S. R. Spindler,J. Nu&. 126,416 (1996). 203. D. L. Feinstein, E. Galea, D. A. Aquino, G. C. Li, H. Xu, and D. J. Reis,]. Biol. Chem. 271, 17724 (1996).
This Page Intentionally Left Blank
Lactose Repressor Protein: Functional Properties and Structure
KATHLEENSHIVEMAITHEWS AND JEFFRY C. NICHOLS Department of Biochemistry and Cell Biology Rice University Houston. T m 77251 I. Lactose Repressor Protein ....................................... A . Domain Structure ........................................... B. Assembly of Tetrameric Protein ............................... I1. DNABinding ................................................. A . Identification of Operator Sequence ........................... B. Search for the Operator Site .................................. C. Loop Formation ............................................ D. Thermodynamics and Kinetics of Operator Binding .............. E. Residues Involved in LacI-Lac0 Interaction .................... 111. Inducer Binding ............................................... A . Sugars-Inducers and Anti-inducers ........................... B. Thermodynamics and Kinetics ................................ C . Residues Involved in Lad-Sugar Interaction .................... IV. Structure and Function ......................................... A . Genetic Studies ............................................ B. Chemical and Spectroscopic Studies . . . . . . . . . . C. Conformational Change ..................................... V. NMR and X-ray Crystallographic Structures ....................... A . N-Terminal Domain ......................................... B. Core. Intact, and Ligand-Bound Forms ......................... C . Integrating Physical. Chemical. and Genetic Information ......... VI. Applications of Lac1 Control .................................... VII. Conclusion and Prospects for the Future .......................... References ....................................................
130 130 131 134 134 136 137 138 139 139 140 140 141 142 142 147 149 149 149 150 154 155 156 157
The lactose repressor protein (Lad). the prototype for genetic regulatory proteins. controls expression of lactose metabolic genes by binding to its cognate operator sequences in E coli DNA Inducer binding elicits a conformational change that diminishes affinity for operator sequences with no effect on nonspecific binding The release of operator is followed by synthesis of mRNA encoding the enzymes for lactose utilization Genetic, chemical and physical studies provided detailed insight into the function of this protein prior to the recent completion of X-ray crystallographic structures The structural information can now be correlated with the phe-
.
.
.
.
.
Progress in Nucleic Acid Research and Molecular Biology. Vol . 58
127
Copyright Q 1998 by Academic Ress. AU lights of reproduction in any form reserved. 0079-6603/98$25.00
128
KATHLEEN SHIVE MA?THEWS AND JEFFRY C. NICHOLS
notypic data for numerous mutants. lhese structures also provide the opportunity for physical and chemical studies on mutants designed to examine various aspects of lac repressor structure and function. In addition to providing insight into protein structure-functioncorrelations, Lac1 has been utilized in a wide variety of applications both in prokaryotic gene expression and in eukaryotic gene regulation and studies of mutegenesis. a3 1998 Academic Press
Differential gene expression provides the ability to respond to a constantly changing external environment in prokaryotes and to generate signaling cascades for life cycle demands in eukaryotes. The specific constellation of proteins required for optimal function varies with cellular age and context. Thus, protein production is carefully regulated by multiple mechanisms that modulate both transcriptional and translational pathways. Control of transcription initiation by RNA polymerase is a predominant mechanism for regulating expression of specific proteins, presumably because it provides maximal conservation of energy for the cell. Jacob and Monod first postulated the existence of a soluble regulatory agent that controlled expression of the lactose metabolic enzymes in Escherichia coli (1)based on their studies of bacterial sugar metabolism. Their visionary hypothesis was quickly confirmed by experiments that demonstrated the presence of the lactose repressor protein (LacI) (2, 3), its specific binding to operator DNA sequences, and modulation of this DNA binding by inducer sugars (4, 5). Monod and colleagues also proposed an allosteric model that was derived, at least in part, from the properties of this regulatory system (6). These two models, one for controlling expression of genetic information and one for conformational alterations to generate distinct binding properties, form the foundation of our modern view of genetic regulation. Almost all cellular mechanisms employed to determine the rate of transcriptional initiation, whether prokaryotic or eukaryotic, involve recognition of specific sequences in DNA by proteins and alteration of this interaction by ligands and/or by homo- and heteromeric interactions. Our insight into a growing number of such systems has its roots in the experiments that led to the original operon hypothesis. This model predicted a soluble agent with the task of regulating production of the lactose enzymes in response to shifts in the bacterial carbohydrate environment (I). The expression of the lac enzymes is regulated by three basic components: the promoter-operator sequence, a transcriptional inhibitor [lac repressor protein (LacI)],and a transcriptional activator (CAP protein) (1, 64. A complex of CAP with CAMPrecognizes two distinct sequences within the promoter to facilitate binding and initiation by RNA polymerase, and CAMP levels are inversely related to the availability of glucose via effects on adenyl
LACTOSE REPRESSOR PROTEIN
129
FIG.1. Schematic of lactose operon. The symbols correspond to the following: pi, prornotlactose er for i gene; i, gene encoding lactose repressor (LacI); p, promoter for lac enzymes; 0, operator sequence (LacO);z, gene encoding P-galactosidase;y, gene encoding lac permease; a, gene encoding thiogalactoside transacetylase; I, inducer; RNA pol, RNA polymerase holoenzyme. The i gene product, lactose repressor,binds to operator to preclude transcription by RNA polymerase. In the presence of inducer sugars, LacI undergoes a conformational change that diminishes affinityfor LacO but does not affect binding to nonspecificDNA sites, which then compete for binding LacI-inducer. When inducer levels are depleted, LacI resumes its state with high affinity for LacO and inhibits transcription of lac mRNA.
cyclase activity (64. This arrangement ensures the most direct metabolic route to energy production through preferential utilization of glucose under conditions where lactose is aIso available. The partially twofold symmetrical operator DNA target sequence for LacI overlaps the promoter sequence (7). RNA polymerase binding (8),initiation (9),and/or elongation (10) are inhibited when LacI occupies this site, precluding production of the mRNA encoding the lac enzymes (Fig. 1).In the absence of lactose, the high affinity of Lac1 for lactose operator sequence (LacO) allows production of only small quantities of lac mRNA. When lactose is available in the environment, the low constitutive amounts of lac permease transport this sugar into the cell, and the correspondingly low levels of P-galactosidase result in production of the in vivo inducer p-1,6-allolactose (11).Binding of this natural inducer to the repressor elicits a conformational change that diminishes its affinity for the operator sequence without effect on binding to nonspecific DNA (12-16). Excess nonoperator DNA in the cell thereby effectively competes for binding to the protein and sequesters the repressor-inducer complex (12-1 6), allowing transcription of lac mRNA to proceed as long as lactose is available in the
130
KATHLEEN SHIVE MATTHEWS AND JEFFRYC. NICHOLS
medium. When lactose levels are decreased, the intracellular store of inducer is depleted by P-galactosidasehydrolysis. Under these conditions, p-1,6-allolactose dissociates from the repressor protein, which resumes its conformation with high affinity for operator DNA, associates with LacO, and shuts down further synthesis of lac mRNA. The lactose regulatory cycle therefore involves association with both specific and nonspecific DNA sequences,binding of sugar molecules, and conformationalshifts in response to these ligands. The recent solution of x-ray crystallographic structures of this important genetic regulatory protein, lactose repressor (17, 18), and its close relative, purine repressor (l9,20),make this a propitious time to look back over the past few decades of study on the lactose repressor. In particular, we wish to examine the structural and functional properties that have been discerned for the lactose repressor by a multiplicity of techniques and to compare this work to the crystallographic structures. Multiple aspects of Lac1 regulation have been reviewed previously in some detail (e.g., 21-28; because of the vast literature on this system, apologies are made to authors whose work is not cited directly in this chapter due to space restrictions).
1. Lactose Repressor Protein
A. Domain Structure The lactose repressor is a homotetrameric protein of 150 kDa (2,3)with binding sites for four inducer molecules (29, 30) and two operator DNA sequences (31, 32, 32u).Each subunit is organized into domains, an arrangement best illustrated by the ability to generate two fragments, a tetramer of 120 kDa and four monomers of -6 kDa, using a variety of proteases under mild treatment conditions (33-35). The larger domain, termed “core protein,” is missing -60 residues from the N-terminus, with the specific site of cleavage dependent on the protease employed for digestion. This tetrameric proteolytic product contains the determinants for subunit assembly and has sugar-binding properties similar to the intact protein (33, 36). The smaller, monomeric product consists of the N-terminal-60 amino acids and exhibits specificity,but low affinity,for operator DNA (37-39).Early in the studies of lac repressor, this region was predicted accurately to bind DNA by a “protrusion” rather than the more conventional enzyme cleft (40). Studies of the trypsin-resistant core protein indicated a low level of operator-specific binding (31,41)that may derive from the hinge that links the N-terminal and core domains (see later). Furthermore, hybrid tetramers of core and intact protein can be isolated, and operator DNA binding is compromised with the loss of a single N-terminal domain (42, 43).
LACTOSE REPRESSOR PROTEIN
131
The domain structure of the protein is also reflected in the phenotypic behavior of LacI mutants, both missense and suppressed nonsense mutations (43u, 44-54). Mutations that alter DNA binding are concentrated in the Nterminal region that can be separated from the core domain by proteolysis, whereas mutations that affect inducer binding are dispersed throughout the core domain and occur in a pattern that suggested folding of a P-sheet structure (52).Secondary structure predictions (55, 56) based on the amino acid sequence of the protein (57-59) indicated s i g d c a n t a-helix in addition to p-structure. Monomeric mutant proteins are produced by a small number of mutations that map in the regions of amino acids 221-227 and 270-285 (60),whereas dimeric mutants are generated by alterations at the extreme C-terminal region (amino acids 340-360) (61-67). The amino acid sequence of the protein is not unusual in character for a globular protein and has been confirmed by DNA sequence analysis (57-59). Amino acid sequence similarity has been found between different segments of the LacI primary sequence and other families of proteins, including other bacterial regulatory proteins (62, 68-71). The N-terminal domain contains a region that is similar in sequence to the helix-turn-helix (HTH)DNA recognition motif found in a number of regulatory proteins (71).The nuclear magnetic resonance (NMR) structure of this region confirms the presence of the predicted helices, and refinements of the structure for the protein-DNA complex demonstrate the structural resemblance to other HTH DNA-binding proteins (72-77), consistent with the predicted DNA binding by a “protrusion” (40).The core domain has significant similarity to the family of periplasmic sugar-binding proteins from E. coli (69, 78), and amino acids 62-323 have been aligned with the arabinose-, glucose-galactose-, and ribose-binding proteins to generate a model for the monomeric core domain (79).As indicated later, the structure of the region predicted by the model is very similar to the solved structure of the LacI tetramer complexed with isopropyl-p-D-thiogdactoside(1PT.G) (root mean standard deviation <3.9 A for residues 62-288; M. Kercher, P. Lu, and M. Lewis, personal communication). The C-terminal region of the protein exhibits sequence similarity to the leucine zipper-containing yeast regulatory protein GCN4 and the Jun family of proteins (62).Thus, this protein was predicted to consist of at least three structurally discrete domains well before crystallographic evidence for this arrangement was provided.
B. Assembly of Tetrameric Protein A wide variety of experimental approaches contributed to our insight into the assembly mode of LacI (Fig. 2). Powder x-ray diffraction studies of microcrystals suggested that the tetramer is an elongated structure with 222 symmetry (80).The N-termini were placed at the ends of an elongated core
132
KATHLEEN SHIVE MA'ITHEWS AND JEFFRY C . NICHOLS
FIG.2. Schematic depicting subunit interfaces in the lactose repressor protein (left) and linked equilibrium of dimer formation with operator binding (lower right). The large dark gray ellipsoidsrepresent the helix-turn-helix DNA-bindingdomain, and the small black ovals are the sugar-binding site. The coiled regions represent the leucine heptad repeat sequence. The tetramer structure (upper left and right) appears to be two h e r s aligned with their N-terminal domains on the same face of the molecule connected by a four-helical bundle at the base (see text). The latter connection abolishes the two potential twofold axes of symmetry. Opening the structure (middle left) allows more facile visualization of the two subunit interfaces and shows the two potential types of dimeric species. Mutation at Y282 results in disruption of both interfaces to produce monomer, while mutation at the C-terminus results in dimeric repressors. Coupling C-terminalextensionwith Y282 mutation yields long-axis dimer (94).The lower right portion of the figure indicates the thermodynamiccoupling of monomer-monomer assembly with operator binding, since monomer cannot bind operator DNA. It should be noted that long-axis dimer and tetramer also bind operator, but there is no thermodynamic linkage of this process with assembly.
LACTOSE REPRESSOR PROTEIN
133
domain based on powder x-ray diffraction and low-angle neutron and x-ray scattering studies (80-83). Since NMR linewidths for the isolated 6-kDa Nterminal domain are only slightly narrower than the corresponding values for intact 150-kDa tetramer, these N-terminal domains must be connected to the core protein by a hinge region that has considerable motional flexibility (84). Electron microscopy yielded variations in the size and shape of the molecules imaged, although several studies observed concentration of negative stain in a presumably highly solvated region located between two halves of the structure (85,86),anticipating the quaternary arrangement later found in the crystallographic studies. Sequence alterations in the C-terminal region, ranging from point mutations to deletions or domain substitutions, resulted in termination of assembly at dimer (62-67). Dimers retained the ability to bind operator DNA, as demonstrated initially by the activity of chimeric proteins and their hybrids (87-89). The C-terminal alterations that generate stable dimeric structures affect a leucine heptad repeat structure with sequence similarity to GCN4 and the jun family of proteins that was predicted to be a leucine zipper-type structure (62, 63) and to form a four-helical bundle (66).The latter arrangement has been confirmed by the crystallographic structures (see later). The observationson dimer properties generated the hypothesis that the repressor protein contained two subunit interfaces, one termed the “monomer interface” and the other the “dimer interface” (Fig. 2). Mutations that abolished assembly completely were identified in the region of amino acids (aa)220-230 and 270-285 (60),and at least one monomeric mutant, later identified as Y282D (go),has been isolated and characterized in detail (91).Apolar mutations at K84 serve as second-site revertants for the Y282D mutation (92)and place this residue at the interface between monomers. Urea denaturation studies of K84A1-11aa and K84Ll-11 aa mutant dimers indicate that introduction of apolar side chains at position 84 significantly stabilizes the monomer interface (93, 934. This apparent functional separation of the monomer and dimer interfaces was demonstrated directly by coupling mutation Y282D with extension of the leucine heptad repeat (94).The resultant double mutant was dimeric, but displayed DNA and sugar-binding characteristics anticipated for a dimer with a different interface between monomers (Table I). This species is therefore designated as a “long-axis”dimer (Fig. 2). Both subunit interfaces are quite stable, as might be expected for an oligomeric species that binds DNA at very low protein concentrations. In fact, synthetic peptides corresponding to the isolated C-terminal sequence form tetrameric structures with high affinity (95).Stability of the repressor tetramer at subpicomolar concentrations has been demonstrated by DNA binding studies (954.Even dissociation of tetramer modified with a fluorescent probe
134
KATHLEEN SHIVE MATTHEWS AND JEFFRY C. NICHOLS
TABLE I PROPERTIES OF LHR5-Y282D REPRESSOR^ IPTG binding pH 7.5
Operator-DNA binding pH 9.2
40-bp operator
Protein
K,()LM)
n
Kd(CLM)
n
Wild-type -11 aa deletion LHR5-Y28 2D Y282D
1.5 1.8 1.4 1.2
1.0 1.1 1.0 1.0
10.0 14.7 2.4 1.7
1.6 1.7 0.9 1.0
NS DNA
K, (d) K , ( d )
M, (D4
~~
0.03 0.63 22 >lo0
>lo0
,100
158,000 75,000 85,600 38,500
"Data reported in Ref. 94.
has indicated high affinity (Kd< M) between the subunits (96),and the stability of this labeled tetramer is affected by the presence of operator or IF'TG or by modification of C281 (adjacent to Y282) (96, 97). We have observed that inducer stabilizes the protein to urea denaturation and facilitates refolding and assembly (J. Bany and K. Matthews, unpublished results).
II. DNA Binding LacI binds with high affinity to operator DNA-containingsequences and with significantly lower affhity to nonspecific sequences (4,5,12-16). Nonspecific binding is integral to the process of induction, providing a sink for the repressor-inducer complex and sequestering the free repressor to generate a high local concentration that ensures occupancy of the operator (12-16). The protein contains two sites for DNA binding, and this stoichiometry has been demonstrated to be independent of the size of the operator DNA (31, 32, 32u).
A. Identification of Operator Sequence To determine the operator site, the sequence of DNA protected from DNase digestion in the presence of LacI was determined (Fig. 3) (7). NMR studies of the isolated N-terminal domain and variant operator sequences indicate that the asymmetrical base pairs (bp) located 2 2 bp from the central base pair are key to the differential binding observed for the left and right sides of the operator (98).This asymmetry in interaction is further underscored by the effect of operator-constitutive mutations on protein binding and repression (Fig. 3) (99).A number of genetic and binding studies have
135
LACTOSE REPRESSOR PROTEIN
- -I
T Q T T Q T Q T Q Q A A T T Q T Q A Q C Q ~ A ~ A A C A A T T T C A C A C A Q Q A C A A C A C A C C
T TAA
C A C Z C Z C
it
+?
-10
CTAT TQ T
T A A A Q
TQ TQ T C C
0
+10 I
,
I
+20
+?
:--
T Q T T ~ ~ Q T Q * Q A A T T ? T Q ? ~ C ? ? ~ T A ? C A AI T T T C A C A C A Q ~ *
vv
A A
A
V
v
vv
* A A
vv
A C A A C A C ~ C C T T ~ i C A C T C b C C T ~ T T ~ T T A A A Q T ~ T Q T C C
AAAAAAAA
AA A
FIG.3. Operator DNA sequence with summary of footprint and Ocinformation. The sequence of 40 bp containing the E. coli lac operator sequence is shown. In the upper sequence, the axis of symmetry is indicated by arrows, and the symmehic bases are under- and overlined. The sequence is numbered relative to the position of the transcription start site (+ 1). In the lower sequence, exonuclease 111 digestion endpoints are shown by asterisks (106, 108). Solid lines above and below the sequence indicate the region protected from DNase I digestion by LacI (107, 108), with breaks indicating partial protection. The arrow indicates the site of enhanced DNase I cleavage. DNA methylation protection and enhancement in the presence of LacI are delineated by (-) and (+), respectively (104,1064 107a).Thymidine positions protected by repressor against uv-induced breakage when substituted by BrdU are denoted by arrowheads (111). Sites of operator constitutive mutations are shown below the sequence by A at the indicated positions (99).
examined the influence of sequence on LacI-Lac0 affinity. Increasing the sequence symmetry (indicated in upper sequence in Fig. 3 by underlined regions) and eliminating the central base pair through which the axis of symmetry passes (indicated by arrow in the upper sequence in Fig. 3) increases LacI affinity by 10-fold,presumably by increasing symmetrical contacts by the protein (100, 101). Sites within the operator DNA sequence affected by or influencing LacI binding have been identified by a variety of studies using methylation, ethylation, nuclease digestion, photochemical cross-linking, and base substitution (shown on the lower sequence in Fig. 3); the affected positions cover -30 bp of the DNA sequence (102-112). Synthetic methods have been used to generate oligonucleotides with specific base substitutions to identify the functional groups that contribute to operator affinity and specificity (105, 109). In addition to the primary operator sequence, two sequences with similarity to the primary operator, termed “pseudo-operators,” have been identified in the z-gene (Ozwith -8fold lower affiiity) and in the i-gene (Oi with -1000-fold lower affinity).These sequences are required for the high level of repression observed in vivo (113, 114).
-
136
KATHLEEN SHIVE MATTHEWS AND JEFFRY C. NICHOLS
B. Search for the Operator Site The LacI protein must select its -30-bp operator sequence from lo7 bp of DNA in the bacterial cell, posing a significant location and recognition problem. Berg and colleagues (115-119) proposed a two-step mechanism to describe this process, in part to account for the exceptionally high association rate constants observed for regulatory proteins finding their target sequences. In the first step, a complex with a nonspecific DNA site is formed via a diffusion-controlled reaction (116).Once bound to DNA (Fig. 4), the protein searches for operator sites by a series of intramolecular steps that appear to include sliding (one-dimensionaldiffusion along the DNA backbone),
FIG.4. Search for the operator site by the protein. Multiple mechanisms for locating the operator site are depicted (based on Refs. 115-119). Initial association between nonspecific sites and LacI is followed by sliding, which is effectively a one-dimensionalsearch for the operator sequence. In addition, intersegment transfer can transport the repressor tetramer from one site in the DNA to more distant sites without dissociation. Intradomain (local) dissociation-reassociation events may also be part of the search for the operator site. Repressor may also dissociate completely and reassociate at another region of the same DNA (interdomainassociation-dissociation). For DNAs in which there are multiple operator sequences, both DNA-binding sites in the repressor tetramer may be occupied to form looped structures.
LACTOSE REPRESSOR PROTEIN
137
intersegment transfer (formation of a ternary complex between two DNA sites with a single repressor tetramer) with release to another site within the same DNA molecule, or dissociation and reassociation within the same DNA either locally (intradomain) or to a completely different region of the molecule (interdomain).The eventual outcome is occupation of the operator site. In addition, transfer reactions between two different DNA molecules can occur with formation of a ternary structure followed by release to a new site on a different DNA (120).These mechanisms contribute to the high affinity of this regulatory protein for its cognate site by generating a rapid association rate constant for repressor binding to the operator sequence.
C. loop Formation The presence of an intramolecular ternary complex (looped structure) was first inferred from the differential behavior of DNAs containing pseudooperator sequences (32, 121). This looped form has each of the two DNAbinding sites per tetramer occupied by a single DNA (Fig. 4) and exhibits greater stability than the complex with a single site occupied (32, 121, 122). Interestingly, the stability of looped complex may be influenced by ion type (123).Optimal sequence, spacing, and phasing of operator sites on linear DNA provides significant (>lo-fold) enhancement of LacI affinity (122,124, 125). The relative stability of repressor-mediated loops in linear plasmid DNA appears to correlate with the effect of DNA length on closed circle formation by ligase (124, 126). In contrast, supercoiled DNA shows no similar length correlation (127,128),and up to >lOOO-fold increases in stability were noted with the introduction of in vivo supercoil densities (125,127-130). The substantial stability for LacI complexes with supercoiled DNAs containing multiple operator sites suggested that loop formation involving the additional operator sequences found in E. coli DNA might account for the contribution of these pseudo-operators to the high level of repression observed in this system.The Oi and Oz secondary operator sequences have been found to be occupied in vivo, and maximal repression is not observed in the absence of these secondary operator sequences (130-139). The ability of LacI to form looped structures in vivo has been employed to determine the helical repeat of DNA in E. coli in an effort to generate insight into the physical properties of cellular DNA (129,136,137,139,140,142). The formation of loops in single DNA molecules can be monitored via nanometer-scale Brownian motion of particles linked to the ends of DNA (143).These measurements have shown that the mechanical strain associated with loop formation does not accelerate loop release and that the primary breakdown pathway maintained the tetrameric structure of the protein (143).These results reflect the significant stability noted previously for the tetrameric structure of this repressor protein under a variety of study condi-
138
KATHLEEN SHIVE MA’lTHEWS AND JEFFRY C. NICHOLS
tions. Despite this structural stability, the absence of “additivity” of thermodynamic effects for operators with different combinations of specific site changes has suggested that the protein is able to adapt its DNA-binding site to optimize affhity for a specific DNA sequence (144,245).This mutual accommodation presumably occurs by variation in the extent and nature of the conformational change in both repressor and DNA sites concomitant with their association (145).
D. Thermodynamics and Kinetics of Operator Binding The generation of significant amounts of operator-containing sequences from plasmid DNA (146)provided the means for detailed studies of repressor-operator thermodynamic and kinetic parameters. The mechanisms proposed for association of LacI with operator-containing DNAs could be tested by analysis of length dependence and salt dependence of rate constants. Equilibrium and kinetic constants for binding to operator are only minimally affected by pH (147),while salt and temperature effects are more significant. Record and colleagues (148,149)demonstrated that -11 ionic interactions were the primary contribution to the nonspecific DNA binding energy, while binding to operator included significant nonpolar interactions and only 6-7 ionic interactions. This value appears to be independent of DNA length for fragment sizes >170 bp (150),while differences are observed for shorter DNAs at low salt concentrations (121). Significant salt and DNA length dependence is observed for the kinetics of LacI binding to operator DNA. The data are consistent with “sliding” for LacI-operator association-dissociation and indicate an intermediate in the binding mechanism (32,121,151-156).LacI binding to single-operator-containing plasmid DNA has been examined in detail (157).Formation of 2 : l operator-repressor complex is anticooperative compared to 1:1 complex at low to moderate salt concentration. Formation of 1:1 complex with plasmid is positively cooperative compared to 40-bp fragment binding under the same conditions. From these results, mechanisms involving DNA wrapping and looping may be involved in the 1:l association of LacI with LacO-containing sequences at low salt concentration (157).The presence of pseudooperator sequences within a DNA sequence results in an increased association rate constant with the repressor protein (32,121, 150, 156).Large positive entropy contributions were found for repressor binding to multiple operator-containing DNAs (121,147). Unusual biphasic temperature dependence observed in equilibrium and kinetic rate constants for LacI-Lac0 and for LacI-inducer were interpreted initially as a structural transition in the protein (121).The reaction is driven entropically at low temperature but by enthalpy above the “transition” temperature, with only subtle differences in free energy (121,158).Record and
LACTOSE REPRESSOR PROTEIN
139
colleagues (158, 159) have noted that this thermodynamic signature, corresponding to a large negative AC for the association process, may be a general phenomenon for protein-DfiA binding and is indicative of extensive burial of nonpolar surface in conformational changes coupled to binding (158, 159).For 2ac repressor, coupled folding of unstructured N-terminal domains, including the hinge helix, were postulated to accompany operator DNA binding (145, 159) and demonstrated recently by NMR (1594. Similar temperature-dependent behavior occurs for inducer binding, suggesting that apolar residues in the binding pocket are buried in the interaction or that a conformational change or folding process results in sequestration of apolar residues.
E. Residues Involved in Laci-Lac0 interaction The helix-turn-helix of the DNA-binding domain contains residues essential to DNA binding. Miiller-Hilland colleagues have examined the effects of substitution at specific sites in the recognition helix (amino acids 17-26) on binding to wild-type operator and variants in each of the central positions of the DNA sequence (160-164).This detailed analysis has indicated that specific amino acids (Y17, Ql8, S21, R22, N25) contact designated sites within the operator sequence. From these and other studies, it was possible to deduce that the recognition helix contacts the DNA in a different orientation from other HTH-containingproteins (164-166). Site-specific mutagenesis of the protein has confirmed contact between Q l 8 and G/C7 in the operator sequence (167). Furthermore, photochemical cross-linking of BrdU-substituted operator DNA indicated H29 proximity to symmetrically related bases in the DNA; in addition, a single asymmetrical contact involving Y17 was observed (112).Interestingly,photochemical cross-linkingto nonoperator DNA also showed cross-linking to H29 (110).These results are consistent with NMR studies of the intact protein and isolated N-terminal domain that demonstrate the f i s t residue in the recognition helix (Y17) makes contact close to the center of symmetry of the operator, while H29 contacts the more distal portion of the operator sequence (76, 168, 169). Cross-link evidence suggests that the orientation of the recognition helix may be altered in the presence of inducer, thus generating the shift from specific recognition to nonspecific binding (1694b).
Ill. Inducer Binding Site-specific DNA recognition by Lac1 must be interrupted by the availability of lactose in the environment to allow production of the enzymes necessary for utilizing this carbon source. This modulation of LacI-Lac0 interaction is fundamental to induction of the lac operon and is accomplished by
140
KATHLEEN SHIVE MATTHEWS AND JEFFRY C. NICHOLS
binding of specific sugars to the repressor protein. Interestingly, lactose itself binds only weakly to LacI and is an anti-inducer (i.e.,this sugar stabilizes the LacI-Lac0 complex; 170,171),while the native inducer (1,6-allolactose)is a transglucosylation derivative of lactose produced by the action of P-galactosidase (11). This arrangement presumably precludes production of significant amounts of Zac mRNA in circumstances where p-galactosidase or lac permease is inactive and the cell is thereby unable to utilize lactose effectively. Inducer sugars elicit a conformation change that diminishes binding to operator sequences and exposes the promoter for transcription initiation by RNA polymerase (Fig. 1).
A. Sugars-Inducers
and Anti-inducers
Sugars have been found that both destabilize and stabilize the complex of LacI with LacO: the former are termed “inducers” and the latter “anti-inducers” (172).The effects of inducers are generally more dramatic (>1000fold decrease in LacI-Lac0 affinity) than anti-inducers (<5-fold increase in LacI-Lac0 affinity) (171, 172). A number of sugars that bind to LacI have been identified. Most are galactose derivatives, and their dissociation constants range from molar to micromolar (171, 172). The majority of sugars known to bind LacI are inducers, and only one neutral compound has been identified, o-nitrophenyl-P-D-galactoside(171). Almost all studies of inducer binding have utilized isopropyl-p-D-thiogalactoside (IPTG), a nonmetabolized and potent inducer of the repressor protein. LacI contains four sugar-binding sites that do not exhibit cooperativity at neutral pH (29, 171, 173). Competition between IPTG and anti-inducers for binding to the repressor protein has been interpreted as evidence that these sugars bind at the same site in the protein (171).Circular dichroism (CD) measurements of binding of an inducer and an anti-inducer, each containing o-nitrophenyl chromophores, indicated that the environments of the sugars are similar in the complex (174). Nonetheless, NMR studies demonstrate that changes in the protein’s aromatic amino acids in response to inducer binding differ from those found with anti-inducer (175).Presumably these differences arise from distinct global conformational shifts elicited by binding to the respective sugars.
B. Thermodynamics and Kinetics The conformationalchange associated with inducer binding alters the environment for both intrinsic and extrinsic chromophores; in particular, W220 fluorescence is perturbed significantly by occupation of the inducer binding site (see Section N,B; 176-178). Equilibrium and kinetic constants for IPTG binding to LacI have been measured using a variety of spectral probes (92, 176-179). At neutral pH the association rate constant is -2 X
141
LACTOSE REPRESSOR PROTEIN
lo5 M - l s-l, the dissociation rate constant is -0.2 s - l , and the equilibrium constant is -1 pM (92, 179).The rate and equilibrium constants are influenced significantly by pH, with a 10-folddecrease in the affinity and the appearance of cooperativity in sugar binding at pH > 9 (178,180).Calorimetric studies of inducer binding show significant enthalpic effects of inducer binding at elevated pH, indicating differences in protonation of the binding site and hydrogen bonds between the inducer and the protein (181).The absence of substantial hydrophobic contributions to binding energy at neutral pH is indicated by the lack of measurable heat capacity change for the reaction (181). Inducer binding is perturbed by the presence of operator DNA, as predicted based on the thermodynamic cycle that links these processes (inducer binding diminishes operator binding; operator binding diminishes inducer binding) (30, 182). The concentration of inducer required for half-saturation of the protein is increased -20-fold in the presence of operator DfiA fragments, and the binding becomes cooperative, with a Hill coefficient of -1.5 (30).This cooperative behavior was described by either Monod-Wyman-Changeu or Koshland models that assumed four inducer and two operator binding sites per tetramer (30).Based on these models, binding the first two inducer molecules accounts for 260% of the difference in affinities between free and inducer-bound repressor for operator DNA, a contribution sufficient to elicit induction in vivo (30).The association rate constant for IPTG binding to LacI-operator complexes is decreased -8fold, while the dissociation rate constant is increased -4-fold, accounting for the 20-fold differential in equilibrium constant (182).
C. Residues involved in Laci-Sugar
Interaction
The contributions of specific portions of the sugar to binding affinity have been analyzed using kinetic and equilibrium studies of a series of methyldeoxyfluoro-P-D-gdactosides(179).The C-3 and C-6 hydroxyls contribute -2 kcabmole to the free energy change associated with inducer binding, whereas the C-4 hydroxyl does not provide a sigmficant energetic contribution to the interaction. Negative AHO values were observed for a series of sugars differing at the p-glycosidic position; significant decreases in A€#' and increases in entropic contributions to binding energy upon replacement of the O-methyl substituent by S-methyl suggest an increase in apolar interactions with the sulfur atom (179).A structural transition in the protein or burial of apolar groups on binding is suggested by the nonlinear Arrhenius plots for kinetic rate constants observed for multiple inducer sugars (179). Site-specific mutagenesis based on homology to periplasmic sugar-binding proteins identified residues within the protein involved in binding sugar (183-185). R197 contributes significantly to sugar binding; substitution of
142
KATHLEEN SHIVE MATTHEWS AND JEFFRY C. NICHOLS
this side chain by K,G, or L results in 250% loss in the free energy of interaction (183).Substitution of D274 with E, N, or A results in a mutant protein with wild-type affinity for LacO but essentially no ability to respond to IPTG or other inducer sugars (184, 185). Other residues within the sugar-binding site have also been examined and result in less dramatic effects on inducer binding energy (184).
IV. Structure and Function A. Genetic Studies Genetic studies provided early insight into the separation of functions within the LacI sequence (44, 45, 48, 52). Detailed studies of missense and suppressed nonsense mutations, in particular from Jeffrey Miller’s laboratory, identified regions associated with specific functions: DNA binding, inducer binding, and assembly ( 4 3 44-54). ~ Nonsense mutations dispersed throughout the protein sequence and including almost half of the residues have been utilized to introduce multiple substitutions at specified sites. Insertion of 14 different amino acids at each position available was followed by characterization of the phenotypic behavior of the LacI products (46,53,54). In addition, thousands of missense mutations have been analyzed (47-52) to generate a detailed view of the effect of amino acid substitution on the phenotypic behavior of LacI in uiuo. Substitutionsproduced by missense mutation or by nonsense suppression in the N-terminal region affect DNA binding exclusively (43u, 44-54). Particularly deleterious effects occur with substitution within the recognition helix of the helix-turn-helix motif and in the segment from H47 to K59 (51-54), a region predicted to form an a-helix and to make contact with operator DNA (186).In addition, mutations that increase affinity for LacO occur primarily in this N-terminal region of the protein (187-189). Expression of N-terminal domains in vivo results in the repression of P-galactosidase synthesis and methylation protection of the operator DNA (141). In contrast to the N-terminal domain, in which nonfunctional missense mutations occur with high frequency, deleterious substitutions occur less frequently in the core domain, and a number of sites are relatively insensitive to substitution (51-54). However, mutation in this region affects the ability of the protein to respond to sugar or to form the oligomeric structures required for operator DNA recognition (44-48,52-54). The periodicity of deleterious mutations in the core domain provided the basis for an accurate prediction of the general fold of the LacI core domain that included a central P-sheet structure (52).
LACTOSE REPRESSOR PROTEIN
143
The predicted structure for the monomeric core domain, based on periplasmic sugar-bindingproteins, provided a scaffold for examining the genetic data for amino acids 62-323 (Fig. 5; Table 1I) (79).The placement of mutations that affected assembly, DNA binding, and sugar binding was consistent with the effects of these alterations on function, and this model allowed a more detailed understanding of the interactions that generated the functionalcharacter of thisprotein. In particular, mutations that affected sugar binding clustered in the area surrounding the presumed sugar-binding cleft, while mutations that affected DNA binding were found for the most part in the interior p-sheet structure or near the N-terminus (Fig. 5).The crystallographic structures (see Section V) c o h the monomeric fold and verify the deductions regarding which amino acids participate in assembly and sugar binding (17, 18).More recently, the crystallographic structures have been used as a template for mapping genetic mutations (189a)in a manner similar to that for the predicted model (79). Mutations that affect dimer assembly map exclusively to the core domain of the protein and were identified in the regions of amino acids 220-230 and 270-285 (60). One of these mutations was identified as Y282D by amplification and sequencing of the DNA encoding this monomeric mutant (go), and this protein has been isolated and characterized in detail (91).Monomeric mutants produced by other substitutionsat this site share the properties of Y282D (190),including undetectable DNA binding, IPTG binding parameters similar to the wild-typeprotein, and minimal influence of pH on the sugar-binding properties (90, 91). The subunit interface of the protein also includes the region surrounding K84, as apolar substitution of this position by A or L stabilizes the monomer interface sufficiently to serve as second-site reversion for the Y282D mutation (92). The tetrameric proteins K84A and K84L are stable in 2 8 M urea (93),and these substitutions have marked effects on both associationand dissociationrates for inducer (92).Furthermore, polar substitutions for K84 (K84R and K84E) alter the allosteric properties for inducer binding observed at elevated pH (92).Thus, K84 and Y282 in the N- and C-subdomains of the core define the interfaces of the monomer that interact to form dimer. Mutations that affect tetramer assembly are found in the C-terminal- 18 amino acids of the protein, which contain a leucine heptad repeat structure (61-63). Alteration of this heptad repeat sequence by amino acid substitution, deletion, or replacement results in dimeric proteins (62-67). For the dimeric mutants of Lac1 that have been examined, allostery associated with either elevated pH or operator presence is observed to be similar to the wildtype tetrameric protein (65). Thus, the allosteric properties associated with inducer binding involve only the monomer interface of the protein (65). Nonetheless, communication across this interface is sensitive to a variety of
FIG.5. Location of mutations in the repressor core monomer model (79).The original model based on the sugar-bindingproteins was refined for sequences distal to position 274 by using the purine repressor structure (19). This further refined model differed only in the C-terminal region of the core domain and is quite similar in many respects to the corresponding region of the crystallographicstructure (18),as shown in the overlay in panel E, in which the lighter backbone trace corresponds to the model structure. The clustering of mutations that affect specific functions is shown in panels A-D. Panel A highlights those residues that occur in the interior psheet region; disruption of these residues presumably results in protein misfolding. Panel B indicates residues that occur at or near the monomer interface; alteration at these sites may affect protein assembly, folding, and/or allostery. Panel C depicts residues found near the opening of the sugar-binding cleft; mutations in these residues may result in conformational effects that mimic the inducer-bound state of the repressor. Panel D shows the location of residues in the sugar-bindingsite for which mutation affects b i n h g of inducer molecules.
145
LACTOSE REPRESSOR PROTEIN
TABLE I1 IDENTIFICATION OF AMINOACID CLUSTERSIN MODELSTRUCTURE^ ~~
~~~~
Decreased operator binding (i- phenotype) N-subdomain Group I Interior residues; disrupt folding; shown in Fig. 5A Gly65, Va166, Met98, Va199, Leu122,lle123, Leu148 Group I1 Close to DNA-bindingdomain LyslOB, Leull4, Leull5, Gln117, ArgllB, Val119 Group 111 Close to DNA-bindingdomain Leu319, A-0320, Val321 Group N May be involved in subunit assembly; shown in Fig. 5B Leu71, His74, R-076, Ser77, Gln78, Ala81, Lys84': Arg86 Group V Close to binding cleft; may favor inducer-bound state; shown in Fig. 5C Leu128, Gln131 C-subdomain Group VI On opposite face to Group IX Glu164, Thr167, Arg168, Gly170, Va1171, Glu172, Leu184, Glyl87, Serl91, Ala199, Trp201, Lys203, Ty204, Ile210 Close to sugar-bindingcleft; shown in Fig. 5C Group VII Gly218, Asp219, Trp220, Ala222, Gly225, Glu227 Group VIII Interior residues; disrupt overall folding; shown in Fig. 5A Ala241, Leu243, Va1244, Asn246, Asp247, Met249, Ala250, Ala266, Ile268, Ser269, Va1270, Gly272 May be important in subunit interaction; same face as Croup IV;shown in Group IX Fig. 5B Ty273, Asp274, Ser279, Ser280, C y ~ 2 8 1 "Ty282cke( ~, Lar286, Thr287, Thr288 Same region as Group IX; may be involved in subunit interactions; shown in Group X Fig. 5B Lys290, Asp292, Leu296, Gly297, Ser300, Val301 Arg303, Leu304 Outer a-helix, somewhat close to Group IX; shown in Fig. 5B Group XI Leu251 Gly252, Ala253, Arg255, Ala256,lle257, Glu259 Mutations require inducer for DNA binding Group XI1 Argl18, Leu128, Am246, Asp247, Tyr273, Asp292 Decreased response to inducer (is phenotype) N-subdomain Group XI11 Outer a-helix, similar to Croup N; shown in Fig. 5B Leu71, Ser77, Gln78, Lys84", Ser85, Asp88, Gln89 In sugar-bindingsite; shown in Fig. 5D Group XIV Leu128, Asp130
(continw)
146
KATHLEEN SHIVE MATTHEWS AND JEFFRY C. NICHOLS
TABLE I1 (Continued) ~
Decreased response to inducer (is phenotype) Group XV
Interior p-sheet residues; disrupt folding; shown in Fig. 5A with Groups I and
VIII Val95, Vd96, Ser97, Met98, Val99 C-subdomuin In sugar-bindingsite; shown in Fig. 5D Group XVI SerlSl, Ser193,Arg195, Arg1979, Trp22oh, Asn246, Asp247, Gln248, Met249, Ty273,Asp274' Group XVII Subunit interaction effects; same region as Groups IX and X; shown in Fig. 5B Ser280, Cys281d, Ty282, Cln291, Asp292, Phe293, Leu296, Gln298 OBased on genetic data from Refs. 51-54 unless otherwise indicated (79). bReference 92. =References9l,180, and 195. Qeference 190. EReference60. aeference 90. meference 183. hReference204. 'Reference 184.
amino acid substitutions across the protein sequence, and cooperativity can be influenced substantially by subtle structural effects (92,191). Amino acid substitutionsat sites distant from the DNA-bindingregion can nonetheless have marked effects on LacI-Lac0 affinity. Although R326 is near the surface of the protein, it forms multiple hydrogen bonds that contribute to the tertiary fold of the protein; substitution of R326 resulted in diminished operator affinity and decreased stability of the protein (191).The profound phenotypic effects of substitutions at multiple sites in the core domain (53,54)indicate that the proper tertiary and quaternary structures of LacI are essential for its normal function in the cell. Similarly, significant decreases in apparent DNA binding have been observed for dimeric mutants of LacI (64, 65) and have been attributed to thermodynamic coupling of dimer assembly to operator DNA binding (64)(Fig. 2). Substituting the GCN4 leucine heptad repeat sequence for the wild-type C-terminus produced a dimer that displayed increased stability and wild-type operator affinity (Fig. 6) (66,67,192), confirmingthermodynamiclinkage of assembly and DNA binding and underscoring the contribution of the C-terminus to these functions. Exceptionally stable dimers with wild-type operator binding properties can also be produced by apolar substitution at position 84 into the deletion dimer background (e.g., - 11 aa C-terminal deletion coupled with K84A) (93,934
147
LACTOSE REPRESSOR PROTEIN
A 340
360
Wild-type
RA&ADSLMQ&ARQVSR&ESGQ
-11 aa
M&&DSLMQ&
R3
RA&EDKVEE&LSKNYH&ENEVAR&KKLESGQ
B Wild-type
-11 aa
R3
FIG.6. C-terminal sequences for different dimer species of LacI. (Panel A) Sequences of wild-type, -11 aa deletion mutant, and R3 protein with the GCN4 leucine heptad repeat sequence replacing the correspondingwild-type region (66). (Panel B) Schematic of the h e r assembly surface for wild-type, -11 aa deletion mutant,and R3 protein (67). Wild-type protein contains an antiparallel four-helicalbundle and the R3 protein a parallel dyadic coiled-coilstructure, while the - 11 aa protein contains insdcient sequence to form an effective subunit interface.
B. Chemical and Spectroscopic Studies Amino acid side chain participation in repressor function has been examined using chemical modification and spectroscopy employing both intrinsic and extrinsic probes. The cysteine residues of LacI, while not reactive with iodoacetamide and other reagents that target exposed side chains, are reactive with a variety of sulfhydryl-specific reagents. Modification of C107 and C140 does not sigdicantly alter the functional character of the protein (193-196), while reaction of C281 diminishes operator h i t y and alters kinetic and allosteric parameters for inducer binding (195).Even reagents that introduce small moieties (e.g., -S-methyl by methyl methane thiosulfonate) elicit significant effects at C281 (195),consistent with the placement of this region near a subunit interface based on its proximity to Y282. C140 of the intact repressor protein reacts selectively with the dansyl derivative of iodoacetamide (IAEDNS),while this selectivitywas abolished in the isolated core domain (197).These results indicate that a hydrophobic “pocket”between the core and N-terminal domains may allow concentration of probe and enhance reaction of C140 in this region (197).Modification of C140 with IAEDNS enhanced fluorescencefor this probe that could be altered by DNA binding (197,
148
KATHLEEN SHIVE MATTHEWS AND JEFFRY C. NICHOLS
198).A role for C107 in operator DNA binding was suggested by the effects of fluorescein mercuric acetate on LacI behavior (196).Maleimide probes react primarily with C107 and C140, with minimal effects on binding properties, and spectral behavior indicated that these cysteine residues are neither exposed nor deeply buried (194,199).The influence of operator on spectral characteristics suggested the proximity of C107 and/or C140 to the DNAbinding site or the influence of conformational changes that accompany DNA binding on these positions (199).Consistent with these observations, C107 oxidation was found to diminish operator binding by LacI (200). The two tryptophan residues in the LacI sequence are located in close sequence proximity at positions 201 and 220 (57-59). Studies of the wild-type protein and mutants W201Y and W220Y demonstrated that W201 is buried and has a smaller contribution to the fluorescence spectrum than W220, which is partially exposed to solvent in the absence of inducer (201-206). The fluorescence data also revealed anisotropic rotations for the macromolecule (202, 207) consistent with an elongated, flexible protein. In the presence of inducer, the maximum of fluorescence emission shifts to shorter wavelength. Since W220 is protected from oxidation by ultraviolet (uv) irradiation or N-bromosuccinimide treatment in the presence of IPTG (208-210), this blue shift is apparently due to a direct interaction between the indole side chain and the sugar. Because the fluorescence shift has been shown to correlate directly with IF'TG occupancy, this method has been employed in many equilibrium and kinetic studies of sugar associationwith LacI and its variants (see Section 111,B). The involvement of lysine residues in LacI function has been demonstrated by chemical modification using dansyl chloride or trinitrobenzenesulfonate (211,212).Reactivity of K290 and K327 was affected by IFTG binding, while K33, K37, and K108 were found to be protected by the presence of DNA (211,212).K37 modification by dansyl chloride correlates with loss of operator DNA-binding activity (212).Arginine involvement in operator binding was indicated by loss of activity by modification with reagents specific for this side chain (2,3-butanedione and phenylglyoxal) (213).Although DNA protected against this loss of activity, the specific side chains involved were not identifed due to reversibility of the product under conditions required for mapping the sites of modifcation (213).Histidine modification with diethylpyrocarbonate resulted in loss of operator affinity, and H29 was found to be the site of reaction (214).Tyrosine involvement in DNA binding was first noted from the effects of iodination of Y7, Y12, and Y17 on operator affinity (215).Residues Y7 and Y17 were further implicated in DNA binding by the pattern of activity loss upon treatment with tetranitromethane, and Y204 reactivity with this reagent was found to be diminished by inducer presence (216,217).In general, these effects noted by chemical studies cor-
LACTOSE REPRESSOR PROTEIN
149
relate well with expected influence of residues based on the structure of LacI (see Section V).
C. Conformational Change Ultraviolet spectroscopy and sedimentation studies provided the first indications of global conformational shifts in the protein associated with inducer binding (177,218-220).This rearrangement affects uv absorbance of tyrosine and tryptophan residues (205, 218, 219) and visible absorbance of cysteine residues modified with nitrophenol probes (176, 221). C107 and C281 shift to more polar environments on IPTG binding, while C140 undergoes a change to less polar surroundings (221).NMR studies demonstrate that environments surrounding tyrosine residues in the core domain (Y204 and Y282)are also altered by inducer (222).Temperature-jump studies further suggested a conformational transition between two states in LacI that can be influenced by inducer presence (223). 8-Anilino-l-naphthalenesulfonate ( A N S ) binding to LacI appears to occur at the interface between the N-terminal and core domains, and its fluorescence is altered by protein association with DNA (224, 225). However, detailed information on the interface between the N-terminal and core domains and the specifics of conformational changes in LacI has awaited the crystallization and structure determination for this protein.
V. NMR and X-ray Crystallographic Structures The structure of the N-terminal domain has been solved by NMR and structures of the core domain and intact protein in various complexes by xray crystallography (17, 18, 72-77). The results of these structural determinations confirm most of the deductions based on genetic, physical, and chemical analysis of LacI, and the detailed arrangement of amino acids provides a framework for understanding behavior of the protein, for comprehending the details of the induction process, and for designing future studies of this system.
A. N-Terminal Domain The N-terminal domain is sufficiently small to allow determination of its structure using two-dimensional NMR methods, distance geometry, and molecular dynamics (Fig. 7 ) (72-77). The N-terminal amino acids 1-51 fold into a three-helical structure arranged in a classical helix-turn-helix motif (72-77). In the presence of operator DNA, a specific complex is formed, with the recognition helix oriented in the opposite direction from other helixturn-helix proteins, confirming the interpretations of previous studies
150
KATHLEEN SHIVE MATTHEWS AND JEFFRY C. NICHOLS
FIG.7. N-terminal domain of the lactose repressor protein complexed with operator DNA. The structure of the isolated N-terminus of Lac1 in complex with operator was determined by two-dimensional NMR methods (76'). Coordinates were obtained from PDB file ILCC. The structure consists of a classical helix-turn-helixmotif' found in many DNA-bindingproteins (71). The DNA structure is not significantlyperturbed by binding of this domain. The structure of the free N-terminal domain has also been determined and differs primarily in the loop between helices 2 and 3 (77).
(164-166). The side chains of amino acids Y17, Ql8, T19, R22, and H29 make specific contacts with the operator DNA sequence (72-77, 226). Additional contacts are made by L6, S21, N25,Q26, Y47, N50, and R51 (76). For most of the sequence, only small differences are observed between the free N-terminal domain and its complex with operator DNA; however, significant shifts have been observed in the loop between helices 2 and 3 of the helix-turn-helix motif (76, 77).
B. Core, Intact, and Ligand-Bound Forms X-ray crystallographicanalyses of the tetrameric core domain (17) and intact protein free and bound to operator or inducer (18)have been complet-
Figure 8
This Page Intentionally Left Blank
LACTOSE REPRESSOR PROTEIN
151
ed. In addition, structures of the homologous purine repressor have been determined (19, 20). The features of LacI structure and assembly determined by other methods are apparent in these structures (Fig. 8): 1. The N-terminal helix-turn-helix domain is the DNA-binding site and is linked by a flexible hinge to the core protein. 2. The monomer interface involves a large segment of the core protein that is folded to form a structure similar to the periplasmic sugar-binding proteins. 3. The dimer interface is generated by a relatively short segment containing leucine heptad repeats at the C-terminus of the protein that forms an antiparallel four-helical bundle. 1. N-TERMINAL DOMAIN
The N-terminus in the intact protein complexed with DNA appears to be folded similarly to the structure of the free and complexed domain determined by NMR methods, forming a helix-turn-helix that fits into the major groove of the operator DNA sequence.The orientation of the recognition helix is similar in LacI and purine repressor (PurR) structures,but m e r s from the known structures of other prokaryotic helix-turn-helix proteins (18,19).The hinge region that connects this DNA-bindingregion to the core domain is exposed, an arrangement that presumably accounts for the susceptibility of this sequence to proteolysisin the absence of DNA (33-35).Although the side chains for the N-terminus are not resolved in the x-ray structure completed, the arrangement is consistent with the contacts determined for the isolated Nterminal domain by NMR (76).The mobility of this region indicated by early NMR studies (84) is echoed in the absence of detectable electron density for the N-terminus in the free protein and LacI-IPTG structures (18). In addition to the contacts made in the major groove and along the backbone by the recognition helix, the x-ray structures of both LacI and PurR proteins indicate a helix formed from amino acids -50-59 that contacts DNA in the minor groove, generating an -45" kink in the DNA structure (18,19). The bend is away from the protein and unwinds the central base pairs of the operator sequence by approximately 40-SOo, a value consistent with early studies of DNA unwinding by LacI (227). Not only is the hinge region involved in direct interaction,but this helix may be required for the proper orientation of the recognition helix and therefore for site-specific DNA binding (18).This helix appears to be present only when the protein is complexed with DNA and may account for the coupling of protein folding (removal of apolar residues from solvent) with DNA binding (145,159,1594. Studies to examine this concerted folding-binding phenomenon are underway in several laboratories.
FIG.8. X-ray crystallographic structures of the core domain and the intact protein complexed with operator DNA. (A) Structure of the core domain-IF’TC complex determined by Friedman et al. (17).Coordinates were obtained from PDB file lTLF.The orientation of the structure is along the dimer interface to highlight the C-terminal four-helical bundle. (B) Structure of the Lad-IPTG complex shown in panel A oriented to look down into the “V-shaped
LACTOSE REPRESSOR PROTEIN
153
2. COREDOMAIN The fold of the central core domain of the monomer is similar to the periplasmic sugar-binding proteins, as initially predicted based on sequence similarity (69, 78, 79).A central six-stranded P-sheet is flanked by four helices in each of two subdomains per monomer; three crossover sequences occur between the two subdomains (17,18).The deep cleft between these two subdomains is the sugar-binding site, with side chain contacts from both segments of the protein sequence to the sugar (17, 18). Consistent with results fiom detailed examination of site-specific mutant proteins (183, 184, 204), €3197, D274, and W220 are involved in sugar contacts. In addition, side chains from S69, R101, D149, and N246 appear to be involved in hydrogen bond interactions and side chains fiom L73, A75, P76,179, F293, and L296 in hydrophobic contacts (17, 18). The C-terminal region 340-357, encompassing the leucine heptad repeat essential for tetramer formation, forms a four-helical bundle that is independent of the remaining structure (17, 18). As discussed previously, this region is capable of forming stable tetrameric structures when separated from the remainder of the Lac1 sequence (95).The buried surface area of this C-terminal dimer interface (-3900 .A2) is comparable to that for the extensive monomer interface (-3000 .A2) (17). Thus these two interfaces contribute approximately equivalent free energies to the structure of the tetramer. The arrangement of monomers in the tetramer was anticipated to be plane rectangular (Fig. 2) based on low-angle x-ray and neutron scattering studies and low-resolution electron microscopic analyses of microcrystals groove as well as to view the monomer interface. (C) Structure of the Lad-IPTG complex determined by Lewis d al. (18).Coordinates were obtained from PDB file 1LBH.The structure is shown in the same orientation as for panel A. Note that the N-terminal domain is not observed in the LacI-IPTG structure, presumably due to the motional freedom of this region of the protein (84).The core domain shuctures for the isolated core and intact protein (17, 18) are very similar and validate multiple observations and previous modeling efforts (see text). The sugarbinding cleft sandwiched between the N- and C-subdomains of the core, and the two independent subunit interfaces are readily apparent in the structures. The monomer interface encompasses a broad region of the protein, while the dimer interface involves only the 18 C-terminal residues. (D) Structure of the LacI-Lac0 complex determined by Lewis et al. (18).Coordinates were obtained from PDB file 1LBG. The N-terminal domains “cross over” for the recognition helices in the helix-turn-helix structure to make contact with the DNA sequence in the major groove. The hinge helices contact the minor groove, resulting in a bend of -45” in the DNA structure away from the protein. The conformational change with inducer appears to generate closure of the sugar-bindingsite, reorientation of the N-subdomainsof the core, consequent misalignment of the N-terminal recognition helix for high-affinity DNA binding, and alteration of monomer interface contacts. This rearrangement has only minimal effect on the C-subdomain of the core and on the dimer interface. (See Color plate.)
154
KATHLEEN SHIVE MATCHEWS AND JEFFRY C. NICHOLS
(80-83). However, all forms of the LacI protein were found to assume a “V” shape, bent at the C-terminus to bring the N-termini onto the same side of the protein rather than occurring at opposite ends. Notably, early electron microscopic results suggested a type of “ V shaped structure (85, 86). A symmetrical tetramerization domain within an overall asymmetrical structure results in different conformationsfor each of the identical strands that connects the C-terminal oligomerization domain to the core region (17, 18).Interestingly, based on the small surface area (-300 A2)buried upon formation of the bent structure, the stability of the “V” shape is hypothesized to be low, and the dimers may adopt a variety of relative orientations to facilitate DNA contact by both operator-binding sites in the protein (17, 18).
C. Integrating Physical, Chemical, and Genetic Information The crystal structures confirm the conclusions reached by many physical, chemical, and genetic studies of assembly, of amino acid side chain participation in binding properties of the protein, and of the role of specific residues in phenotypic behavior. Assembly of the protein via two separable subunit interfaces, one broadly dispersed in the sequence and the other confined to the C-terminus, was deduced based on behavior of specific mutants of the protein (60-67, 90, 190). Further, the confinement of intersubunit communication regarding inducer binding to the dimeric species (65) has been confirmed by the structural analysis. The wide array of amino acids involved in the monomer interaction, both in assembly and in communication between subunits, has been emphasized further by the crystal structures.’ Multiple contacts are made between subunits across a large interface, and subtle changes in the interactions may influence cooperativity for inducer binding and thereby exert an influence on inducibility (17, 18).The crystal structures identify a number of amino acid pairs that may play a role in the conformational change that accompanies inducer binding and diminishes operator affinity (e.g., K84-E100, Q117-Rl18, H74-D278) (17, 18).A great deal of further study will be necessary to decipher the specific events that are key to the induction process. The structures illuminate the effects of inducer binding and consequently the specific mechanism by which inducer binding influences operator affinity. The sugar-binding and DNA-binding sites are distant in the structure, and the profound influence of inducer sugars on DNA binding affinity must be mediated through conformational effects on the protein. Even before the structures were solved, the cracking of LacI crystals by exposure to IPTG reflected the effects of sugar on the protein conformation (228). Although no electron density is observed for the N-terminal region in free or
LACTOSE REPRESSOR PROTEIN
155
IPTG-bound repressor structures, the position of the peptide backbone near the core domain indicates that bound sugar alters the spatial relationship between N-terminal domains (18). Furthermore, the conformational shift changes the orientation of the core N-subdomains with respect to the Csubdomain within each monomer and with respect to the N-subdomain in the adjacent subunit (18). This motion separates the amino acids corresponding to the C-terminal ends of the hinge helices by about 4 A and presumably results in misalignment of the N-terminal recognition helices, thus precluding high-affinity binding to operator DNA (18).The N-subdomain monomer interface appears to undergo considerable rearrangement as a consequence of inducer binding, while the C-subdomain is largely unaffected (18).Similar structural effects have been observed for PurR binding to corepressor, although only the core domain of this protein has been examined in the ligand-free form (19, 20).
VI. Applications of Lac1 Control The LacI-Lac0 system is used widely to regulate expression of genes in bacterial cells, as reflected in the multiple vector systems commercially available that employ this protein in cloning genes and overexpressingtheir protein products (e.g., 229,230).The high affinity of LacI for LacO ensures minimum protein production in the absence of inducing sugars, an arrangement important for a cloned gene for which the protein product may be toxic to the cell. IPTG is used in these cloning systems as a signal to initiate mRNA synthesis at selected times. Where efficient repression is essential for bacterial survival, a second operator site can be introduced to ensure even lower constitutive expression of the cloned gene. In some cases, RNA polymerase of T7 phage is under LacI-Lac0 control, and the gene to be expressed is placed behind a unique T7 promoter sequence. As the only target for T7 polymerase, the mRNA levels and consequent protein production can be a significant portion of the cellular output and thereby facilitate purification of the desired protein. The Lad-Lac0 system has also been employed in eukaryotic systems, both to controI expression and in altered form to facilitate transcription. Transfection of LacI expression vectors in mammalian cells, plant cells, or Xenopus oocytes can block transcription of a reporter gene under LacO control, and the effects of operator placement with respect to transcription initiation sites have been examined (231-234, 234a). Similar to applications in bacterial expression, regulated expression of foreign genes in mammalian cells was achieved using phage T3 polymerase under LacI control (235).Fusion of the simian v i r u s 40 nuclear localization signal and a transactivation
156
KATHLEEN S H N E MA’ITHEWS AND JEFFRY C. NICHOLS
domain from herpesvirus protein 16 to the C-terminus of LacI resulted in a protein that served as a transcriptional activator at Lac0 sites within transfected DNA (236, 237).This protein was undoubtedly dimeric, since the fusion eliminated the C-terminal leucine heptad repeat sequences necessary for tetramer assembly. Although most experiments in eukaryotes have utilized wild-type LacI, the effects of nuclear localization sequences (NLS) on LacI binding activity and on nuclear accumulation were examined in anticipation of in vivo applications of this system in transgenic animals (238).The optimum placement of nuclear localization signals occurs as an extension to the C-terminus of the protein, and significant nuclear accumulation of LacI-NLS protein is observed, in contrast to the equivalent cytoplasmic and nuclear levels for wildtype protein (238).The ability of mammalian cells, both in culture and in the intact animal, to take up IPTG at levels sufficient for induction demonstrates the utility of this system even for whole animals (239).Initial applications in transgenic LacI mice focused on detecting mutation frequencies and types following exposure to a variety of mutagens and carcinogens (240-242).An interesting note is that the spectrum of mutations observed in human cells is significantlydifferent than that found in E. coli, in which the mutation of LacI has been well studied (243).The information gathered from such analyses will provide significant insight into the mechanisms of mutagenesis and the connection between mutation and carcinogenesis.
VII. Conclusion and Prospects for the Future As the first system of genetic regulatory control to be elucidated, the lactose repressor protein and operator DNA have served in a variety of pioneering studies-from their initial isolation to generation of targets for mutagenesis in transgenic animals. Many fundamentals of genetic regulation are exhibited by this protein: the requirement of assembly to oligomer for DNA binding, recognition by the oligomer of a specific DNA sequence, and modulation of this site-specificbinding by ligand. The lactose repressor protein also serves as a paradigm for allostery. Its structure is an interesting composite of fundamental motifs for folding and assembly: helix-turnhelix DNA-binding domain, “conventional” sugar-binding site formed by P-scaffold and a-helices, extended monomer interface through which cooperativity is communicated, and four-helicalbundle for dimer association. Determination of the crystallographic structures (17,18) opens a new frontier for future experimentation. Detailed elucidation of the induction mechanism, including cooperativity of inducer binding, is among the interesting pathways that can now be explored effectively. Studies of the lac repressor
157
LACTOSE REPRESSOR PROTEIN
protein have taken many turns along the convoluted route to our current understanding of its structure and function. No doubt many surprises still await us on this fascinating journey initiated by the keen insight and skillful deduction of Jacob and Monod. ACKNOWLEDGMENTS The work reported from this laboratory was supported by grants from the National Institutes of Health (GM22441) and the Robert A. Welch Foundation (C-576)and employed facilities of the Keck Center for Computational Biology. J. C. N. was supported by an NIH Biotechnology Training Grant. We thank M. Thomas Record and colleagues and M. Kercher, P. Lu, and M. Lewis for permission to cite their unpublished results. We appreciate critical input from Catherine Falcon, Nicole Magnasco, Liskin Swint-Kruse, and Diane Wycuff and discussions with other members of the Matthews laboratory.
REFERENCES 1. F. Jacob and J. Monod,]. Mol. B i d . 3,318 (1961). 2. W. Gilbert and B. MuUer-Hill, Proc. Natl. Acad. Sci. U.S.A. 56,1891 (1966). 3. A. D. Riggs and S . Bourgeois,]. Mol. Bid. 34,361 (1968). 4. W. Gilbert and B. Muller-Hill, Aoc. Natl. Acad. Sci. U.S.A. 58,2415 (1967). 5. A. D. Rjggs,S. Bourgeois, R. E Newby, and M. Cohn,]. Mol. Biol. 34,365 (1968). 6. J. Monod, J. Wyman, and J:P. Changeux, J. Mol. Bid. 12,88 (1965). 6a. A. Kolb, S. Busby, H. Buc, S. Garges, and S. Adhya, Annu. Rev. Biochem. 62, 749 (1993). Z W. Gilbert and A. Maxam, Aoc. Natl. Acad. Sci. U S A . 70, 3581 (1973). 8. P. J. Schlax, M. W. Capp, and M. T. Record, Jr.,]. Mol. Bid. 245,331 (1995). 9. S. B. Straney and D. M. Crothers, Cell 51,699 (1987). 10. J. Lee and A. Goldfarb, Cell 66,793 (1991). 11. A. Jobe and S. Bourgeois,J. Mol. Bid. 69,397 (1972). 12. A. D. Riggs, H. Suzuki, and S. Bourgeois, 1.Mol. Bid. 48,67 (1970). 13. S.-y. Lin and A. D. Riggs, Biochem. Biophys. Res. Commun. 62,704 (1975). 14. S.-y. Lin and A. D. Riggs, Cell 4, 107 (1975). 15. P. H. von Hippel, A. Revzin, C. A. Gross, and A. C. Wang, Proc. Natl. Acad. Sci. U.S.A. 71, 4808 (1974). 16. Y. Kao-Huang, A. Revzin, A. P. Butler, P. OConner, D. W. Noble, and P. H. von Hippel, Proc. Nad. Acad. Sci. U.S.A. 74,4228 (1977). 1 Z A. M. Friedman, T. 0. Fischmann, and T. A. Steitz, Science 268,1721 (1995). 18. M. Lewis, G. Chang, N. C. Horton, M. A. Kercher, H. C. Pace, M. A. Schumacher, R. G. Brennan, and P. Lu, Science 271,1247 (1996). 19. M. A. Schumacher, K. Y.Choi, H. Zalkin, and R. G. Brennan, Science 266,763 (1994). 20. M. A. Schumacher, K. Y. Choi, F. Lu, H. Zalkin, and R. G. Brennan, Cell 83,147 (1995). 21. J. H. Miller and W. S. Reznikoff, eds., “The Operon.” Cold SpringHarbor LaboratoryPress, Cold Spring Harbor, NY (1980). 21u. J. R. Beckwith and D. Zipser, eds., “The Lactose Operon.” Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1970).
158
KATHLEEN SHIVE MATTHEWS AND JEFFRY C , NICHOLS
22. B. Muller-Hill, B. Gronenborn, J. Kania, M. Schlotmann, and K. Beyreuther, in “Nucleic Acid-Protein Recognition” (H. J. Vogel, ed.), p. 219. Academic Press, New York, 1977. 23. B. Miiller-Hill, Angew. Chem. Internut. Edit. 10, 160 (1971). 24. S. Bourgeois and M. Pfahl, Ado. Rot. Chem. 30,1(1976). 25. M. D. Barkley and S. Bourgeois, in “Regulatory Biology” 0. C. Copeland and G. A. Manluf, eds.), p. 66. Ohio State University Press, Columbus, 1977. 26. B. Muller-Hill, Bog. Biophys. Mol. Bid. 30,227 (1975). 26a. A. E. Chakerian and K. S. Matthews, 1nt.J.Biochem. 20,493 (1988). 26b. A. E. Chakerian and K. S. Matthews, Mol. M h b i o l . 6,963 (1992). 27. H. Sund and G. Blauer, eds., “Protein-LigandInteractions,” Section 111. deGruyter, New York, 1975. 28. B. Muller-Hill, BioEssays 12,41 (1990). 29. A. P. Butler, A. Revzin, and P. H. von Hippel, BiochemisCry 16,4757 (1977). 30. R. B. OGorman, J. M. Rosenberg, 0. B. Kallai, R. E. Dickerson, K. Itakura, A. D. Rigs, and K. S. Matthews,]. Bid. Chem. 255,10107 (1980). 31. R. B. OGorman, M. Dunaway, and K. S. Matthews,]. B i d . Chem. 255,10100 (1980). 32. P. A. Whitson and K. S. Matthews, Biochemistry 25,3845 (1986). 32a. F. Culard and J. C. Maurizot, Nudeic Acicls Res. 9,5175 (1981). 33. T. Platt, J. G. Files, and K. Weber,]. Biol. Chem. 248,110 (1973). 34. N. Geisler and K. Weber, FEBS Lett. 87,215 (1978). 35. J. G. Files and K. Weber,]. Bid. Chem. 251,3386 (1976). 36. B. E. Friedman and K. S. Matthews, Biochem. Biophys. Res. Cmrnun. 85,497 (1978). 37. N. Geisler and K. Weber, Biochemistry 16,938 (1977). 38. T. M. Jovin, N. Geisler, and K. Weber, Nature (London)269,668 (1977). 39. R.T. Ogata and W. Gibert, h c . Natl. A d . Sci. U.S.A. 75,5851 (1978). 40. K. Adler, K. Beyreuther, E. Fanning, N. Geisler, B. Gronenborn, A. Klemm, B. Muller-Hill, M. Pfahl, and A. Schmitz,Nature &ondun) 237,322 (1972). 41. K. S. Matthews,J. Biol. Chem. 254,3348 (1979). 42. N. Geisler and K. Weber, R-oc. Natl. A d Sci. U.S.A.73,3103 (1976). 43. M. Dunaway and K. S. Matthews,]. Biol. C h .255,10120 (1980). 43a. M. Pfahl, Genetics 72,393 (1972). 44. J. H. Miller, D. Ganem, F’. Lu, and A. Schmitz,]. Mol. B i d . 109,275 (1977). 45. U. Schmeissner,D. Ganem, and J. H. Miller,]. Mol. Bid. 109,303 (1977). 46. J. H. Miller, C. Coulondre. M. Hofer, U. Schmeissner, H. Sommer, A. Schmitz, and P. Lu, J. Mol. B i d . 131, 191 (1979). 4% J. H. Miller and U. Schmeissner,]. Mol. Bid. 131,223 (1979). 48. M. Pfahl, C. Stocker, and B. Gronenbom, Gazetics 76,669 (1974). 49. R. M. Schaaper, B. N. Danforth, and B. W. Glickman,]. Mol. Biol. 189,273 (1986). 50. J. E. LeClerc, J. R. Christensen, P.V. Tata, R. B. Christensen, and C. W. Lawrence,]. Mol. Bid. 203,619 (1988). 51. A. J. E. Gordon, P.A. Bums, D. F. Fix,F. Yatagai, F. L. Allen, M. J. Horsfall,J. A. Halliday, J. Gray, C. Bernelot-Moens,and B. W. Glickman,]. Mol. Biol. 200, 239 (1988). 52. J. H. Miller,]. MoZ. Bwl. 131,249 (1979). 53. L. G. Kleina and J. H . Miller,J. Mol. Bid. 212,295 (1990). 54. P.Markiewicz, L. G. Kleina, C. Cruz, S.Ehret, and J. H. Miller,]. Mol. B i d . 240,421 (1994). 55. P.Y. Chou, A. J. Adler, and G. D. Fasman,]. Mol. Bid. 96,29 (1975). 56. S. Bourgeois, R. L. Jemigan, S. C. Szu, E. A. Kabat, and T.T. Wu, Biopolymers 18,2625 (1979). 57. K. Beyreuther, K. Adler, N. Geisler, and A. Klemm, h c . Natl. Acad. Sci. U.S.A. 70,3576 (1973).
LACTOSE REPRESSOR PROTEIN
159
58. K. Beyreuther,K. Adler, E. Fanning, C. Murray, A. Klemm, and N. Geisler,Eur.]. Biochem. 59,491 (1975). 59. P. J. Farabaugh, Nature (London) 274,765 (1978). 60. A. Schmitz, U. Schmeissner,J. H. Miller, and P. Lu,]. B i d . Chem. 251,3359 (1976). 61. J. H. Miller, T.Platt, and K. Weber, in “The Lactose Operon” (J. R. Beckwith and D. Zipser, eds.), p. 343. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY,1970. 62. A. E. Chakerian, V. M. Tesmer, S. l? Manly, J. K. Brackett, M. J. Lynch, J. T.Hoh, and K. S. Matthews,]. B i d . C h m . 266,1371 (1991). 63. S. Alberti, S. Oehler, B. von Wilcken-Bergmann,H. = h e r , and B. Miiller-Hill,New BWZogist 3,57 (1991). 64. M. Brenowitz, N. Mandal, A. Pickar, E. Jamison, and S. Adhya,]. B i d . C h m . 266,1281 (1991). 65. J. Chen and K. S. Matthews,]. B i d . Clim. 267,13843 (1992). 66. S. Alberti, S. Oehler, B. von Wilcken-Bergmann, and B. Miiller-HiU, E M B O ]. 12, 3227 (1993), 67. J. Chen, S. Alberti, and K. S. Matthews,]. Biol. Chem. 269,12482 (1994). 68. M. J. Weickert and S. A d h ~ ] .B i d . C h m . 267, 15869 (1992). 69. B. MuUer-Hill, Nature (London} 302,163 (1983). 70. B. von Wilcken-Bergmannand B. Miiller-Hill,Pmc. NatZ. Acud. Sci. U.S.A.79,2427 (1982). 71. R. G. Brennan and B. W. Matthews,]. B i d . C h a . 264, 1903 (1989). 72. R. Kaptein, E. R. l? Zuidenveg, R. M. Scheek, R. Boelens, and W. E van Gunsteren,J. MoZ. B i d . 182, 179 (1985). 73. R. M. J. N. Lamerichs, R. Boelens, G. A. Van Der Marel, J. H. Van Boom, and R. Kaptein, Eur. ]. Biochem. 194,629 (1990). 74. J. de Vlieg, R. M. Scheek, W. E van Gunsteren, H. J. C. Berendsen, R. Kaptein, and J. Thomason, Pmteins 3,209 (1988). 75. J. de Vlieg, H. J. C. Berendsen, and W. F. van Gunsteren, Pmteins 6,104 (1989). 76. V. P. Chuprina, J. A. C. Rullmann, R. M. J. N. Lamerichs,J. H. van Boom, R. Boelens, and R. Kaptein,]. MoZ. B i d . 234,446 (1993). 77. M. Slijper, A. M. J. J. Bonvin, R. Boelens, and R. Kaptein,]. MoZ. B i d . 259,761 (1996). 78. C. F. Sams,N. K. Vyas, F. A. Quiocho, and K. S. Matthews, Nature (London) 310, 429 (1984). 79. J. C. Nichols, N. K. Vyas, F. A. Quiocho, and K. S. Matthews, ]. B i d . Chem.268, 17602 (1993). 80. T.A. Steitz, ‘ I J. Richmond, D. Wise, and D. Engelman, Proc. NatZ. Acud. Sci. U.S.A.71, 593 (1974). 81. M. Charlier, J. C. Maurizot, and G. Zaccai, Nature (London) 286,423 (1980). 82. I. Pilz, K. Coral, 0. Kratky, R. P. Bray, N. G. Wade-Jardetzky,and 0.Jardetzky, Biochemistry 19,4087 (1980). 83. D. B. McKay, C. A. pickover, and T. A. Steitz,]. Mol. B i d . 156,175 (1982). 84. N. Wade-Jardetzky,R. P. Bray, W. W. Conover, 0.Jardetzky, N. Geisler, and K. Weber, J. Mol. B i d . 128,259 (1979). 85. Y.Ohshima, T.Horiuchi, and M. Yanagida,]. MoZ. BWZ. 9 1 515 (1975). 86. H. P. Zingsheim, N. Geisler, F. Mayer, and K. Weber,]. Mol. B i d . 115,565 (1977). 87. B. Muller-Hill and J. Kania, Nature (London}249,561 (1974). 88. J. Kania and D. T.Brown, Roc. Natl. Acud. Sci. U S A . 73,3529 (1976). 89. J. Kania and B. Miiller-HiU,Eur. J. BWchem. 79,381 (1977). 90. J. Chen and K. S. Matthews, Gene 111,145 (1992). 91. T. J. Daly and K. S. Matthews, Biochemistry 25,5474 (1986). 92. W.4 Chang, J. S. Olson, and K. S.Matthews,]. B i d . Chem.268,17613 (1993).
160
KATHLEEN SHIVE MATTHEWS AND J E W R Y C . NICHOLS
93. J. C. Nichols, Ph.D. Thesis, Rice University, 1995. 93a. J. C. Nichols and K. S. Matthews,1.Biol. Chem., in press (1997). 94. J. Chen, R. Surendran,J. C. Lee, and K. S. Matthews, Biochemistry 33,1234 (1994). 95. R. Fairman, H.-G. Chao, L. Mueller, T. B. Lavoie, L. Shen, J. Novotny, and G . R. Matsueda, h o t . Sci. 4,1457 (1995). 95u. M. M. Levandowski,0.V.Tsodikov, D. E. Frank, S. E. Melcher, R. M. Saecker, and M. T. Record, Jr., ]. Mol. Biol. 260,697 (1996). 96. C . A. Royer, G. Weber, T.J. Daly, and K. S. Matthews, Biochemistry 25,8308 (1986). 97. C. A. Royer, A. E. Chakerian, and K. S. Matthews, Biochemistry 29,4959 (1990). 98. F, Rastinejad, P. Artz, and l? Lu,]. Mol. Bwl. 233,389 (1993). 99. J. L. Betz, H. M. Sasmor, F. Buck, M. Y. Insley, and M. H. Caruthers, Gene 50,123 (1986). 100. A. Simons, D. Tils,B. von Wilcken-Bergmann, and B. MUer-Hill, Proc. Nutl. A d . Sci. U S A . 81,1624 (1984). 101. J. R. Sadler, H. Sasmor, and J. L. Betz, R-oc. Nutl. Acad. Sci. U.S.A. 80,6785 (1983). 102. R. Ogata and W. Gilbert, Proc. Natl. Acad. Sci. U.S.A. 74,4973 (1977). 103. W. Gilbert, A. Maxam, and A. Mimabekov, in “Control of Ribosome Synthesis,Alfred Benzon Symposium IX”(N. C. Kjeldgaard and 0.Maalee, ed.), p. 139. Academic Press, New York, 1976. 104. R. T. Ogata and W. Gilbert,]. Mol. Biol. 132,709 (1979). 105. D. V.Goeddel, D. G. Yansura, andM. H. Caruthers, Roc. Natl. Acad. Sci. U.S.A. 75,3578 (1978). 106. D. Shalloway, T. Kleinberger, and D. M. Livingston, Cell 20,411 (1980). 106a. S P. Manly, G. N. Bennett, and K. S.Matthews, Proc. Nutl. Acad. S d . U.S.A. 80,6219 (1983). 107. A. Schmitz and D. J. Galas, Nucleic Acids Res. 8,487 (1980). 107a. S. P. Manly and K. S. Matthews,]. Mol. Biol. 179,315 (1984). 108. S. P. Manly, G. N. Bennett, and K. S. Matthews,]. Mol. Bid. 179, 335 (1984). 109. M. H. Caruthers, S. L. Beaucage, J. W. Efcavitch, E. F. Fisher, R. A. Goldman, P. L. deHaseth, W. Mandecki, M. D. Matteucci, M. S. Rosendahl, and Y. Stabinsky, Cold Spring Harbor Symp. Quunt. Biol. 47,411 (1983). 110. B. Barbier, M. Charlier, and J.-C. Maurizot, Biochemistry 23,2933 (1984). 111. K. L. Wick and K. S. Matthews,]. Biol. Chem. 266,6106 (1991). 112. T. D. Allen, K. L. Wick, and K. S. Matthews,]. Biol. Chem. 266,6113 (1991). 113. W. S. Reznikoff, R. B. Winter, and C. K. Hurley, Roc. Nutl. Acad. Sci. U.S.A. 71, 2314 (1974). 114. M. Pfahl, V. Gulde, and S . Bourgeois,]. Mol. Biol. 127,339 (1979). 115. 0.G. Berg and C. Blomberg, Biophys. Chem. 8,271 (1978). 116. 0.G . Berg, R. B. Winter, and P. H. von Hippel, Trends Biochem. Sci. 7,52 (1982). 117. P. H. von Hippel and 0. G. Berg,]. Biol. Chem. 264,675 (1989). 118. 0. G. Berg, R. B. Winter, and P. H. von Hippel, Biochemistry 20,6929 (1981). 119. 0.G. Berg and P. H. von Hippel, Annu. Rm. Biophys. Biophys. Chem. 14,131 (1985). 120. T. Ruusala and D. M. Crothers, h c . Natl. Acad. Sci. U.S.A. 89,4903 (1992). 121. P. A. Whitson, J. S. Olson, and K. S. Matthews, Biochemistry 25,3852 (1986). 122. H. Krimer, M. Niemoller, M. Amouyal, B. Revet, B. von Wilcken-Bergmann,and B. MullerHill, EMBO]. 6,1481 (1987). 123. M. Brenowitz and E. Jamison, Biochemistry 32,8693 (1993). 124. W.-T. Hsieh, P. A. Whitson, K. S. Matthews, and R. D. Wells,]. Biol. Chem.262, 14583 (1987). 125. E. R. Eismann and B. MUer-Hill,], Mol. Biol. 213,763 (1990). 126. M. Brenowitz, A. F’ickar, and E. Jamison, Biochemistry 30,5986 (1991).
.
LACTOSE REPRESSOR PROTEIN
161
127. P. A. Whitson, W.-T. Hsieh, R. D. Wells, and K. S. Matthews, J. Biol. Chem.262, 4943 (1987). 128. P. A. Whitson, W.-T Hsieh, R. D. Wells, and K. S. Matthews, J. Biol. Chem.262, 14592 (1987). 129. H. Krimer, M. Amouyal, A. Nordheim, and B. Miiller-Hill, EMBOJ. 7,547 (1988). 130. J. A. Borowiec, L. Zhang, S.Sasse-Dwight,and J. D. Gralla,,]. Mol. Biol. 196,101 (1987). 131. M. Besse, B. von Wilcken-Bergmann,and B. Muller-Hill,EMBOJ. 5, 1377 (1986). 132. M. C. Mossing and M. T. Record, Jr., Science 233,889 (1986). 133. S. Sasse-Dwightand J. D. Gralla,]. Mol. Biol. 202,107 (1988). 134. Y. Flashner andJ. D. Gralla, h c . Natl. Acad. Sci. U.S.A. 85,8968 (1988). 135. S. Oehler, E. R. Eismann, H. Kramer, and B. Muller-Hill,EMBOJ. 9,973 (1990). 136. G. R. Bellomy and M. T. Record, Jr., “Structural and OrganizationalAspects of Metabolic Regulation,”p. 307. Alan R. Liss, New York, 1990. 137. S. M. Law, G. R. Bellomy, P. J. Schlax, and M. T. Record, Jr.,J. Mol. Biol. 230,161 (1993). 138. S. Oehler, M. Amouyal, P. Kolkhof, B. von Wilcken-Bergmann,and B. Muller-Hill,E M B O I. 13,3348 (1994). 139. J. Miiller, S. Oehler, and B. Miiller-Hill,J. Mol. Biol. 257,21 (1996). 140. G . R. Bellomy, M. C. Mossing, and M. T. Record, Jr., Biochemistry 27,3900 (1988). 141. A. M. Khoury, H. S. Nick, and P. Lu,]. Mol. Biol. 219,623 (1991). 142. G . R. Bellomy and M. T Record, Jr.. Prog. Nucleic Acid Res. Mol. Biol. 39,81 (1990). 143. L. Finzi and J. Gelles, Science 267,378 (1995). 144. M. C. Mossing and M. T Record, Jr.,J. Mol. B i d . 186,295 (1985). 145. D. E. Frank, R. M. Saecker,J. P. Bond, M. W. Capp, 0.V. Tsodikov, S. E. Melcher, M. M. Levandoski, and M. T. Record, Jr.,J. Mol. Biol. 267,1186 (1997). 146. 0.B. Kallai, J. M. Rosenberg, M. L. Kopka, T. Takano, R. E. Dickerson, J. Kan, and A. D. Riggs, Biochim. Biophys. Acta 606,113 (1980). 147. A. D. Riggs, S. Bourgeois, and M. Cohn,]. Mol. Biol. 53,401 (1970). 148. P. L. deHaseth, T. M. Lohman, and M. T. Record, Jr., Biochemistry 16,4783 (1977). 149. M. T. Record, Jr., P. L. deHaseth, and T.M. Lohman, Biochemistry 16,4791 (1977). 150. R. B. Winter, 0. G . Berg, and P. H. von Hippel, Biochemistry 20,6948 (1981). 151. T. M. Lohman, P. L. deHaseth, and M. T. Record, Jr., Biophys. Chem. 8,281 (1978). 152. A. M. Khoury, H. J. Lee, M. ms,and P. Lu, Biochim. Biophys. Actu 1087,55 (1990). 153. M. D. Barkley, Biochemistry 20,3833 (1981). 154. M. D. Barkley, P. A. Lewis, and G. E. Sullivan, Biochemistry 20,3842 (1981). 155. M. G . Fried and D. M. Crothers,]. Mol. Biol. 172,263 (1984). 156. R. B. Winter and P. H. von Hippel, Biochemistry 20,6961 (1981). 157. M. M. Levandoski, 0.V. Tsodikov, D. E. Frank, S. E. Melcher, R. M. Saecker, and M. T. Record, Jr.,J. Mol. Biol. 260,697 (1996). 158. J.-H. Ha, R. S. Spolar, and M. T Record, Jr..J. Mol. Biol. 209,801 (1989). 159. R. S. Spolar and M. T. Record, Jr., Science 263,777 (1994). 15%. C. A. E. M. Spronk, M. Slipjer,j. H. Van Boom, R. Kaptein, and R. Boelens, Nature Struc. Biol. 3,916 (1996). 160. N. Lehming, J. Sartorius, M. Niemoller, G. Genenger, B. von Wilcken-Bergmann, and B. Miiller-Hill, EMBOJ. 6,3145 (1987). 161. J. Sartorius, N. Lehming, B. Kisters, B. von Wilcken-Bergmann,and B. Muller-Hill,EMBO J. 8, 1265 (1989). 162. N. Lehming, J. Sartorius, B. Kisters-Woike,B. von Wilcken-Bergmann,and B. Miiller-Hill, EMBOJ. 9,615 (1990). 163. J. Sartorius, N. Lehming, B. Kisters-Woike, B. von Wilcken-Bergmann,and B. Miiller-Hill, J. Mol. Biol. 218,313 (1991).
162
KATHLEEN SHIVE MATTHEWS AND JEFFRY C. NICHOLS
164. N. Lehming, J. Sartorius, B. Kisters-Woike,B. von Wilcken-Bergmann,and B. Muller-Hill, Nucleic Acids Mol. Biol. 5,114 (1991). 165. N. Lehming, J. Sartorius, S.Oehler, B. von Wilcken-Bergmann, and B. Muller-Hill,R-oc. Natl. Acad. Sci. U.S.A. 85,7947 (1988). 166. J. A. Shin, R. H. Ebright, and P. B. Dervan, Nuckic Acids Res. 19,5233 (1991). 16% R. H. Ebright, R-oc. Natl. Acad. Sci. U.S.A. 83,303 (1986). 168. H. Nick, K. Amdt, F, Boschelli, M. A. Jarema, M. L i b , H. Sommer, and P. Lu, ]. Mol. Biol. 161,417 (1982). 169. H. Nick, K. Amdt, F. Boschelli, M. A. C. Jarema, M. Lillis, J. Sadler, M. Caruthers, and P. Lu, h c . Natl. Acad. Sci. U S A . 79,218 (1982). 169a. D. E. Kamashev, N. G. Esipova, K. K. Ebralidse, and A. D. Mirzabekov, FEBS Lett. 375, 27 (1995). 169b. D. Kamashev, K.Ebralidze, N. G. Esipova, and A. D. Minabekov, Mol. Bwl. 29,553 (1995). 170. A. Jobe and S . Bourgeois, 1.Mol. Biol. 75,303 (1973). 171. M. D. Barkley, A. D. Riggs, A. Jobe, and S . Bourgeois, Biochemistry 14,1700 (1975). 172. A. D. Riggs, R. F. Newby, and S. Bourgeois,]. Mol. Biol. 51,303 (1970). 173. Y. Ohshima, T Mizokoshi, and T Horiuchi,]. Mol. Bwl. 89,127 (1974). 174. J.-C. Maurizot and M. Charlier, Eur. J. Biochem. 79,395 (1977). 175. F. Boschelli, M. A. C. Jarema, and P. Lu,]. Biol. Chem.256,11595 (1981). 176. B. E. Friedman, J. S. Olson, and K. S . Matthews,]. Bwl. Chem. 251,1171 (1976). 177. S. L. Laiken, C. A. Gross, and P. H. von Hippel,]. Mol. Biol. 66,143 (1972). 178. B. E. Friedman, J. S.Olson, and K. S . Matthews,]. Mol. Biol. 111,27 (1977). 179. A. E. Chakerian,J. S.Olson, and K. S. Matthews, Biochemistry 26,7250 (1987). 180. T.J. Daly and K. S . Matthews, B i o c h i s h y 25,5479 (1986). 181. J. DonnBr, M. H. Caruthers, and S.J. Gill,]. Biol. Chem.257,14826 (1982). 182. M. Dunaway,J. S. Olson,J. M. Rosenberg, 0.B. Kallai, R. E. Dickerson, and K. S . Matthews, ]. Biol. Chem.255,10115 (1980). 183. R. 0.Spotts, A. E. Chakerian, and K. S.Matthews,]. Biol. Chem. 266,22998 (1991). 184. W.4. Chang, P. Barrera, and K. S.Matthews, Biochemistry 33,3607 (1994). 185. W.4. Chang and K. S. Matthews, Biochemistry 34,9227 (1995). 186. B. von Wilcken-BergmannandB. Muller-Hill, R-oc. Natl. A d . Sci. U.S.A.79,2427 (1982). 18% M. Pfahl,]. Mol. Biol. 147,175 (1981). 188. M. Pfahl, ]. Mol. Biol. 147,l (1981). 189. A. Schmitz, C. Coulondre, and J. H. Miller,]. Mol. Biol. 123,431 (1978). 189a. J. Suckow, P. Markiewim, L. G. Kleina, J. Miller, B. Kisters-Woike, and B. Muller-Hill, ]. Mol. Biol. 261,509 (1996). 190. A. E. Chakerian and K. S. Matthews,]. Biol. C h .266,22206 (1991). 191. L. Li and K. S. Matthews,]. Bwl. Chem.270,10640 (1995). 192. J. Chen and K. S. Matthews, Biochemistry 33,8728 (1994). 193. D. S. Yang, A. A. Burgum, and K. S. Matthews, Biochim. Biuphys. A& 493,24 (1977). 194. R. D. Brown and K. S. Matthews,]. Biol. Chem.254,5128 (1979). 195. T.J. Daly, J. S. Olson, and K. S . Matthews, Biochemistry 25,5468 (1986). 196. A. A. Burgum and K. S. Matthews,]. Biol. Chem. 253,4279 (1978). 197. J. M. Schneider, C. I. Barrett, and S. S. York, Biochemistry 23,2221 (1984). 198. D. E. Kelsey, T. C. Rounds, and S . S.York, Roc. Natl. A d . Sci. U.S.A. 76, 2649 (1979). 199. R. D. Brown and K. S. Matthews,]. Bwl. Chem. 254,5135 (1979). 200. S. P. Manly and K. S. Matthews,]. Biol. Chem. 254,3341 (1979). 201. H. Sommer, P. Lu, and J. H. Miller, J. Biol. Chem.251,3774 (1976). 202. C. A. Royer, J. A. Gardner, J. M. Beechem, J.-C. Brochon, and K. S. Matthews, Biophys. ]. 58,363 (1990).
LACTOSE REPRESSOR PROTEIN
163
203. J. C. Brochon, P. Wahl, M. Charlier, J. C. Maurizot, and C. HBIBne, Bwchem. Biophys. Res. Commun. 79,1261 (1977). 204. J . A. Gardner and K. S. Matthews,]. Biol. Chem. 265,21061 (1990). 205. R. B. O'Corman and K. S. Matthews,]. Biol. Chem. 252,3572 (1977). 206. P. K. Bandyopadhyayand C.-W. Wu, Arch. Biochem. Biophys. 195,558 (1979). 207. P. K. Bandyopadhyay,F. Y.-H. Wu, and C.-W. Wu,]. Mol. Biol. 145,363 (1981). 208. R. B. OGorman and K. S. Matthews,]. Biol. Chem. 252,3565 (1977). 209. M. Charlier, F. Culard, J.-C. Maurizot, and C. Helene, Bwchem. Biophys. Res. Commun.
74,690 (1977). M. Spodheim-Maurizot,M. Charlier, and C. Helene, Photochem.Photobiol. 42,353 (1985). P.A. Whitson, A. A. Burgurn, and K. S. Matthews, Biochemistry 23,6046 (1984). W.-T. Hsieh and K. S. Matthews, Biochemistry 24,3043 (1985). P. A.Whitson and K. S. Matthews, Biochemistry 26,6502 (1987). C. F. Sams and K. S. Matthews, Biochemistry 27,2277 (1988). T. C . Fanning, Biochemistry 14,2512 (1975). M. E.Alexander, A. A. Burgum, R. A. Noall, M. D. Shaw, and K. S. Matthews, Biochim. Biophys. A& 493,367 (1977). 217. W.-T. Hsieh and K. S. Matthews,]. Biol. Chem. 256,4856 (1981). 218. K. S. Matthews, H. R. Matthews, H. W. Thielrnann, and 0.Jardetzky, Biochim. Biophys. Acta 295,159 (1973). 219. K. S. Matthews, Biochim. Biophys. Acta 359,334(1974). 220. Y. Ohshima, M. Matsuura, and T.Horiuchi, Biochem. Biophys. Res. Commun. 47, 1444 (1972). 221. C.F. Sams, B. E. Friedman, A. A. Burgum, D. S. Yang, and K. S. Matthews,]. Biol. Chem. 252,3153 (1977). 222. M. A. C. Jarerna, P. Lu, and J. H. Miller, Proc. Natl. Acad. Sci. U.S.A. 78,2707 (1981). 223. F. Y.-H. Wu, P. Bandyopadhyay, and C.-W. Wu, J. Mol. Biol. 100,459 (1976). 224. S. S. York, R. C. Lawson, Jr., and D. M. Worah, Biochemistry 17,4480 (1978). 225. D. M. Worah, K. M. Gibboney, L.-M. Yang, and S. S. York, Biochemistry 17,4487 (1978). 226. K.Amdt, H. Nick, F. Boschelli, P. Lu, and J. Sadler,]. Mo2. Bwl. 161,439 (1982). 227. R. Kim and S.-H. Kim, Cold Spring Harbor Symp. Quunt. Biol. 47,451 (1983). 228. H. C. Pace, P.Lu, and M. Lewis, Proc. Natl. A d . Sci. U.S.A. 87,1870 (1990). 229. M. J. R. Stark, Gene 51,255 (1987). 230. J. W.Dubendomand E W. Studier, 1.Mol. Biol. 219,45 (1991). 231. M. C.-T, Hu and N. Davidson, Cell 48,555 (1987). 232. J. Figge, C. Wright, C. J. Collins, T. M. Roberts, and D. M. Livingston, Cell 52,713 (1988). 233. M. C.-T. Hu and N. Davidson, Gene 62,301 (1988). 234. H.-S. Liu, H. Scrable, D. B. Villaret, M. A. Lieberman, and F! J. Stambrook, Cancer Res. 52, 983 (1992). 234a. R. J . Wilde, D. Shuf'flebottorn, S. Cooke, I. Jasinska,A. Menyweather, R. Ben, W. J. Brammar, M. Bevan, and W. Schuch, EMBO]. 11,1251 (1992). 235. U. Deuschle, R. Pepperkok, F. Wang, T. J. Giordano, W. T. McAUister, W. Ansorge, and H. Bujard, Roc. Natl. Acad. Sci. U.S.A. 86,5400 (1989). 236. M. A. Labow, S. B. Baim, T. Shenk, and A. J. Levine, Mol. Cell. Biol. 10,3343 (1990). 237. S. B. Baim, M. A. Labow, A. J. Levine, andT. Shenk, h c . Natl. Acad. Sci. U.S.A. 88,5072 (1991). 238. A. Fieck, D. L. Wyborski, and J. M. Short, Nucleic Acids Res. 20,1785 (1992). 239. D.L.Wyborski and J. M. Short, NwZeic Acids Res. 19,4647 (1991). 240. S. W. Kohler, G. S. Provost, A. Fieck, P. L. Kretz, W. 0. Bullock, J. A. Sorge, D. L. Putman, and J. M. Short, h c . Natl. A d . Sci. U.S.A.88,7958 (1991). 210. 211. 212. 213. 214. 215. 216.
164
KATHLEEN SHIVE MATTHEWS AND JEFFRY C. NICHOLS
241. D. Gunz, S. E. Shephard, and W. K. Lutz, Enoiron. Mol. Mutagen. 21,209 (1993). 242. S. E. Shephard, C. Sengstag, W. K. Lutz, and C. Schlatter, Mutut. Res. 302,91(1993). 243. H. C. Hsia, J. S. Lebkowski, P.-M. Leong, M. P. Calos, and J. H. Miller, J. Mol. Biol. 205, 103 (1989).
Copper-Regulatory Domain Involved in Gene Expression DENNIS R. WINCE D e p a r t m of ~ Medicine and Biochemistry University of Utah Health ScienCRs center Sdt Lake City, Utah 84132
I. Copper Ion Sensing in Prokaryotes ............................... 11. Copper Sensing in Eukaryotes ................................... 111. Copper Metalloregulation in Yeast ................................ A. Copper-ResponsiveTrans-acting Factors ....................... B. Presence of a Polycopper-ThiolateCluster in Acel and Amtl . , , C. Dissection of Acel and Amtl into Functional Domains . . . . . . . . . . . D. Mechanism of Cu-Mediated Transcriptional Activation ........... n! Metal Clusters in Regulation .................................... V. Summary and Perspective ....................................... References ....................................................
. ..
168 169 170 174 177 180 185 188 190 191
Copper ion homeostasis in yeast is maintained through regulated expression of genes involved in copper ion uptake, &(I) sequestration, and defense against reactaive oxygen intermediates. Positive and negative copper ion regulation is obsewed, and both effects are mediated by Cu(1)-sensingtranscription factors. The mechanism of Cu(1) regulation is distinct for transcriptional activation versus transcriptional repression. Cu(1) activation of gene expression in S. cerevisiae and C. glabrata occurs through Cu-regulated DNA binding. Ihe activation process involves Cu(1) cluster formation within the regulatory domain in Acel and Amtl. Cu(1) binding stabilizes a spec& conformation capable of high-affinity interaction with specific DNA promoter sequences. Cu(1)-activated transcription factors are modular proteins in which the DNA-binding domain is distinct from the domain that mediates transcriptional activation. The all-or-nothing formation of the polycopper cluster permits a graded response of the cell to environmental copper. Cu(1) triggering may involve a metal exchange reaction converting Acel from a Zn(I1)specific conformer to a clustered Cu(1) conformer. The Cu(1) regulatory domain occurs in transcription factors from S. cerevisiae and C. glabrata Sequence homologs are also known in Y.lipolytica and S. pombe, although no functional information is available for these candidate regulatory molecules. The presence of the Cu(1) regulatory domain in four distinct yeast strains suggests that thiiCu-responsive domain may occur in other eukaryotes. Cu-mediated repression of gene expression in S. cerevisiae occurs through Cu(1) regulation of Macl. Cu(1) binding to Mac1 appears to inhibit the transactivation domain. The Cu(1) specificity of this repression is likely to arise from formation of a polycopper thiolate cluster. 0 1998 Academic Ras Progress in Nucleic Acid Research and Molecular Biology, Vol. 58
165
Copyright D 1998 by Academic Press. All rights of reproduction in any form reserved. 0079-6603/98$25.00
166
DENNIS R. W N G E
Cells in the natural world experience a changing environment. Physiological responses to such changes enable cells to survive variation in nutrient concentrations or environmental factors. The classical work of Jacob and Monod in 1961demonstrated that expression of a set of genes is coupled to the concentration of a nutrient, lactose ( I ) . This work led to the concept of regulator proteins that function as cellular sensors in detecting nutrient or environmental changes. Signals from the sensors are transduced into physiological responses. A common primary response is the transcriptional regulation of genes whose products have protective functions or participate in nutrient metabolism. In the case of the genes regulated by the concentration of lactose, the regulator protein (lac repressor) represses transcription when the regulator is not bound by lactose or another inducer (2). Sugar binding to the lac repressor propagates a conformational change within the DNA-binding domain, thereby diminishing the affinity of the protein for DNA(3, 4). A myriad of sensory mechanisms have been uncovered to detect changes in the availability of a nutrient. A common theme in sensory mechanisms involves signal transduction through conformational dynamics of sensory molecules (5, 6). Some sensory molecules share structural similarities. For example, a family of bacterial sensory proteins that mediate sugar-induced chemotaxis are unified by common structural elements (6). We are interested in cellular mechanisms for sensing metal ions. Cells experience changes in the availability of metal ions. Under conditions of limiting levels of a particular metal ion, genes encoding proteins involved in uptake of copper, iron, and zinc ions are derepressed in yeast (7-10).In contrast, conditions of excess metal ions in the environment result in the repression of metal ion uptake genes as well as the induced expression of a different subset of genes (11-14). Genes activated by metal ions encode proteins that typically have protective roles (11-14). How do cells sense a specific metal ion and transduce a physiological response? This review focuses on mechanisms by which cells specifically sense the intracellular copper ion concentration and transduce the signal into the physiological regulation of the cellular copper levels. The basis for copper sensory response in yeast lies in formation of specific polycopper clusters. The biology and chemistry of the copper signal transduction pathway are addressed. The existence of other metal clusters in general nutrient sensing is also discussed. Copper is an essential nutrient. Copper is now known to be an essential cofactor in nearly 20 enzymes. However, excess accumulation of copper ions results in toxicity. In fact, copper salts have historically been used as a fungicide, molluscide, and algicide (15).Copper-induced toxicity may arise, in
CU-REGULATORY DOMAIN IN GENE EXPRESSION
167
part, from cell damage caused by reactive oxygen intermediates. Transition metal ions (i.e. copper) can catalyze the formation of highly reactive hydroxyl radicals through the Fenton reaction (16). Hydroxyl radicals react with most biomolecules at diffusion-limited rates, resulting in polypeptide bond cleavage and DNA base and sugar oxidation (17,18). Homeostatic mechanisms exist in all cells to regulate the cellular concentration of copper ions, thus maintaining copper balance and minimizing the deleterious effects of copper ions. Maintenance of cellular copper homeostasis expectedy has three features. First, cells must be able to sense the intracellular concentration of copper ions or, alternatively, the concentration within the immediate environment of a cell. Second, a mechanism must exist to transduce the copper ion concentration signal into a physiological response. Third, the processes are expected to be specific for copper ions such that the regulation of the intracellular concentration of copper ions does not affect the nutritional status of other essential metal ions. The range of the copper concentration affected by homeostatic mechanisms is not known. However, by cornparision, the homeostatic range for Zn(I1)exists within a narrow concentration range (19,20).Cultured hamster kidney cells fail to propagate when the intracellular Zn(I1)concentration falls below about 0.25 fmol Zn(II)/celldue to Zn(I1) deficiency or rises above 0.6 fmol Zn(II)/celldue to Zn(I1) toxicity (19, 20).Cells can tolerate Zn(I1) levels in excess of 0.6 fmohell if they are able to sequester Zn(1I) (19,20).Clearly, these cells are able to sense the intracellular Zn(1I) level and respond to maintain homeostasis. The mechanisms of Zn(I1) sensing in mammalian cells is not understood. Cellular sensing of a metal ion signal is best understood with respect to calcium ions. Fluctuations in the intracellular Ca(I1) concentration between lo-* and M correlates with cellular responses (21).Physiological stimuli that increase the free Ca(I1) concentration upward of lop5M result in propagation of a signal. Sensors that couple the free Ca(1I) concentration with physiological responses include calmodulin, troponin C, and SlOO proteins (21, 22). Ca(I1) binding to calmodulin and troponin C induces a conformational change, creating a functional response (21-23). The structural motif in these two proteins that permits reversible Ca(I1) binding is the E F hand. In essence, the EF hand motif serves as a molecular switch enabling the cell to detect a stimulatory influx of Ca(II), thereby transducing the signal into a cellular response (21). If parallels exist between Ca(I1) regulation and copper regulation, one may predict that cellular sensing of copper ions is achieved through conformational dynamics of a copper regulatory domain. Information gleaned from studies on bacteria, yeast, and animal cells sug-
168
DENNIS R. WINCE
gests that cells can sense the cellular copper ion concentration and respond to copper ions by regulating uptake or biosynthesis of copper-buffering polypeptides. Copper ion sensing in different species is initially discussed, but the review focuses on Cu activation of gene expression in yeast as the yeast sensors that mediate copper regulation are well characterized.
I. Copper Ion Sensing in Prokaryotes Diverse mechanisms of copper regulation are known in prokaryotes. Pseudomonm regulate the intracellular Cu ion concentration by controlling Cu uptake through regulated synthesis of periplasmic and outer membrane Cu-binding proteins (24).Cu metalloregulation is achieved through a twocomponent system involving CopR and Cops (24).Copper ions trigger the intrinsic Cops membrane protein to activate CopR through transmembrane signal-mediated phosphorylation. Phospho-CopR is an active transcription factor that stimulates expression of the cop operon. Two-component kinase regulatory systems are common mechanisms in prokaryotes for regulation of genes that respond to environmental stimuli (5). Based on homology to other signal transduction systems, Cops is functionally the copper sensor and CopR the signal transducer. A similar, but less well characterized twocomponent system exists in Escherichia coli. The physiological response in E. coli is limitation of intracellular Cu levels through energy-dependent Cu export during initial copper exposure, but copper accumulation in late stages of exposure (25). The candidate E. coli regulatory molecules, PcoS and PcoR, are homologous to the candidate sensor and transducer in Pseudomonm as well as other signal transduction molecules (5).The candidate sensor molecules in Pseudomonm and E. coli appear to be intrinsic membrane proteins based on the presence of hydrophobic segments of the sequences (24,25). In Entmococcus hirae, Cu-regulated expression of two P-type ATPases is critical for Cu homeostasis (26). One ATPase, CopA, is believed to be important in Cu uptake, whereas the second protein, CopB, appears to be a Cu-exporting ATPase (27). An interplay of two regulatory proteins, one a Cu-regulated repressor and the second a Cu-dependent activator, appears critical for mediating Cu regulation of CopA and CopB biosynthesis (26).The mechanisms of Cu sensing and signal transduction remain unresolved. Cu-dependent regulation of alternate metalloproteins is well established in green algae (28).The availability of copper in Chlamydonwnasreinhardtii is the determining factor of whether cells synthesize the Cu-containing plastocyanin or the heme-containing cytochrome c6 to mediate electron transfer
CU-REGULATORY DOMAIN IN GENE EXPRESSION
169
in the reaction center of photosystem I. If Cu ions are available, plastocyanin is the preferred molecule synthesized. Cu-deficient conditions result in activation of cytochrome c6 transcription. However, neither the identity of the Cu sensor nor the mechanism of Cu regulation is understood (29).
II. Copper Sensing in Eukaryotes Copper metalloregulationis poorly understood in animal cells. One of the most intensively studied metal-regulated gene systems involves the metallothionein family of proteins (30).Mammalian cells have multiple metallothioneins (MT) that can buffer the intracellular concentration of multiple metal ions. Human disorders that impair copper egress from cells (e.g., Wilson disease) result in copious amounts of cellular CuMT complexes (31).Although targeted disruptions of the MT1 and MT2 genes in the mouse failed to exhibit any copper hypersensitivity phenotype, the essentiality of MT is apparent in mice lacking the Menkes copper effluxer (32,33). MT genes are metalloregulated in most animal cells (30).One transcription factor that mediates MT gene expression is metal-responsive transcription factor 1(MTF1)(34,35).The role of MTFl as a metal ion sensor remains unclear. In addition, it is unknown whether Cu-induced expression of MT genes is a direct effect of interaction of Cu ions with a cellular sensor or whether Cu ions work by altering intracellular Zn pools (35). The yeast Saccharomyces cerevisiae provides the most complete picture of eukaryotic copper metalloregulation. Many components of copper regulation, including the sensors, have been identified in yeast. The advances in yeast may be significant and highly relevant to human biology. A expanding number of situations exist in which the combination of homology between yeast and human genes and available mechanistic information on the yeast gene function permits predictions to be made on structure-function relationships in human gene products. For example, studies on yeast MECl and TEL1 provided insight into the human gene that predisposes individuals to ataxia-telangiectasia (36).In fact, of 51 cloned human genes associated with disease, 13 show similarities to yeast genes (37). Yeast studies have provided clues to human cell copper homeostasis. The cloning of genes predisposing humans to Menkes and Wilson diseases provided insight into human copper homeostasis. The Menkes and Wilson gene products are homologous to each other as well as to cation transporters of the P-type ATPase class (31).Both proteins appear to be localized within vesicular membranes and therefore function in transmembrane copper transport. Gene amplification of the Menkes gene in CHO cells confers a
170
DENNIS R. WINGE
phenotype of hyperresistance to copper salts through enhanced copper ion efflux (38). The contribution of yeast to understanding the physiology of Menkes and Wilson gene proteins came in the discover of two yeast homologs. One yeast homolog, designated Ccc2, was shown to be critical for iron transport (39). Disruption of CCC2 in yeast results in defects in respiration and iron transport (39). Ccc2 is critical for copper insertion into the Fet3 ferrooxidase, which functions in conjunction with Ftrl in iron transport across the plasma membrane (9,39).Ccc2 is predicted to function as a copper ion transporter pumping Cu ions into a vesicle, perhaps the trans-Golgi bodies, where Cu insertion into Fet3 occurs (9).Based on homology with ceruloplasmin, the active Fet3 molecule would contain a trinuclear copper cluster. The role of Ccc2 in Fet3 biosynthesis may be analogous to the role of Wilson gene product in biosynthesis of ceruloplasmin, a mammalian ferrooxidase (9). In the absence of a functional Wilson Cu transporter, Cu insertion into ceruloplasmin is impaired, resulting in low ferrooxidase activity. One classical characteristic of Wilson disease patients is low ceruloplasmin activity (40). The yeast CCC2 gene is iron regulated through the Aft1 transcriptional activator (41).The question arises whether Wilson gene is iron regulated in human cells.
111. Copper Metalloregulation in Yeast Copper metalloregulation exists in yeast as copper ions are required for. at least three key enzymes (Fig. 1).The ability of cells to grow on nonfermentable carbon sources is dependent on having an active cytochrome oxidase complex that requires Cu ions as cofactors (42). Oxidative growth requires defense molecules against reactive oxygen intermediates. Superoxide dismutase is a Cu metalloenzyme that dismutes superoxide anions (43). A third key Cu metalloenzyme is Fet3, which is a ferrooxidase critical for uptake of Fe(I1) (39, 44). A myriad of other oxidases and oxygenases require Cu(I1)as a functional cofactor,but the presence of these enzymes in the yeast Saccharomyces cerevisiae is unclear. Copper ions regulate the biosynthesis of several proteins in S. cermisiae. Genes encoding proteins involved in copper ion uptake are repressed in cells grown in medium containing 10 pM Cu(I1) (7,45).The high-affinity plasma membrane transporters Ctrl and Ctr3, as well as a metalloreductase Frel, are Cu repressed (7, 45-47). Genes encoding these molecules are fully expressed only in conditions of low environmental Cu(I1).Thus, expression of this Cu uptake system appears to be a cellular response to inadequate intracellular Cu levels. A number of other gene products may be important in Cu
CU-REGULATORY DOMAIN IN GENE EXPRESSION
171
FIG.1. Scheme of copper-dependent activation of gene expression in yeast. The banscrip tion factor Acel is the &(I) sensor in Sacchummyces cereuisMe and mediates Cu activation in expression of CUPI, CRS5, and SODI. The activation of Acel involves formation of a tetracopper cluster in a Cu-regulatorydomain.
uptake and therefore may be copper regulated. These include three homologs to the Frel metalloreductase and a candidate low-affinity Cu transporter designated Ctr2 (4.2,47, 48). The sensor that mediates Cu repression of CTRl and FREl is Macl. Cells harboring a dominant MAC1 mutation, MACl"P', are unable to repress Cuuptake genes (45, 49). These cells are hypersensitive to copper salts in the growth medium (49).Cells lacking a functional MAC1 product show reduced copper transport (45).Macl appears to function as a transcription factor necessary for basal expression of CTRl and FREl. The mechanism of Cu repression appears to consist of Cu inactivation of the Macl transactivation domain (50).Mechanistic details of the Cu inactivation are being pursued. A different subset of genes are transcriptionally activated when the extracellular copper concentration exceeds 10 pM. These include CUPI, CRSS, and S O D 1 (11-14). CUP1 and CRSS encode cysteinyl-rich polypeptides in the metallothionein family (1.2,13,14).CUP1 is the dominant locus
172
DENNIS R. WINGE
that confers the ability of yeast cells to propagate in medium containing copper salts (51-53). Cells highly resistant to copper salts contain a CUP1 locus with tandem arrays of genes encoding the Cupl metallothionein (51). The Cupl metallothionein buffers the intracellular copper ion concentration by binding Cu(1) ions within a heptacopper-thiolate cluster (54, 55). The effectiveness of Cupl metallothionein in copper ion buffering is dramatically enhanced by the cellular coupling of the biosynthesis of Cupl to the cehlar concentration of Cu(I) ions (11).The copper regulation of Cupl biosynthesis occurred at the level of transcriptional activation (56, 57). All genes have promoter elements that are recognized by factors that bind DNA specifically and provide a nucleation site for assembly of transcription components (58). Transcriptional activators of genes transcribed by RNA polymerase I1 contain at least two domains, one for specific DNA binding and another for activation (58,59). Interaction of the activation domain with proteins that interact with the TATA element allow for RNA polymerase binding, creating the preinitiation complex (58, 59). Cis-acting promoter elements in the 5’ sequence of CUP1 were shown to be critical for Cu-mediated transcriptional activation (11,60).The cis-acting element (UAScu: Cu-responsive upstream activation sequence) is specifically regulated by copper and silver ions (11).The UAScu in CUP1 consists of 16 base pairs (bp) and therefore spans a turn and a half of a B-form DNA helix (61, 62). The CUP1 h4T gene contains four candidate UAScu elements, two of which exist within a palindrome (60).The palindromic UAScu can function independently as a Cu-responsive element (11, 63).The other two candidate UAScu sites lack any dyad symmetry and have not been demonstrated to be independently functional. Alignment of the four candidate UAScu sites from CUP1 reveals a series of conserved nucleotides (Fig. 2). A core GCTG sequence is present in each element. Saccharomyces cereuisiae contains a second metallothionein gene, CRS5, whose expression is copper regulated. CRS5 is present as a singlecopy gene, unlike the tandem array of CUP1 metallothionein genes (13). CUP1 is the dominant metallothionein locus in yeast (53).Targeted disruption of CUP1 in cells with a wild-type CRS5 locus confers hypersensitivity to copper salts (13).An additional targeted disruption of CRSS exacerbates the hypersensitivity (13).The dominance of CUP1 in copper buffering could not be rigorously inferred from these studies for two reasons. First, the CUP1 locus is normally amplified, leading to a major difference in gene copy number. Second, the CUP1 MT genes contain multiple Cu-responsive UAScl, elements in the 5’ sequences. CRS5 contains only a single such element (13) (Fig. 2). Mutations within the CRS5 UASCu element abolished the effectiveness of Crs5 in copper ion buffering (13). To compare the two metallothionein genes under comparable conditions,
CU-REGULATORY DOMAIN IN GENE EXPRESSION
173
CuZnAce1 Bindina Sites Sa ccharomyces cerevi siae
***
*** ******
--- -__------CUP~
GCGTCTTTTCCGCTQA U A S c u ~ GTAGTCTTTTTGCTGG UASCuR TATATTCTTTTGCTGG AAGACATTTTTGCTGT
som
GCGGCATTTGCGCTGT
as5
GCGTCTTTATTGCTQT
CuZnAmtl Bindincr Sites Candida glabrata
AMTl
AATTTG-A
MTI
AVATTGCTQA ATTATTGCTGT
MTII
GATTTAWWA ATTTTCTATTAAGCTGT AATTCCGCTGA AATTCGGCTOA
FIG.2. DNA-binding sites for Acel and Amtl. The dashes above the sequences refer to nucleotides determined to be critical by footprinting analyses. The asterisks above the Acelbinding sites designate nucleotides in the UAS,, palindrome found to be essential by mutagenesis (11).The proximal site of the binding site is defined as the core, conserved GCTG sequence.
disruptions were carried out in a background containing only a single CUPl MT gene (53).A ms5A strain was only slightly more copper sensitive than CRS5 cells (53).To evaluate the role of &-acting elements in the dominance of CUPl, hybrid genes were constructed with each MT open reading frame (ORF) placed under either the CUPl or CRSS 5’ sequences (53).The dominance of the CUP1 promoter sequence was clearly established (53).The modest effects of CRSS MT in copper buffering may imply that Crs5 has a major physiological function distinct from copper ion buffering.
174
DENNIS R. WINGE
CUP1 and CRS5 MT polypeptides are not homologous, but both contain multiple Cys-X-Cys sequence motifs and bind multiple Cu(1)ions (13).Cupl binds 7 Cu(1) ions in a single Cu7Sl, copper-thiolate cluster (54). Crs5 binds 12 Cu(I) ions, presumably within two separate polycopper clusters, as is the case with mammalian MTs (53). The third known gene activated by copper ions is SOD1, encoding Cu,Zn-superoxide dismutase (12, 6 4 . A single UAS,, exists in the 5' promoter sequence of SOD1 (12) (Fig. 2). Cupl and Sodl exhibit cross-functions. CuCupl exhibits limited superoxide dismutase activity,and CUP1 can functionally suppress the oxygen sensitivity of a sodlh strain (65).The mechanism of superoxide dismutation by CuCupl involves a thiyl radical species (66,67). Likewise, Sodl contributes to copper buffering in S. cerevisiue (68). Thus, Cupl and Sodl each contribute to copper and oxygen radical homeostasis (68).
A. Copper-Responsive Transacting Factors The trans-acting factor that mediates Cup) activation in S. cerevisiae was identified as Acel (also designated Cup2) (69, 70). The mechanism of Cu(1) activation through Acel was shown to be Cu-dependent Acel binding to the UAS,, elements in CUP1 (11, 71). Acel formed a specific complex with UAS,, DNA only in the presence of Cu(I) or Ag(1) ions (11).The DNAbinding domain of Acel was shown to map to the N-terminal122 residues (11).The intact Acel molecule consists of 225 residues. Twelve cysteinyl residues are present in the N-terminal122 residues of Acel, 11of which are critical for Cu-induced expression of CUP1 (72).Ten of the 11 critical cysteinyl residues are present in C ~ S - X , , ~ - C sequence ~S motifs that are commonly found in metal-binding proteins such as metallothionein.These results led Furst et al. to postulate in 1988 that Cu(I) binding to Acel triggered a conformational change to a fold that was poised for DNA binding (11).According to the model, binding of Acel to UAS,, elements upstream of CUP1 allows the transactivation domain of Acel to function in the assembly of the preinitiation transcription complex. Acel binds to the UAS," elements in SOD1. Mutations at conserved nucleotides in UAS,, of CUP1 or SOD1 preclude Acel binding (12,6 4 . In a systematicmutagenesis study of the critical nucleotides in the dominant half of the CUP1 UAS,, palindrome, 12 nucleotides were critical (marked by asterisks in Fig. 2) (11).All nucleotides conserved between UAS,, elements in CUP], SODl, and CRS5 were shown to be critical for copper inducibility in the mutagenesis study (11). Footprinting analyses of Acel binding to UAS,, revealed major groove base contacts at the two ends of UAS,, and minor groove contacts in the middle M-rich region (Fig. 2) (61, 62).The prediction was made that Acel
CU-REGULATORY DOMAIN IN GENE EXPRESSION
175
lies atop the minor groove, contacting the major groove on both sides (61). Acel was suggested to contain a bipartite DNA-binding domain, one module for contact of the proximal, core GCTG sequence and a second for contact of the distal GCG sequence (61, 62). Consistent with the bipartite theory is the observation that a mutant Acel with a CysllTyr substitution made contact with only the core GCTG sequence and the A/T minor groove region (61). There were no distal contacts. Expectedly, in vi&o binding affinity of the mutant protein for DNA was reduced nearly 10-fold (61). Additional insights on the mechanism of copper metalloregulation in yeast comes from studies on a related system in the yeast Cundida gZubruta. Cundida glubrutu is an opportunistic pathogen that usually occurs in association with the more virulent Cundida ulbicuns (73). Both yeasts can cause systemic infections in immunocompromised patients (74). Cundida glubruta contains a family of MT genes, as is observed in S. cerevdsk (75). The MT genes of C.glubruta, designated MTZ, MTllu, and MTllb (75, 76), are specifically copper responsive in their expression. The MTZlu locus contains a tandem array of MTllu genes, analogous to the CUPl locus in S. cereuisiae (77). Copper tolerance of C.glabrata is related to the MTllu copy number (77). MTllu and MTllb have identical coding sequences but differ slightly in 5’ and 3’ sequences (76).MTII molecules encodes by MTZlu and M T U b confer greater copper tolerance than MTI (76).MTI is functionally and structurally analogous to Crs5 in S. cereuisk. The implications of these similarities remain unclear. The trans-acting factor that mediates Cu-induced expression of MT genes in C.glubruta is Amtl (78).AMTl expression is itself copper responsive, and this transcriptional autoregulation is mediated by a single Cu-responsive promoter element in the AMTl 5’ sequences (79). The promoter element resembles the S . cerevisiaeUAS,. element in the presence of a core GCTG sequence preceded by an A/T-rich region (80).Cells harboring a mutant AMTl in which two the two guanine bases within the core GCTG sequence in the AMTl promoter were substituted with adenines were compromised in copper tolerance (80).The clear implication of this copper sensitivity is that autoregulation is critical for normal function of Amtl (79).Curiously, autoregulation is not observed in A d . AMTl can functionally suppress the copper sensitivity of ace1-1 S. cerevisiue cells by mediating copper-induced expression of CUPl (81).In contrast, ACE1 cannot suppress the copper-sensitivity of C.glabrata umtl A cells (82).The cross-species function of Amtl, but not Acel, may arise from the observation that the Amtl DNA-binding site consists of only a subset of the DNA contacts that exist in the Acel DNA-binding site (Fig. 2). Acel and Amtl are structurally homologous. The N-terminal half of Amtl is 50% identical to the corresponding region of Acel, with complete
176
DENNIS R. WINGE
conservation of sequence positions of the 11critical cysteinyl residues if one gap is added in the sequence alignment in Acel (Fig. 3). Both proteins contain two CysXaaXaaCys sequence motifs, three CysXaaCys motifs and an isolated Cys residue. Amtl is a 265-residue polypeptide, unlike the 225 residues in Acel. A portion of the length difference occurs in the N-terminal DNA-binding domain. That domain in Amtl is 10 residues longer than the corresponding domain in Acel. Homology between Acel and Amtl ends after residue 100 in Acel. The C-terminal region (residues 120-225) of Acel has been shown to contain the transactivation domain (83).Thus, Acel, like many yeast transcription factors, is modular in nature, with a specific DNA-binding domain and a separate and independently acting transactivation domain (84). The transactivation domains of Acel and Amtl are acidic in nature. The PI values of the C-terminal regions of Acel (111-225) and Amtl (111-265) are near 4 for both. In contrast, the PI values for the N-terminal DNA-binding domains of Acel and Amtl are around 9.8 for both. Many yeast transcriptional activators contain acidic transactivation domains (84).The acidic nature of transactivation domains is only a descriptive characteristic, rather than a functional feature. The negative charge of acidic transactivation domain does not
1
40
Acel: MVVINGVKYA CETC IRGHRAAQ CTH TDGPLQMIRRKGRPS Amtl: MVVINGVKYA CDSC IKSHKAAQ CEH NDRPLKILKPRGRPP Lpz8: MVLINGIKYA CERC IRGHRVTT CNH TDQPLMMIKPKGRPS
........... . . .. . . . :.
:
:. :.
: :
. . .... ...
: ::
41
80
TT CGHC KELRRTICNF'NPSGG CMC AS-AR-RPAVGSK-EDE TT CDHC KDMRKTKNVNPSGS CNC SKLEKIRQEKGITIEED TT CDYC KQLRKNKNANPEGV CTC GRLEKKKLAQKAKEEAR
.. ..
.
: :
: . :: : : :
: :
. .
81
--------- R CRC DEGEP CKC HT-KRKSSRKSKGGSCH. MLMSGNMDM- CLC VRGEP CRC HA-RRKRTQKSNKKDNL. AKAKEKQRKQ CTC GTDEV CKY HAQKRHL-RKSPSSSQK. : :
.. ..
. .. .
..
FIG.3. Sequencealignmentof amino-terminaldomains ofAcel, Amtl, and Lpz8. Lpzd was identified from the yeast genome sequencing project. The critical metal ligands are present in bold letters. The colons and dots shown below the alignment indicate sequence identities and conservative changes, respectively.
CU-REGULATORY DOMAIN IN GENE EXPRESSION
177
appear to be critical (84). In addition, acidic activators do not appear to exhibit a strict structural dependency in function (84). Additional candidate Acel-like copper sensors have been identified through genomic sequencing projects (Fig. 3). One of the candidate sensors exists in S. cerevisiae.This candidate open reading frame (ORF), designated Lpz8 (accession #U31900), is homologous to Acel in the N-terminal 120 residues but is distinctive in the C-terminal segment (Fig. 3). The ORF in Lpz8 consists of 694 residues, unlike the 225 residues in Acel. LpzS contains 10 of the 11critical cysteinyl residues in Acel, The cysteine-rich Nterminal region of Lpz8 is highly basic, as in Acel and Amtl @I = 9.8). The last critical cysteine in Acel (Cys90) is a Tyr in Lpz8. Mutation of cysteinyl codon (codon 90) in Acel abolishes in vivo function (72). We have shown mRNA encoding this sequence is present in wild-type S. cerevisiae (L. J. Martins and D. R. Winge, unpublished observation). Cells containing a disrupted Lpz8 locus are viable and exhibit no copper hypersensitivity. The function of Lpz8 remains unresolved. Sequence homology is suggestivethat Lpz8 is a copper sensor in S. cerevisiae,but the absence of an apparent copper phenotype at present precludes proof. A related ORF has been reported from Yarrowia Zipolyticu. This candidate molecule (Crfl, accession #P45815) of 412 residues contains the 11 conserved cysteinyl residues found in Acel and Amtl within a highly basic N-terminal segment. Crfl deviates from Acel and Amtl in the N-terminal 120 residues, primarily in the one region, that distinguish Acel and Amtlnamely, the peptide segment separating the first and second Cys-X-Cys motifs. The significance of the length variation is discussed later. No functional information on Crfl has been reported to date.
B. Presence of a Polycopper-ThiolateCluster in Acel and Amtl The DNA-binding activity of Acel was shown to be dependent on the presence of copper ions (11, 85). The addition of Cu(1) to an Acel peptide consisting of residues 1-122 resulted to a specific protein-DNA complex (11, 85). The only metal ions that facilitated protein-DNA complex formation were Cu and Ag ions (11,85). Expression of recombinant Acel in bacteria enabled the isolation of a CuAcel complex (85,86).The CuAcel complex exhibited transitions in the ultraviolet range consistent with Cu-thiolate charge-transfer bands (85).This was the first indication that cysteinyl thiolates may be important in Cu ion binding. The complex was luminescent in aqueous solutions, indicative of Cu(1) binding in an environment shielded from solvent interactions (85). Irradiation of the complex with ultraviolet light resulted in emission yielding
178
DENNIS R. WINCE
an orange hue. The implication of the Cu(1)emission was that &(I) ions were bound within the interior of the Acel molecule. Initial attempts to evaluate the stoichiometry of Cu binding focused on in vitro reconstitution, as native isolates exhibited a variable bound Cu content (85). Titration studies of apohcel with Cu(1) were carried out, monitoring emission as the indicator of Cu(1) binding (85). Titration studies revealed a biphasic rise in emission with an inflection point at 4 mol eq. and maximal emission at 6 mol eq. (85). Subsequently, improvements in the purification protocol of the recombinant Acel and Amtl molecules (N-terminal 122 and 110 residues, respectively) resulted in the reproducible isolation of each molecule with 4-mol eq. Cu(1) bound (86, 87). In addition, each complex contained a single bound Zn(I1) ion. Mass spectrometry of the Cu,ZnAmtl complex revealed a single major complex exhibiting a mass consistent with the mass of the Cu,ZnlAmtl complex (87).Copper reconstitution studies of apo-Amtl revealed an d-or-nothing formation of a tetracopper species (87). Copper ion titration studies with apo-Amtl revealed maximal emission at 4-mol eq. Cu(I),unlike the maximal emission at 6 mol eq. in Acel (87).It appears that Cu(1) can bind in the ZnQI) site in Acel and perhaps an addltional site in Acel, but not Amtl. The CuQ) ions were tenaciously bound. Acidification to pH values below 1were required to dissociate the bound Cu(1) ions, whereas pH 4 conditions dissociated the single Zn(1I) ion (87). X-ray absorption spectroscopy was carried out on the CuAcel (prepared by in vitro reconstitution) and Cu,ZnAmtl complexes (85, 86, 88). In this technique, an analysis is made of photoelectrons ejected upon absorption of x-ray radiation by a copper atom being analyzed. The technique is c d e d extended x-ray absorption fine structure (EXAFS) and, for copper analyses, transitions of the copper 1s electron are monitored. The photoelectron wave is backscattered by neighboring atoms. Analysis of the scattering pattern yields radial structural information. The arrangement of neighboring atoms is not as important as the distance of such atoms from the absorbing Cu atom. Copper K-edge EXAFS provides an accurate picture of neighboring atoms within a 4 to 5-A distance from the absorbing copper atom. Since Acel and Amtl each contain four Cu(4 ions, EXAFS analyses will provide only an average picture of the four ions. Copper K-edge EXAFS of CuAcel and Cu,ZnAmtl revealed a dominant first shell scatter peak that was best fit by three sulfurs at 2.26 A. A coordination number of three is consistent with known model Cu(1)-thiolatecomplexes (89).An outer shell scatterer was present in both samples, and this interaction was only fit with the inclusion of a heavy atom scatterer at 2.7 A from the absorbing Cu atom (87, 90). The detection of the outer shell heavy
CU-REGULATORY DOMAIN IN GENE EXPRESSION
179
atom scatterer is suggestive of at least one additional Cu atom located 2.7 8, away from the scattering Cu atom. The observation of the short Cu-Cu distance in Cu(1)complexes of Acel and Amtl is consistent with a clustering of Cu(1) ions in each molecule. A number of small synthetic polycopper-thiolate clusters have been crystallographically characterized by synthetic inorganic chemists (89).One such cluster that has been studied in a number of laboratories is of a tetracopper cluster with thiolphenolate ligands (89). The four Cup) ions within the [Cu,(SPh) J2- complex exhibit trigonal coordination by the sulfurs, and all sulfurs bridge two Cup) ions. The bridging thiolates maintain the integrity of the polycopper cluster. The mean Cu-S bond distance in the cluster is 2.287 8, (89).The four Cu atoms are separated by 2.74 8, (89).These distances are similar to those observed in the CuAcel and Cu,ZnAmtl complexes. To prove that the outer shell heavy atom scatterer in CuAcel and Cu,ZnAmtl represented a Cu-Cu backscattering, we carried out EXAFS analyses of the crystallographically defined [Cu,(SPh) J2-complex (90).The transformed EXAFS of the synthetic cluster revealed the expected Cu-S distance of 2.28 8,and an outer shell interaction at 2.74 8,(90).The outer shell interaction was at the same distances as the mean Cu-Cu distance observed crystallographically. Thus, it appears that the Cu(1) complexes of Acel and Amtl consist of a Cu(1)-thiolatepolycopper cluster as in the defhed synthetic clusters. EXAFS provides no clear information about the nuclearity of the Cup)thiolate cluster in Cu,ZnAmtl. The EXAFS data are consistent with a cluster, but does not establish whether all four Cu(1) ions exist within the same polycopper cluster. There are several lines of evidence suggesting that the cluster nuclearity is four. The all-or-nothing formation of a 4-Cu(I) bound species and the linear rise in Cu(1) emission in titrations of apo-Amtl with &(I) peaking at 4 mol eq. are consistent with a nuclearity of 4 in the polycopper cluster. The similar mean Cu-S bond distances and Cu-Cu distances in Cu,ZnAmtl and the synthetic [Cu,(SPh)J2- cluster support the existence of a tetracopper center in Cu,ZnAmtl and Cu,ZnAcel. The spectroscopic similarities of the synthetic tetracopper cage structure to the Cu(1) clusters in Cu,ZnAmtl and Cu,ZnAcel leads to the following predictions about the transcription factors. First, each Cu(I) ion is coordinated by three thiolates. Second, the polycopper cluster is held together largely by p-bridging thiolates in which a given cysteinyl thiolate is coordinated to two &(I) ions (89).Third, Cu-Cu bonding is a minor energetic factor in the stability of the cluster (89). One obvious difference between the tetracopper clusters in Amtl and Acel and the synthetic [Cu,(SPh) J2- is in the symmeiy of the clusters. The tetracopper clusters in the transcription factors deviate from symmetry more than the synthetic cage cluster.
180
DENNIS R. WINCE
The polycopper cluster in Acel and Amtl resembles the polycopper cluster in metallothioneins in certain regards. The Cupl metallothionein consisting of 53 residues enfolds a single heptacopper-thiolate cluster (54, 55). The Cu(I) cluster in Cupl spectroscopically resembles the cluster in Cu,ZnAcel. The Cu(1) clusters in Cupl and Acel exhibit S +=Cu charge transfer bands in the ultraviolet, luminesce, and exhibit a prominent Cu-Cu scatter interaction by Cu Kedge EXAFS (90). The CuCupl solution structure recently resolved by NMR consists of two large parallel polypeptide loops separated by a deep cleft containing the heptacopper-thiolate cluster (55).Ten of the 12 cysteinyl thiolates in Cupl participate in the formation of the heptacopper cluster. Bridging thiolates contribute to the integrity of the cluster structure. Thus, the Acel-Cup1 system in yeast is curious in that the regulatory factor (Acel), like the product of the pathway (Cupl), contains a polycopper cluster. In this regard, the Cu(1) regulatory system resembles the Ca(1I) regulatory system in that the sensors have similar metal binding motifs as the buffering components.
C. Dissection of Acel and Amtl into Functional Domains Mapping studies were carried out on Amtl and Acel to determine which cysteinyl thiolates serve as ligands for Zn(I1) and the tetracopper cluster (86). Site-directed mutagenesis of AMT1 revealed that the single Zn(I1) ion is coordinated within the N-terminal 40 residues (86). Ligands for the bound Zn(I1)ion include the thiolates of Cysll, Cysl4, Cys23, and the imidazole of His25 (86, 91). The spacing of Zn(I1) ligands is C-x2-C-x8-C-x-H.Synthetic peptides encompassing residues 1-42 of Amtl and Acel form stably folded complexes in the presence of 1-mol eq. Zn(I1) or Cd(I1). Zn(I1) has been detected in Acel and Amtl only based on studies with recombinant proteins in bacteria. The question arises whether these transcription factors function in yeast with Zn(I1)populating the N-terminal metal site. There are two lines of evidence to support that the N-terminal site is Zn(I1) occupied in yeast. First, this domain exhibits a high affinity for divalent metal ions. The Cd(I1) dissociation constant calculated from proton titration studies is 2 X M at pH 7 (91). Second, substitution of Cysll in Acel with a carboxylate (Asp)which is a good Zn(I1) ligand but poor Cu(1) ligand, preserved limited function of Acel. Substitution of Cysll with a nonliganding residue, Ser or Tyr, abolishes in vivo function. Thus it is highly likely that the N-terminal segment in Acel and Amtl exists as a functional Zn module. The Zn(I1) module in Acel and Amtl is conserved in two other known yeast proteins (Mac1 and Lpz8) and two additional ORFs (86) (Fig.
CU-REGULATORY DOMAIN IN GENE EXPRESSION
181
3). Comparison of the six sequences reveals a consensus sequence of: KxACxxCIxxHxxxxCxHd. In addition to the conserved Zn(I1)ligands, there are five additional sequence identities and multiple conserved changes among the six proteins. The sequence similarity of Mac1 from S. cerevisiae and Acel ends at residue 40, suggesting that the N-terminal40 residues form an independent domain. The Zn(I1)module is essential for in vivo function of both Acel and Amtl (61,91).One candidate function involves DNA binding in the A/r-rich region of UAS,". A short sequence [(R/K)GRP]exists in the Zn module that is homologous to a minor groove DNA binding motif found in various nuclear proteins from animals,plants, insects, yeast, and bacteria (92,93).The conserved motif, designated the A-T hook motif, is important for preferential binding to A/"-rich DNA sequences (93). The consensus sequence for the A-T hook motif is RGRP ( 9 s 93). The consensus sequence is often flanked by basic and prolyl residues (92).The A-T hook motif was originally described in the high-mobility group HMGI/Y protein family (93-95). The motif in HMG-I/Y proteins is predicted to have an extended conformation similar to the structure of the antibiotic netropsin, which also binds to the minor groove of A/r sequences (95).Short peptides containing the RGRP sequence have the same DNA binding characteristics as the intact protein (94, 95). In the peptide-DNA complex, the arginine side chains are buried deep in the minor groove of the A/r sequence (94).The side chains of the two arginines establish specific contacts with the A/r sites (94).HMG-I/Y contains three repeats of the RGRP motif spaced by approximately 30 residues (93). Two additional DNA minor groove binding proteins, Pax and Hin recombinase, have similar motifs, although the conserved sequence is only GRP and the sequence is not embedded in a basic-rich segment (96,97).The motif in the Pax paired domain exists within a p turn that fits directly into the minor groove of DNA (97).Specific DNA base contacts are made by the glycine and arginine residues in the GRP p turn. In contrast, the RGRP sequence in Hin recombinase occurs in an extended polypeptide conformation and is responsible for minor groove DNA binding (96). The RGRP sequence (residues 36-39) in Amtl was shown to be responsible for minor groove DNA binding within the A/"-rich region (80). An Arg38Lys codon mutation in Amtl greatly attenuated its in vivo function (80).Likewise, a Gly37Glu codon mutation in Acel compromised the ability of Acel to mediate Cu-induced expression of CUP1 (61).The substitutions in each protein diminished in vitro DNA binding affinity four- to fivefold (61, 80). The RGRP motif in Acel and Amtl resembles the A-T hook motif in HMG-I/Y, yet it is not clear whether the motif in Acel or Amtl will exist in an extended conformation like HMG-I/Y or a reverse turn as in Pax.
it32
DENNIS R. WINGE
The RGRP motif in Acel and Amtl is part of the conserved Zn module. Thus it is conceivablethat one function of the Zn module is to stabilize a conformation that presents the RGRP sequence in proper juxtaposition to contact DNA. An additional function of the Zn(I1) module in Acel is its role in major groove contacts with DNA bases (6462).As mentioned, Cu,ZnAcel contacts bases separated by one and one-half turns of the DNA helix (61).The Zn(I1) module of Acel appears responsible for the major groove contacts in the distal GCG region of UAS,, (61, 62).The evidence for such major groove contacts is twofold. First, footprinting studies with the acel-1 mutant molecule containing a CysllTyr substitution revealed a loss of contacts in the distal region (61).Second, these same contacts were also lost in footprinting studies using a modified Acel in which the three cysteinyl residues in the N-terminal40 residues were alkylated (62). As mentioned, the DNA-binding site for Amtl involves only the core GCTG sequences and the A/r-rich region (80). There are no Amtl major groove contacts in the region corresponding to the Acel distal site. However, this does not imply that the Zn module in Amtl is nonfunctional. Mutations within codons of the Zn module of Amtl (CysllTyr, Cysl4Ser, Cys23Ser) reduce in viuo function of Amtl (91). The tetracopper domain of Amtl is enfolded by residues 41-110 (50). Expression of a truncated Amtl (residues 37-110) in bacteria resulted in the isolation of a Cu(I)-containing complex with 4-mol eq. Cu(1) bound (50). Spectral analyses of the truncated Cu,Amtl complex suggested that the four &(I) ions were bound similarly as in the intact Cu,ZnAmtl complex. The extinction coefficient of the S + Cu charge transfer bands and relative quantum yield of luminescence of the truncated CuAmtl complex were equivalent to those of the Cu,ZnAmtl complex. Cu K-edge EXAFS analyses of the truncated CuAmtl complex yielded a similar mean Cu-S bond distance and Cu-Cu separation as the intact Cu,ZnAmtl complex (SO).The truncated CuAmtl complex exhibits high affinity and specific DNA binding. The binding affhity is reduced only by a factor of 10 (50). The truncated CuAmtl complex may be expected to contact only the core GCTG sequence (proximal site in Fig. 2) and not the A/"-rich region. The reduced DNA binding affinity of the truncated CuAmtl may be related to disruption of the minor groove contacts by the RGRP motif. The DNA binding affinity of a mutant Amtl complex with a R + K substitution in the HMG-like RGRP sequence is also reduced by 5- to 10-fold (80). Thus, Amtl and Acel appear to consist of three separate domains: residues 1-40, the Zn(I1) domain; residues 41-110 (100 in Acel), the tetracopper domain; and residues 110-C-terminal end, the transactivation domain. A model for the organization of Acel and Amtl is shown in Fig. 4.
CU-REGULATORY DOMAIN IN GENE EXPRESSION
183
FIG.4. Model of the Cu-activatedAcel and Amtl factors. Both factors appear to consist of three independent domains, an amino-terminalZn module, the Cu-regulatory domain, and the carboxyl-terminaltransactivation domain.
The activation of Amtl and Acel to become DNA-binding proteins occurs by formation of the tetracopper cluster. Thus, the tetracopper domain is functionally a Cu-regulatory domain (CuRD).This Cu-regulatorydomain is also found in the S. cerevisiaeLpz8p and Yarrowia lipolytica Crfl sequences. From these four sequences a consensus sequence can be derived: C-X,-C(X)12,,-C-X-C-(X)l,-,,-C-X-C-X,-C-X-C. The CuRD contains eight cysteinyl residues. Only six thiolates are needed to form a tetracopper cluster based on the known Cu,S, synthetic cage clusters. The tetracopper clusters in Acel and Amtl may not contain all bridging sulfurs as is observed in the Cu,S, cage clusters. One model is that the Cu-thiolate cage cluster in Amtl and Acel is stabilized by four terminal and four bridging thiolates (Fig. 5). Bridging thiolates may be the predominant stabilizing force in the integrity of the tetracopper clusters in Acel and Amtl. In the candidate consensus sequence, there is only one place that exhibits length variability-the segment separating the two halves containing four cysteines each. The variable spacing of 10-27 residues between each halfsegment suggests that the CuRD may resemble the structure of CuCupl. As mentioned, the CuCupl fold consists of two lobes separated by a cleft containing the polycopper cluster. The CuRD in Amtl and Acel may exists as a two-lobe structure with the tetracopper center located within the interior of a cleft. Each lobe may contribute four thiolates to the cluster. Cu(1) activation of Amtl or Acel appears to consist of conversion of this 70-residue CuRD from an apo-conformer or inactive Zn(II) conformer to a
184
DENNIS R. WINCE
FIG.5. Model of the tetracopper-thiolatecluster in Acel and Amtl. The Cu(I) regulatory domain contains eight critical cysteinyl thiolates. We predict that the polycopper cluster will contain four terminal thiolates and four bridging thiolates. Bridging thiolates are essential for cluster integrity (89).
structure containing the tetracopper cluster. Because Amtl is isolated from cultures grown in the absence of added CuSO, as a Zn protein, the inactive state of the CuRD is expected to be a Zn(I1) stabilized conformer (M. C. Posewitz and D. R. Winge, unpublished observation).If the basal state of the regulatory domain is a Zn(I1) conformer, then Cu activation would occur through the following metal exchange reaction: Zn,MT
+ Cu,Zn,Acel
f,
Cu,MT
+ Zn,Acel
A model for such metal exchange is the facile metal exchange kinetics observed in metallothioneins (98). The significance of a tetracopper cluster as the structural unit within the activated transcription factors is threefold. First, a polycopper cluster formed by eight cysteinyl residues organizes and stabilizes a larger structural unit than a single bound metal ion. A single Cup) site expectedy is three or four coordinate, and therefore would anchor the polypeptide in only three or four places rather than eight anchor sites in the candidate CuRD of Acel and Amtl. A second significant aspect of a tetracopper cluster in the CuRD is that a polycopper cluster provides metal ion specificity.Polymetal clusters are also known for Zn(I1) and Cd(I1) ions, but these clusters are structurally distinct
CU-REGULATORY DOMAIN IN GENE EXPRESSION
185
from the polycopper clusters (89). Polycopper-thiolate clusters coordinate Cu(1) ions in either digonal or trigonal geometry. Zn(I1)-thiolateclusters are characterized by tetrahedral Zn(II) coordination (89).In both cases, bridging thiolates are key features of cluster stability.Mammalian metallothionein isoforms 1 and 2 consist of two polymetal-thiolate clusters that are distinct depending on whether Zn(I1)or Cu(1)ions are bound (99).The distinct clusters translate into metal-dependent structures. The activation of Acel is not strictly Cu(1) specific, as Ag(I) ions are effective in conferring DNA binding activity in Acel (11,85). Structurally similar [Cu,(SPh),] 2- and [AgJSPh),] 2- metal-thiolate cage clusters exist (100).Subtle structural differences observed between AgAcel and CuAcel (101)may relate to volume differences of the metal-thiolate cages for the two monovalent ions. The mean Cu-S bond distance for a trigonally bound Cu(1) ion is 2.27 A, whereas the mean Ag-S bond distance is 2.50 A (100).Thus volume constraints as well as cluster geometry may be important factors in dictating metal ion specificity in Acel and Amtl. Cluster volume was implicated as a critical factor in metal ion binding within clusters in metallothionein (102). The third important feature is the observed cooperativity in cluster formation. The tetracopper center in Amtl was shown to form in an all-ornothing manner (87). Cooperativity in Cu(I) binding was also reported for Acel in &(I) titration studies (85,103).Results of the &(I) titration studies, monitored by the luminescence of the Cu(1)-thiolate center, were biphasic, with an inflection point at 4-mol eq. Cu(1).A Hill coefficient of 6 was calculated for the overall process (103).Thus the Hill coefficient for formation of the tetracopper center is not known. Addition of &(I) to apo-Ace1 followed by a DNA binding with UAS,, assay revealed DNA binding was a sigmoidal function of copper concentration (101). A Hill coefficient of 4 was derived from the binding data, Cooperative formation of polymetal clusters in metallothioneins has also been observed (104). Cooperativity in cluster formation may be significant in that it permits a direct coupling of the intracellular exchangeable Cu ion concentration to transcriptional activation of CUP1 and to a lesser extent CRS5 and SODI. Cells can respond to small increases in copper ion concentration to activate Acel and therefore enhance MT biosynthesis.
D. Mechanism of Cu-Mediated Transcriptional Activation Transcription in eukaryotes requires that the relevant RNA polymerase associate with a promoter region and form a stable initiation complex (58). All genes have promoter elements that are recognized by factors that bind
186
DENNIS R. WINCE
DNA specifically and provide a nucleation site for complex formation. Transacting factors consist of at least two domains: one for DNA binding and one for assembly of the preinitiation complex (58, 59). For genes transcribed by RNA polymerase 11,the transactivation domain of a trans-acting factor associates either directly or indirectly with a multiprotein complex associated with the TATA element typically localized 30 nucleotides upstream of the transcription start site (58, 59).These interactions recruit RNA polymerase, forming the preinitiation complex. Transcriptional activators are capable of stimulating transcription from naked DNA templates, but activators increase expression by overcoming repression caused by nucleosomes (58,105).Activation of gene expression therefore requires clearing of nucleosomes and the establishment of transcription complexes at the promoter region. Nucleosome clearing appears to be a key component in the function of Acel (106).Nucleosome loss induced by depletion of histone H4 results in activation of CUP1 expression from a CUPl-kcZ fusion gene (106').The induced level is comparable to the Cu-induced level of expression. The activation of CUP1-lac2 induced by nucleosome loss occurred independently of the CUP1 UAS," (106).Thus, nucleosome clearing may be a key component in the activation of CUP1 by Cu,ZnAcel. The model for Acel activation of expression of CUPl, CRS5, and SOD1 appears to involve the following components:
1. Initially, &(I) binds to Acel within the nucleus to form the tetracopper cluster in the regulatory domain. The basal state of Acel is expected to be a Zn,Acel conformer. As such, Cu(1)binding involves Cu(I)-Zn(II) exchange reactions. The process is expected to be a cooperative metal exchangeprocess based on metal exchange reactions in metallothionein. These processes most likely occur within the nucleus. An Acel-P-galactosidase fusion localizes in noninduced cell nuclei (107). Acel is the Cu(1) sensor and transducer. This is unlike copper-mediated transcriptional regulation in prokaryotes, in which a cascade process exists involving distinct sensors and transducers (5,25). Regulation of DNA binding in Acel by Cu(1) ions makes Acel activation distinct from regulation of many eukaryotic transcription factors. Among the known mechanisms for modulation of DNA binding are interaction with inhibitory proteins (109,110), phosphorylation (108), and regulation by subcellular localization (111-113). The silent state of several transcription factors exhibits coordinate repression of DNA binding and transactivation activities (111).Coordinate regulation may prevent squelching of general transcription factors or inappropriate activation of transcription when a factor binds DNA nonspecifically (111).We are currently addressing whether basal
CU-REGULATORY DOMAIN IN GENE EXPRESSION
187
ZnAcel exhibits a repressed transactivation domain in addition to reduced DNA affinity. 2. A second event involves binding of Cu,Zn,Acel to UAScu elements upstream of Cu-responsive genes. DNA binding by Acel may relieve repressive effects of chromatin structure and/or may result in interactions with components of basal transcription complex to form preinitiation complex. The DNA within chromatin is packaged within nucleosomes. The stability and positioning of assembled nucleosomes in chromatin determines whether nucleosomes repress or activate transcription (114, 115).The variation in the assembly of nucleosomes is related, in part, to histone acetylation (114). Hyperacetylation of histones correlates with transcriptional activation (114).The yeast protein Rpd3 is required for regulation of inducible genes responding to external signals (116).Rpd3 is homologous to a known histone deacetylase (117),implying that regulation of histone acetylation may be important in gene activation. The actual mechanism of Rpd3 as a deacetylase in transcriptional activation is not understood (114). The activation of CUP1 by nucleosome loss implies that clearing of nucleosomes is a major effect of DNA binding by Acel (106).Under conditions of nucleosome loss, the UAScu is nonessential for transcription activation of CUP1 (102).In normal chromatin, Acel binding to UAScu may be critical for nucleosome clearing of Cu-responsive genes. The HMG-I molecule containing an RGRP DNA binding motif is known to mediate displacement of histone H1 from chromatin (118). The RGRP motif in Acel and Amtl may likewise contribute to nucleosome clearing. Alternatively, DNA binding by Acel may promote intermolecular interactions that facilitate association of TATA-binding proteins with the TATA DNA element in an early step of assembly of the preinitiation complex. Basal, unactivated Acel may be less effective in nucleosome clearing due to slow on-rates for DNA binding (101). 3. In any regulated response, the mechanism of signal transduction includes not only signal propagation but also the return to the silent state. The activation of Acel by formation of a tetracopper cluster suggests that silencing may involve dissociation of Acel-bound Cu(1)ions or, alternatively, degradation of activated Acel. Cu induction of CUP1 expression is autoregulated by metallothionein product (119,120). The presence of the CUP1 metallothionein or a heterologous MT inhibits Cu activation of CUP1 (120,121).Metallothioneins may repress CUP1 expression by lowering the available Cu(1)ion concentration as well as inactivating CuAcel through a Cu-Zn exchange reaction with ZnMT. If Cu-mediated induction of Cup1 MT results in production of
188
DENNIS R. WINGE
MT polypeptides in molar excess of the concentration of exchangeable Cu(I), the excess apo-MT or ZnMT may provide a ligand for the following exchange, resulting in Cu(I) dissociation from Acel (Fig. 1).The autoregulation studies reported for CUP1 MT do not provide any clear evidence that an MT-dependentCuAcel inactivation process occurs in vivo. An alternative mechanism of silencing of Cu signal transduction involves degradation of activated Acel. Preliminary studies suggest that this mechanism of silencing is unlikely. Cells transformed with an epitope-tagged ACE1 failed to exhibit any Cu-dependent proteolytic degradation (L. T. Jensen and D. R. Winge, unpublished observation). Three genes are presently known to be Cu-activated in their expression through Acel. The three genes encode molecules involved in cellular defenses against metal-mediated toxicity. Two encode metallothionein polypeptides that sequester &(I) ions, thereby reducing the free Cu(I) ion concentration capable of promoting radical formation. The third gene encodes superoxide dismutase that functions in disproportionation of superoxide anions. It is conceivable that a family of other genes are Cu regulated through Acel. Other candidate genes include those whose products function in lowering the cytoplasmic Cu ion concentration or oxidative stress defense. Saccharomyces cerevisiae is an excellent organism to catalog the genes regulated by Cu ions since the sequence of the genome is known. Saccharomyces cerevisiae is the first eukaryote to be sequenced in its entirety. It is estimated that S. cerevisiae w i l l contain nearly 6000 genes (37).Technology is available to detect genes differentially expressed by environmental changes. In one elegant approach, differential expression of yeast genes is being monitored by hybridization on a microarray prepared by robotic printing of all yeast cDNAs on glass (122; P. 0. Brown, personal communication). Differential expression studies have the potential to catalog all genes differentially expressed by changes in external copper concentrations.
IV. Metal Clusters in Regulation A polycopper cluster appears to be an important feature in the Cudependent repression of genes in S. cerevisiae.As mentioned earlier, copperdependent repression of CTRl and FREl occurs at the level of transcription in an Acel-independent manner (7,40). The transacting factor responsible for Cu repression of CTRl and FREl is Macl (45). Macl contains two cysteine-rich motifs consisting of CxCxxxxCxCxxCxxH sequence repeats that resemble Cup) binding cysteinyl sequence motifs found in Acel and metallothioneins (49).
CU-REGULATORY DOMAIN IN GENE EXPRESSION
189
The Cys-rich motifs in Macl are adjacent to candidate transactivation domains in Macl. A segment of Macl consisting of residues 201-340 containing the two cysteine-rich motifs is quite acidic, exhibiting a PI of 3.9. The PI of the full-length Macl molecule is 6.9. The proximity of the Macl Cys-rich motifs to a candidate acidic TAD led us to postulate that the activity of Macl TAD(s) may be copper regulated. We demonstrated that copper regulation of Macl is achieved through regulation of the Macl TAD activity by using a Gal4-Mac1 fusion protein in which the DNA-binding domain of Gal4 (residues 1-147) was fused in frame to residues 42-417 of Macl. The first 40 residues of Macl were deleted since they may likely contribute to DNA binding. The first 40 residues of Macl are homolgous to the Zn modules of Acel and Amtl and contain the A-T hook motif (49).The Gal4 fragment does not contain a TAD;therefore, it fails to truns-activate expression of reporter genes cloned downstream of a promoter element containing Gal4 binding sites. Expression of the Gal4-Mac1 fusion protein in a cell containing pGAL4-lacZ resulted in expression of Pgalactosidase, implying that residues 42-417 of Macl contain transcriptional activation activity (132). Expression of p-galactosidase is copper regulated (132).Under conditions of limited copper ion uptake into cells, P-galactosidase activity was markedly elevated; whereas cells from copper-supplemented medium exhibited low P-galactosidase levels. Cu(I) ions bind to Macl in Cu(1)-thiolatecoordination similar to that observed in Acel and MT. Reversible Cu(1) binding may be important in regulating the activity of the TAD domain(s) in Macl. Regulation of a transactivation domain is a novel form of cellular regulation. Our working model is that Macl Cu-specific inactivation occurs through formation of a polycopper cluster. The physiological significance of Cu-dependent repression of CTR1 and FREl expression may be explained in one of two ways. In the first model, growth conditions leading to an intracellular Cu(I) ion concentration that exceeds demand would result in down-regulation of gene products involved in high-affinity copper uptake. Such Cu-dependent repression may be a mechanism to minimize potential deleterious effects of excess intracellular copper. However, the observation that CTR1 expression is only maximal under conditions of extremely low copper ion concentrations in the growth medium may be more consistent with a second model in which Macl is fully functional only under Cu-deficient growth conditions. According to this model, the normal state of Macl is the Cu-repressed form. Cells exposed to inadequate copper growth conditions would result in derepression of Macl and subsequent enhanced expression of genes whose products are involved in copper ion uptake. Thus activation of Macl by copper limitation may be a copper starvation response. Metal clusters may be important in other metal ion regulatory pathways.
190
DENNIS R. WINCE
Iron repression of gene expression in S. cermisiae is mediated by the factor Aft1 (8,41).Aftl mediates iron regulation of the CccZ copper transporter, the Fet3 ferrooxidase, the Ftrl plasma membrane transporter, and two Fre metalloreductases (41).It was shown that Aftl binds a conserved DNA sequence in the 5' promoter region of the mentioned genes and that DNA binding is abolished in Fe-treated cells (41).AFT1 was originally cloned as a partially dominant AFT1 mutation that failed to mediate Fe repression (8).The mutation resulted in a Cys + Phe substitution in a Cys-Xaa-Cys sequence motif adjacent to the candidate DNA-binding domain (8).A candidate mechanism of Fe repression is inhibition of DNA binding by an iron-bound Aftl complex. Ironselective repression may occur by iron binding within an iron-sulfur cluster. Iron homeostasis in animal cells is regulated in part by regulated synthesis of the iron-buffering protein femtin and the cellular receptor for diferric transfenin (123).Iron regulation occurs posttranscriptionally and is mediated by two iron-responsivefactors, designated IRPl and IRPZ (124-126).The mechanism of iron regulation in IRPl has been reported to involve reversible formation of a Fe-S cluster (127).Translation of mRNA for femtins and the transfenin receptor is repressed by IRPl in the absence of the Fe-S cluster. Formation of the Fe-S cluster inhibits the ability of IRPl to bind to mRNA, and translation is unimpaired (127). Metal clusters are functional in other sensory mechanisms. Iron sulfur clusters are important in cellular sensing of oxygen and metabolic byproducts (127-131).Changes in oxygen tension in cells result in activation of expression of one subset of genes and repression of a different subset. The Fe-S centers in the prokaryotic SoxR transcription factor and ferrochelatase have been implicated as cellular sensors for superoxide anions and nitric oxide, respectively (134 131).In prokaryotes oxygen levels are sensed by the transcription factor FNR (128,129).FNR activates transcription of a number of bacterial genes under anaerobic growth conditions. The active form of FNR contains an iron-sulfur (Fe-S) cluster (128).In the presence of oxygen, the Fe-S cluster is disrupted, resulting in an inactive apo-protein. The Fe-S cluster of FNR appears to be a direct oxygen sensor (129).
V. Summary and Perspective Copper ion homeostasis in yeast is maintained through regulated expression of genes involved in copper ion uptake, Cu(1) sequestration, and defense against reactive oxygen intermediates. Positive and negative copper ion regulation is observed, and both effects are mediated by Cu(1)-sensingtranscription factors. The mechanism of Cum regulation is distinct for transcriptional activation versus transcriptional repression.
CU-REGULATORY DOMAIN I N GENE EXPRESSION
191
Cu(1) activation of gene expression in S. cermisiae and C. glabratu occurs through Cu-regulated DNA binding. The activation process involves Cum cluster formation within the regulatory domain in Acel and Amtl. Cu(1)binding stabilizes a specific conformation capabIe of high-&ty interaction with specific DNA promoter sequences. Cu(1)-activatedtranscription factors are modular proteins in which the DNA-binding domain is distinct fi-om the domain that mediates transcriptional activation. The all-or-nothing formation of the polycopper cluster permits a graded response of the cell to environmental copper. Cu(I) triggering appears to involve a metal exchange reaction converting Acel from a Zn(I1)-specificconformer to a clustered Cum conformer. Besides Acel of S. cereuisiue and Amtl of C. glabratu, the Cu(1) regulatory domain occurs in sequence homologs from Y. ZipoZyticu and S. pombe. The presence of a conserved Cu(1) regulatory sequence in four yeast strains suggests that this Cu-responsive domain may occur in other eukaryotes. Ongoing sequencing efforts in plant and animal species may reveal candidate Cu(1) regulatory domains in other phyla. Although much current research supports the notion that yeast may be a rosetta stone for the understanding of animal physiological processes, a significant question arises of how applicable yeast Cu regulation mechanisms will b e to animal cells. Several major questions concerning intracellular copper homeostasis persist that will stimulate future research. One question in yeast is how Cu(1) ions are routed between cellular compartments. How are Cup) ions presented to Acel for activation of gene expression or Mac1 for repression of gene expression? The channeling and presentation of Cu(1) ions to export transporters, such as the Wilson and Menkes Cu transporters, are unresolved. In addition, the identification of metal ion sensors in animal cells is a significant goal.
ACKNOWLEDGMENTS I am grateful to the people in my laboratory, past and present, who contributed to our efforts on these projects (alphabetical order): Charles Dameron, Rohan Farrell, Janet Graden, Laran Jensen,Laura Martins, Rajesh Mehra, Matt Posewitz, Andrew Sewell, John Simon, and JoanneThorvaldsen.I appreciateexcellentcollaborationswith Graham George in EXAFS, John Peltier in mass spectrometry,Val Culotta for CrsS work, Ian Dance for synthetic &(I)-thiolate clusters, Ian Armitage for NMR on Cupl, and Mike Summers for NMR on Amtl.
REFERENCES 1. F. Jacob and J. Monod,]. Mol. Biol. 3,318 (1961). 2. W. Gilbert and B. Muller-Hill, h c . Natl. Acad. Sci. U.S.A. 56,1891 (1966).
192
D E N N I S H. W N G E
3. A. M. Friedman, T. 0. Fischman, and T. A. Steitz, Science 268,1721 (1995). 4. M. Lewis, G. Chang, N. C. Horton, M. A. Kercher, M. C. Pace, M. A. Schumacher, R. G. Brennan, and P. Lu, Science 271,1247 (1996). 5. J. S. Parkinson and E. C. Kofoid, Annu. Reo. Genet. 26, 71 (1992). 6. F. A. Quiocho, Cum @in. Struct. B i d . 1,922 (1991). % A. Dancis, D. Haile, D. S. Yuan, and R. D. Klausner,]. Bid. Chem. 269,25660 (1994). 8. Y. Yamaguchi-Iwai, A. Dancis, and R. D. Klausner, EMBO]. 14,1231 (1995). 9. R. Stearman, D. S. Yuan, Y. Yamaguchi-Iwai, R.D. Klausner, and A. Dancis, Science 271, 1552 (1996). 10. Zhao and D. Eide, Proc. Natl. Acad. Sci. U.S.A.93,2452 (1996). 11. P. Furst, R. Hu, R. Hackett, and D. Hamer, Cell 55,705 (1988). 12. E. B. Gralla, D. J. Thiele, P.Silar, and J. S. Valentine, h c . Natl. Acod Sci. U.S.A. 88,8558 (1991). 13. V. C. Culotta, W. H. Howard, and X. F. Liu,]. Biol. Chem.269,25295 (1994). 14. D. H. Hamer, Annu. Reo. Biochem. 55,913 (1986). 15. I. H. Scheinberg and I. Sternlieb, in “Trace Elements in Human Health and Disease” (A. S. Prasad and D. Oberleas, eds.), p. 415. Academic Press, New York, 1976. 16. I. Fridovich, Science 201,875 (1978). 1%S. B. Farr and T.Kogoma, Mirrobiol. Reo. 55,561 (1991). 18. J. A. Imlay and S. Linn, Science 240,1302 (1988). 19. R. D. Palmiter and S. D. Findley, EMBO]. 14,639 (1995). 20. R. D. Palmiter, T. B. Cole, and S. D. Findley, EMBO]. 15, 1784 (1996). 21. M. Ikura, Trends Bwl. Sci. 21,14 (1996). 22. B. W. Schafer and C. W. Heizmann, Z h d s B i d . Sci. 21,134 (1996). 23. C. A. McPhalen, N. C. J. Strynadka, and M. N. G. James, Adv. Prot. Chem. 42,77 (1991). 24. S. D. Mills, C. K. Lim,and D. A. Cooksey, Mol. Gen. Genet. 244,341 (1994). 25. N. L. Brown, S. R. Barrett, J. Camakaris, B. T. 0.Lee, and D. A. Rouch, Mol. M i m b i o l . 17, 1153 (1995). 26. A. Odermatt and M. Solioz,J. Biol. Chem. 270,4349 (1995). 2% A. Odermatt, R. Krapf, and M. Solioz,Biochem. Biophys. Res. Commun. 202,44 (1994). 28. K. L. Hill and S . Merchant, EMBO]. 14,857 (1995). 29. J. M. Quinn and S. Merchant, Plant Cell 7,623 (1995). 30. R. D. Palmiter, Erperieneia 52,63 (1987). 31. P. C. Bull and D. W. Cox, Trends Genet. 10,246 (1994). 32. B. A. Masters, E. J. Kelly, C. J. Quaife, R. L. Brinster, and R. D. Palmiter, R-oc. Natl. Acad. Sci. U.S.A. 91,584 (1994). 33. E. J. Kelly and R. D. Palmiter, Nature Genet. 13,219-222 (1996). 34. R. Heuchel, F. Radtke, 0. Georgiev, G . Stark, M. Aguet, and W. Schafher, EMBOJ. 13, 2870 (1994). 35. R. D. Palmiter, Proc. Natl. Acad. Sci. U.S.A. 91,1219 (1994). 36. D. M. Morrow, D. A. Tagle, Y. Shiloh, F. S. Colins, and F? Hieter, Cell 82,831 (1995). 37. N. Williams, Science 272,481 (1996). 38. J. Camakaris, M. J. Petris, L. Bailey, P. Shen, P. Lockhart, T W. Glover, C. L. Barcroft, J. Patton, and J. E B. Mercer, Hum. Mol. G&. 4,2117 (1995). 39. D. S. Yuan, R. Stearman, A. Dancis, T. Dunn, T Beeler, and R. D. Klausner, h c . Natl. Acad. Sci. U.S.A. 92,2632 (1995). 40. M. L. Schilsky, Hepatology 20,530 (1994). 41. Y. Yamaguchi-Iwai,R. Steman, A. Dancis, and R. D. Klausner, EMBOJ. 15,3377 (1996). 42. T. Tsukihara, H. Aoyama, E. Yamashita, T. Tomizaki, H. Yamaguchi, K. Shinzawa-Itoh, R. Hakashima, R. Yaono, and S. Yoshikawa, Science 269,1069 (1995).
CU-REGULATORY DOMAIN IN GENE EXPRESSION
193
43. I. Fridovich,J. Biol. Chem. 264,7761 (1989). 44. C. Askwith, D. Eide, A. Van Ho, P. S. Bernard, L. Li, S. Davis-Kaplan, D. M. Sipe, and J. Kaplan, Cell 76,403 (1994). 45. R. Hassett and D. J. Kosman,J. B i d . Chem. 270,128 (1995). 46. A. Dancis, D. S. Yuan, D. Haile, C. Askwith, D. Eide, C. Moehle,J. Kaplan, and R. D. Klausner, Cell 76,393 (1994). 47. E. Georgatsou and D. Alexandraki, Mol. Cell. B i d . 14,3065 (1994). 48. K. Kampfenkel, S . Kushnir, E. Babiychuk,D. lnze, and M. V. Montagu (1995)J.B i d . C h . 270,28479 (1995). 49. J. Jungmann, H. A. Reins, J. Lee, A. Romeo, R. Hassett, D. Kosman, and S. Jentsch, EMBO J. 12,5051 (1993). 50. J. A. Graden, L. T. Jensen, A. K. Sewell, and D. R. Winge, B b c h a i s t y 35,14583 (1996). 51. S. Fogel and J. W. Welch, Roc. Nutl. A c d . Sci. U.S.A. 79,5342 (1982). 52. D. H. Hamer, D. J. Thiele, and J. E. Lemontt, Science 228,685 (1985). 53. L. T. Jensen, W. R. Howard, J. Strain, D. R. Winge, and V. C. CulomJ. Bwl. C h . 271, 18514 (1996). 54. S. S. Narula, R. K. Mehra, D. R. Winge, and I. M. Armitage,]. Am. C h . SOC.113,9354 (1991). 55. C. W. Peterson, S. S. Narula, and I. M. Armitage, FEBS Lett. 379,85 (1996). 56. T. R. Butt, E. J. Stemberg, J. A. Gorman, P. Clark, D. Hamer, M. Rosenberg, and S. T Crooke, Roc. Natl. A d . Sci. U.S.A. 81,3332 (1984). 5% M. Karin, R. Najarian, A. Haslinger, P. Vdenzuela, J. Welch, and S. Fogel, Roc. NatZ. Acud. Sci. U.S.A.81,337 (1984). 58. L. Zawel and D. Reinberg Annu. Reo. Biochem. 64,533 (1995). 59. P. J. Mitchell and R. Tjian, Science 245,371 (1989). 60. D. J. Thiele and D. H. Hamer, Mol. Cell. Biol. 6,1158 (1986). 61. C. Buchman, P. Skroch, W. Dixon, T. D. Tullius, and M. Karin, Mol. Cell. Bwl. 10,4778 (1990). 62. A. Dobi, C. T. Dameron, S. Hu, D. Hamer, and D. R. Winge, J. B i d . C h .270, 10171 (1995). 63. V. C. Culotta, T. Hsu, S. Hu, P. Furst, and D. Hamer, Proc. Nutl. A d . Sci. U.S.A.86,8377 (1989). 64. M. T. Cani, F. Galiazzo,M. R. Ciriolo, and G. Rotilio, FEBS Lett. 2278,263 (1991). 65. K. T. Tamai, E. B. Gralla, L. M. Ellerby, J. S. Valentine, and D. J. Thiele, Roc. Nutl. Acad. Sci. U.S.A.90,8013 (1991). 66. D. Deters, H.-J. Hartmann, and U. Weser, Biochim. Biophys. A& 1208,344 (1994). 6% C. Sievers, D. Deters, H.-J. Hartmann, and U. Weser,J. Inorg. B i o c h . 62,199 (1996). 68. V. C. Culotta, H.-D. Joh, S.-J. Lin, K. H. Slekar, and J. Strain, J. B i d . Chem. 270, 29991 (1995). 69. D. J. Thiele, Mol. Cell. B i d . 8,2745 (1988). 70. C. Buckman, P. Skroch, J. Welch, S. Fogel, and M. Karin, Mol. Cell. B i d . 9,4091 (1989). 71. C. Buchman, P. Skroch, W. Dixon, T.D. Tullius, and M. Karin, Mol. Cell. B i d . 10, 4778 (1990). 72. S . Hu, P. Furst, and D. Hamer, New B i d . 2,544 (1990). 73. W. F. Hickey, L. H. Somrnede, and F. J. Schoen, Am. J. Clin. Puthol. 80,724 (1983). 74. W. L. Whelan, CRC Crit. Rev. Microbial. 14,99 (1987). 75. R. K. Mehra,J. R. Garey,T.R. Butt W. R. Gray, and D. R. Winge,]. Biol. C h a . 264,19747 (1989). 76. R. K. Mehra, J. L. Thorvaldsen, I. G. Macreadie, and D. R. Winge, Gene 114,75 (1992). 7% R. K. Mehra, J. R. Garey, and D. R. Winge,]. B i d . C h . 265,6369 (1990).
194
DENNIS R. WINCE
78. P. Zhou and D. J. Thiele, Roc. Natl. Acud. Sci. U.S.A. 88,6112 (1991). 79. P. Zhou and D. J. Thiele, Genes Deu.7, 1824 (1993). 80. K. A. Koch and D. J. Thiele, Mol. Cell. Biol. 16,724 (1996). 81. J. L. Thorvaldsen, A. K. Sewell, C. L. McCowen, and D. R. Winge, 1.Biol. Chem. 268, 12512 (1993). 82. J. L. Thorvaldsen,R. K. Mehra, W. Yu, A. K. Sewell, and D. R. Winge, Yeast 11,1501 (1995). 83. T. Munder and P. Furst, Mol. Cell. Biol. 12,2091 (1992). 84. S. Hahn, Cell 72,481 (1993). 85. C. T. Dameron, D. R. Winge, G. N. George, M. Sansone, S. Hu, and D. Hamer, Proc. Natl. Acad. Sci. U.S.A. 88,6127 (1991). 86. R. A. Farrell, J. L.Thorvaldsen, and D. R. Winge, Biochemistry 35,1571 (1996). 8%J. L. Thorvaldsen, A. S . Sewell, A. M. Tanner,J. M. Peltier, I. J. Pickering, G. N. George, and D. R. Winge, Biochemistry 33,9566 (1994). 88. K. H. Nakagawa, C. Inouye, B. Hedman, M. Karin, T. D. Tullius, and K. 0.Hodgson,]. Am. Chem. Soc. 113,3621 (1991). 89. I. G. Dance, Polyhedron 5,1037 (1986). 90. I. J. Pickering, G. N. George, C. T. Dameron, B. Kun, D. R. Winge, and I. G. Dance,]. Am. Chem. Soc. 115,9498 (1993). 91. M. C. Posewitz, J. R. Simon, R. A. Farrell, and D. R. Winge,]. Bioinq. Chem. 1,560 (1996). 92. F. J. Nicolas, M. L. Cayuela, I. M. Martinez-Argudo.R. M. Ruiz-Vazquez, and F. J. Murillo, Proc. Natl. Acad. Sci. U.S.A. 93,6881 (1996). 93. M. Bustin, and R. Reeves, Prog. N w l . Acid Res. Mol. Biol. 54,35 (1996). 94. B. H. Geierstanger, B. F. Volkman, W. Kremer, and D. E. Wemmer, Biochemistry 33,5347 (1994). 95. R. Reeves and M. S. Nissen,]. Bwl. Chem. 265,8573 (1990). 96. J.-A. Feng, R. C. Johnson, and R. E. Dickerson, Science 263,348 (1994). 9% W. Xu, M. A. Rould, S. Jun, C. Despian, and C. 0.Pabo, Cell 80,639 (1995). 98. J. D. Otvos, D. H. Petering, and C. E Shaw, Comments Inurg. Chem. 9,1 (1989). 99. K. B. Nielson, C. L. Atkin, and D. R. Winge,]. Biol. Chem. 260,5342 (1985). 100. I. G . Dance, Aust.]. Chem. 31,2195 (1978). 101. P. Furst and D. Hamer, Proc. Natl. A c d Sci. U.S.A. 86,5267 (1989). 102. M. Good, R. Hollenstein, and M. Vasak, Eur. J. Biochem. 197,655 (1991). 103. J. R. Casas-Finet, S. Hu, D. Hamer, and R. L. Karpel, Biochemistry 31,6617 (1992). 104. M. Good, R. Hollenstein, P. J. Sadler, and M. Vasak, Biochemistry 27,7163 (1988). 105. M. Han and M. Grunstein, Cell 55,1137 (1988). 106. L. K. Dumn, R. K. Mann, and M. Grunstein, Mol. Cell. Biol. 12,1621 (1992). 10% M. S. Szczypka and D. J. Thiele, Mol. Cell. Bwl. 9,421 (1989). 108. M. Karin and T.Hunter, Cum. Biol. 5,747 (1995). 109. D. Picard, V. Kumar, P. Chambon, and D. R. Yamamoto, Cell Regul. 1,291 (1990). 110. L. Zhang and L. Guarente, EMBO]. 14,313 (1995). 111. X.-Y. Li and M. R. Green, Genes Deu. 10,517 (1996). 112. S. T. Whiteside and S. Goodbourn,]. Cell. Sci. 104,949 (1993). 113. E. M. ONeill, A. Kaffman, E. R. Jolly, and E. K. OShea, Science 271,209 (1996). 114. A. P. Wolffe, Science 272,371 (1996). 115. C. Schild, F.-X. Claret, W. Wahli, and A. P.Wolffe, EMBO]. 12,423 (1993). 116. M. Vidal and R. F. Caber, Mol. Cell. Biol. 11, 6317 (1991). 11% J. Tauton, C. A. Hassig, and S. L Schreiber, Science 272,408 (1996). 118. K. Zhao, E. Kas, E. Gonzalez, and U. Laemmli, EMBO]. 12,3237 (1993). 119. D. H. Hamer, D. J. Thiele, and J. E. Lemontt, S h e 228,685 (1985). 120. C. F. Wright, D. H. Hamer, and K. McKenney,]. Biol. Chem. 263,1570 (1988).
CU-REGULATORY DOMAIN IN G E N E EXPRESSION
195
121. D. J. Thiele, M. J. Walling, and D. H. Hamer, Science 231,854 (1986). 122. M. Schena, D. Shalon, R. W. Davis, and P. 0.Brown, Science 270,467 (1995). 123. J. L. Casey, D. M. Koeller, V. C. Ramin, R. D. Klausner, and J. B. Harford, EMBOJ. 8,3693 (1989). 124. B. Guo, J. D. Phillips, Y.Yu, and E. A. Leibold,]. Biol. Chem. 270,21645 (1995) 125. F. Samaniego,J. Chin, K. Iwai, T.A. Rouault, and R. D. KlausnerJ Biol. C h m . 269,30904 (1994). 126. B. R. Henderson and L. C. Kuhn,]. Biol. Chem. 270,20509 (1995). 12% T.A. Rouault and R. D. Klausner, 2m ' cl.s Bid. Sci. 21,174 (1996). 128. N. Khoroshilova, H. Beinert, and P. J. Kiley, A-oc. Nutl. Acad. Sci. U.S.A. 92,2499 (1995). 129. B. A. Lazazzera, H. Beinert, N. Khoroshilova, M. C. Kennedy, and P. J. Kiley,]. Bid. C h . 271,2762 (1996). 130. E. Hilalgo and B. Demple, EMBOJ. 13,138 (1994). 131. V. M. Sellers, M. K. Johnson, and H. A. Dailey, Biochemistry 35,2699 (1996). 132. J. A. Graden and D. R. Winge, Proc. Natl. Acud. Sci. U.S.A. 94, in press (1997).
This Page Intentionally Left Blank
Molecular Biology of Trehalose and the Trehalases in the Yeast Saccharomyces cerevisiae' SOLOMON NWAKA AND HELMUTHOLZER~
lnstitutfir Biochemie und Molekularbiologie Universitit Freiburg Freiberg D- 79104, Germany I. Metabolism of Trehalose in Yeast ................................. A. Turnover of Trehalose: Enzymes, Control of Synthesis and Hydrolysis .................................. B. Assay of Trehalose .......................................... 11. Biological Functions of Trehalose in Yeast ......................... A. Trehalose in the Life Cycle of Yeast ............................ B. Trehalose Accumulation during Growth on Glucose and Under Stress ................................. 111. Characterizationand Localization of the Yeast Trehalases . . . . . . . . . . . . A. Localization of the Acid and Neutral Trehalases . . . . . . . . . . . . . . . . . B. Biochemical Characterization of the Acid Trehalase .............. C. Biochemical Characterization of the Neutral Trehalase ........... IV Molecular Analysis of the Yeast Trehalases ......................... A. Molecular Analysis of the ATHI, N T H l , and NTH2 (YBR0106)Genes ................................. B. Alignment of Trehalases from Various Organisms ................ C. Regulation of the Expression of Trehalases by Heat and Other Stress Conditions .......................... D. Regulation of Trehalose Concentration and Expression of the Trehalases by Nubients (CataboliteRepression) ............
199 199 201 202 202 203 207 207 207 209 211 211 218 219 224
* Abbreviations and gene nomenclature: aa, amino acid(s); kb, kilobase pair(s); bp, base pair(s); FBPase, fructose-I-6-bisphosphatase;SDS-PAGE, sodium dodecyl sulfate-polyacrylamide gel electrophoresis;MES, 4-morpholineethanesulfonic acid; HEPES, 4-(2-hydroxyethyl)1-piperazineethanesulfonicacid; EDTA, ethylenediamine-tetraacetic acid; TI", tetrachloroisophthalonitrile; CDEs, centromeric elements; HSE, heat shock element; STRE, stress responsive element; HSF, heat shock factor; ATHl gene, encodes the vacuolar acid trehalase; NTHl gene, encodes the cytosolic neutral trehalase; NTH2 (YBROIOG),homolog of the NTHl gene, encodes a putative trehalase Nth2p; TPSl (ClFl or GGSl or TSSl)gene, encodes the 56kDa subunit of the trehalose-&phosphatesynthase-phosphatase complex that is the synthase; TPS2 gene, encodes the 102-kDa subunit of the trehalose-6-phosphatesynthase-phosphatase complex that is the phosphatase; TPS3 (TSLI)gene, encodes the 123-kDa subunit of the trehalose-6-phosphate synthase-phosphatase complex whose function is unclear. To whom correspondence may be addressed: Telephone: +49-(0)761-2035250; Fax: +49-(0)761-2035253; e-mail:
[email protected]. Progr~ssin Nucleic Acid Research and Molecular Biolopy, Vol. SI)
197
Copyright 8 1998 by Academic Press.
AU rights of reproductionin any form reserved. 0079-6603/98$25.00
198
SOLOMON NWAKA AND HELMUT HOLZER
V. Biological Functions of the Trehalase Genes ....................... A. Role in Trehalose Hydrolysis .................................. B. Role in Stress Response ...................................... C. The Acid Trehalase and Trehalose Transport .................... D. Role of Trehalose Hydrolysis in Spore Germination .............. VI. Trehalases and Heat Shock Proteins .............................. A. Stress Regulation by Heat Shock Element and Stress Responsive Element ............................... VII. Outlook on the BiotechnologicalImportance of Trehalose and the Trehalases ............................................. References ....................................................
226 226 226 228 229 229 229
231 233
I h e present state of knowledge of the role of trehalose and trehalose hydrolysis catalyzed by trehalase (EC331.28) in the yeast Saccharomyces cerevisiae is reviewed. Trehalose is believed to hnction as a storage carbohydrate because its concentration is high during nutrient limitations and in resting cells. It is also believed to function as a stress metabolite because its concentrationincreases during certaii adverse environmental conditions,such as heat and toxic chemicals. I h e exact way trehalose may perform the stress function is not understood, and conditions exist under which trehalose accumulation and tolerance to certain stress situations cannot be correlated. Three trehalases have been described in S. cerevisiae: 1) the cytosolic neutral trehalase encoded by the NTHl gene, and regulated by cAMP-dependent phosphorylation process, nutrients, and temperature; 2) the vacuolar acid trehalase encoded by the ATHl gene, and regulated by nutrients; and 3) a putative trehalase Nthlp encoded by the NTH2 gene (homolog of the NTHl gene) and regulated by nutrients and temperature. 'Ihe neutral trehalase is responsible for intracellular hydrolysis of trehalose, in contrast to the acid trehalase, which is responsible for utilization of extracellular trehalose. The role of the putative trehalase Nth2p in trehalose metabolism is not known. The NTHl and NTH2 genes are required for recovery of cells after heat shock at S O T , consistentwith their heat inducibilityand sequence similarity. Other stresson, such as toxic chemicals, also induce the expression of these genes. We therefore propose that the NTHl and NTH2 genes have stress-related function and the gene products may be called stress proteins. Whether the stress function of the trehalase genes is linked to trehalose is not clear, and possible mechanisms of stress protective function of the trehalases are discussed. 8 I998 Academic Press
Trehalose (a-D-glucopyranosyl(l-l)-a-Dglucopyranoside), is a nonreducing disaccharide of glucose that was discovered from the ergot of rye in 1832 by Wiggers (I).Subsequently,the French chemist Berthelot found this sugar in trehalu (a desert manna from Asia minor that is produced by the weevil Lurinus nidifium) and called it trehalose (2). In yeast cells, the presence of trehalose was first shown by Koch and Koch (3)and Tanret (4). Trehalose is ubiq-
TREHALOSE AND TREHELASES IN S. CEREVZSIAE
199
uitously distributed in nature and can be found in a great variety of organisms, including bacteria, fungi, plants, insects, and some other invertebrates (5).The exact function of trehalose in these organisms is not clear; however, it is thought to be an important reserve carbohydrate in bacteria and fungi. In the yeast S a c c h m y c e s cerevisiae, trehalose is also thought to function as a stress metabolite because its concentration in the cell increases during certain environmental or physiological stresses such as heat (reviewed in 6, 7). An enzyme hydrolyzing trehalose was first found in Aspergillzcs niger by Bourquelot (8) and then in Sacchurmyces cerevisiue by Fischer (9). Since then, trehalase (a,a-trehalose-l-D-glucohydrolase, EC 3.2.1.28)has been detected in many other organisms of the plant and animal kingdom (5).In mammals, the exact function of this enzyme is not clear. It is thought, however, that this enzyme, which is present in the intestine of certain mammals, is responsible for hydrolysis of ingested trehalose, because patients deficient in intestinal trehalase have been reported to show diarrhea upon ingestion of trehalose-containingmushrooms (10-12).In the yeast S. cerevisiae,three trehalases have been described, but the exact role of these enzymes in trehalose hydrolysis in intact cells, as well as their regulation, is not well understood. Trehalose hydrolysis is believed to be an important biochemical process during the various life functions of yeast, for example, fungal spore germination and resumption of growth in resting cells (reviewed in 13). In the past years, work in our laboratory has focused on the role of trehalose and its metabolism in the life cycle of yeast. This review therefore gives an overview of the state of knowledge in this field but discusses in more detail the characterization, regulation, and molecular analysis of the trehalosehydrolyzing enzymes and the corresponding genes. The potential biotechnological applications of trehalose are highlighted. For further reading, readers are advised to consult appropriate references mentioned herein.
1. Metabolism of Trehalose in Yeast A. Turnover of Trehalose: Enzymes, Control of Synthesis and Hydrolysis Trehalose biosynthesis is catalyzed by the sequential action of trehalose6-phosphate synthase and trehalose-6-phosphate phosphatase activities called the trehalose synthase complex (14, 15) using UDP-glucose and glucose 6-phosphate as substrates (Fig. 1). The protein complex purified in an intact form contained polypeptides of 123,102, and 56 kDa (14, 15). The genes encoding these proteins have all been cloned and sequenced (16-20). A deletion of the TPS1 gene (also called CZFL,GGSI, and TSSI) encoding
200
-
SOLOMON NWAKA AND HELMUT HOLZER
T-6-P
4
nm1p
Tpslp Tps3p?
PPi + UDP-Glc ?
-\
ose
Glycolysis
FIG.1. Trehalose metabolism in S. cerevisiae. Trehalose biosynthesis is catalyzed by trehalose-6-phosphate synthase (Tpslp encoded by T P S l gene) and trehalose-6-phosphatephosphatase (Tps2p encoded by the TPS2 gene). The role of Tpsdp encoded by the ZPS3 gene is not clear (indicatedwith ?). The hydrolysisof trehalose is catalyzed by the trehalases (Nthlp for neutral trehalase encoded by the NTHl gene, Nth2p for putative protein of the NTH2 gene, and Athlp for acid trehalase encoded by the ATHl gene). The various pH optima for activity are shown. The role of Nth2p is not clear (indicated with 2).
the 56-kDa protein (trehalose-6-phosphatesynthase)leads to inability of cells to synthesize trehalose and prevents growth on glucose. The inability of the TPS1 mutant to grow on glucose suggested a role for this gene as a sensor for glucose influx into the cell: GGS1 stands for “general glucose sensor” (19). A disruption of the TPS2 gene encoding the 102-kDa polypeptide eliminates the trehalose-6-phosphatase activity (18)while the TPS3 (TSLI) gene encoding the 123-kDa polypeptide seems to regulate the trehalose-6phosphate synthase activity of the trehalose synthase (20). The hydrolysis of trehalose is catalyzed by trehalase (EC 3.2.1.28) (see Fig. 1).In yeast, trehalose-hydrolyzingenzyme activity was first described in 1895 by Emil Fischer (9). An inactive (zymogen) form of trehalase, which is activated by cyclic AMP-dependent phosphorylation, was reported (21).In 1982, Wiemken and co-workers demonstrated that the phosphorylatable trehalase was localized in the cytosol, whereas a second, permanently active, trehalase was found in the vacuoles (22).Londesborough and Varimo (23)separated these two activities and determined pH optima for the two enzymes. The phosphorylatable enzyme localized in the cytosol had its maximal activity
TREHALOSE AND TREHELASES IN S. CEREVISZAE
201
at pH 7 and was therefore designated the “neutral trehalase,” while the vacuolar trehalase, which has its maximal activity at pH 4.5, was designated the “acid trehalase.” It was also demonstrated that CAMP, ATP, and Mg2+activate neutral trehalase (23,24).The neutral trehalase was isolated and partially characterized (24)and the corresponding gene, called NTHl gene, was cloned and sequenced (25,26‘).The localizationof the phosphorylated trehalase (neutraltrehalase) in the cytosol complements the fact that trehalose is a cytoplasmiccorn pound, a finding that led to early speculation of a role for trehalose as a protec tive agent (6,22).The acid trehalasewas isolated and partially characterized (27). In an attempt to clone the acid trehalase-encoding gene, a gene called YGPI, whose product gp37 is highly glycosylated, was identified (28).However, pg37 does not represent a trehalose-hydrolyzingactivity (28).Subsequently,the ATHl gene was isolated; its deletion leads to loss of acid trehalase activity (29). A gene called NTH2 (formerly called YBROIOG), was described (30).Because of the high identity of the NTHl and NTH2 gene products, as well as the high homology of the NTH2 gene product to other trehalase sequences from a variety of prokaryotes and eukaryotes, the NTH2 gene was designated a trehalase gene. However, a role for NTH2 in trehalose metabolism (i.e. trehalose hydrolysis) has till now not been demonstrated. Trehalose hydrolsis is an important feature of many developmental processes in fungi (13).In S. cereuisiae, it seems to be an important phenomenon during spore germination, and for resumption of growth on ethanol and other nonfermentable carbon sources (13,31). The biological function of trehalase consists of the control of trehalose concentration via degradation of trehalose into glucose units. Using mutants of the neutral trehalase-encodinggene, NTHl, we have demonstrated that the neutral trehalase is the major enzyme responsible for trehalose hydrolysis in vivo (25, 32, 33). The acid trehalase has so far no known trehalosehydrolyzing activity in uivo, but it recognizes trehalose as a substrate in vitro. Based on the specificity of the acid trehalase for trehalose, a sensitive assay for trehalose using purified acid trehalase from S. cereoisiae was developed (34).In a search for the role of acid trehalase in trehalose metabolism, we found that an acid trehalase-deficient mutant does not grow on trehalose as a carbon source, in contrast to wild-type and a Anthl mutant This suggests a role of the acid trehalase in trehalose utilization in a manner that is different from the neutral trehalase (35).Furthermore, a growth defect of the acid trehalase mutant on glycerol, similar to the neutral trehalase mutant, presents possible evidence for the involvement of the two enzymes in trehalose hydrolysis in vivo (35).
B. Assay of Trehalose Different methods used for assaying trehalose in intact yeast cells have been described. These methods include the paper chromatographic method
202
SOLOMON NWAKA AND HELMUT HOLZER
(36);the anthrone method (37),which, however, is not specific for trehalose, the color reaction and thin-layer chromatography method (38),which is specific but time consuming; and the enzymatic method (34,39).In contrast to
the enzymatic “end point” method (34),which is based on quantitative reaction of the substrate to be assayed, is a kinetic method (39),which uses the trehalose concentration-dependent variation in the rate of trehalase reaction as an indicator for the concentration of trehalose. Such a “kinetic” method is sensitive to disturbances in the samples to be assayed and depends strongly on the purity of the enzyme used for the assay. The enzymatic endpoint assay method uses acid trehalase purified from the yeast suc2 mutant (27,34). After quantitative hydrolysis of trehalose by acid trehalase, the resulting glucose is assayed with the commercially available glucose oxidase-peroxidase dye system (40).When intact cells are analyzed for trehalose, preexisting glucose can be washed out with ice-cold water without reducing the trehalose content. A convenient method for extraction of trehalose from intact cells is heating at 95°C for 20 min followed by centrifugation. The specificity of this assay is high because acid trehalase prepared from a suc2 mutant, which is deficient in external invertase, was found to hydrolyze no other disaccharide than trehalose (27).The sensitivity of the assay is high because the commercially available glucose oxidase-peroxidase assay allows quantitative determination of as little as 1 pg (i.e. 5 nmol) of glucose. A similar endpoint method was described using a crude extract preparation of extracellularconidial trehalase from humicola griseu (41). The disadvantage of this method (41)as compared to that using suc2 acid trehalase (34)is the lack of specificity of the crude enzyme preparation used.
II, Biological Functions of Trehalose in Yeast A. Trehalose in the Life Cycle of Yeast Sacchrmyces cerevisiae can synthesize and degrade trehalose and, depending on the environmental conditions and the stage of the life cycle, trehalose can represent less than 1010, or more than 23%,of the dry weight of cells (37,42,43).These variations in trehalose content, and the large amounts that can be accumulated, suggest that it plays an important role during the yeast life cycle. Studies correlating trehalose levels with the physiological and developmental activities of the cells have suggested that this disaccharide functions as an important carbon and energy reserve in starving cells (44,45), in cells undergoing respiratory adaptation (46),in germinating spores (47),in vegetative cells during emergence from stationary phase (48),and in cells traversing the mitotic cell cycle under conditions of carbon and energy limitation (43).The observation that yeast cells accumulate trehalose when deprived
203
TREHALOSE AND TREHELASES IN S. CEREVZSZAE
of glucose, nih-ogen, sulfur, or phosphorus suggests that reserve carbohydrate accumulation is a general response to various types of nutrient limitation (37). Glycogen seems to play a similar role. However, the fact that glycogen and trehalose display nonidentical patterns of accumulation and utilization raises the possibility that they may play distinct roles in the cellular economy.
B. Trehalose Accumulation during Growth on Glucose and Under Stress As shown in Fig. 2, yeast cells growing exponentially on glucose or other fermentable carbon sources (rich carbon sources), such as fructose and galac25
7
/ 20
15
10
5
// !:pq 4
Y o u 1 4.5 h at 30'C. A(6oO)
-
1.1
1'
A
0
*
60
I
Time (min) allor 4.5 h
0
0
6
12
18
24
30
36
42
Time (h)
FIG.2. Trehalose cwntent of yeast cells during growth on glucose and during heat stress (39°C).Growth measured as absorbance at 600 nm (triangles).Trehalose concentration (mM) during growth (squares).Inset shows reversible trehalose accumulation at heat stress (39"C/40min) and return to normal growth temperature (30W40 min) in a wild-type strain. (Adapted from I. Kienle,M. Burgert, and H. Holzer, Yeasts,607 (1993).Copyright 1993 by JohnWiley & Sons Ltd.)
204
SOLOMON NWAKA AND HELMUT HOLZER
tose, have very low trehalose levels (about 0.1-0.5 mM). As these cells exhaust their carbon source and enter the respiratory phase of growth (diauxie),trehalose accumulation increases until stationary phase (when nutrients become limiting) to about 25 mM (34).The extreme increase of trehalose begins after consumption of glucose. It was suggested that repression of the trehalose-6-phosphate synthase (trehalose-synthesizingenzyme complex) by glucose and derepression of the synthesis of this enzyme after consumption of glucose might be responsible for the drastic change in concentration of trehalose from exponentially growing cells to stationary cells (49). In contrast to fermentable carbon sources, cells growing on nonfermentable carbon sources (poor carbon sources), such as glycerol, ethanol, and acetate, show high levels of trehalose during both exponential and stationary growth (50, 51). Nutrient stress therefore supports the accumulation of trehalose in the cell. In recent years, much attention has been drawn to a possible function of trehalose as a stress protectant, mainly based on the remarkable stress-protective properties in vitro and on the strong correlation between trehalose content and stress resistance in vivo (6, 7). Transfer of exponentially growing cells on glucose from the normal growth temperature of 30°C to heat stress temperature of 39°C for 40 or 60 min causes a rapid increase in the concentration of trehalose from 0.1 to >12 mM (see the inset in Fig. 2) (34).This effect of heat on trehalose concentration has also been reported by other authors under slightly different heat stress temperatures (37-45°C) for different times (32, 38, 52-54). It is shown in Fig. 2 (inset) that the temperaturedependent increase in the concentration of trehalose is reversible: shifting the 39°C treated cells back to 30°C for 40 or 60 min causes a drop in the concentration of trehalose back to low trehalose concentration. The accumulation of trehalose during heat stress has been shown to result from increase of both the activity of the trehalose-6-phosphate synthase (55)and the concentration of the substrates UDP-glucose and glucose 6-phosphate (54). In addition to heat stress, trehalose accumulates in cells exposed to dehydration, pressure, hazardous chemicals, and the like (38, 52, 53,56').The parallelism between trehalose accumulation and tolerance to various stress conditions suggested that the major function of trehalose is not storage, but a stress protectant against adverse environmental conditions (6, 7). Doubts as to the universal validity of this relationship arose when Winkler et al. (54) demonstrated that a mutant deficient in the synthesis of the heat shock protein 104 (57)does not exhibit thermotolerance after heat stress, even though the accumulation of trehalose was as high as in the corresponding wild-type strain. Since then several authors have shown lack of correlation between trehalose accumulation and ability of cells to acquire thermotolerance using mutants that are altered in trehalose metabolism and in the synthesis of cer-
TREHALOSE AND TREHELASES IN S. CEREVlSlAE
205
tain heat shock proteins (32, 54, 56, 58), suggesting that under some conditions trehalose does not mediate thermotolerance. In fact, it was shown that trehalose is important for thennoprotection only in nonfermentable carbon sources and not in fermentable carbon sources (50).These observation with various strains are summarized in Table I. Though lack of correlation between trehalose and acquisition or induction of thermotolerance has been shown, one cannot exclude that trehalose accumulation during heat stress may play a role in thermotolerance under certain conditions. The biological usefulness of trehalose accumulation during heat stress at 37-45°C may consist of aiding the cells in surviving such a preconditioning during the time heat shock proteins are synthesized (54). This so-called fire brigade function of trehalose has been supported by other workers (32, 58), who also suggested that the final “long-term” protective effect of exposure to critical (damaging) temperature (50°C) after preconditioning might be mediated by heat shock proteins and other stress proteins or factors. However, it seems that the mechanism of heat protection is different from the protection against oxidative stress and chemicals. While heat exposure results in immediate accumulation of trehalose in parallel to induction of heat shock proteins, exposure to some toxic chemicals does not immediately result in trehalose accumulation, in contrast to the induction of some stress proteins (S. Nwaka et al., submitted). A similar pattern of stress protein induction in contrast to trehalose accumulation was seen when yeast cells were exposed to the potential mutagen TPN (tetrachloroisophthalonitrile) (56).These authors showed that the smallest concentration of TPN that leads to induction of heat shock protein and aquisition of tolerance to 51°C exposure did not result in trehalose accumulation. However, the concentration of TPN that leads to trehalose accumulation was too toxic for the cells; they could not survive 51°C exposure. This clearly shows that the mechanism of stress response is a complicated phenomenon, dependent on the kind of stressor and the conditions, and that no single pathway has the ultimate function of protecting the cell from stress. After the first report on the role of trehalose as a membrane stabilizer in anhydrobiotic organisms (60),many papers documenting the impressive and specific protective effect of trehalose against stress treatment of biological structures in vitro (e.g. 61-64) have been published. Restriction and modifjmg enzymes have been dried and stored on trehalose even after exposures at very high temperatures without loss of activity after rehydration. Furthermore, trehalose can be used to preserve and dry foodstuffs, as well as antibodies in bedside blood typing (61,65,66). The protective effect of trehalose is not well understood at the molecular level. Two hypotheses (the water replacement hypothesis and the glass transition hypothesis) have been proposed, but there is no straightforward explanation, particularly for the supe-
TABLE I L A C K OF CORRELATlON B-EN
strains
W303 (wl? Ahspl04 mutant DF5a (WT) Auk41 ubc5 mutant YS18 (WT) Anthl mutant YS18 (WT) Anthl mutant YSH6.36.3B (WT) Anthl mutant A p l mutant W303 (WT) Atpsl mutant IFO-0224 (WT)
s.
TREHALOSE ACCUMULAnON AND THERMOTOLERANCE IN CEREVISZAE
Growth stage and carbon source (2530°C) Exponential cells on Wglucose Exponential cells on Wglucose Exponential cells on YPglucose Exponential cells on Wglucose Exponential cells on Wglucose Exponential cells on Wglucose Statiomuy cells on Wglucose Stationary cells on Wglucose Exponential cells on Wglycerol Exponential cells on Wglycerol Exponential cells on Wglycerol Exponential cells on W&actose Exponential cells on Wgalactose Exponential cells on Wglucose
"+, addition of;glc, glucose; TPN,tetxachloroisophthalonitde.
Treatment before heat shock and trehalose levels"
40°C; trehalose high 40°C; trehalose high Trehalose low Trehalose low 4030°C; trehalose low 4030°C; Trehalose high
Thermotolerance (50-54°C)
high low low
high low low
Trehalose high Trehalose higher
high
+ 100-mMglc; trehalose low
low low low
+100-mMglc; trehalose high
+ 100-mMglc; Trehalose low 42°C;trehalose high 42°C; trehalose low +0.1 mum1 TPN; trehalose low
low
high high
high
Reference(s)
54 54,57 32 32,59 32 32 33 33 50 50 50 58 58 56
TREHALOSE AND TREHELASES IN S. CEREVISIAE
207
riority of trehalose in confering stress protection compared to molecules with similar structures (61, 62, 64, 67-70). Although the function of trehalose as a membrane stabilizer in anhydrobiotic organisms (60) has recently been challenged (74, care should be taken in correlating in uitro functions of trehalose as a stabilizer to its suggested in uiuo function as a thermoagent in yeast.
111. Characterization and Localization of the Yeast Trehalases
A. Localization of the Acid and Neutral Trehalases The idea of intracellular compartmentilizationof trehalase and its substrate in yeast (72, 73)came about due to the finding that starved yeast cells that contain high concentrations of trehalose also contain high trehalase activity. Further investigations after disruption and centrifugation of protoplasts indicated that trehalase is found in the soluble fraction, whereas trehalose remains in the sediment (i.e., protoplast) (31).From these data, it was concluded that trehalose is separated from trehalase through its binding to special sites on the cytoplasmic membrane. In 1974, it was reported that trehalase activity increased upon the initiation of growth of stationary yeast cultures (74).This increased activity of trehalase appeared to involve the activation of a preexisting trehalase zymogen by CAMP-dependentprotein kinase (21).In 1982, Wiemken and co-workers (22, 75) demonstrated the localization of the trehalase zymogen in the cytosol and localization of an active trehalase in the vacuole. These two activities were separated horn the respective compartments by protein fractionation, and their different physical and catalybc properties were analyzed using partidy purified enzymes (23).This presented clear evidence for the existence of two different activities in yeast, one with maximal activity at pH 4-5 that was confined to vacuoles and the other with maximal activity at pH 7 that was located in the cytosol and interconvertedby phosphorylation-dephosphorylation (2l, 76, 77; reviewed in 13).
B. Biochemical Characterization of the Acid Trehalase Acid trehalase co-purifieswith the secreted enzyme invertase (27).Therefore the vacuolar acid trehalase was purified to SDS gel homogeneity from a suc2 mutant deficient in invertase. After a five-step purification procedure, these authors found a 38% yield with an approximately 7000-fold purification. The purified enzyme exibited a broad smear on SDS-PAGE electrophoresis corresponding to a molecular weight in the range of 167-265
208
SOLOMON NWAKA AND HELMUT HOLZER
kDa. A molecular weight of 218 kDa was estimated for the purified enzyme by high-performance gel filteration, in agreement with previously published data for the partially purified enzyme (23).Similar to invertase (78, 79), the broad smear resulting from SDS-gel electrophoretography of the acid trehalase was believed to be due to high carbohydrate content of this protein (27). Furthermore, incubation of the purified acid trehalase with concanavalin A-sepharose removed all acid trehalase activity from the supernatant and confirmed that acid trehalase is a glycoprotein (27). Dependence of purified or crude extract acid trehalase on pH was studied and shown to be maximal at pH 4.5 (27).Earlier investigations under different conditions (23, 80) showed pH optima between 4 and 5.7, similar to the pH of the content of vacuoles (81, 82). EDTA (6.5 mM) in the tested pH range from 2.5 to 7.7 had no effect on the acid trehalase but completely inhibited the neutral trehalase of crude extracts (23,27). The purified acid trehalase showed high specificity for trehalose at pH 4.5. Other disaccharides, such as sucrose, maltose, lactose, cellobiose, and mellibiose, showed no detectable glucose formation on incubation with acid trehalase. The specificity of the acid trehalase for trehalose is the basis for the trehalose assay discussed earlier (34).The Km for trehalose was about 4.7 mM at pH 4.5, with a corresponding Vmaxof 99 pmol min-’ mg-l. An isoelectric point PI of 4.7 was estimated for acid trehalase similar to three other vacuolar enzymes from yeast: proteinase A, carboxypeptidase Y, and aminopeptidase I(83-85). The Km and V,, data from other groups (23,31,80)are at variance to this and to each other and may reflect the different conditions used in the experiments (27). Incubation of the purified acid trehalase with 0.1- and 10-mMconcentrations of each of several classical inhibitors-EDTA disodium salt, iodoacetic acid, o-phenanthrolinium chloride, phenylmethylsulfonyl floride, and 0.1 m M HgC1,-showed no significant effect on the enzyme activity, in contrast to 10-mM HgCl,, which causes significant inhibition. Furthermore, incubation of the enzyme with 0.1- or 10-mM CaCl,, MnCl,, MgCl,, or ZnSO, has no effect on enzyme activity (23,27).This suggests that the active site contains no sulihydryl, serine, or threonine hydroxyl groups and that acid trehalase activity is independent of bivalent metal ions. As a result of the purification and characterization of the vacuolar “acid trehalase” as a glycoprotein (carbohydrate content of about 86Oo),the biosynthesis and processing of this enzyme was studied in v i m using mutants conditionally defective in the secretory pathway (27, 86).These authors showed that the acid trehalase is synthesized in a Sec61-, Secl8-, and Sec7-proteindependent manner, similar to invertase. However they did not show how this protein enters the vacuole. It should be noted that the migration pattern of the purified acid trehalase is typical of proteins that undergo heterogeneous
TREWALOSE AND TREHELASES IN S. CEREVZSZAE
209
glycosylation, such as invertase; however, this type of glycosylation is not seen with any previously characterized vacuolar hydrolase. Vacuolar glycoproteins generally undergo limited glycosyl modification, resulting in sharp defined bands on SDS-PAGE gels (87-89).We therefore suggest that acid trehalase may be an unusual vacuolar enzyme (29;B. Mechler et aE., in preparation). Deglycosylation of the purified acid trehalase was achieved using endoglycosidase H and N-gylcosidase F treatment in vitro. SDS-PAGE electrophoresis using purified acid trehalase antisera revealed a single band of about 41 kDa and a doublet band of approximately 100 kDa (86).From the finding that the sec61 mutant (blocked in passage of newly synthesized protein into the endoplasmic reticulum, and thus defective in the capacity to core-glycosylate secretory proteins) also shows a distinct band at about 41 kDa, it was concluded that this 41-kDa fragment is the carbohydrate-free acid trehalase (86).These authors therefore discussed the doublet 100-kDa fragment found after deglycosylation as probably representing a partially deglycosylated acid trehalase form. Subsequently, using antisera raised against the purified acid trehalase (27),a different glycosylated protein called gp37 (molecular mass 37 kDa) encoded by the YGPl gene was identified (28).It was therefore concluded that what was thought to be the deglycosylated acid trehalase (86)is a different protein that may have co-purified with acid trehalase. In a further search for the acid trehalase gene, the ATHl gene, required for acid trehalase activity, was identified (29).The amino acid sequence of the ATHl gene predicts a protein of about 117 kDa, which approximates to the about 100-kDaband discussed in the previous publications (27,86).The molecular analysis of the acid trehalase gene ATHl is discussed in detail in Section IV,A.
C. Biochemical Characterization of the Neutral Trehalase The neutral trehalase was purified to homogeneity (withpoor yield) from stationary cells of the ABYSl mutant cells deficient in vacuolar proteinases A and B and carboxypeptidases Y and S (go), as characterized by App and Holzer (24).The ABYSl mutant was used because a profound proteinase sensitivity of neutral trehalase was found when partially purified from other strains (24).Purification from the ABYSl mutant was started from a stationary culture grown on glucose for 18-24 hours, at which growth phase the specific activity of neutral trehalase was found to be highest, irrespective of whether the enzyme is phosphorylated or not. These authors achieved a 1500-fold purification of the neutral trehalase with 2% yield in preparative electrophoresis. The purified electrophoretically homogeneous preparation
210
SOLOMON “AKA
AND HELMUT HOLZER
of phosphorylated neutral trehalase exhibited a molecular mass of 160 kDa on nondenaturing gel electrophoresis and of 80 kDa in SDS-PAGE electrophoresis. The pH dependence on the activity of this enzyme was demonstrated (with maximal activity of about 114 Fmol of trehalose min-l mg-’ at 37°C) at pH of 6.8 to 7, using 50 mM concentrations of acetate, MES, and HEPES buffers or using 50-mM imidazol HC1, which is routinely used for neutral trehalase assay at pH 7. The pH optimum at 6.8-7.0 justifies the designation of the enzyme as “neutral trehalase,” and its cytosolic localization was also confirmed. The apparent Km of the enzyme for trehalose was shown to be 34.5 mM, and, among seven oligosaccharides (trehalose cellobiose, lactose, maltose, mellibiose, sucrose, and raffimose) studied, the enzyme formed glucose only from trehalose. van Assche and Carlier (91)published a similar Km value; however, other authors published Km values around 5-10 mM measured with partially purified and probably only partially phosphorylated enzyme preparations (23, 73, 92). The high concentration of trehalose observed in yeast under certain conditions (up to 23% of the dry weight or 0.7 mobliter of the soluble space of the yeast cells (37,52)may explain why a trehalase with high Km may be of physiological significance. Polyclonal rabbit antiserum raised against neutral trehalase precipitates the enzyme in the presence of protein A-sepharose and does not react with acid trehalase. Mg2+,Mn2+,and Ca2+at a 1.5-mMconcentration have no effect on the purified neutral trehalase; however, 3-mM Mn2+and 5-mM Ca2+ led to an increase in enzyme activity from crude extracts (24, 93). EDTA or EGTA at 0.1 mM decreases neutral trehalase activity to about 50% but l-mM EDTA completely inhibits the enzyme activity (23,24,93),allowing the measurement of the acid trehalase activity. However, some overlapping activities of the acid and neutral trehalases under their respective pH optima exist: activity of purified neutral trehalase at pH 4.5 is <1% of the activity at pH 7, while activity of the purified acid trehalase at pH 7 is <5% of the activity at pH 4.5 (24, 27). The neutral trehalase is sensitive to sufhydryl reagents, whereas acid trehalase is not (27).Polycations (polyethyleneimine, poly-Llysine-L-phenylalanine, or histones) activate neutral trehalase from crude extracts but inhibit activity of the purified phosphorylated enzyme. The activation by polycations was found to result from precipitation of RNA and polyphosphates, which both inhibit neutral trehalase. It was therefore concluded that, in addition to phosphorylation-mediated activation of neutral trehalase, a second activation mechanism exists, namely, removal of an inhibitor by polycations (24, 93). Dephosphorylation of the purified phosphorylated neutral trehalase by alkaline phosphatase from Escherichia coli resulted in more than 90%inactivation, and rephosphorylation by incubation with the catalytic subunit of beef heart protein kinase resulted in reactivation (activation was also effect-
TREHALOSE AND TREHELASES IN S. CEREVlSZAE
211
ed by yeast CAMP-dependent protein kinase) and incorporation of 0.85 mol of phosphate/mol subunit (80 m a ) . The phosphorylated amino acid residue was identified as phosphoserine (24).Microsequence(s) of the purified protein were difficult to determine because the protein was N-terminal blocked. Further studies on the regulation of this enzyme were based not on the protein sequence, but on the use of the structural gene cloned by complementation of chemical mutants of the neutral trehalase. The molecular analysis of the neutral trehalase-encoding gene is discussed in section IV,A. The numerous differences found between acid and neutral trehalasesuch as difference in intracellular compartmentation, carbohydrate content, response to phosphorylation, immunological response, and pH dependence on activity-suggest that these two enzymes are encoded by different genes and that their regulation may vary considerably. The in vitro enzymatic assay for acid and neutral trehalase (from crude extracts and for the purified enzymes) is based on the measurement of glucose delivered from trehalose, using the commercially available GOD/POD reagent, which contains glucose oxidase, peroxidase, and ABTS (40).The different pH optima of these enzymes, and the inhibition of neutral trehalase by EDTA in contrast to acid trehalase, are considered in the various buffers used, Phosphorylation of the neutral trehalase from crude extracts is achieved by addition of a CAMP,ATP, and Mg2+mixture to the reaction mixture (21). One unit of acid trehalase is defined as the amount of enzyme that catalyzes the hydrolysis of 1 ymol of trehalose/min at 37°C and pH 4.5, while one unit of the neutral trehalase is defined as the amount of enzyme that catalyzes the hydrolysis of 1 ymol of trehalose/min at 37°C and pH 7.0. Details of the assay method are described elsewhere (24,27,33,76).
IV. Molecular Analysis of the Yeast Trehalases
A. Molecular Analysis of the ATH 7, NTH I , and NTH2 (YBRO 106)Genes Following many years of research without a clear picture on the role of trehalose in yeast, Lillie and Pringle (37) summarized their work by stating the following: “Our results suggest several possible roles for trehalose during the yeast life cycle and indicate several constraints on the complex regulatory mechanisms that govern trehalose metabolism. However, it must be emphasized that correlations of the type reported here and in previous studies are not sufficient to establish cause-and-effectrelations. We hope that genetic studies will allow more definitive conclusions to be drawn.” This statement was followed by an extensive biochemical and molecular characterization of
212
SOLOMON NWAKA AND HELMUT HOLZER
the enzymes involved in trehalose metabolism with a view to understanding the role and regulation of trehalose metabolism in yeast. The search for the genes that encode enzymes of trehalose metabolism started in the late 1980s, with no true success until the early 1990. As stated in the introduction, the genes T P S 1 , TPS2,and TPS3 encode the three enzymes of the trehalose-6-phosphate synthase-phosphatase complex. Although the regulation of these genes with regard to trehalose synthesis is not well understood, their study has contributed immensely in our efforts to understand trehalose biosynthesis in yeast. It is also thought that this enzyme complex may have other functions in yeast, such as sensing the influx of glucose in the cell and in stress response (16, 18, 19, 94). Whether these additional functions of the synthase are linked to the function of trehalose is not clear. Concerning the trehalases, information deduced from the biochemical characterizationof the acid and neutral trehalases was very useful for the subsequent molecular analysis. In order to clone the structural gene for the acid trehalase, two independent approaches were used (28,29).First, peptide sequences obtained from the purification of the acid trehalase (27)were used to prepare degenerate oligonucleotides subsequently used in a polymerase chain reaction with genomic DNA as template. This resulted in a 0.5-kb fragment used to screen a library and to isolate a gene called YPG1, whose expression responds to general nutrient limitation and repression by glucose (28).The YGPl gene codes for a highly glycosylated,secreted protein called gp37, with a yet-unknown function. Disruption of the YGPl gene has no effect on acid trehalase activity, and it was concluded that YGPI does not encode the acid trehalase. Furthermore, it was concluded that the highly glycosylatedprotein with high acid trehalase activity purified previously (27)co-purified with the gp37 protein. Antisera raised against gp37 using synthetic peptides show a high-molecular-weight smear on SDS-PAGE electrophoresis similar to the glycosylated acid trehalase preparation reported before (27,86).Deglycosylation of gp37 produced a discrete species that migrated at approximately 41 kDa, in agreement with the molecular mass calculated from the YGPl nucleotide sequence. This same 41-kDa protein was believed to be the deglycosylated acid trehalase (86). From this, it is likely that Mittenbiihler and Holzer (86) studied the biosynthesis and processing of gp37 instead of the acid trehalase. However, it should be noted that, after deglycosylation of the the purified acid trehalase preparation, these authors (86)observed a second band of about 100-kDa molecular mass in addition to the 41-kDa band on SDS-PAGE (cf. Fig. I in (86)).This 100-kDa band was discussed as the partially deglycosylatedprotein of the acid trehalase preparation. In a second attempt to isolate the structural gene for the acid trehalase, Destruelle et al. (29)relied on the observation that overproduction of vacuo-
TREHALOSE AND TREHELASES IN S. CEREVZSZAE
213
lar proteins leads to their expression at the cell surface (95,96).Secreted proteins can then be identified by immunoblotting with specific antibodies or by their enzymatic activity. For this reason these authors developed a specific enzymatic overlay assay for the acid trehalase that allowed screening of many transformants. A similar enzymatic overlay assay had been developed for the neutral trehalase (25).For the acid trehalase assay, the neutral trehalase activity was inhibited by the addition of 5-mM EDTA and by the acid pH of 4.5 (24, 27). To iden* the putative acid trehalase-encoding gene, yeast strain SEY6210 was transformed with plasmid DNA from a YEp24based genomic library (97j, and positive clones that resulted in secretion of acid trehalase activity were identified. Two plasmids resulted in an approximately 10-fold increase in acid trehalase activity concomitant with appearance of activity at the cell surface. Sequence analysis of these plasmids showed overlapping genomic inserts, which were used to define the limits of DNA sequences leading to acid trehalase secretion. This led to the isolation of the ATHl gene causing the secretion of acid trehalase activity. Disruption of the ATHl gene leads to loss of acid trehalase activity, and reintroduction of the gene by yeast transformation restores the acid trehalase activity and results in overproduction of the acid trehalase when a high-copy vector was used. Northern blot analysis revealed that the ATHl gene is expressed in stationary-phase cells, while no expression could be detected in logarithmically growing yeast cells on glucose. This expression pattern corresponds with the activity profile of the acid trehalase (29,33, 98). While the ATHl gene is required for acid trehalase activity, two lines of evidence, at the time, suggested that the ATHl gene may not be the structural gene for the acid trehalase, and favored it as a regulator of synthesis of acid trehalase. First, the genes encoding the neutral trehalases from yeast and trehalases from certain organisms have been isolated, and these trehalases exhibit strong homology in some conserved domains that may correspond to the catalyhc site (25, 99). The predicted ATHl gene product does not show any homology with these trehalases. As suggested earlier, it is possible that an acid trehalase-encoding gene possesses a different catalytic mechanism than that used by the other known trehalases, such as neutral trehalase. Second, the ATHl gene product (Athlp) lacks certain molecular characteristics that are associated with proteins that transit through the secretory pathway, such as a signal sequence at the amino terminus for soluble secretory pathway proteins, hydrophobic internal signal sequences, and consensus cleavage sites based on the rule of von Heijne (100). Localization of the acid trehalase to the vacuole may therefore occur by a mechanism independent of the secretory pathway (29,101).However, the acid trehalase has been characterized as a glycoprotein that transits to the vacuole in a sec-dependent manner, suggesting movement through the secretary pathway (23,86,102).
214
SOLOMON NWAKA AND HELMUT HOLZER
Recent work from our laboratory showed that deletion of the ATHl gene leads to inability of yeast cells to grow on trehalose as carbon source, and demonstrated a gene dosage effect of the ATHl gene for growth on trehalose and for acid trehalase activity using strains homozygous and heterozygous for the Aathl mutant allele (35).This was a first observation demonstrating the participation of acid trehalase in trehalose utilization and points to the ATHl gene being the structural gene for acid trehalase. In a further study aimed at clarifjmg the relationship of the ATHl gene to the acid trehalase (103),the acid trehalase was purified from an overexpression strain and from a strain disrupted in the YGPl and SUC2 genes, as described (27). These authors achieved a 37% yield with an approximate 600-fold purification, in contrast to the about 7000-foldpurification achieved previously (27).The purified glycosylated protein showed a broad smear corresponding to the molecular mass estimated previously (27).Deglycosylationof the protein resulted in a doublet band of 85 kDa and an additional band at -37 kDa on SDS-PAGE. Analysis of the acid trehalase preparation from the Aygpl strain showed the 85-kDa but not the 37-kDa species, suggesting that the 37-kDa species is the deglycosylated YGPl gene product (27, 86). However, the 85-kDa doublet band was further characterized and found to be the deglycosylated acid trehalase: the pH optimum and isoelectric point of this enzyme were similar to previously published data (23,27).The peptide sequence of part of the 85-kDa protein corresponds to the protein predicted from the ATHl gene, and antiserum raised to Athlp specificallyrecognizes the purified acid trehalase. Therefore, the ATHl gene encodes the acid trehalase, and it is glycosylated. Another question that remains unanswered is how the acid trehalase gains access to its substrate trehalose, as well as how it gets into the vacuole. Endocytosis and autophagocytosis are mechanisms used for delivery of extracellular and cytoplasmic constituents to the vacuole prior to degradation. For endocytosis, signals that trigger inclusion of proteins into endocytic vesicles are partly understood only for a limited number of proteins. Acid trehalase secreted to the periplasm and bound to trehalose could be internalized from the cell surface and delivered to the vacuole via endocytosis, and cytoplasmic trehalose may also enter the vacuole by autophagocytosis (35,103, 104).The necessity of the acid trehalase for growth on trehalose supports this hypothesis and proposes endocytosis as a mechanism of uptake (35).Furthermore, unpublished results from our laboratory (B. Mechler et aZ., in preparation) support this idea: 1)similar to the Aathl mutant, the endl mutant (109, which is defective in the later steps of endocytosis and biogenesis of the vacuole, does not grow on trehalose as a carbon source; 2) the Aathl mutant cannot take up trehalose from medium into the cell; 3) although growth on trehalose is extremely slow, it is stimulated by low pH; and 4) the acid trehalase appears to be an unusual vacuolar enzyme whose processing
TREHALOSE AND TREHELASES IN S. CEREVZSZAE
215
does not depend on proteinases A and B. A truncated secretory pathway for the processing and entry of acid trehalase and trehalose to the vacuole is proposed as shown in Fig. 3. Interestingly, some of the kinetic parameters reported for a high-affinity trehalose transporter (107)do not contradict the role of acid trehalase in trehalose transport. Also, the pH optimum for trehalose transport is in the acidic region, similar to acid trehalase. Furthermore, experiments using the secretory mutants have demonstrated that several permeases (108)and plasma membrane H+-ATPase(109)are externalized by the same secretory vesicles that transport periplasmic enzymes. However, the amino acid sequences of the permeases and the ATPase do not contain amino-terminal signal sequences (110). Therefore, membrane insertion is probably mediated by internal sequences. These characteristic may apply also for the acid trehalase.
FIG.3. Proposed pathway for entry of acid trehalase Athlp and extracellulartrehalose into vacuoles.The processing and biosynthesis of Athlp through the secretory pathway is shown (86, 102).Existence of periplasmic and vacuolar Athlp is shown. Entry of periplasmic Athlp and extracellular trehalose into the vacuole through endocytosis is shown. A link between the secretory and endocytic pathways was proposed earlier (106). PM, plasma membrane; SV, secretory vesicle; EV, endocybc vesicle.
216
SOLOMON
NWAKA AND HELMUT HOLZER
The cytosolic neutral trehalase-encoding gene N T H l was cloned by complementation of a neutral trehalase-deficient yeast mutant obtained by EMS mutagenesis (25). Three neutral trehalase mutants shown to belong to the same complementation group were transformed with the Sacchuromyces cerevisiaegenomic library in YEp24 (97)and screened by enzymatic overlay assay for neutral trehalase. Two overlapping plasmids that restored the defect in neutral trehalase activity were isolated and sequenced. The N T H l gene, believed to have an open reading frame of 2079 base pairs ('bp), encoding a protein of 693 amino acids and a molecular mass of about 80 kDa, was identified (2s). Subsequently, it was found that the N T H l gene has an open reading frame of 2253 bp, corresponding to a protein of molecular mass 86 KDa, in agreement with Western blot analysis of purified and crude extract neutral trehalase (24-26'). Northern blot analysis yielded a single mRNA species of approximately 2.3 kb, in agreement with the size of the gene. Disruption or deletion of the N T H l gene leads to loss of neutral trehalase activity; to higher steady-state concentration of trehalose in exponentially growing, heat-stressed, or stationary cells; and to inability of these mutants to hydrolyze trehalose after return from heat stress temperature of 40°C to normal growth temperature of 30°C (25,32).Polyclonal neutral trehalase antiserum (24) as well as synthetic peptide antiserum based on the sequence of Nthlp does not precipitate the neutral trehalase in the N T H l mutant strains. Amino acid sequence comparison of Nthlp from S. cerevisiue points to significant similarity in certain domains, with osmoregulated treA gene encoding the periplasmic trehalase from E. coli (111)and the trehalase gene from rabbit small intestine trehalase (lo),which may participate in the formation of the catalytic domain of these trehalases (25).The Nthlp has two putative CAMP-dependentphosphorylation consensus sequences (ArgArg-XSer) (112).It was therefore suggested that the previously described activation of neutral trehalase by CAMP-dependent phosphorylation is due to phosphorylation of one or two of these sites (25, 2s). The N T H l gene therefore encodes yeast neutral trehalase and the neutral trehalase is responsible for intracellular trehalose hydrolysis, in contrast to acid trehalase (25, 32,33). Nucleotide sequence analysis indicated a location of the N T H l gene near the centromeric elements CDEI, CDEII, and CDEIII on chromosome IV (113).Due to this nearness to the centromere, it is believed that the N T H l gene is a useful tool to determine the distance of unknown genes relative to their centromere. Transformation of yeast with pNTH (YEp24 plus N T H l and its centromeric flanking sequences) does not lead to significant overexpression of neutral trehalase activity. This is due to a possible depressing influence of the centromere on the plasmid copy number per cell (25, 99). However, when the open reading frame of the N T H l gene was put under the control of the GAL1 promotor in a 2-p-based high-copy plasmid, a high over-
TREHALOSE AND TREHELASES IN S. CEREVZSZAE
217
expression of neutral trehalase activity was observed (33),suggesting that the flanking sequences of the gene that contain the centromeric elements may be responsible for the depressing effect on plasmid copy number proposed earlier (25, 99). In the promoter of the NTHl gene, a putative binding sequence for the Migl protein was found, a multicopy inhibitor of the GAL1 promoter (26, 114). Migl protein also binds the promoter of the SUC2 and FBPl genes (114-116),and it is thought to be not only a glucose-sensitiverepressor of GALI, but also an inducer of the glucose-regulated metabolic pathway. The neutral trehalase, a glucose-forming enzyme, might therefore be repressed by glucose under the control of Migl protein (25, 99). As mentioned earlier, a putative trehalase-encoding gene, NTH2, located on the right arm of CENZZin Chromosome I1has been described (30).Its predicted protein sequence shows 77% identity to the product of the neutral trehalase-encoding gene, NTHl, located on the right arm of CENZV in chromosome Iv (26, 113). The putative protein for the NTH2 gene (Nthep), has a longer N-terminal extension compared to the NTHl gene product (Nthlp) (25, 26, 30). The N-terminal regions of Nthlp and Nth2p contain two putative CAMPdependent phosphorylation sites in their sequences. Nthlp contains Arg17 ArgLysSer20with phosphorylatable Ser20,and Arg80ArgGlySer83with phosphorylatable Ser83 (25, 2 0 , while Nth2p contains ArpgArgLysSe152 and Arg10gArgGlySer112(26).Therefore, similar to the neutral trehalase encoded by NTHl gene, the NTH2 gene product Nth2p may be regulated by phosphorylation. Trehalase phosphorylation participates in the regulation of trehalose concentration, which is thought to be involved in stress response and control of the yeast life cycle (13).The similarity in sequence and localization of the NTHl and NTH2 genes support the idea that the centromeric regions of both chromosomes were formed through duplication of a centromere region of an ancestral chromosome (26,30). Northern blot analysis of the YBROIO6 (NTH2) gene in a background of the Anthl mutant showed that NTH2 is expressed at a low level in exponentially growing cells on glucose and at a high level in stationary cells after glucose exhaustion (33).This result suggested that NTH2 is repressed by glucose similar to neutral trehalase activity (25, 54, 99,104). Whether a catalflcally active trehalose-degrading enzyme (i.e., a trehalase) is formed by translation of this mRNA is still an open question: a disruption or deletion of the NTH2 gene as well as its overexpression does not change the trehalose degradation rate or the trehalose levels as compared to wild-type strain under different conditions. However, an increase in acid trehalase activity was reported for the NTH2 disruption mutant strain (33).Whether this implies a regulatory effect of NTH2 on acid trehalase is yet to be studied. However, the necessity of the NTHl and NTH2 genes for recovery of yeast cells after heat
218
SOLOMON NWAKA AND HELMUT HOLZER
shock has been demonstrated by a heat shock survival assay of the respective mutants on solid media (33,104).This is a strong indication that the two genes have something in common functionallyin accordance with the strong structural homology between them. This phenotype of the NTH2 mutant also points to the formation of a protein from the NTH2 gene and to other functions of the trehalases in yeast. Studies with antiserum raised against the Nth2p using a synthetic peptide of the N-terminus have not yielded a clear result yet. Furthermore, Nth2p appears to have a signal sequence at its Nterminal region, however, this sequence lacks characteristics of a typical sequence used for import into organelles (see Fig.4).
B. Alignment of Trehalases from Various Organisms Amino acid sequence comparison of available trehalase sequences from Saccharomyces cereuisiae neutral trehalases (Sc-Nthlp and Sc-Nthep), KZuveromyces Zuctis neutral trehalase (Kl-Nthlp) (gene bank accession X81421), E. coli periplasmic trehalase (E-Trea)and cytoplasmictrehalase (E-Tref)( I 11; gene bank accession number S47739),rabbit small intestine trehalase (R-Tre) (lo),the insect trehalases from T'ho molitor (T-Tre) (117),and Bombyx d (B-Tre) (gene bank accession number D13763) reveal an N-terminal extension of the yeast trehalases (Sc-Nthlp, Sc-Nth2p from S. cerevisiae and Kl-Nthlp from K. la&) in relation to the other trehalases (Fig.4).Such an Nterminal extension has also been observed for the FBPl gene from S. cerevisiae as compared to FBPases from pig and sheep (118),as well as for the FBPase from S. pombe and pig kidney (119).In yeast malate dehydrogenase, an N-terminal extension has been observed in the case of the cytosolic enzyme in contrast to the mitochondrial isoenzyme (120). The N-terminal extension of the neutral trehalases from yeast (Sc-Nthlp, Sc-Nthep, and Kl-Nthlp) contains the putative CAMP-dependentprotein kinase recognition sites ArgArg-X-Ser (112)at two different positions. One or two phosphorylation sites may therefore mediate the CAMP-dependent phosphorylation of Nthlp (13, 21,24) and possibly also of Nth2p (26).Such an N-terminal extension seems to be important for the regulation of Saccharomyces cereuisiae trehalase (Nthlp) and K . laccis neutral trehalase (Kl-Nthlp) by a CAMPdependent phosphorylation process. This fits with the dependence of the Nterminal extension of the protein from the NTH1 sequence for activity (33). In Fig. 4, it can also be seen that the eight trehalases from various organisms show homology in two distinct domains corresponding to the central core and the C-terminal region. This homology is much more significant when three or six trehalases are compared (25, 99). The regions of homology may therefore be part of the catalytic domain of the trehalases, similar to the situation in the FBPase genes from E. coli, rabbit, and S. cereuisiae.Furthermore, the relationship of the trehalases from different sources can be
TREHALOSE AND TREHELASES IN S . CEREVZSIAE
219
seen from the dendrogram in Fig. 5. The more closely related the organisms are, the more similar the trehalases appear to be. While the trehalases from the yeasts S. cermisiae and K . lactis share a high percentage of identity, inclusion of trehalase from any of the other organisms will reduce the identity. This is also true for the E. coli trehalases and insect trehalases compared separately. It should be noted, however, that, while the dendrogram gives a first notion on the sequence similarity between different proteins, it may not reflect the evolutionary relationship between the proteins. One argument in favor of the ATH1 gene not being the structural gene for acid trehalase is the lack of homology between the ATHl gene product and the other known trehalases from Saccharomyces cermisiae and other organisms (29).This lack of homology may be justified if one considers the localization of the acid trehalase in the vacuole and its pH optimum of activity of 4.5, in contrast to the neutral trehalase, located in the cytosol with a pH optimum of activity of 7. The evolutionary relationship as well as the catalytic mechanisms of the two enzymes may be different
C. Regulation of the Expression of Trehalases by Heat and Other Stress Conditions Cells respond to environmental or physiological stresses by altering the expression of specific genes encoding proteins that serve a protective or adaptive role. In the yeast Saccharmyces cermisiae, stress induces the expression of heat shock genes, as well as genes involved in protein degradation, glycolysis, plasma membrane function, antioxidative defense, metal homeostasis, trehalose biosynthesis (124, and trehalose hydrolysis (33, 104). In contrast, the synthesis of many proteins (300 out of 500 analyzed in one study (122))transiently decreases after a heat shock, mainly due to cessation of RNA synthesis and degradation of preexisting mRNA (123, 124). The expression of a set of the stress-induced genes is also induced when cells enter stationary phase. This can be considered as a stress response triggered by nutrient limitation (125,126'). Nutrient regulation of the trehalases is discussed in Section IV,D. Although the exact roles of all the induced proteins are not well understood, it is evident that stress tolerance requires the coordinated activity of a number of gene products involved in diverse cellular functions. Trehalose concentration increases in cells exposed to heat stress. This has been shown to result from increased enzymatic activity of the trehalose-6phosphate synthase and increased concentration of its substrates UDPglucose and glucose 6-phosphate (54, 55). In fact, an increase in mRNA of the TPS1 and TPS2 genes has been reported, and these genes appear to have stress related functions (20,127).The additional finding that heat stress also leads to increased mRNA of the NTHl and NTH2 genes, as well as increased enzymatic activity and immunoreactiviy of the NTH1 gene product, is sur-
100 1 PTDPRKQKQA KPAKINRTRT USVFDNV... Sc-Nthlp .......... .......... .......... ..NsQVNTSQ G W A Q G R Q M LSSLSEFNDP F G P ITDPR. .KQS KIIIRLNRTRT MSVFNKV.. . Sc-Nth2p MVDFLPKVTE INPPSEGNDG EDNIKPLSSG SEQRPLKEEG QPGGRRHHRR LSSMHEYFDP F G P FSVAEPGGGK K1-Nthlp .......... .......... .......... ....MDGKvN NNPPRSRHRR TSSLE3WWP FSTPDVYYGP KSDPS..KLL SK"RTRT E-Trea .......... ........................................ .......... .......... .............................. E-Tref ........................................ .......... .......... .............................. .......... .......... .............................. B-Tre .......... ........................................ .......... .............................. T-Tre R-Tre ........................................ -____--___ ---------C O M M S U S ----------
.......... .......... ........................................ .......... .......... __________ .............................. .......... ------__---_______-_ .......... ----______ __________ ---------__________
101 Sc-Nthlp .. .SPFKXTG F'G.KLQQTRR GSEDDTYSSS OGNRRFFIED VDKTLNELLA Sc-Nth2p .. .SDFKN.G m.DYTLRRR G S E D D S ~ S SPGNRRF~IDNVDLALDELLA K1-Nthlp GHSSSYTSPY FDTTVPLRRR GSEDDSYSAS QGQRRFIIED VDKTLKELLA E-Trea ........................... .MK SPAPSRPQXM ALIPACIFLC E-Tref KLN QKIQNPNPDE LMIEVDLCYE 8-Tre .MRLFLLLVG T-Tre ........................................ .MIPFLLMIA R-Tre ................................... MPGST V ___---------___-------------Consensus
........................... ........................................
__________
SC-Nthlp Sc-Nth2p K1-Nthlp E-Trea E-Tref B-Tre T-Tre R-Tre Consensus
............
--________ ---------- __________
201 AKSFGRHQIF AKSFGRHQIF AKLFNRXQVI AKLFPDQKTF AKIFPDSKTF ARLYPDSKTF
LDEARINENP VNRLSRLI" LDEARINENP VDRLSRLITT LDEARLNBNP VNRWTRLISG AD.AVPNSDP MILADY... PD.CAPIaIDP LDILIRY ... VD.NATLSAF..Q SRIFNDSKTF VE.LRXINDE QTTLENF..D ARLYPDDKQF VD.MPLSTAP DQVLQSF. .A
__________
200 AEDTDKNYOI TIEDTGPKVL KVGTANSYGY KHINIRGTYM LSNLLOELTI SEDTDRNH~I TIEDTGPKV~ KVGTANSNGF RHVNVRGTYM LSNLLQELTI SEDTDGNYOI TIEDTGPRVI RVGTVNSNGY RHVHIRGTYM LSNLLOELTL FAACSVQA~ETPVTPQPPDI LU;. .P. . L ~ Q N LDPYELRLDE MIFAEPEPEW IEGLPASDAL TPADRYLE.. ...LFEHVQS LTTV...uLD DLPPTCIRW Y ......... . . . . C N S T . . ...LLIIwpn FADTVIQVSA QSQPSCDsKV Y... .......... CQGK.. ...LLHWEBl LG... LGSEQ ALPPPCESQI Y...... CHGE.. . . . L L H W _____-------------- ---L------
.......... ............
...
...
----------
401
400 GKILNANRSY YLCRSQPPFL SRILNANRSY YLCRSQPPFL GKILNANRSY YLCRSQPPFL GHIPNGNRSY YLSRSQPPFF GREDLLKCMA CbSpAIMIENX GHIPNGNRTY YLSRSQPPVF DIzlgTAKQtI ENLIELLYKP GHIPNGSRW YQERSQPPLL ONPLSIVEKY GPIPNGARVP YLNRSQPPLL m G M L QNFLDLVTAY GHIPNGGRW YLQRSQPPLL --------M--I-N--R--Y--RSQPP-NKTLIVARGUV NKVLWARGMV NKLDVARGMV GllwnnrADMI
EHFIFEINm EHFIFEIDHY EHFIFEIDHY ANFAHEIDTY
__________
500
T m V V F K X LGRSNPMV T m L V F E K IGQDWPNAI TEMALQVCRK MGGDKNWAV -QH EGD. A ALMVELFEED GVR...... G AAnIKLwEK TKD......I TLMISLWSA TND...... M TLYMDRWAH TGD ...... L
.....
--M-------
300 VEYLPICKITA VEYLPKDITA VNYLPKDITP ...WPV. LTR ...WPV. LTR OGFREPAKAL NDIWPT.wLR ~TIRQFAQDL V K I W P T . ~ PlcLRAwAEQL H L L W C K . U K
SpMIPSLKLE SQMNPSLKLE SEMDPSLQLE QSLREHIDGL NSLKEHIDQL
------____ ---------- ------_--- ---------- ----------
301 Sc-Nthlp EYVKSVNDTP-G FNPSTGEKTL IGYPYAVPGG RFNELYGWDS m L L E A Sc-Nthlp EYVKSWDTP -G VNPSTGERSL VGYPYAVPGG RFNELYGWDS YLWltCLIES Kl -Nthlp EFVKSLNDKP GL-SH MDPVTGEFPW VGFPYAVPGG RF'NELYGWDS YRYlLGLLES .STENTE KWDSLLPLPE ............ .PYWPGG RpRHvyywDS YFTEaiCLAES E-Trea E-Tref RPSETYYWDS YPRILGLAES .EPQDHI PWSSUALPQ ............ .SYB-Tre RVKPSVLEKP EQSSLVPWIW ............ GFTVPGC WKEIYYWDA YWIIEGLLIT T-Tre RvIu[EvLDYP KHYSLLF" ............ .GFIIPGG W T W Y Y W D S YWIVEGLLLS R-Tre KIKPEVLSQP ERFSLIYSQH .............PFNPGG MVEFWWDS YwvPIEcLLLS ----L----- ---------- -------pGG W-B-Y-WD- y----GL--Consensus Sc-Nthlp Sc-Nth2p K1-Nthlp E-Trea E-Ttef B-Tre T-Tre R-Tre Consensus
.......
QFUNSLTRRV DLNNVOEIAK DTKIDTPGAK NPRIYVPYM: PEQYEFYJQA QFWl3LTRRV DLYNIAEIAR DSKIDTPGAK NPRIYVFTNC PEQYEFYIQA QrwRSLTRRI DSNNIAKIAY DTKIDTPKAK NPRIYVPYNC QDEXQQLVQW . . . . R M W SGFDLFXFVN VNFFLPI[Ei.G W W P P E . . G ....RKvRRH RDFDLRKFVE NHPWLPEVYS SEYVSDP Q ELLDRTNHNP T K E D L Q M DFFDE.TSEL EEWKPDDHKE NPPFLAKIRD NFLRDTNHRR TRADL~KWS DNPRQ.ENEF ESWTPTDFTD NPTLLSRIED ELAATYNNTV PREQLEKFVQ EHFQAVGQEL ESWTPGGDWKE SPQFLQKISD
---------- - - - - - - - - - - ---------- - - - - - - - - - -
...
DLLKRAPQAS IKEYK.. . . . . . . . . . .TVW TASPRLDPET GLSRYKPWL GIPPETESW FDTVLLPYAS KHGVTLDEFK QLLKRAFRAA DLLRRAPRAA A.LKQYLPQM A..RRYLDHL EFIRKYISAL EWLAKNIRTI AFLRENIETL
_ _ _ - _ _ ___________ __-
IKFlK.. .......... EVW IKEYL..... .......TVW QKEYAYWMDG VENLQAGQQE KpIEyApkMDG AESLIPNQAY HL BKELEYWLDT DTELRFWL" RL ALELDFWAEN ........RT --E-------
........ ........
MSSPRLDSLT GLSCYHSDGI GIPPETEPDH FDTILLPYAE TASPRLDEKT GLSCYHPDGI GIPPBTEPGH FDSILRKYAE KRWIUQDGT LLNRYWDDRD TPRPE...SW VEDIATA.KS RHVVRWDGS LLNRYWDDRD TPRDE ...SW LEDW3TA.K. IAFNKLIDRVY TLLRYYIPSA GPRPE . . . S Y Y E D Y W Q K L VDVVRDGIVY KLAQYNSNSG SPRPE. ..SY YEDV"ASVF ISVSSGGNSH TLNRYHVPYG GPRPE . . . S Y SKDTELAHTL -L-_y----- -_--E--_--
__________ __________
__________
m R KYNVSIPEFR NPNRPATEIY HSGRPPNEVY DRNPDF'NDIY SDEROKRELY PEG.SWETLW
__________
Sc-Nthlp Sc-Nth2p K1-Nthlp E-Trea E-Tref B-Tre T-Tre R-Tre
501 QLYNDGKIKE YLYNEU4IKE DLYNSQKVHE RDLRSAAASG RDLRAGAASG ADLKSAAESG
PKLDEFFLHD PKLDAFFLHD PDLDVFFLHD WDFSSRWM.. WDYSSRWL.. WDFSTRWFIS MDLKSAAESG WDFSSRWIVD AELKAGAESG WDFSSRWLVG
RGVRESGHDT RAVRESGHDT RGVRESGHDT ...DNpQQLN ...RDTGRLA ESGDNSGNLT EYGGTRGNLS ..SPNPDSLG
TYRFEGVCAY LATIDLNSLL YKYEIDIADF IKEFCDDKYE DPLDHSITTS TYRFEGVCAY LATIDLNSLL YKYEKDIAFV IKEYFGNEYX DENDGTVTDS TYRFENVCAY LATIDLNSLL YKYEVDIAYV IKKYFGDNFE GLPEGHRT.S VDLNSLM FKPIEKILARA SKAA...... ....GDNAMA TLRTTSIVP. SIRTl'QFIP. ... IDLNAFL FKLESAIANI SALK . . . . . . . . . . GEKETE NLNTKNVIP. ...MLNAIF AGALQITANF PAIL . . . . . . . . . .KNPRRA ALHTRRIIP. ... VDLNAFL CQAFQKLSEF YQTL . . . . . . . . . .GDYPNA SIRTSKLVP. . . .VDLNAFL CQAEELLSGF YSRL . . . . . . . . . . GNESQA
...
Sc-Nthlp Sc-Nth2p K1 -Nthlp E-hea E-hef B-Tre T-Tre R-Tre consensus
SC-Nthlp Sc-Nth2p K1-Nthlp E-Trea E-Tref B-Tre T-Tre R-Tre Co*sensus
_ _ _ _ _ _ _ _ _ _ _____-_-__ 700
601
Sc-Nthlp DESGFFFDYN Sc-Nth2~ EDSGFFFYYN K1-Nthlp EETGFYYDYN E-Trea LQQGWYADYD E-Tref DENGIYRDYD 8-Tre EEDGVWHDYD T-Tre RDDGIWYDWD
600 AMWKEWNIR QEKITKYMWD EHWEELAELR KTRINDWEKLAEVR KEP.1 DKYLWD NQYETLANAR QKGIEKYLWN ALFRQKASAR RDAVNRYLWD RSSIEQALWD ==Pw TFWSKLVKIW Q H S I E M V " TKYRNLRAQR IAALTALLWD
TKIKHRTSYE TKLKCRTSYE VKTEKRTSYE LKSHKVRNQL WRREOL.ALF 1L"iiPRRYF
SATTFWALWA SATTFWSLWA SVTTFWALWA TAAALFPLYV SAAAIVPLYV YTSNLAPLWM NELSQHFSMF FPSNFAPLWS
701 LTVTNRLAYFt QQVATRLAYR WHTRRLAYR KEVAMDISWH DLLGDEIARS SKLAKELAQV KQWLARR QEVAFQLAQN
__________
GLATKEQAQK MVEKALPKLE MLGGLAACTE RSRGPISISR GLATEEQAUI RlERALPQLE MLGGLVACTE KSRGPISIDR GMSSQE&R UVENALPRLE EFGGLVACTA RSRGELSLDR N A A A K D W KATATKTHLL Q.. ... .PGG .LNTTSVK m A D R LANAVRSRLL T PGG ILASEYE NAvEKPhAK HGARVLEYLH ESQALEYPGG .VPVSLVN ETFDSRNAEI L G 1 TQ"lMDYHGG IPTSLSH
W H Q I L A W E GLRSYGY . . . WAPHQILAWK GLSAYGY . . . WAPHQILVWD GLVRYGY. .. WAPLQWVATE GLQNYG. .Q TGEGWDKPNG WApLclwxAIO GFIQIYG...D SGE&DFPNA WPPEiiSAW? AIQNIGSEES TGEQWDYPNA WPPMQSIIVW GLDKSGSYRA
DVTRGTDPHR VEAEYGNQGA DVTRGTDPHR VDAEYDVTRGTDPHR VDAEYDVSTTGT.GG GGGEYPLQ.. HIADGYPRW: GGGEYPLQ.. DALNAG.KYG GGGEYTVQ.. "EVPG.QNG GGGEYVVQ.. DISNA..QFG GGGEYEVQ.. ..--EY--Q--
LKYMNSHARR ALGACIPPIS FFSSLRPQER LRYMNNHARR ALAACSPPLP FFNSLKPSEK UKYMNNFARR ALGTCVTPKV FFGRLPPKEK
......
.. ... .. ...
PIRQWDYPPG PIRPWDYPPG FTR6WDYPFG SGQPWDAPNG
.
800 DYNGIVVEKY DYNGMVVEKY DYNGIvvERl R.EFXLVEKY E.QHKLIEKY E.KKQNFZKY Q.TGEW€KY Q.RSAMYEKY
--------
__________
DPKGAATEGF DFKGVATEGF DFKGVATFGF .......DGF .......DGF DGF SGF EGF --------GF
....... ....... .......
GWVNASYIU: GWVNTSYLLG GWVNSSYLLG GWIWG. ..VT GWTNG...W GWSNG...W GWTNG... W GWTNG... VA GW-N------
LKMLDLICPK EQFCDNVPAT RPTVKSATTQ RRLIGLYGEP .................... LEFLDRYGAV LTSVDSVDAS ANNGQSNEES LEFINQFFTT LULLDRYGDR LSSGTQLALL EPHCLAAALL
....................
__________
__________ __________
801 810 NLYGL. .... KLYYL..... KKYGLE.... PSTKEAQPTP
.......... .......... LSFLTR .... -------_-_ ETD-...
FIG.4. Alignment of the predicted amino acid sequence of eight known trehalases from various organisms. Sc-Nthlp and Nth2p rep~ E-Trea and Tref resent neutral trehalases from Saccharomym cmisioe.Kl-Nthlp represents neutral trehalase from K l u v m y c e Zactis. represent E. coli periplasmic and cytoplasmic trehalases, respectively. B-Tre represents trehalase from the insect Bombyx d, while T-Tre represents trehalase from the insect T'~ molitor. R-Tre represents trehalase from rabbit small intestine.Consensus sequences are indicated.Alignment was produced using the pileup program (University of Wisconsin).
222
SOLOMON NWAKA AND HELMUT HOLZER
I
I E-Trea
I R-Tre FIG.5. Dendrogram of the eight trehalases from various organisms. The names of the trehalases are as described in the legend to Fig. 4. The relationship between the various proteins is depicted. The yeast trehalases appear to be more related to each other than to the other trehalases from E. coli, insects, and rabbit. This dendrogram was produced using the Pileup program (University of Wisconsin).
prizing (104; S . Nwaka et al., submitted). This implies that these genes are regulated ~y temperature at both the transcriptional and translational levels, and their protein products can therefore be called heat shock proteins or, better, stress proteins. This fits with the role of these genes in the recovery of yeast cells after heat shock, as mentioned earlier (33,104). The increase in neutral trehalase activity seems to result from increase in de nom protein synthesis due to the high NTH1 mRNA level during heat stress and not due to a posttranslational modification of existing protein. As shown in Table 11, CAMP-dependent phosphorylation is not responsible for the increase in activity during heat stress because the activity of neutral tre-
223
TREHALOSE AND TREHELASES IN S. CEREVISIAE
halase both before and after the heat stress did not respond to phosphorylation by a CAMP-ATP mixture in vitro. This shows that the temperature regulation may be independent of the CAMPregulation pathway, at least on glucose-grown cells. Furthermore, when the heat-stressed exponentially growing cells on glucose were shifted to normal growth temperature,instead of an expected decrease in activity, the neutral trehalase activity rather increased (see Table 11).This seems to be typical for glucose-grown exponential cells and may be related to a role of the NTHl gene in recovery of cells after heat shock (33,104). In addition to regulation of the expression of NTHl and NTH2 genes by temperature, unpublished results from our laboratory show that oxidative stress (hydrogen peroxide) and other toxic chemicals (such as CuSO,) lead to increased neutral trehalase activity and immunoreactivity, as well as increased mRNA expression of the NTHl and NTH2 genes. However, the patterns of increase of mRNA of the trehalases are different under these chemicals, suggesting different regulation of the trehalases in the presence of these chemicals. Interestingly, these chemicals do not lead to trehalose accumulation under the same conditions, in contrast to the situation with heat, which increases trehalose and the trehalases at the same time. This strongly suggests that the mechanism of stress protection by the trehalases can be independent of trehalose. In contrast to the NTHl and NTH2 genes, the ATHl gene has been shown not to be necessary for heat shock survival, and the activity of the acid trehalase from cells growing on glucose does not increase significantly during heat stress (104). TABLE I1 NTHIPACTIVITYIN EXPONENTIALLY GROWING CELLSON GLUCOSE BEFORE ANDAFTER HEAT STRESS AT 40"Cn ~
(mU/mg protein)
Nthlp specific activity with phosphorylation (mU/mg protein)
9 30 70 24 54 131
10 28 74 22 52 139
Nthlp specific activity Strain(s)
without phosphorylation
Growth/heat treatment
~~
~
YS18 (Wild-type) YSNlipNTH (AnthfipNTH)
30°C 4O0C/4Omin 3O0C/4Omin 30°C 4O0C/4Omin 30"C/40min
"Assay method is as described (24, 25, 33); Nthlp, neutral trehalase encoded by the " H I gene; pNTH = YEp24 plasmid containing a 6kb SalI fragment of the NTHI gene (25).
224
SOLOMON NWAKA AND HELMUT HOLZER
D. Regulation of Trehalose Concentration and Expression of the Trehalases by Nutrients (Catabolite Repression) Exponentially growing yeast cells on glucose exhibit low trehalose concentration (32, 34, 45, 54) and basal activity of the trehalose-synthesizing and -hydrolyzingenzymes (55). However, as the cells pass through the diauxic shift (switching from fermentation to respiratory metabolism), they start to accumulate trehalose until they reach stationary phase (34, 4 4 , where they have a high trehalose content. It was suggested that repression of the trehalose-6-phosphatesynthase (trehalose-synthesizingenzyme complex)by glucose might be responsible for the drastic change in concentration of trehalose from exponential to stationary growth phase (49).It has also been shown that the activity of neutral and acid trehalase is low in exponentially growing cells on glucose and high in stationary phase after glucose consumption (24, 25, 33, 54). Furthermore, it has been reported that the expression of the NTH2 gene and possibly the NTH1 gene is low in exponentiallygrowing cells as compared to stationary phase, suggesting repression of these genes by glucose (33, 54,99,104).The dependence of gene expression on the presence of glucose in the culture medium is typical for genes that are under control of “catabolite repression,” also called “glucoserepression”(128-130).Repression by glucose generallypoints to a biologically sigdicant function of the respective enzymes in glucose metabolism. In yeast, besides several key enzymes of gluconeogenesis (130),the NTHI, ATHI, and YBROIOG (NTH2) genes are subject to catabolite repression. A biological function of the glucose-regulated “silent trehalase” NTH2 may consist of making glucose available from trehalose or other carbohydrates under special conditions. The mechanism of glucose repression involves at least three possibilities in the regulatory pathway from what is known from certain genes (131, 132). First, glucose can inhibit transport or formation of the inducer molecule. Second, glucose can inhibit an activator molecule required for gene expression.Third, the promoters of glucose-repressed genes can contain operator sites that are targets for a repression mechanism independent of the positive activator protein. In addition to glucose repression, some proteins repressed by glucose may also be inactivated by glucose, a phenomonenon called “cataboliteinactivation” (130,133). Catabolite inactivation is believed to be a consequence of proteolysis. Some glucose-repressed enzymes are subject to catabolite inactivation (130).However, almost nothing is known about a possible regulation of the trehalases by catabolite inactivation. Nutrient starvation in the yeast Saccharomyces cermisiae leads to a number of physiologicalchanges that accompany entry into stationaryphase. Two changes that have been well documented are strengthening of the cell wall
TREHALOSE AND TREHELASES IN S. CEREVZSZAE
22s
and accumulation of trehalose (13,134).As cells enter stationary phase, the transcription of most genes and the levels of mRNA decrease (126),followed by a substantial drop in overall protein synthesis (135).In contrast, some genes are expressed at higher levels in stationary phase as a response to nutrient limitation (125, 126, 136, 137), and the gene products in some cases appear to be necessary during stress condition. We can therefore say that the availability of nutrients in the medium is a signal for the expression of the trehalases. This might imply a regulation of the trehalases both by catabolite repression and by nutrient limitation, resulting in stress response. It may also suggest a common regulatory circuit for the trehalases by nutrients similar to the M a , SUC, or GAL gene families (131).The high expression of the trehalases in stationaryphase parallel to a high level of trehalose may contribute to the high stress tolerance of these cells. The regulation of the trehalases on nonfermentable carbon sources such as ethanol, glycerol, and acetate has not received much attention. The acid trehalase activity is high in exponentially growing cells on glycerol or ethanol, in contrast to nondetectable activity in exponentially glucose-growing cells (98; S. Nwaka, unpublished results), although the neutral trehalase is measurable under these conditions.Interestingly, Anthl and Aathl mutants have been shown to grow very poorly on glycerol compared to wild-type and the Anth2 mutant. However, no difference in growth is noticeable during growth on ethanol under laboratory conditions (99,104). A clear explanation for t h i s growth behavior is lacking; however, it was suggested that ghcose delivery from trehalose may be necessary during growth on glycerol (104).It has been shown that the Aathl mutant produces more ethanol and more trehalose and grows faster than wild-type and Anthl mutant when 200’0 glucose is used for cultivation (138).However, in other studies (25, 50), the Anthl mutant was shown to accumulate more trehalose than wild-typeunder similar conditions. CAMP-dependent protein kinase plays a central role in the response to starvation (reviewed in 139). CAMPlevels are controlled by modulation of adenylate cyclase and CAMPphosphodiesterase activities. Adenylate cyclase (encoded by CYRI) is regulated by Raslp and Rasep, GTF-binding proteins that are homologs of the mammalian Ras oncogen proteins. When the Ras proteins are activated, intracellular levels of CAMP increase, causing the negative regulatory subunit of the CAMP-dependentprotein kinase (encoded by the BCY1 gene) to dissociate from, and thus activate, the catalytic subunits (encoded by TPKI-3). CAMPlevels fall under conditions of nutrient limitation, resulting in the inactivation of CAMP-dependent protein kinase. Addition of glucose to stationary-phase cells leads to rapid hydrolysis of trehalose by activation of trehalase via the Ras/cAMP pathway (13, 32, 140). This CAMP-dependent activation of trehalase has been shown to be a posttranslational modification of existing enzyme (140,141).Using mutant strains
226
SOLOMON NWAKA AND HELMUT HOLZER
of S. cerevisiae harboring lessions in the CAMP protein kinase cascade, it was established that trehalase is a substrate for the CAMP-dependentprotein kinase (76). The presence of two possible phosphorylation consensus sequences in the predicted protein of the NTHl gene supports such a posttranslational mechanism. The predicted Nth2p also has two CAMP-dependent protein kinase consensus sequences, suggesting that it may also be regulated by phosphorylation via the same pathway. As shown in Table 11, neutral trehalase activity from extracts of exponentially growing cells on glucose does not respond to phosphorylation by CAMP-ATP mixture, in contrast to stationary cells grown on glucose (24, 26, 33, 54, 99) or exponentially growing cells on galactose (99).A possible explanation for this might be that the basal level of neutral trehalase in exponentially growing cells on glucose already exists in phosphorylated form due to the high CAMPand ATP content of exponentially glucose-growing cells (142,143).
V. Biological Functions of the Trehalase Genes A. Role in Trehalose Hydrolysis By definition, trehalase is any enzyme that can specifically bind and hydrolyze the substrate trehalose into two glucose units. In the yeast S. cereuisiae, the idea of two trehalases-cytosolic neutral trehalase and vacuolar acid trehalase-was established based on in uitro assay systems that use the typical characteristics of these two enzymes, such as pH optimum for activity and localization. The availability of the genes that encode these enzymesnamely, the NTHl gene for neutral trehalase, the ATHl gene for the acid trehalase, and the NTH2 gene for a putative trehalase-opened a possibility to study the role of these enzymes in trehalose hydrolysis in intact cells. It was shown using the Anthl mutant that the neutral trehalase encoded by the NTHl gene is responsible for intracellular hydrolysis of cytosolic trehalose, in contrast to the acid trehalase encoded by the ATHl gene and a putative third trehalase Nth2p encoded by NTH2 (25,30,33, 50, 104). Furthermore, the acid trehalase Athlp has been shown to be responsible for extracellular utilization of external trehalose, probably by endocytosis through the vacuole, in contrast to the neutral trehalases (35).For hydrolysis and utilization of extracellular trehalose by the acid trehalase of Neurosporu crassu,a similar pathway has been proposed (144).
B. Role in Stress Response The finding that the trehalase genes NTHl and NTH2 may play an important role in the recovery of yeast cells after heat shock suggests a second
TREHALOSE AND TREHELASES IN S. CEREVISIAE
227
function of these trehalases in addition to trehalose hydrolysis. This thermotolerance-enhancing role of the trehalases does not seem to occur via trehalose accumulation; otherwise one would have seen better survival effect in the NTHl-defective mutant (which does not hydrolyze trehalose) when compared to the wild-type or to the NTH2 mutant, which behaves like wild-type in terms of trehalose hydrolysis (33).Though stationary-phase cells are thermotolerant and show high trehalose concentration (32,38,53), they also exhibit high neutral trehalase activity and high mRNA expression of the YBROIOG (NTH2) gene. Therefore, the ability to recover after heat shock of the wild-type as compared to the NTHl and NTH2 mutant strains can only be correlated to the trehalases and not to trehalose concentration. This is supported by the complementation of the "poor-heat-shock-recovery" phenotype present in the Anthl strain with plasmids containing the NTHl gene and its promotor, as shown in Fig. 6. There exist more examples of lack of
FIG.6. The NTHl gene complements the heat shock survival defect of the Anthl mutant. Two wild-type strains (YS18); two independent Anthl mutant strains (Anthl::LEU2 and Anthl::URA3), also called YSNl and YSN2, respectively; and two Anthl strains transformed with plasmids containing a 6-kb Sun fragment of the NTHl gene (Anthl/pN"H and Anthl/pNTH2) were used. pNTH was derived from yEp24 (25, 97) and pNTH2 (99) was derived from pSEY8 (Invitrogen).All strains are isogenic except for the Anthl mutation and the introduction of pNTH and pNTH2. Growth and assay conditions are as described (33,104).
228
SOLOMON NWAKA AND HELMUT HOLZER
correlation between trehalose concentration and increase in thermotolerance (Table I). The NTH1 and NTH2 genes not only are involved in protecting cells from heat shock, but they may participate in protection from other forms of stress, such as toxic chemicals and oxidative injury, in a manner that is independent of trehalose accumulation (S. Nwaka et al., submitted). We therefore propose that the neutral trehalase encoded by the NTH1 gene and the putative product of the NTH2 gene (Nth2p) are stress proteins. Though a nonrelation between the NTHl and NTH2 genes and trehalose in their stress-protective function has been demonstrated under some conditions, one cannot eliminate the possibility that certain stress protection (such as heat stress resulting in heat shock survival) afforded by the trehalases may be linked to trehalose metabolism (i.e., delivery of glucose from trehalose).This may explain why the cell increases the production of trehalose-synthesizingenzymes, and at the same time increases production of the trehalose-hydrolyzingenzymes. Heat shock may damage components of the membrane and limit the uptake of glucose or other nutrients. In this situation, the internal supply of glucose for energy production coming from trehalose breakdown might be necessary (104).This situation may apply for stress conditions that involve the accumulation of trehalose and an increase in the expression of the trehalases at the same time. For some stress response-inducing chemicals, which do not favor the production of trehalose and the trehalases at the same time, other mechanisms, such as prevention of damage to cellular components during stress or resolution of these components after stress, may be used. This would favor a role of these proteins as chaperones. The ATHl gene appears not to be involved in stress protection such as heat shock protection (104); however, it has been shown that disruption of the ATH1 gene confers better survival after dehydration, freezing, and ethanol shock, probably due to an enhanced trehalose content under special conditions (138).
C. The Acid Trehalase and Trehalose Transport A trehalose carrier that is assumed to move trehalose toward both sides of the yeast cell membrane has been described (145). Nothing is known about the trehalose carrier gene; however, the trehalose carrier protein is repressed by glucose, and its mutant (derived from chemical mutagenesis) does not grow on trehalose as a carbon source, similar to the acid trehalase mutant. However, the trehalose carrier mutant shows high sensivity to dehydration even in the presence of high trehalose concentration (149, in contrast to the acid trehalase mutant, which results in high dehydration tolerance (138).This descrepancy between the trehalose carrier and the acid trehalase makes it unlikely that both proteins are the same. Two trehalose
TREHALOSE AND TREHELASES IN S. CEREVISZAE
229
transport systems in S. cerevisiae have been biochemically characterized: a high-affinity H+ symporter ( K , 4 mM) and a low-affinity transport activity (Km > 100 mM) (107).The high-affinity transporter was shown to have a pH optimum of 5.0, to be repressed by glucose, and to be inhibited by uncouplers such as organic mercury compounds. Some of these characteristics are similar to what has been described for the acid trehalase (27, 103). In support of this similarity are the findings that the acid trehalase mutant cannot transport trehalose from the medium into the cell and that trehalose transport is stimulated by acidic pH (B. Mechler et al., in preparation). These findings indicate that the acid trehalase may mediate the transport of extracellular trehalose into the cell (i.e., a second function of the acid trehalase in addition to trehalose hydrolysis).
D. Role of Trehalose Hydrolysis in Spore Germination Trehalose hydrolysis is thought to be a major event during early germination of fungal spores, suggesting the importance of trehalose and the trehalases in spore germination (13,146).However, germination was shown to be possible in yeast in the absence of trehalose mobilization (147),and it was concluded that the amount of energy produced during trehalose breakdown in S. cermisiae was small compared with that derived from the sugar in the medium (148). Using diploids homozygous for the double mutations of AnthBAathl as well as hnthDAnth2, no defect in sporulation and in germination of spores on fermentable carbon sources was seen (99,104).
VI. Trehalases and Heat Shock Proteins A. Stress Regulation by Heat Shock Element and Stress Responsive Element In the yeast Saccharomyces cerevisiae,temperature and other stress conditions induce the expression of genes through at least two promoter elements, the heat shock element (HSE) and the stress responsive element (STRE) (127, 149-151). The transcriptional induction of many of the heatand stress-induced genes is mediated by the heat shock factor (HSF) by binding to the HSE (152-154). This transcriptional activator is constitutively bound as a trimer to a cis promotor DNA sequence (GAANNTTC or longer, where N is any nucleotide). Binding activity of HSF to HSE is induced upon stress (154).A novel element that mediates stress-induced transcriptional activity was identified having the consensus sequence CCCCT, called STRE (127, 149-151). This element is known to mediate the stress-induced expression of 1) the CZT1 gene encoding the cytosolic catalase T (150), 2) the
230
SOLOMON NWAKA AND HELMUT HOLZER
TPS2 gene encoding the 102-kDa subunit of the trehalose-6-phosphatesynthase complex (127),and 3) the DDR2 gene (149). It is also present in the promoter region of other stress-induced genes such as UBZ4,HSPIZ, HSP26, HSP104, and PTP2 (151).The observation of this motif in the promoter regions of the NTHl and NTHZ genes, but not in the ATHl gene, explains the heat stress-induced expression of the genes NTHl and NTH2 as well as the requirement of these genes (not of the ATHl gene) for the recovery of yeast cells after heat shock (104).Although the exact roles of all the stress-induced proteins are not well understood, it is evident that stress response requires the coordination of a number of gene products involved in diverse cellular functions.An update of known stress genes with location of the STREs is presented in Table 111. In addition to the lack of correlation between trehalose concentration and thermotolerance shown in Table I, it was proposed that increased amounts of trehalose are not sufficient for stress tolerance (1.27).However, the induction of the TPS2 gene (and possibly other genes participating in trehalose syn-
TABLE I11 STRERELATEDSEQUENCES(CCCCT OR GGGGA OR REVERSEORLENTATION) IN THE PROMOTER OF SOMESTRESSPROTEINS Gene
Number of !TITIE in promoter
HSPl04 HSP78 SSAl SSA4 SSCl HSP26 HSPl2 CTl-1
DDR2 GACl UB14 GPDl HAL1 ENAl PTP2 PSl PS2 NTHl NTH2
3 4 1 2 4 1 1 2 6 5 3 3
Position of STRE in promoteln
-172, -220, -252 - 126, -159, - 191 -160, -211 -179, -432, -467 -250 -328, -466, -484, -659 -190, -232, -377, -414, -435, -652, -679 -100, -345, -330 -175, -203, -248, -472 -659 -252, -655 -34, -286, -330, -791 -399 -651 -114, -105 -239, -249, -278, -305, -359, -471 -308, -421, -441, -490, -523 -148, -335, -343 -267, -454, -459
"Numberswith - indicate position of SlXE relative to the A X .
Reference(s)
151,155 156 155 155 157 151,155 151,155 151,155 149,151,155 151,155 151,155 155 155 155 155 155 127,155 33,104 33,104
TREHALOSE AND TREHELASES IN S. CEREVZSIAE
231
thesis) and trehalose accumulation was shown to be required for Yaplp-mediated pleiotropic drug resistance through STRE (127).The YAP1 gene is a yeast transcriptional activator that mediates the multiple drug resistance. The regulation of the NTHl and NTH2 genes by heat stress, the presence of the STRE in their respective promoters, and the requirement of these genes for recoveIy of cells after heat shock suggest that not only trehalose biosynthesis but its turnover as a whole may be necessary for effective stress response in yeast. One may reflect that expression of the trehalose synthase complex, expression of the trehalase genes (NTHI and NTH2), and concentration of trehalose increases as yeast cells are subjected to heat stress. Different lines of evidence have led to the proposal of a protective cooperation between some heat shock proteins and trehalose (158,159).Here we propose that the trehalose hydrolyzing machinery represented by the neutral trehalases is also essential for stress tolerance. Though the acid trehalase is not regulated by temperature, it is expressed at a high level in stationary-phasecells similar to NTH1 and NTH2. This suggests a common role of the three trehalases in response to nutrient regulation.
VII. Outlook on the Biotechnological Importance of Trehalose and the Trehalases Trehalose accumulation has important consequences for various commercial applications. The trehalose content of commercial baker’s yeast is widely believed to be a critical factor for its stress resistance, and this has received much attention, especially with respect to the production of “instant active dry yeast” (160, 161). The culture conditions of commercial baker’s yeast have been optimized over the years in order to obtain high trehalose content, and, at present, trehalose levels of 15-200/0 of the dry weight are common, with 10% considered as a critical threshold (161, 162).Accumulation of trehalose correlates with increased survival of yeast cells following dehydration or freezing. This means that lower levels of yeast cells are required for dehydrated or frozen dough preparations,resulting in reduced costs without compromisingproofing times (162-166).Initiation of fermentation upon the mixing of nutrients with yeast cells during dough preparation triggers a rapid mobilization of trehalose and a rapid loss of freezingresistance. To minimize both processes, the preparation of frozen dough is generally carried out at low temperature (50);however, the genetic elimination of all the trehalases in baker’s yeast will be a useful approach. Published data with the single Anthl and Aathl mutants indicate higher freezing tolerance under pilotscale conditions (50, 138).Furthermore, the Authl mutant has been shown to be more tolerant to ethanol than wild-type, making it an important strain
232
SOLOMON NWAKA AND HELMUT HOLZER
for brewing (138).Ethanol resistance is a critical feature of yeasts used for brewing; alcohol levels in wine and sake, for instance, are in part limited by ethanol toxicity (167, 168). The ability of trehalose to preserve and stabilize certain labile biological molecules (enzymes, antibodies, therapeutics) and foodstuffs in vi&o makes it an important sugar, especially in developing tropical countries where refrigeration and power supply are unreliable (60, 64). This brings hope for solving many economical, agricultural, and health problems in these countries. Because of the increasing industrial interest in trehalose, some patents have claimed the need to increase the trehalose content of organisms by transforming them with combinations of the structural genes for trehalose synthase from yeast (169),as well as preservation of resistance to freezing by elimination or reduction of trehalose mobilization through genetic engineering of the trehalases genes (170). In Japan, efforts have been made to synthesize trehalase inhibitors with the aim of using them as fungal insecticides (171).This is a consequence of the importance of trehalose and its hydrolysis in the life function of insects (172, 173). In vertebrates, including humans, trehalase has also been found in the renal proximal tubules and can therefore be used as a diagnostic marker for some diseases of the kidney, in which case the enzyme is found in urine (174-1 77). Independent of this potential use of trehalase in renal diseases, the stress-responsive function of the NTH1 and NTH2 genes, similar to other stress proteins, may open new applications in managing some human diseases. ACKNOWLEDGMENTS We thank members of our institute for stimulating discussions and cooperation. We also thank Dr. N. Pfanner, Dr. B. Mechler, Dr. W. Heinerneyer, and H. Ziihringer for critically reading the manuscript, and W. Frib and M. Burgert for help with the figures. The assistance of Dr. T. Jentsch in preparation and discussion of Figures 4 and 5 is hereby acknowledged. Work in our laboratory is suported by grants from the Deutsche Forschungsgerneinschaft (H074/27-1), the Fonds der Chemischen Indushie, Frankfurt; and WissenschaftIicheGesellschaft, Freiburg. Note Added in Proof: While this work was undergoing editorial review, an article on the molecular characterization of the acid trehalase from Aspergillus niduluns appeared (178).This work showed that the acid trehalase of A. niduluns is required for growth on trehalose similar to the acid trehdase of S. cereuisiae (35).Accordingly, sequence comparison of the two trehalases revealed significant similarity to each other. On the other hand, data from our laboratory shows that the 7°K genes encoding the catalytic subunit of CAMP-dependent protein kinase are required for the stability of neutral trehalase during heat stress in a CAMPindependent manner (Zaehringeret al., in preparation).
TREHALOSE AND TREHELASES IN S. CEREVZSZAE
233
REFERENCES 1. H. A. L. Wiggers, Ann. P h r m . Poxnu 1,129 (1832). 2. M. Berthelot, Acad. Sci. Puris 46, 1276 (1858). 3. E. M. Koch and F. C. Koch, Sciace 61,510 (1925). 4. M. G. Tanret, C. R. Acud. Sci. 180,598 (1932). 5. A. D. Elbein, Ado. Carbohydrate Chern. Biochem. 30,227 (1974). 6. A. Wiemken, Ant. uan Leeuwenhoek 58,209 (1990). 2 A. Van Laere, FEMS Mimhiol. Rev. 63,201 (1989). 8. M. E. M. Bourquelot, C. R. Acad. Sci. 116,826 (1893). 9. E. Fischer, Ba: Dtsch. Chem. Ges. 28,1429 (1895). 10. J. Ruf, H. Wacker, P. James, M. Maffia, P. Seiler, G. Galand, A. v. Kiekebusch, G. Semenza, and N. Mantei,]. Biol. Chem. 265, 15034 (1990). 11. R. Bergoz, Gastroenterology 60,909 (1971). 12. J. Madzarovova-Nohejlova, Gastroenterology 65,130 (1973). 13. J. M. Thevelein, Microbiol. Rev. 48,42 (1984). 14. J. Londesborough and 0.E. Vuorio,]. Gen. Microbiol. 137,323 (1991). 15. J. Londesborough and 0. E. Vuorio, Eur. J. Biochem. 216,841 (1993). 16. W. Bell, P. Klaasen, M. Ohnacker, T. Boller, M. Henveijer, P. Schoppink, P. van der Zee, and A. Wiemken, Eur. 1.Biochem. 209,951 (1992). 1% M. I. Gonzales, H. Stucka, H. Feldmann, and C. Gancedo, Yeast 8,183 (1992). 18. C. De Virg&o, N. Biirkert, W. Bell, T. Boller, and A. Wiemken, Eur. J. Biochem. 212,315 (1993). 19. L. Van Aelst, S. Hohmann, B. Bulaya, W. de Koning, L. Sierkstra, M. J. Neves, K. Luyten, R. Alijo, J. Ramos, P. Cocceti, E. Martegani, N. M. de Magahaes-Rochas,R. L. Brandao, P. Van Dijck, M. Vanhalewyn, P. Dumez, A. W. H. Jans, and J. M. Thevelein, Mol. Microbiol. 8,927 (1993). 20. 0.E. Vuorio, N. Kalkkinen, and J. Londesborough, Eur. ]. Biochem. 216,849 (1993). 21. P. Van Solingen and J. B. Van der Plaat, Biochem. Biophys. Res. Cummun. 62,553 (1975). 22. F. KeUer, M. Schellenberg, and A. Wiemken, Arch. Microbiol. 131,298 (1982). 23. J. Londesborough and K. Varimo, Biochem.].219,511 (1984). 24. H. App and H. Holzer,]. Biol. Chem. 264,17583 (1989). 25. M. Kopp, H. Miiller, and H. Holzer,]. Biol. Chem. 268,4766 (1993). 26. M. Kopp, S. Nwaka, and H. Holzer, Gene 150,403 (1994). 22 K. Mittenbiihler and H. Holzer,]. Biol. Chem. 263,8537 (1988). 28. M. Destruelle, H. Holzer, and D. Klionsky, Mol. Cell. Biol. 14,2740 (1994). 29. M. Destruelle, H. Holzer, and D. Klionsky, Yeast 11,1015 (1995). 30. K. H. Wolfe and A. J. E. Lohan, Yeast 10, S41(1994). 31. N. 0.Souza and A. D. Panek, Arch. Biochem. Biophys. 125,22 (1968). 32. S. Nwaka, M. Kopp, M. Burgert, I. Deuchler, I. Kienle, and H. Holzer, FEBS Lett.344,225 (1994). 33. S. Nwaka, M. Kopp, and H. Holzer, J. Biol. Chern. 270,10193 (1995). 34. I. Kienle, M. Burgert, and H. Holzer, Yeast 9,607 (1993). 35. S. Nwaka, B. Mechler, and H. Holzer, FEBS Lett. 386,235 (1996). 36. W. E. Trevelyan, D. P. Proctor, and J. S. Harrison, Nature (London) 166, 444 (1950). 32 S. H. Lillie and J. R. Pringle,]. Bactaiol. 143,1384 (1980). 38. T. Hottiger, P. Schmutz, and A. Wiemken, J. Bacteriol. 169,5518 (1987). 39. P. S. Araujo, A. C. Panek, R. Ferreira, and A. d. Panek, Anal. Biochem. 176,432 (1989). 40. E. Bemt and R. Lachernicht, in “Methoden der Enzymatischen Analyse” (H. U. Bergmeyer, ed.) Vol. 2, p. 1260. Verlag Chemie, Weinheim/Bergstrasse,West Germany, 1974.
234
SOLOMON W A K A AND HELMUT HOLZER
41. A. L. S. Zimmermann, H. F. Terenzi, and J. A. Jorge, Biochim. Biophys. Actu 1036, 41 (1990).
42. M. T. Kiienzi and A. Fiechter, Arch. Mikrobiol. 64,154 (1969). 43. M. T. Kiienzi and A. Fiechter, Arch. Mikrobiol. 84,254 (1972). 44. V. E. Chester, Biochem.J. 86, 153 (1963). 45. A. Panek, Arch. Biochem.Biophys. 98,349 (1962). 46. A. D. Panek and J. R. Matton, Arch. Biochem.Biophys. 183,306 (1977). 4% S. M. Kane and R.Roth, J. Bacterial. 118,8 (1974). 48. A. Panek, Arch. Biochem.Biophys. 100,422 (1963). 49. J. Francois, M. J. Neves, and H. G. Hers, Yeast 7,575 (1991). 50. P. Van Dijck, D. Colavizza, P. Smet, and J. M. Thevelein, Awl.Environ. Microbial. 61,109 (1995).
51. 52. 53. 54. 55. 56.
B. Elliot and B. Futcher, Yeast 9,33 (1993). T. Hottiger, T. Boller, and A. Wiemken, FEBS Lett. 220,113 (1987). P. V. Attfield, FEBS Lett. 225,259 (1987). K. Winkler, I. Kienle, M. Burgert, J. C. Wagner, and H. Holzer, FEBS Lett.291,261 (1991). M. J. Neves and J. Francois, Biocha. J. 288,859 (1992). K. Fujita, H. Iwahashi, 0.Kodama, and Y.Komatsu,Biocha. Biophys. Res. Cmmun. 216, 1041 (1995).
5%Y.Sanchez and S. Lindquist, Science 248,1112 (1990). 58. J. C. Argiielles, FEBS Lett. 350,266 (1994). 59. W. Seufert and S. Jentsch, EMBOJ. 9,543 (1990). 60. J. H. Crowe, L. M. Crowe, and D. Chapmann, Science 223,701 (1984). 61. C. Colaco, S. Sen, M. Thangavelu, S. Pinder, and B. Roser, Bio/Technology 10,1007 0992). 62. J. H. Crowe, J. F. Carpenter, L. M. Crowe, and T. J. Anchordoguy, Cyobiology 27, 219 (1990).
63. 64. 65. 66.
B. J. Roser, Biophrmucology 5,44 (1991). B. J. Roser, I).ends Food Sci. TechnoE. 2,166 (1991). D. Blakeley, B. Tolliday, C. Colaco, and B. Roser, Lancet 336,854 (1990). T.Hottiger, C. De Virgho, M. N. Hall, T Boller, and A. Wiemken, Eur. J. Biochem. 219, 187 (1994).
67. M. J. Burke, in “Membranes, Metabolism and Dry Organisms” (C. Leopold, ed.) p. 358. Cornell University Press, Ithaca, NY,1985. 68. J. S. Clegg, in “Membranes, Metabolism and Dry Organisms” (C. Leopold, ed.), p. 169. Cornell University Press, Ithaca, NY,1985. 69. J. L. Green and C. A. Angell,]. Phys. Chem. 93,2880 (1989). 70. H. Levine and L. Slade, Biophafinocology 5,36 (1992). 71. L. M. Higa and C. Z. Womersley,]. Exp. Zool. 267,120 (1993). 72. K. Myrback and B. Ortenblad, Biochem. Z. 291,61 (1937). 73. G. Avigad, Biochem. J. 97,715 (1965). 74. J. B. van der Plaat, Biochem. Biophys. Res. Commun. 56,580 (1974). 75. A. Wiemken and M. Schellenberg, FEBS Lett. 150,329 (1982). 76. I. Uno, K. Matsumoto, K. Adachi, and T. Ishikawa,]. Biol. C h a . 258,10867 (1983). 7% C. H. Ortiz, J. C. C. Maia, M. N. Tenan, G . R.Braz-Padrao,J. R.Mattoon, and A. D. Panek, J. Boctaiol. 153,644 (1983). 78. R. B. Trimble and F. Maley,]. Bwl. Chem. 252,4409 (1977). 79. R.S. Williams, R.J. Tmmbly, R. MacColl, R.B. Trimble, and F. Maley,]. Biol. Chem. 260, 13334 (1985).
80. P. J. Kelley and B. J. Catley, A d . Biochen. 72,353 (1976). 81. J. Slavik, FEBS Lett. 140, 22 (1982).
TREHALOSE AND TREHELASES IN S. CEREVZSIAE
235
82. J. A. den Hollander, K. Ugurbil, T. R. Brown, and R. G. Shulman, Biochemistry 20,5871 (1981). 83. F.Meussdoerffer,P. Tortora, and H. Holzer,]. Biol. Chem. 255,12087 (1980). 84. S. Aibara, R. Hayashi, and T.Hata, A@. Bwl. C h . 35,658 (1971). 85. G. Metz and K. Rohm, Biochim. Biophys. Acta 429,933 (1976). 86. K. Mittenbiihler and H. Holzer, Arch. Microbial. 155,217 (1991). 8Z D. J. Klionsky, L. M. Banta, and S. D. Emr, Mol. Cell. Biol. 8,2105 (1988). 88. D. J. Klionsky and S. D. Emr, EMBO]. 8,2241 (1989). 89. T. Stephens, B. Esmon, and R. Schekman, CeZl30,439 (1982). 90. T.Achstetter, 0. Emter, C. Ehmann, and D. H. Wolf,]. Biol. Chem. 259,13334 (1984). 91. J. A. van Assche and A. R. Carlier, Biochim.Biophys. A& 391,154 (1975). 92. G. M. Dellamora-Ortiz, C. H. D. Ortiz, J. C. C. Maia, and A. D. Panek, Arch. Biochem.Bwphys. 251,205 (1986). 93. H. App and H. Holzer, Z. Lebensm. Untets. Forsch. 181,276 (1985). 94. J. M. Thevelein and S. Hohmann, TIBS 20,3 (1995). 95. J. H. Rothman, C. P. Hunter, L. A. Vall, andT. H. Stevens, Proc. Nutl. Acad. Sci. U.S.A.83, 3248 (1986). 96. T. H. Stevens, J. H. Rothman, G. S. Pagne, and R. Schekman,]. Cell Bwl. 102,1551 (1986). 97. M. Carlson and D. Botstein, Cell 28,145 4982). 98. P. F. San Miguel and J. C. Argueles, Biochim. Biophys. Ada 1200,155 (1994). 99. S. Nwaka, Ph.D. Thesis, Faculty of Biology, University of Freiburg, Germany (1995). 100. G. von Heijne, Nwleic A& Res. 14,4683 (1986). 101. D. J. Klionsky, R. Cueva, and D. S. Yaver,]. Cell Biol. 119,287 (1992). 102. S. D. Harris and D. A. Cotter, Can.]. Microbial. 34,835 (1988). 103. P. Alizadeh and D. J. Klionsky, FEBS Lett.391,273 (1996). 104. S. Nwaka, B. Mechler, M. Destruelle, and H. Holzer, FEBS Lett.360,286 (1995). 105. C. Dulic and H. Rieman, EMBO]. 8,1349 (1989). 106. H. Riezman, Cell 40,1001 (1985). 107. B. U. Stambuk, P. S. de Araujo, A. D. Panek, and R. Serrano, Eur.]. Biocha. 237,876 (1996). 108. J. Tschopp, P. C. Esmon, and R. Schekman,]. Boderiol. 160,966 (1984). 109. C. L. Holcomb, W. J. Hansen, T Etcheveny, and R. Schekmann,]. Cell BwZ. 106, 641 (1988). 110. R. Serrano, in “The Molecular and Cellular Biology of the Yeast Saccharomyces cereuisiae” (J. R. Broach, J. Pringle, and E. Jones, eds.), Vol. 1, p. 523. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1991. 111. C. Gutierrez,M. Ardourel, E. Bremer, A. Middenhorf, W. Boos, and U. Ehmann, Mol. Gen. Genet. 217,347 (1989). 112. J. Rittenhouse, P. B. Harrsch, J. N. Kim, and D. Mecke,]. Bwl. C h 261,3939 (1986). 113. C. Mann and R. W. Davis, Mol. Cell. Bwl. 6,241 (1986). 114. J. 0.Nehlin and H. Ronne, EMBOJ. 9,2891 (1990). 115. J. J. Mercado, 0.Vincent, and J. M. Gancedo, FEBS Lett. 291,97 (1991). 116. D. W. Griggs and M. Johnston, h c . Nud. A d . Sci. U.S.A.88,8597 (1991). 117. M. Takiguchi, T. Niimi, Z. Su, and T. Yaginuma, B h c h . I. 288,19 (1992). 118. K. D. Entian, R. F.Vogel, M. Rose, L. Hoffman, and D. Mecke, FEBS Lett. 236, 195 (1988). 119. D. T. Rogers, E. Hiller, L.Mitstock, and E. Orr,]. Biol. Chem. 263,6051 (1988). 120. E. Kopetzki, K. D. Entian, F. Lottspeich, and D. Mecke, Biochim.Biophys. Ada 912,398 (1987). 121. W. H. Mager and P. Moradas Ferreira, B i o c h . ] . 290,1(1993).
236
SOLOMON NWAKA AND HELMUT HOLZER
122. M. J. Miller, N. H. Xuong, and E. P. Geiduschek, Proc. Natl. Acod. Sci. U.S.A. 76, 5222 (1979). 123. S. Lindquist, Nature (London) 293,311 (1981). 124. J. Plesset, J. J. Foy, L. L. Chia, and L. S. McLaughlin, in “Interaction of Translation and Transcriptional Controls in the Regulation of Gene Expression” (M. Grunberg-Manago and B. Safer, eds.), p. 495. Elsevier, New York, 1982. 125. M. Wemer-Washbume,J. Becker,J. Kosic-Smithers, and E. A. Craig,]. Bodetiol. 171,2680 (1989). 126. M. Werner-Washbum,E. Braun, G. C. Johnston, and R. A. Singer,Microbwl. Rev. 57,4683 (1993). 12% N. Guonalaki and G. Thireos, EMBO]. 13,4036 (1994). 128. J. Monod, Growth 11,223 (1947). 129. B. Magasanik, Cold Spring Harbor Symp. Quunt. Biol. 26,249 (1961). 130. H. Holzer, Cell Bid. Reo. 212,283 (1989). 131. M. Johnston and M. Carlson, in “The Molecular and Cellular Biology of the Yeast Saccharomyces cmevisiae” (E. Jones, J. Pringle, and J. Broach, eds.), Vol. 2, p. 193. Cold Spring Habor Laboratory Press, Cold Spring Harbor, NY, 1992. 132. M. Johnston, J. S . Flick, and T.Pexton, Mol. Cell. Biol. 14,3834 (1994). 133. H. Holzer, Trends Biochem. Sci. 1,178 (1976). 134. C. E. Deutch and J. M. Parry,]. Gen. Microbiol. 80,259 (1974). 135. H. Boucherie,]. Bacteriol. 161,385 (1985). 136. U. Teichert, B. Mechler, H. Miiller, and D. H. Wolf,]. Biol. Chem. 264,16037 (1989). 13% Y. Sanchez,J. Taulien, K. A. Borkovich, and S . Lindquist, EMBOJ. 11,2357 (1992). 138. J. Kim, P. Alizadeh, T. Harding, A. Hefner-Gravik, and D. J. Klionsky, Appl. Enoiron. Microbiol. 62, 1563 (1996). 139. J. B. Gibbs and M. Marshall, Microbiol. Reo. 53,171 (1989). 140. J. M. Thevelein, Mol. Microbiol. 5, 1301 (1991). 141. C. Punvin, F. Leidig, and H. Holzer, Biochem. Biophys. Res. Commun. 107,1482 (1982). 142. P. Eraso and J. M. Gancedo, Eur. 1.Bwchem. 14,195 (1984). 143. J. M. Gancedo, Eur.]. Biochem. 206,297 (1992). 144. B. M. Bonini, M. J. Neves, J. A. Jorge, and H. F. Terrenzi, Biochim. Biuphys. A& 1245,339 (1995). 145. E. C. A. Eleutherio, P. S. Araujo, and A. D. Panek, Biochirn. Biophys. Acta 1156,263 (1993). 146. P. Rosseau, H. 0.Halvorson,L. A. Bulla, Jr., and G. St. Julian,]. Bacteriol. 109,1232 (1972). 14% C. DoniN, P. P. Puglisi, A. Vecli, and N. Marmiroli,]. Bactaiol. 170,3789 (1988). 148. S. L. Campbell-Burk and R.G. Shulman, Annu. Rm. Microbiol. 41,595 (1987). 149. N. Kobayashi and K. McEntee, Mol. Cell. Biol. 13,248 (1993). 150. G. Marchler, C. Schiiller, G. Adam, and H. Ruis, EMBOJ. 12,1997 (1993). 151. C. Schiiller,J. L. Brewster, M. R.Alexander, M. C. Gustin, and H. Ruis,EMBO]. 13,4382 (1994). 152. P. K. Sorger and H. R. B. Pelham, EMBOJ. 6,3035 (1987). 153. G. Weiderrecht, D. Shuey, W. Kibbe, and C. Parker, Cell 48,507 (1987). 154. P. K. Sorger and H. C. M. Nelson, Cell 59,807 (1989). 155. W. H. Mager and A. J. J. De Kruijff,Microbial. Rev. 59,506 (1995). 156. S. A. Leonhardt K. Fearon, P. N. Danese, and T. L. Mason, Mol. Cell. Biol. 13,6304 (1993). 15% E. A. Craig,J. Kramer,J. Shilling,M. Werner-Washbume,S. Holmes,J. Kosic-Smithers, and C. M. Nicolet, Mol. Cell. Biol. 9,3000 (1989). 158. T. Hottiger, C. De Virgiho, W. Bell, T. Boller,.and A. Wiemken, Eur. J. Biochem. 210,125 (1992). 159. P. W. &per, FEMS Microbwl. Reu. 11,339 (1993).
TREHALOSE AND TREHELASES IN S. CEREVZSZAE
237
G . E. Pollock and C. D. Holmstrom, Cereal Chem. 28,498 (1951). N. B. Trivedi and G. Jacobson, h g . Indust. Microbiol. 23,45 (1986). P. Gelinas, G . Fiset, A. LeDuy, and J. Goulet, Appl. Enuiron. Mimobiol. 55,2453 (1989). G. M. Gadd, K. Chalmers, and R. H. Reed, FEBS Microbiol. Lett. 48,249 (1987). K. Lorenz, Baker’s Dig. 48,14 (1974). Y. Oda, K. Uno, and S. Ohta, Appl. Enuiron. Microbiol. 52,941 (1986). G . W. Sanderson, C m a l Foods World 30,770 (1985). M. J. Bekers, B. E. Dambertga, I. J. Krause, E. J. Ventina, and J. G. Kontakevich, in “Current Developments in Yeast Research” (G. G. Stewart and I. Russel, eds.), p. 117. Pergamon Press, Toronto, 1981. 168. J. J. C. Mansure, A. D. Panek, L. M. Crowe, and J. H. Crowe, Biochim. Biuphys. Ada 1191, 309 (1994). 169. J. Londesborough and 0. Vuorio, International patent application PCT/FI93/00049 (1993). 170. M. Driessen, K. A. Osinga, and M. A. Herweijer, European patent application 91200686.3 (1991). 171. S. Ogawa and C. Uchida,J. Chem. SOC. Perkin Transaction 1,1939 (1992). 172. G. R. Wyatt, Adv. Insect Physiol. 4,287 (1967). 173. A. Becker, P. Schloder, J. E. Steele, and G. Wegener, Erperientia 52,433 (1996). 174. S. J. Berger and B. Sacktor,]. Cell B i d . 47,637 (1970). 175. M. Nakano,J. Histochem. Cytochem. 30,1243 (1982). 176. T. Niwa, T. Katsuzaki, T. Yazawa, N. Tatemichi, Y.Miyazaki, and K. Maeda, Nephron 63, 423 (1993). 177. G . Semenza, Annu. Rev. Cell Biol. 2,255 (1986). 178. C. dEnfert and T. Fontaine, Mol. Microbiol. 24,203 (1997). 160. 161. 162. 163. 164. 165. 166. 167.
This Page Intentionally Left Blank
Molecular and Structural Features of the ProtonCoupled Oligopeptide Transporter Superfamily
YOU-JUN FEI,VADIVEL GANAPATHY, AND FREDERICK H. LEIBACH~ Department of Biochemistry and Molecular Biology
Medical College of G e m Augusta, Gewgia 30912-2100
I. Two Different Peptide Transporter Subfamilies:A Comparison between the Members of the ABC Peptide Transporter Subfamily and the POT Subfamily ......................................... 11. Molecular Cloning Procedures Employed for Identification of the POT Family Members .................................... A. Expression Cloning ......................................... B. Cloning by Homologous Hybridization ......................... C. Cloning by Use of Degenerate Oligonucleotides and RT-PCR . . , . . D. Cloning by Functional Complementation ....................... 111. Comparison of Amino Acid Sequences of the Members of the POT Family ............................................. N.Topological Features of the POT Subfamily ........................ V. Conclusion ................................................... References ....................................................
241 243 244 244 245 245 248 256 257 259
Work in the area of molecular biology of transport proteins has unveiled the presence of a distinct peptide transporter superfamily whose members extend from the prokaryoticto the eukaryotickingdom. There are two subgroups withim this superfamily, one subgroup harnessing the energy necessary for active transport from a transmemhrane H+ gradient and the other subgroup relying directly on ATP hydrolysis. In addition to the use of Werent drivingforces,the two subgroups are also distinguishable with regard to molecular structure and operational mechanism. This review is intended to analyze critically the molecular nature of the members of the H+ gradient-dependent peptide transporter subgroup, with emphasis on the cloning strategies utilized in the isolation of the individual transporter cDNAs or genes: on the structural patterns, motifs, and conserved amino acid residues com-
' To whom correspondence may be addressed. Progress in Nucleic Acid Research and Molecular Biology. Vol. 58
239
Copyright 0 1998 by Academic Ress.
AU rights of reproduction in any form reserved. 0079-6603B8 $25.00
240
YOU-JUN FEI ET AL.
mon to constituentmembers of the subgroup;and on the characteristictopological features of the individual members. Q 1998 Academic Ress
All living cells are separated from their external environment by lipid bilayer membranes, Selective translocation of specific molecules across the membrane to acquire necessary nutrients and regulatory elements, and to discharge metabolic waste products, is mandatory for cell survival and growth. Peptide transport across the membrane as one of many transmembrane activities is mediated by specific integral membrane proteins (carriers) and has been demonstrated to be a widely distributed phenomenon throughout nature in both prokaryotes and eukaryotes, such as bacteria, yeast, plants, and animals (1-8).Peptides may serve as sole source to satisfy carbon and/or nitrogen requirements for some of these species. In bacteria, the peptide transport system is also involved in the recycling of cell wall peptides released from peptidoglycan during growth (9),in addition to the function of transporting nutrient peptides. It is also involved in sporulation and chemotaxis (10-12). In plants, peptide transport systems play an important role in providing peptides as amino acids and as a nitrogen source for development and growth during germination (3).In animals, including humans, peptide transport has been demonstrated unequivocally in the intestine and the kidney (13-21). The peptide transport system in the mammalian intestinal epithelial cells is responsible for the absorption of the digestion end products of dietary proteins, mainly di- and tripeptides. The system has a unique feature due to its driving force derived from an inwardly directed transmembrane proton gradient. A closely related H+-dependent peptide transport system expressed in the mammalian kidney is responsible for the tubular uptake of filtered diand tripeptides. In the past several years, research in the area of peptide transport has attracted considerable attention due to the fascinating functional features of the peptide transport process. In addition to its natural substrates, the mammalian peptide transport system is also capable of transporting several pharmacologically active compounds, including p-lactam antibiotics, the antitumor agent bestatin, angiotensin-converting enzyme inhibitors, and renin inhibitors (22-24). Thus the interaction of the intestinal peptide transport system with these compounds is considered as a major determinant of their therapeutic efficacy after oral administration. The potential for nutritional, agricultural, clinical, and therapeutic applications of the proton-coupled peptide transport process necessitates a clear understanding of the biochemical and molecular aspects of the transport proteins responsible for this process.
PROTON-COUPLED OLIGOPEPTIDE TRANSPORTERS
241
1. Two Different Peptide Transporter Subfamilies: A Comparison between the Members of the ABC Peptide Transporter Subfamily and the POT Subfamily The ABC (for AT€-binding cassette) transporter superfamily is ubiquitous, present in both prokaryotic and eukaryotic kingdoms, and is involved in many important biological processes, such as transport of nutrients (e.g., amino acids, sugars, and oligopeptides) and ions (e.g., C1-, K+),secretion of metabolic waste products, antigen presentation and development of multiple drug resistance in cancer cells (25,26‘).The members of the ABC transporter superfamily are operated by directly “burning” ATF’ as the driving force to pump substrates across the plasma membrane against a concentration gradient. These integral membrane proteins contain a hydrophobic 12transmembrane-segment profile. These transporters possess two nucleotidebinding domains, called AT€-binding cassettes, which interact with ATP. Although some prokaryotic ABC transporters are composed of an assembly of multiple protein subunits in the form of a homodimer, a heterodimer, or even a tetramer, a typical ABC transporter, in general, consists of four membrane-associated domains: two integral hydrophobic transmembrane domains and two peripherally located cytoplasmic domains. Each of the two transmembrane domains consists of six membrane-spanning segments to form a “two-times-six”structure, which constitutes the pathway for transferring substrates across the membrane. The other two peripheral domains located at the cytoplasmic surface of the plasma membrane are responsible for binding ATP and coupling ATP hydrolysis to the transport process. Several peptide transport systems found in bacteria and the peptide transport system involved in the transport of antigenic peptides in antigen-presenting cells in animals are members of this ABC transporter superfamily. The bacterial peptide transporters that belong to this superfamily are present in the plasma membrane and operate in coordination with peptide-binding proteins in the periplasm (Fig. 1).The animal cell peptide transporter that is associated with antigen presentation is present not in the plasma membrane but in the endoplasmic reticulum. In the present review, only the former are included as the members of the ABC peptide transporter subfamily.The members of the proton-coupled oligopeptide transporter (POT) subfamily are all present in the plasma membrane and are found not only in bacteria but also in yeast, plants, and animals. Despite the similarity in substrate specificity between the ABC peptide transporter subfamily and the POT subfamily, these two groups are essentially unrelated from a structural and operational point of view. The most fundamental difference lies in the energetics. In contrast to the ABC peptide
242
YOU-JUN FEI ET AL.
FIG.1. Operational models for the ABC peptide transporter subfamily and the POT subfamily. This cartoon is intended to demonstrate the structural and operational differences between the two subfamilies of peptide transporters:the ABC peptide transporter subfamily (top panel) and the POT subfamily (bottom panel). In the case of the ABC peptide transporter sub-
PROTON-COUPLED OLIGOPEPTIDE TRANSPORTERS
243
transporter family members, which rely upon ATP hydrolysis for their driving force, the members of the POT family are not directly energized by ATP hydrolysis, but driven by a transmembrane electrochemical proton gradient (Fig. 1).This proton motive force is generated by different mechanisms, depending upon the individual cell type-either proton pumps coupled to the electron transport chain in the case of bacterial cells or a Na+-H+ exchanger in the case of animal cells. From the structural point of view, unlike the ABC peptide transporter subfamily members, the POT subfamily members do not contain nucleotide-binding domains. There is also substantial difference in substrate specificity between these two groups. The ABC peptide transporters exhibit substrate specificity that is more restricted than that of the POT family members. Individual ABC peptide transporters handle only a narrow range of peptides. Some of them recognize specific di- or tripeptides, and others deal with larger oligopeptides and even proteins. In contrast, the POT family members have a broad spectrum of substrate specificity and are usually capable of handling a wide variety of small peptides, mainly diand tripeptides and, to a lesser degree, tetrapeptides, regardless of the composition, charge, and hydrophobicity of the amino acid residues in the peptide substrates. However, the chain length of peptide substrates for the POT family members is restricted to tetrapeptides.
II. Molecular Cloning Procedures Employed for Identification of the POT Family Members cDNA cloning of functional cellular proteins remains one of the most arduous challenges in the contemporary molecular biology field, especially when the transcripts from the genes are present in low abundance. In recent years, several investigators have used different molecular biological techniques to clone more than a dozen members of the proton-dependent oligopeptide transporter (POT) family. Here we adapt the nomenclature of Paulsen and Skurray (27),who initially coined the term “ P O T for this family of transport proteins, This section provides a brief description of the various experimental approaches utilized in the cloning of the POT family members. family, the two larger wheels represent the two hydrophobicregions, each consistingof six transmembrane domains, and the two smaller wheels represent the two nucleotide-bindingdomains, which are hydrophilic and are located on the cytoplasmic surface.The nucleotide-bindingdomains bind and hydrolyze ATP to provide the driving force for the transport process.This transporter subfamily operates in coordination with specific binding proteins (BP) that bind peptide subsbatesand make them available to the transporter. In the case of the POT subfamily,the driving force is a transmembraneH+ gradient, generated by the Na+-H+ exchanger.
244
YOU-JUN FEI ET AL.
A. Expression Cloning Xenopus lamis oocytes possess the complete set of translational machinery (e.g., ribosomes, enzymes, transfer RNAs, and translation cofactors) for subsequent use in early embryonic development. Xenopus oocytes are widely used for heterologous expression of cRNAs encoding cytosolic enzymes, immunoglobulins,membrane receptors, ion channels, ion pumps, and membrane transporters. The widespread use of the oocyte expression system is based upon the fact that oocytes can not only translate exogenous RNA from various sources efficiently and faithfully, but also cany out posttranslational modifications, such as precursor processing, phosphorylation, glycosylation, intracellular trafficking and protein targeting into the plasma membrane, complex subunit assembly, and even exporting secretory proteins. One of the most important aspects is that expressed foreign proteins always preserve their native biological activities, thus allowing functional studies of the expressed proteins. The system is also suitable for functional analysis by a variety of electrophysiologicaltechniques due to the large size of oocytes, which permits introduction of electrodes into the cell with considerable ease. The expression cloning approach is unconstrained by the nonavailability of nucleotide or amino acid sequence information of the desired genes or gene products and can be, theoretically,applied to clone any gene encoding a protein of detectable function. Microinjection of size-fractionated mRNA from the tissue sources, followed by subsequent functional assays, narrows down the potential target mRNAs to a relatively smaller subpopulation. This particular subpopulation of mRNA can then be used to construct an expression cDNA library for subsequent screening using the same oocyte expression technique. This approach effectively reduces the size of the cDNA library to be screened and increases the probability of successful isolation of the target cDNA. This strategy has been successfully employed in cloning the protoncoupled peptide transporter from rabbit intestine (F'EPT1) (28).The same strategy was applied by different investigators to successfully clone rabbit PEPTl and rabbit PEPT2 from the rabbit intestinal and renal cDNA libraries, respectively (29, 30).
B. Cloning by Homologous Hybridization Functional similarity among related transporter systems in certain functions, such as substrate specificity and driving forces, co-transported inorganic ion species, and electrogenic nature, may imply the existence of considerable homology among the proteins responsible for these transport systems in terms of the amino acid sequences and therefore in terms of the nucleotide sequences of the corresponding mRNAs. This rationale has allowed human intestinal PEPTl and renal PEPT2 (31,32),and rat intestinal
PROTON-COUPLED OLIGOPEFTIDE TRANSPORTERS
245
PEPTl (33)to be cloned by screening the respective cDNA library, using as the probe the rabbit PEPTl cDNA either derived from the parental cDNA, as for hPEPTl and hPEPT2, or generated by reverse transcription-polymerase chain reaction (RT-PCR), as for rat PEPT1. For PCR-generated cDNA probes, targeted cDNA fragments are usually chosen in a region with relatively higher similarity in the parental template cDNA if several homologous sequences are known. In the case of the mammalian peptide transporter family members, the amino acid sequences in the first four to five membranespanning regions show much higher homology compared to other regions.
C. Cloning by Use of Degenerate Oligonucleotides and RT-PCR This technique has been used to clone rat PEPTl and PEPT2 (34, 35). Degenerate pools of short oligonucleotides that can code for a given stretch of amino acids have been used as polymerase chain reaction (PCR) primers to isolate some tissue-specific cDNA clones, especially for mammalian plasma membrane transporter genes. For instance, degenerate oligonucleotides corresponding to regions of high sequence identity between the norepinephrine and the GABA transporters have been successfully used to generate probes for screening cDNA libraries leading to cloning of other members of the (Na+, C1-)-coupled neurotransmitter transporter superfamily (36,37). During the cloning work of rat PEPTl and PEPT2, the degenerate oligonucleotides were designed based upon the cDNA sequence information of rabbit PEPT1. Locations of the oligonucleotide primers that were used in these studies correspond to the amino acid sequences in TM1 (transmembrane domain) and TM4, where the amino acid similarity is conserved across several species. The mRNA from “provider” tissues, the rat intestine or kidney, was reverse transcribed into the cDNA, which was taken as a template for subsequent polymerase chain reaction (RT-PCR) using the designed degenerate primers to obtain the tissue-specific double-stranded DNA fragments. The resultant DNA fragment was subcloned and sequenced to verify the identity. This cDNA represented a small fragment of the rat intestinal or renal peptide transporter cDNA. The fragment was then used to screen the rat intestinal or kidney cDNA library to isolate the full-Iength clone.
D. Cloning by Functional Complementation 1. THEPRINCIPLE OF FUNCTIONAL COMPLEMENTATION CLONING STRATEGY First, a host strain deficient in peptide transport activity has to be established or selected by genetic recombination knockout or physical irradiation. The defective host strain cannot grow in a selective medium that provides
246
YOU-JUN FEI ET AL.
the auxotroph with the required amino acid in the form of a dipeptide as the sole source. A suitable genomic DNA library or cDNA library that contains the target peptide transporter DNAcDNA in a transforming plasmid is then used to prepare DNA, which is then introduced into the peptide transportdeficient host strain by chemical transformation method or by infection (e.g., by infecting a host plant through a bacterium harboring the transforming plasmid). The transformants are subsequently subjected to growth selection and screened on a discriminating environment by assaying whether the deficient function of the host strain has been corrected by the introduced gene contained in the transforming vector. Survivors among the transformants in the selective environment signify that they are the receivers of the sought-for gene. The gene to be cloned therefore will be identified and isolated.
2. CHLl GENECLONING METHOD First, a chlorate-nitrate transport-deficient Arabidupsis thulium mutant, chll-5, was created by gamma-irradiation of the wild-type plant (38).The phenotype of the mutant was revealed by the resistance to the herbicide chlorate. The CHLl gene was isolated from a genomic library of Arubidupsis thulium by genetic complementation of the mutant chll-5, conferred by a genomic DNA fragment, 11 kb in length, from wild-type plant. A plasmid containing the 11-kb DNA fragment, in which the CHLl gene was embedded, was first introduced into Agrobucterium tumfuciens. This bacterial strain containing the CHLl gene was used to infect the root explant of the mutant chll-5. The resultant complementation was demonstrated by sensitivity of the mutant plant to the herbicide chlorate. The CHLl gene was thus identified and isolated. 3. DTPTGENECLONING METHOD Lactococcus lactis possesses a di- and tripeptide transporter with a broad substrate specificity. This transporter is different from the binding-proteindependent, ATP-driven ABC peptide transporter. This transporter is energized by the electrochemical proton gradient across the membrane and thus belongs to the POT family. The DtpT system was cloned by Complementation of a dipeptide transport-deficient and proline auxotrophic Escherichia coli strain (39).Lactococcus la& chromosomal DNA was partially digested and ligated into an expression vector pTAQI. The newly created plasmid DNA containing numerous L. h t i s genomic DNA fragments, was then used to transform the peptide transport-deficient E. coli strain E1772, which was spread on a selective medium containing a dipeptide (Pro-Gly)as sole source of proline. Only those colonies of thetransformed E. coli El772 that are functionally complemented by the di-tripeptide transport gene of L. Zuctis can survive on this selective medium. The DNA fragment containing the DtpT
PROTON-COUPLED OLIGOPEPTIDE TRANSPORTERS
247
gene in the cloning vector was narrowed down by restriction endonuclease digestion and subsequent screening for the ability to functionally complement the E. coli strain El772 on the same selective medium. The DtpT gene of L. lactis was trimmed down to -3.5 kb, in which an open reading frame (ORF) of 1389 bp encoding a protein (DtpT) of 463 amino acids was found. Afterward, a stable dipeptide transport-deficient mutant L. lactk strain (AG300) was established by deleting the DtpT gene from the chromosome via homologous recombination. Phenotype of the DtpT deletion strain characterized by growth experiment showed resistance to a toxic dipeptide analog and failure to grow on a medium containing dipeptide as sole source of certain amino acids. These experiments demonstrated that the gene DtpT is responsible for the di- and tripeptide transport activity in Lactococcus lactis (39). 4. NTRl GENECLONING METHOD Heterologous complementation in yeast mutants was used to clone the peptide transporter gene NTRl of Arubidupsis thulium (40).A mutant Sacchuromyces cereuisiae strain defective in the histidine permease gene was transformed with an expression cDNA library derived from Arabidopsis seedlings. Transformantswere selected on a medium supplemented with histidine at a suboptimal concentration. The cDNA clone NTRl was identified from the cDNA library by its ability to restore the growth of the mutant on the selective medium. The NTFU was at first suspected to be a histidine permease. However, functional analysis was hampered by the fact that the uptake rate was not sufficient to be distinguishable from the background when the NTRl gene was expressed in the mutant yeast. Interestingly, the deduced amino acid sequence showed homology to the known members of the peptide transporter family, such as PTR2 (from yeast) and PEPTl (from rabbit). This prompted further work to determine whether the NTRl gene is functionally related to peptide transport (41).A yeast mutant strain (LR2) defective in the PTR2 gene was constructed by integrating a disruption cassette into the PTR2 gene. N T R l cDNA was subcloned into a yeast expression vector under control of a highly active yeast promoter and used to transform the mutant yeast strain (LR2). The transformants containing the NTRl gene were found to survive and grow on a selective medium supplemented with dipeptides as sole source of amino acids. Peptide transport activity of NTRl was determined directly by measuring the uptake of radiolabeled dipeptides into the transformed yeast cells. 5. YSCPTR2/CAPTR2/ATm2CLONING METHOD
The yeast Sacchromyces cereuisiae peptide transporter gene yscPTR2 was cloned by functional complementation of the peptide transport-deficient phenotype @tr2-) (42).A peptide transport-deficient yeast strain (PBIX-9B)
248
YOU-JUN FEI ET AL.
was transformed with a yeast-E. coli shuttle vector (Ycp50) containing yeast genomic DNA fragments derived from a Saccharomyces cerevisiae genomic DNA library, The transformed cells were selected on a selective medium supplemented with histidine as peptide transport system inducer and a dipeptide as the sole source of the amino acids required auxotrophicallyby the host strain. Following identification and isolation, the pTR2 gene was trimmed down to a 3.1 kb DNA fragment by restriction endonuclease digestion and confirmed by hameshift insertion technique. The yeast transformants harboring the pTR2 plasmid displayed the expected wild-type phenotype capable of accumulating radiolabeled dipeptides, growing on defined dipeptides, and being sensitive to toxic dipeptides. CaFTR2, a Candida alhicam peptide transporter, and AtPTR2, an Arabidopsis thaliana peptide transporter, were also cloned from a C. alhicam genomic library (43)and an A. thaliana cDNA library (44,45),respectively, by a similar functional complementation technique using a peptide transport-deficient mutant of s.cereuisiae.
111. Comparison of Amino Acid Sequences of the Members of the POT Family General structural features of individual members of the POT subfamily, the size and distinct structural profile of the transporter genes or cDNAs, predicted amino acid composition, molecular mass of the protein core, isoelectric point (PI), consensus glycosylation sites, and the potential phosphorylation sites (Ser or Thr) for protein kinase C and protein kinase A (for the mammalian POT members only) of the transporter proteins are listed in Table I. The gene symbols and generic names are kept the same as described in respective original publications. AU of the transporter proteins listed in Table I have been demonstrated to transport small peptides as measured by the uptake of radiolabeled substrate, by the sensitivityto toxic peptide agents, or by growth in a medium containing peptide substrates as sole source of required auxotrophic amino acids. The only exception is CHLl isolated from Arabidopsis thuliana as a chlorate-nitrate transporter. The peptide transport activity of this protein has not been investigated yet. A multiple sequence alignment was carried out using a computer-assisted analysis method. A software, Pileup (version 8.1),from the Genetic Computer Group Inc. (Madison, WI)was used to carry out the comparison. Creation of the alignment was according to the algorithm of Feng and Doolittle (46),using a simplification of the progressive alignment method. Multiple alignment was initiated with a pairwise comparison of the two most similar sequences, generating a cluster for the first two aligned sequences. The re-
249
PROTON-COUPLED OLIGOPEPTIDE TRANSPORTERS
TABLE I CHARACTERImCS OF POT SUBFAMILY MEMBERS
Transporter
Gene or cDNA (OW
Transporter protein (no. of amino acids, estimated molecular size, isoelecbic point)
WEPT1 -3.1 kb (Human intestinal (2127 bp) peptide transporter)
708 aa, -79 kDa PI = -8.58
Rat PEPTl -3.0 kb (Rat intestinal (2133 bp) peptide transporter)
710 aa, -79 kDa PI = -7.39
Rabbit PEPTl -2.7 kb (Rabbit intestinal (2124 bp) peptide transporter)
707 aa, -79 kDa PI = -7.47
hPEPT2 -2.7 kb (Human kidney (2190 bp) peptide transporter)
729 aa, -82 kDa PI = -8.26
Rat P E R 2 -3.9 kb (Rat kidney peptide (2190 bp) transporter)
729 aa, -82 kDa PI = -7.99
Rabbit PEPT2 -4.2 kb (Rabbitkidney (2190bp) peptide transporter)
729 aa, -82 kDa PI = -7.39
CHLl -2.0 kb (Arahidopsis (1773 bp) thaliana chloratenitrate transporter)
590 aa, -65 kDa PI = -8.48
Topology 12 TM N-glycosylation sites: N50, N404, N408, N409, N509, N514, N562 PKC siters: S357, S704 PKA sites: absent 12 TM N-glycosylation sites: N415, N439, N510, N532, N539 PKC site: S357 PKA site: T362 12 TM N-glycosylation sites: N50, N439, N513 PKC site: S357 PKA site: T362 12 TM N-glycosylation sites: N435, N472, N528 PKC sites: S264, S376, S607, S640, S724 PKA site: S34 12 TM N-glycosylationsites: N435, N448, N528, N587 PKC sites: S376, S640, T708 PKA site: absent 12 TM N-glycosylationsites: N435, N472, N508, N528, N587 PKC sites: S264, S376, S640, T724 PKA site: absent 12 TM Two groups of 6 segments flanking a hydrophilic region with charged amino acids
250
YOU-JUN FEI ET AL.
TABLE I (Continued)
Transporter NTRl (Arabidopsis thaliana peptide transporter) a m 2
(Arabidopsis thaliana peptide transporter) yscme (Saccharomyces cerevisiae peptide transporter)
Gene or cDNA (OW
Transporter protein (no. of amino acids, estimated molecular size, isoelectric point)
Topology
-2.1 kb (1761 bp)
586 aa, -68 kDa pI = -5.07
12 TM N-glycosylation sites: N81, N86
-2.8 kb (1833bp)
610 aa, -68 kDa PI = -5.16
Multiple transmembrane domains
601 aa, -68 kDa Gene PI = -5.14 -3.1 kb cDNA 1.8 kb (1806 bp)
-
c m 2 Gene -5.1 (Candida albicans kb (1872 peptide transporter) bP) -2.1 kb DtpT (di-and tripeptide (1392 bp) transporter from Lactococcus lactis)
An 883-bp leader sequence
preceding the 1806 ORF Canonical elements at positions -208 and -65, which contains a TATA box and a CCAATCA box, and an inverted repeat element in the promoter region Intronless
623 aa, -70 kDa PI = -6.40 463 aa, -51 kDa PI = -10.57
12 TM a-Helix Both N- and C-termini located on the exoplasmic surface of the membrane
maining sequences with relatively less homology were then added for comparison, by aligning this cluster with the next most related incoming sequence. The final alignment was established by a series of progressive stepwise comparisons, which gradually incorporated increasingly dissimilar sequences and clusters until all sequences had been included into the final pairwise alignment (Fig. 2). A dendrogram describing the comparative relationship in terms of homology in amino acid sequence among the members of the POT subfamily from different species was also constructed (Fig. 3). The followingparameters were used to create the alignment: gap creation penalty = 3.00, gap extension penalty = 0.10. For the PRETTY program, parameter plurality = 9 was stipulated to vote for the consensus amino acids,
251
PROTON-COUP I X I ) 0LICX)PEPTII)E TRANSPORTERS PtrZ YaCptr2 I
captrz
@pep= Rabbitpsptz -tpep= Bpsptl Ratpeptl Rabbitpeptl chll Ntrl Dtpt Consensus
Ptr2 Yaaptr2-1 captr2
*=P= RabbitpapRatpep=
Wpeptl
Ratpept1 Rabbitpept1 chll
Ntrl Dtpt Conaensua
PtrZ Ymcptr2-1 Captr2
1 mai eeqitkadad f i i m d q m y l akekkadqsa t i n q a d e q s a t d e l q k a m u t mlnhpsqgad &qdek.qgd tpviwe . k t q a v t l k d syvtddvans t e r y n l a p s p . w a m dfanekqp& vqvltde. k nimlddkydy e d p k n y a t n y .mnpfq k n i a k e t l f w m p f q qneaketlfa m p f q kneaketlfa
......
..... .................... .................... .................... .................... .................... .................... .................... .................... .................... -------------------
... ...........
epeptl
RatpepRabbitpeptl chll
Ntrl Dtpt Consensus
~~~
11 a p t e e e l a t l psvogtipwk a ~ t e e e m z t lrhvaakiDme owLIaIvEL8 &tpqe&l rrvi&i&m t f M l a l o i F a ppappkkpap trogmnyplm i a F I w n x P a lmapakktpp k i o g a n y p l a i a F I W F 0 ppappkkapp kifgmmypva iaFI\NIIpFO .....ugmak mhaffgypls ifFIINnpPo .....ugmak a r g o f W p l 8 i f F I ~ F mgmak a l a o f g y p l a ifFIvVnEFa dfqgrpa drsktggwam aaWIloiXav gmnlfngnpp lknktgnwka apPIlgnL00
..... ...
Z R F s Y Y Q m a D F O I I W . . .......... ...em* =-X R a s Y Y m t t g iLinYi& dpchphgwga pppg.pdama LRPmYYma VLilrFlyf. XRPmYYCIM*aVLtlYFlyf. ERPmYYCMka VLtlYPlyf. B R F m Y Y W a iLilYFtnf. ORPaYYCIMTa 0 ILvlYFmf. XRPaYYWa 1 L I l Y F m f . XRLttlGigv nLvtYLtgt. XRLaYYQiag n L i t Y L t t k . Hra i L v y n y r l . ttadn LW-yYa-- -L--Yp---210 i p svidagk i p mrgnrd.9 ipqaienana ~
....................
.................... ....................
--pI-V-Xp-
------____ _____-----
..
.1piLggq.
s
.. .
.vneLtdnnh
.....
-----_F---
......... .....mggfv V S L i i I . . . . . . . . .....iggfi U i i L I ........ .....
..........
. .:
dgtpd.1.. dgapnnl dgtpdal.. &&am l v a I g L~l L IL npttmmhoeq amgiqltvly l a L y L t igdfcrp. =tpw-f 9BLYLI f vaLfL1
...
~
L
I
... .......... . . . . . . . . ----------
...... ... ...... ... .......... .......... ~ aF ggDQFeEgqe k q r n r . . ... sp g . w w l t c p k e r a k m t y . . .......... aP g m t d . m r v r k a m . . .......... nM v g h l Y . . . . a k d d a r r d t . . ..........
281
mept1
RatpepRabbitpaptl chll
..
iipgLzppro .vPdPWO glsmL ----L----280 maLwpklpp yvktkkngsk v i v d p v v t t a iaDQLpkrkp mikvlkager v i v d a n i t l q vlDpYpLerd mvkvlptges i i l d r e k s l s ggLqF-ha eertr. ggDQFelLha eertr. g g n m ~ e h aeartr.. aamFeEaae k u r n r . .
.......... .......... ..........
Ptr2 Ysaptr2-1 captr2 wept2 -itpep= Ratpep=
.
.fPiLgglr.. ipiLggk indLWhnh IndLtdhdh
.. . . .Lqwdddlm .. .. Higwddnl. hlgnata . .Lhqgnvsa Iatnvttwqgt naiVWiYry
a. g.
Dtpt Consensus
.................... .................... .................... .................... .................... ...............
..................................... __________ ------__-_
141 galnlgetga d g l s n f F t f w gvLmlnaqga tg1S y f P q f W gaLCg1wa 8altnlLtfl Lhmsdta Lhmedt. Lhmedt .i.Wddnl.
aglglpkwa --I,------211
....sdedfe ddynpkqlr
........................ pvsie-k&r ......................... pvmtotppr ......................... pvsteanrlpr .................................................. .................................................. .................................................. ....................... mBlp&k B & l l d a W . .......... ....................... u g m i e s a azpllrrgli 1qeVllya.d .................................................. ------___---------- ---------- ---------- ---------140 a f i I i I v Z L 0 LRPaYYQLtv pFqnYUqf.. .......... ...gpkdatp
mep= .. ..
-itpep= RatPCP=
70
gvlvngdlyp
w l a t Talns..... alrnat TeLaY.....
....US-
350 AYllPluVfv i p L I l L a v a k t a F t m t l l p P iAvVtLifCa k a Y . . . i a r P p i L kikpP kiY. r J P P P kmY kkPpP kmY rlrppp .kkfkP my. . k k f q P .kkfkP
....hkma AYllPfopfr ...*L. .
f . gedOyaL
f . .qe&yaL f..gqdaYaL IhakqaoYpL ihmqqaoYpL IhvkqaoYpL
.. . .. ..
w.
Ntrl Dtpt
Consensus
FIG.2. A multiple protein sequence alignment of POT family members. The software PileUp (version 8.1) in the GCG package (from the Genetic Computer Group Inc., Madison, WI) was used to establish this multiple sequence alignment. Gaps are introduced to make the optimum alignment and indicated by the dots. Hpeptl, ratpeptl, and rabbitpeptl represent the lowaffinity peptide transporters from human, rat, and rabbit, respectively; Hpept2, ratpept2, and rabbitpept2 represent the high-affinitypeptide transporters from these three respective species; Chll, Ntrl, and Ptr2 represent the chlorate-nitrate transporter aid the two different peptide transporters from A. thaliuna, respectively; yscPtr2-1 is the peptide transporter from Saccha-
YOU-JUN FEI ET AL.
252 351 Ptr2 vpalfv.lvk cmalllktnL Yamti2 1 iQd....kvi akafkvawiL q5qvmtnVvk i l a v l f s g n F S m i w f k CiwfaianrF CiWfaianrF S m i w i k CiWfalcnrF qQnimgkVak a i g f a i k n r p qQnimgkVak a i r f a i k n r F qQnilakVvk a i c f a i k n r F iQapmtwaa V i v a . . .W gQapitrIaq v w a a . . . F
. .
420 .hlalll...
..................................
iakkln tknkfd... ikrlwn.. knragdipkr knracdipkr rnragdlpkr rhrskafpkr rhrakafpkr rhrakqfpkr rnrklelpad rkamhvped repanpmdak
............................... F n a a k p a . . . ............................ . g t f W d h G a h m ............................ d Wl.dwaaeky q. ........................... Wl.dwaaaky q. ........................... Wl.dwaaeky Wl.dwakaky e ............................ W l .dwakeky n . ........................... a ............................ Wl.dwakeky
E3
paylyiaaegamlgk q k l p h t e q f r a l . d k d P 3 atllyetqd. .knaaiaga r k i e h t d d a q Yl.dkaavia a k r n f i i t l t ivlivaligf f l i y q a a p a n F i n n f i n v l a
.
.................. ---------p --___----_ ---------- --------__--____---w--------4.~ 90 l y v L d ~ a PIdltlKralr a a k t F l f y p i WOYgQmtn nLiaQagqMq t g ........ aPitQasmMe l h ........ ........vh peknypwndk WWdEMrala ackvFifypi YwtqYgtmia r l D I K q t f d sckiFlyyii Fnladnglga VctaligaMk Id ........ argtiyynak kkaaitwadq pkq. ................ LImOVl(lltr vlflYiplpm EWalLdQqga W l Q a h M n ....r n l Q f . pkq ................. L I M t l t r vlflYiplpm RJalLd!2qga W l Q a t k M n ... .gnlQf. vlflYiplpm EWalLdQqga WlQankMn .... gdlQf. pkh . . . . . . . . . . . . . . . . . LI?QVKaltr der ................. L1aq-r vmflyiplpn malmaqga W l Q a t t M a ... .gkiQa. der ................. LIEqWDtk WPflYiplpl. E W a l m . WlQattMt ....g k i Q t . der. ................ LIaqqT1pmvrr v l f l Y i p l p r EWalFdQqga W l Q a t t M a ....g r i Q i . aeaav-tarlvf nkw. tl8tlt dVeWKaivr mlpiwatcil m a Q l t t InnQaatLd ....raias. 10
-Q-----V--
_-........ l a 60.
~
&;kmgdya nawadLcM weE1Kilir u&iWaagii i i g i v v p i i y fmmftakkv e a d l k r k l t a y i p l F 1 a a i v LI-EM-------y----491 .nvanDlFQa F d n i a l i I F I PiaDnIiYpL l r K . . .Ynip .gipnDILPa W . i a l i I F 1 P i p u f v Y p F i r r . . .Y.tp .gvpnDlbhn F n p l t I i I L I PiLEyglYpL InK.. . a i d fvlqPDqMW LnpllVlIFI P1R)fWYrL vaX...Cgln iax.. .cgin fvlqmL ~ ~ U V ~ IPlrnlviYrL FI fvlqPDc@Qv LnpflVlIFI PlmlViYrL iax.. .Crin 1eiqPDqMQtvnailIvIMV PiPDaVlYpL i a K . . .cgfn isiqPDcp(Pt v n a i l 1 v I W PivDaVvYpL i a K . . .C#n 1eiqPDwQtv n t i l I i I L V PimDaVvYpL i s . ..Cgln feipPaaMav Fyvggllltt a v Y D N a I r L ckKlfnYphg fqlpPaaLgt FdtaaViIW PlYDrfivpL arKftgvdkg fhidPawyQl LnplfIvlLa PiFvrIwnkL gA)-MQF----I-IFI p-A)-V-y-L --K---C--561 rgpcyanf.. agpwynep.. qqcgyya.. mapaqpgpqe Vflqvlnlad dsv*vhnrgn e n n a l l i e a i mappqpgaqs i l l q v l n l a d davkltvlgn n n n a l l a d s i mihumamaa iflavlnlad rrrtvlvtvlsm r n n a l l v s r v tlpi&kgie vqilivlnlgn 6tmnialp& m.. .vtl tlpvfpagnq vqikvlnlgn ndmavyfpgk n . .vtv tlpvfpkane vqikvlnvga enmiialpgq t . vtl . . l k r l r t a h ahg ..iirlhman d l g l v
_--------- ----------
.....
----
...
.ckiQa. FaavYaQIMt mRrqQ9raUn EwaieeQaat i i a w g e a r s nlnptwfQft W----Q--+--Q---M-------Q-560
Wp Lkp Fkp Fa8 Bts Fan Fta Fta
W P a t a d y Mvlqaklyq FmFgaFaXkW AAvlqBfvyL RnraaFaqia gfvlqkqvye KiLacLafav ARNeikIne MvLacLafaa M t v a i k I n e MiLaaLafav A a l m t k I n g MvLasMafw Mivqveldk WfLaaMafnr mvqv'mldk Fta MfLaanafva AtdlqveIdk LfFgaMalnav m m . . Lrp LfvavLcmaa Mi-.. Fte .at W t g a m y l i mtlpg F--------Q M-L------M-----I-630
...
... .....
............................................................
............................................................ ............................................................ kafatphya klhlktkaqd kafqktphya k i h l n t k a q d aafqntthys klhlaakaqd g p u i q t n a f m tfdvnllt.-r aqmaqtdtfm tfdvdqlt. a nqmaqtnefm t f n e d t l t . 8
... .................... .......... ....... .......... ..... .................... .................... .............................. ---------_-__--------------- ---------- ___------631 .............................. .................... .................... .............................. .............................. k.................... mlvtahsvae k n w s l v i r e danaiaamv d t a m r t t n s m t m t l ....
.....
ai&ah&
rnwjyaliiru &kaiaaimv k-atty;
avhndhsvw vtavtddfkq vttvahefap vtplitpalca
knayqllihq gqrhtllw. ghrhtllw. gqrhtllw.
mtairfintl d g o a i a d v kdtgikpang m a a i r f i n t l apnhyqvvk dglnqkpekg s n g i r f v n t f .gpnlyrvv)r dglnqkpakg e n g i r f v a t l .apnnyrvvn dglt*adkg engirfvnty
.
fhfhlkyhnl fyfhlkyhnl lhfhlkynsl inia&ap. invaapgapg inita..gaq
.......... .......... .......... .......... .......... ___------.......... ---------700
.......... .......... .......... ..........
.......... d.......... talnwedy
hkdvnia1.t qeuwnialgt hkdlnialdt nolititmag nardtikmg aqpinvtmag
dialnv+ny daplavgkdy kvyania. ay kvyenvt. ah kvyehia.ay
...................................................................... ...................................................................... ...................................................................... ----______ _----_-------------- -__________-----_-_ _____--------------
FIG.2. (continued romyces cereuisiae;Captr2 is the peptide transporter from Cundida ulbicans; Dtpt is the di- and tripeptide transporter from L. lactis. Highly conserved amino acids (upper case letters) occurring at least nine times in the compared protein sequences are written in the consensus sequence. 11 , , and 111) are highlighted. The amino acid residues Three POT-specificconsensus sequences (I conserved in all of the transporter proteins, the three conserved histidyl residues in the mammalian peptide transporter subfamily, and the RGD tripeptide motif in the mammalian highaffinity transporter subgroup are also highlighted.
253
PROTON-COUPLED OLIGOPEPTIDE TRANSPORTERS 701
770 t d tavandiavw 1 g hntpmvhva t n adspapitaw ylfvltnntn qy.lqawkie dipankmsir ylfvltnatk qg.lqawkm dipankvaia y l M t n i t a qg.1qawkae d i p v n k l a i a ytyivq.rkn &apwltvfs diaantvnma y t y v i r a r a a d g a l a v k e f r dippntvnma y t y l l t a q a t . g e e dippntwma p tV¶ct1p1g€y e agqmpiavl 11 ng-tagraaal
........................... .................... ........................... .................... ..........................................................
gvaayrtvqr gvaayrtvpr gvaayrtvlr naatyqffpa aaanyqffpm naaeyqffta
poypavhart grypavhakt gkypavhaet pikgftiaat qqkdytintt gvkgftvaaa
mdknfa...l n l g l l d f g a a &dfa...l nlglldfgaa mdkvfa.. .1 dlgqldfgtt e i w w n f ntfylefgaa riapnaaadf kaanldfgaa glaeqorrcw espylefgaa
.................................................. ....................... .............. ................................ ............. ---------- ------_--- ---------- ---__---------------------___ ___------840 LEFaFtlrlrpp aMKS11tAlF LFl'nAfQaJl a i a I a a t a v . .......... Q LEYaYakAPa aMKSflm1F LLTnAfQaai gaalrrpvhr. .......... a yELaYtraFp aLKslvyAlF LvrmAfaaal a l a I t p a l k . .......... Q LEFaXac#&'a aMKSvlqAaW LLTiAMOii v l w a q f s g l v.qwae.. .. LZFaYaqAPa aMKSv1LLTvAlQnil VlWaqfagl v.qvae.. SVTO LtraYaqh?'a aMcsvlqAaW LLTvAvQnil vlvVaqfagl a.qwae.. BVTQ LCFaYaqhPa nWEv1qAgW LLTvAvmil vlivagagqf akqwae.. Q LIiFaYaqRPa nWSVlqAgW LLTvAiWi VliVaeaghf dkqww.. 0 LtFaYaqhPa I M T S V l M LLRrAvQni1 VllVagagql nkqwae..
LDFfLmoElr -tglL LEFiYdqaW aMrSloaAla LavatklAW aPq8qmArM IpF-y--Ap- -----A+ 841
......npkl ......dpkf
LaTWQfff a N l v t i M * f t g k . . ahpr LLTnAlQnyl a a 1 I l t l v t y fttrngqegw PLadataqal n a q I t p i f k a a t e . . . . . . . ,,&=-A+----v-----910 Qqaqla.f7x rndilltkkdv akLavhd.yam 0 l m n d . y a a d e f d l n p i aapkandiei M t n e m r l d -anrg ihdvQlpirr aDMgpadkh lphiqgnmik l e t k k t k l * . 0l)lqgpedkq iphuqgnudn l e t k k t k l . . eDtreatdkq ipavqgnmin l a t k n t r l + . aEiaaqfd6d ekknrlekan pyihuganaq aEieaqfded ekkkgvyken pymalspvsq aEieaqfeed &kknpaLnd lypalapvaq rlamvyiald dapaipmgh. kaaa......
__________
t m Y t q i a v t aF1agimRN afhhYda.m twlprglava aFImgol8wl afrkYnd.to dpnl h n r P l a i g l a gFlaaiVMa qfwnldkwm F i l f a a l 1LVlalIFaI mgyyYvpvLt F v l f a a l 1LUmlIFaI mgyyYipika r v l f a o l 1 L U m l I F ~ Vmaypvplka Y i l f a a l 1LVvavIFaI marfYtyinp Yvlfaal 1LUmiIBaI marfYtyinp Y i l f a a l l L m v I F a 1 marfYtyvnp ynflrlvavl v a l n f l I F l V fakwYvykek i a d n l n a g h l d y f s r l l a g l aLVnmawff aaar..ykqk VhfFaitgii glIVgiIL11 ikkpilk1mg dvr ---p------Lv---Ip-I ----y-----p-------911 923 adeaqynlek ana l a p m e a l r a t tky ivaika.
......
.............
............. ............. ... ... ...
.. .. .. .. ..
.......... ....................
........................... ---------- ----------
.......... __________
......
............. .............
............. _______----FIG.2. (continued)
which means that an amino acid was included in the consensus sequence only if it occurred at least nine times among the aligned multiple protein sequence. From the multiple transporter protein sequence comparison, two subgroups can be sorted out. Subgroup I consists of six members of mammalian peptide transporter, of which three are low-affinity peptide transporters (PEPTls) and the other three are high-affinity peptide transporters (PEPT2s). Subgroup I1 consists of peptide transporters of nonmammalian origin, including bacterial DtpT from L. Zuctis, three yeast peptide transporters, and the plant peptide transporters from A. thuliana. A separate parallel multiple sequence comparison within each subgroup indicated that the sequence homology is higher within a subgroup than between subgroups. Se-
254
YOU-JUN FEI ET AL.
FIG.3. An evolutionary dendrogram. The dendrogram was created from the multiple protein sequence comparison using the Tree program in the GCG package. The mammalian subfamily is highlighted.
quence identity/similarity is more frequently presented in the transmembrane domains than in the intra- or extracellular loops; the preserved amino acids are especially more frequent in the first six transmembrane domains (the first hydrophobic cluster) than the last six transmembrane domains (the second hydrophobic cluster). The amino acid sequence is most divergent at the N-terminus and the C-terminus. Three highly conserved homologous amino acid stretches among the aligned sequences have been identified, namely consensus sequences I, 11, and 111, which are indicated by three boxes in Fig. 2. The identical amino acids preserved in all of the 12 transporter proteins included in the multiple alignment are highlighted by the heavy-stipplingboxes. The amino acids conserved in more than nine family members are written in capital letters in the row describing the consensus sequences, as well as in the individual sequences. The nonconserved amino acids are denoted by lowercase letters. Consensus sequence I, GX(2)IADXWLGXF'XTIX(5)VX{3)G7consists of the second transmembrane (TM) domain near the cytoplasmic surface and the whole third TM,including the short loop between the two TMs. This sequence corresponds to amino acid positions 168-192 in the multiple aligned
PROTON-COUPLED OLIGOPEPTIDE TRANSPORTERS
255
sequences. Consensus sequence 11, LGTGGIKPXV, encompasses the cytoplasmic half of TM4 and occurs at amino acid positions 238-247. Consensus sequence 111, FX(2)FYLXINXGSL,at amino acid positions 283-295, corresponds to the whole TM5. The third consensus sequence is identical to the "peptide-transport family signature motif" described by Steiner et aZ(8). Since all three consensus sequences are located between TM2 and TM5, it appears that the first half of these transporter proteins is highly preserved during evolution and is associated with some hitherto unrecognized important function that is common to all these transporters. The specificity of the consensus sequences was tested using a Blast program (BlastN and BlastP, according to Lipman's algorithm (47))against the protein sequence database in the GenBank, EMBL, Tags (a subset of the GenBank), GenPept (protein sequences from the GenBank coding sequence translations),PIR (Protein Information Resource), and Swiss-Prot databases. The protein database search results showed that all three consensus sequences are POT specific, and do not occur in any non-POT protein. One exception, however, is that consensus sequence I1 is found in an insert of Caenorhabditis elegans cosmid C06G8. The deduced cDNA from a contiguous nucleotide sequence 2.2 Mb in length is located at chromosome I11 of C. elegans (48), and the corresponding protein, predicted using the Genefinder software, exhibits similarity to human intestinal PEPT1. It is very likely that the deduced protein is, indeed, a peptide transporter in C. elegans and is a member of the POT family. Another homologous protein (589 amino acids) identified by query consensus sequence I from the GenPept database is a plant nitrate transporter from Brmsica napus (49), which shows 91% identity and 97% similarity to the CHLl nitrate transporter from A. thulianu. Whether or not this transporter is capable of peptide transport function is not known. From the Protein Information Resource (PIR) database, a third protein, Escherichia coli hypothetical protein 0489, was identified that shares 30% identity and 58% similarity with the DtpT from L. laclis. Again, the peptide transport function of this E. coli protein has not been investigated. An RGD tripeptide motif is found exclusively in the high-affinity peptide transporters (PEPT2) of the mammalian POT subgroup. The proposed membrane topology indicates that the RGD motif is located at the extracellular loop between TM5 and TM6. The involvement of the RGD motif in the physiological functions of the high-affhity peptide transporters remains to be elucidated. Many adhesion proteins present in extracellular matrices, such as fibronectin, vitronectin, collagens, and fibrinogen, contain the tripeptide motif arginine-glycine-asparticacid (RGD) as a binding ligand for the cell surface receptors belonging to the integrin superfamily. The RGD sequence in each of the adhesion proteins is recognized by cell adhesion receptors. The tripeptide RGD sequence motif and its conformation in the individual RGD-
256
YOU-JUNFEI ET AL.
containing adhesion-promoting proteins may play an important role in the recognition specificity between the cell surface receptor and extracellular matrices. Such recognition is a critical signal for cell position, migration, and anchorage during cell differentiationand growth (50). Since PEPT2 has been found in several mammalian tissues, including kidney, pancreas, lung, and brain, the RGD motif identified in the PEPT2 subgroup could be involved in some important, not yet recognized, function. Another important structural feature in the mammalian POT subgroup is the presence of at least three conserved histidyl residues in the transporter proteins. A multiple sequence comparison demonstrates that His-57, His121, and His-260 (corresponding to the position in hF'EPT1) are conserved across the mammalian peptide transporter members; His-57 and His-121 are located near the extracellular side of TM2 and Th44,respectively. These conserved histidyl residues are highhghted in Fig. 2. The imidazole group in histidine has a pK, value of 6.7, making it feasible for the group to accept protons or to donate protons at physiological pH. This characteristic becomes very relevant to the transport systems that are influenced by transmembrane proton gradients and use protons as one of the contransported substrates. The essential role of histidyl residues in the catalytic function of lactose-H+ contransporter in E. coZi, a prototype for H+-coupled transport systems, has been investigated in detail (51,52).There is evidence that mammalian peptide transporters, which are driven by a transmembrane H+ gradient, possess histidyl groups that are obligatory for their catalytic function (53-55). Studies in our laboratory using site-directed mutagenesis have revealed that His-57 is absolutely essential for the catalytic function of human PEPTl (56).Similarly, the conserved histidyl residue in the corresponding position in human PEPT2 (His-87) is obligatory for PEPT2 function (56).Mutation of the other two conserved histidyl residues does not influence the function of PEPTl and PEPT2 significantly.The molecular mechanism by which this particular histidyl residue participates in the binding and translocation of H+ remains to be determined.
IV. Topological Features of the POT Subfamily The POT subfamily members are present in prokaryotic as well as in eukaryotic cells and share certain common structural and functional characteristics. It is believed that all of the carrier-mediated transport systems, including facilitative transporters, antiporters, and ion-coupled symporters, operate by similar mechanisms. This solute transport family, called the major facilitator superfamily (MFS), has an ancient evolutionary history and probably dates back more than 3.5billion years (57). Amino acid sequence
PROTON-COUPLED OLIGOPEPTIDE TRANSPORTERS
257
analysis indicates that constituent members of the MFS contain multiple hydrophobic transmembrane domains arranged in two clusters; six transmembrane helical segments are placed in each cluster, forming a six-plus-sixstructure with both the amino and the carboxyl termini located on the cytoplasmic surface of the plasma membrane. All of these proteins may have descended from a common ancient precursor protein of half this size by duplication during the evolutionary process. Hydrophobicity analysis demonstrates that the POT members follow the characteristic six-plus-sixtransmembrane domain topology observed in other members of the MFS. Figure 4 shows the 12-transmembrane domain topology in some of the representative members of the POT family. The transmembrane domains are indicated by peaks in the plot. The divider hydrophilic loops between the two tentative clusters are highlighted. There are slight variations in the hydrophobicity plots of different members of the POT family as reported by different investigators, depending upon the computer program and the plotting parameters used in each case, but a common theme, six-plus-six structure, appears to hold in each case. It is proposed that the 12 transmembrane domains form a transmembrane pathway for substrate passage through the plasma membrane (57). A number of questions remain as to how the substrates bind to the transporter protein, how the transport process is coupled to the proton translocation, how the proton motive force drives the transporter, what kind of conformational changes occur during the transport process, and others. In the mammalian peptide transporter subgroup, a large hydrophilic loop (-200 amino acids) intervenes in the second cluster and divides the cluster into two segments, each containing three membrane-spanning domains. This phenomenon may be a much later event in evolution. Functions associated with this extracellular loop remain to be elucidated.
V. Conclusion There is clear evidence from structural and functional studies that the peptide transporters that are driven by a transmembrane H+ gradient (POT family) are distinct from the peptide transporters that are directly energized by ATP (ABC peptide transporter family). The occurrence of H+-coupled peptide transport is much more widespread in nature than the occurrence of ATP-driven peptide transport. Several members of the POT family have been cloned from bacteria, yeast, plants, and animal tissues. The physiological function of these transporters is obviously the transport of peptides into the cells to provide amino acids for cellular metabolism. However, the characteristic broad substrate specificity of these transporters indicates that there is potential for the use of these transporters as a vehicle to deliver various bi-
258
3 0
3
3
0
100
100
200
200
-
c
500 -
............................
400
Amino Acid Position 300 400 500
300 ....... -
............................
I
200 I
300 I
400 I
500
I
600
4
I
100
-
700
700
YOU-JUN FEI ET AL.
600
600
I
400
. . I
300
: : ........
I
200
--
0
hpBpTz
I
100
I
.........
0
0
6
I
100 I
200 I
300 I
400
I
500
I
600
500
600
1
I
CHLl
400
I
500
300
I
400
200
I
300
100
I
200
0
4 I
100
...............<
I
NTRl
0
.........
0
io t 3
M
3
3 3 0
3 3 0 -3
PROTON-COUPLED OLIGOPEPTIDE TRANSPORTERS
259
ologically active peptidomimetic agents into the cells. It was believed for several years that the phenomenon of peptide transport was restricted to small intestine and kidney in animals and humans. More recent evidence, however, suggests that the occurrence of this phenomenon is much more widespread in animal tissues than previously thought. Tissues such as the brain, pancreas, lung, and liver express peptide transporter-specific mRNAs. While the physiological relevance of the peptide transporters in the small intestine and the kidney is readily apparent from the well-known absorptive function of these two tissues, the role of these transporters in other tissues remains to be investigated. Considerable progress has been made in recent years in the elucidation of the structural and functional features of H+-coupled peptide transporters not only from animal tissues but also from bacteria, plants, and yeast. This should pave the way for further research into various other aspects of these transporters, including structure-function relationship, membrane topology, three-dimensional structure, operational mechanism, and energy coupling. ACKNOWLEDGMENTS This work was supported by NIH grant DK28389. We thank Ms. Lisa Young for excellent secretarial assistance.
REFERENCES 1. C. F. Higgins and M. M. Gibson, Meth. Enzymol. 125,365 (1986). 2. C. F. Higgins, M. P. Gdagher, M. L. Mimmack, and S.R. Pearce, BioEssays, 8,111 (1988). 3. C. F. Higgins and J. W. Payne, in “Nucleic Acids and Proteins in Plants” (D. Boulter and B. Parthier, eds.),p. 438.Springer Verlag, Berlin, 1989. 4. D. M. Matthews, “Peptide Absorption: Development and Present State of the Subject.” Wiley-Liss, New York, 1991. 5. V. Ganapathy, M. Brandsch, and F. H. Leibach, in “Physiology of the GastrointestinalTract” (L. R. Johnson, ed.), p. 1773. Raven Press, New York, 1994.
FIG.4. Hydrophobicity plot for the POT subfamily members. Using the PEPPLOT program (Genetic Computer Groups Inc., Madison, WI),the protein sequences of d members of the POT subfamily have been replotted with a window size of 21 amino acids for each putative transmembrane domain. Due to similarity in hydrophobic profile of the closely related transporter proteins in different subgroups, only selective members are shown. Human intestinal peptide transporter (hPEPT1) is taken as a representative for the three mammalian lowaffinity peptide transporters. Human renal peptide transporter (hPEPT2) is taken as a representative for the three mammalian high-afhity peptide transporters. DtpT, capTR, chll, and NTFU, and yscPTR2 represent the peptide transporters from bacteria (L. la&), fungi (C. albicans), plants (A. thaliana),and yeast (S. cereuisiae),respectively.
260
YOU-JUN FEI ET AL.
6. D. Meredith and C. A. R. Boyd,]. M& Bid. 145,l (1995). 7, F. H. Leibach and V. Ganapathy, Annu. Rev. Nu&. 16,99 (1996). 8. H. Y. Steiner, F. Naider, and M. M. Becker, Mol. Microbiol. 16,825 (1995). 9. E. W. Coodell and C. F. Higgins,]. Bacterial. 169,3861 (1987). 10. M. Perego, C. F. Higgins, and S. R. Pearce, Mol. Mimobiol. 5,173 (1991). 11. M. D. Manson, V. Blank, G. Brade, and C. F, Higgins, Nature (London) 321,253 (1986). 12. W. N. Abouhamad, M. Manson, M. M. Gibson, and C. F. Higgins, Mol. Microbiol. 5,1035 (1991). 13. V. Ganapathy and F. H. Leibach, Lqe Sci. 30,2137 (1982). 14. D. M. Matthews, Physiol. Rev. 55,537 (1975). 15. V. Ganapathy and F. H. Leibach, Am. 1.Physiol. 249, G153 (1985). 16. V. Ganapathy and F. H. Leibach, Am. J. Physiol. 251, F945 (1986). 1%V. Ganapathy and F. H. Leibach, Cum. Opin. Cell Bid. 3, 695 (1991). 18. V. Ganapathy ,Y. Miyamoto, and E H. Leibach, Cunt7ib. Infusion Tha: Clin. Nutr. 17,54 (1987). 19. V. Ganapathy, Y. Miyamoto, and F. H. Leibach, Adu. Biosci. 65,91 (1987). 20. G. K. Grimble, Annu. Rev. Nutr. 14,419 (1994). 21. S. Silbernagl,Physiol. Rev. 68,911 (1988). 22. A. Tsuji and I. Tamai, P h u m e u t . Res. 13,963 (1996). 23. G. L. Amidon and H. J. Lee, Annu. Rev. Phumcol. Toxicol.34,321 (1994). 24. A. Tsuji, in “Peptide-BasedDrug Design” (M. D. Taylor and G. L. Amidon, eds.), p. 101. American Chemical Society, Washington, DC, 1995. 25. C. F. Higgins, Annu. Rev. Cell Bid. 8,67 (1992). 26. C. F. Higgins, Res. Microbiol. 141,353 (1990). 27. I. T. Paulsen and R. A. Skurray, Trends B i o c h . Sci. 19,404 (1994). 28. Y. J. Fei, Y. Kanai, S. Nussberger, V. Canapathy, E H. Leibach, M. F. Romero, S. K. Singh, W. F. Boron, and M. A. Hediger, Nature (London) 368,563 (1994). 29. M. Boll, D. Markovich, W. M. Weber, H. Korte, H. Daniel, and H. Murer, pfugers Arch. Eur. 1.Physiol. 429,146 (1994). 30. M. Boll, M. Herget, M. Wagener, W. M. Weber, D. Markovich,J. Biber, W. Clauss, H. Murer, and H. Daniel, Proc. Natl. Acad. Sci. U.S.A. 93,284 (1996). 31. R. Liang, Y. J. Fei, P. D. Prasad, S. Ramamoorthy, H. Han, T. L. Yang-Feng, M. A. Hediger, V. Ganapathy, and E H. Leibach,]. B i d . Chem. 270,6456 (1995). 32. W. Liu, R. Liang, S. Ramamoorthy,Y. J. Fei, M. E. Ganapathy, M. A. Hediger, V. Ganapathy, and F. H. Leibach, Biochim. Biophys. A& 1235,461 (1995). 33. K. Miyamoto, T. Shiranga, K. Morita, H. Yamamoto, H. Haga, Y. Taketani, I. Tamai, Y. Sai, A. Tsuji, and E. Takeda, Biochim. Biophys. A& 1305,34 (1996). 34. H. Saito, M. Okuda, T. Terada, S. Sasaki,and K. I. Inui,]. P h m o l . Exp. Ther. 275,1631 (1995). 35. H. Saito, T.Terada, M. Okuda, S. Sasaki, and K. I. Inui, Biochim. Biophys. Actu 128, 173 (1996). 36. S. G. Amara and M. J. Kuhar, Annu. Rm. Neurosci. 16,73 (1993). 37. S. Shafqat, M. Velaz-Faircloth, A. Guadano-Femaz,and R. T. Fremeau, Jr., Mol. Endocrind, 7,1517 (1993). 38. Y.F. Tsay, J. I. Schroeder, K. A. Feldmann, and N. M. Crawford, Cell, 72,705 (1993). 39. A. Hagting, E. R. S. Kunji, K. J. Leenhouts, B. Poolman, and W. N. Konings,]. Biol. Chem. 296,11391 (1994). 40. W. B. Frommer, S. Hummel, and D. Rentsch, FEBS Lett. 347,185 (1994). 41. D. Rentsch, M. Laloi, I. Rouhara, E. Schmelzer, S. Delrot, and W. B. Frommer, FEBS Ldt, 370,264 (1995).
PROTON-COUPLED OLIGOPEPTIDE TRANSPORTERS
261
42. J. R. Perry, M. A. Basrai, H. Y.Steiner, F. Naider, and J. M. Becker, Mol. Cell. Biol. 14,104 (1994). 43. M. A. Basrai, M. A. Lubkowitz, J. R. Perry, D. Miller, E. Krainer, F. Naider, and J. M. Becker, Microbiology, 141,1147 (1995). 44. H. Y.Steiner, W. Song, L. Zhang, F. Naider, J. M. Becker, and G. Stacey, Plant Cell, 6,1289 (1994). 45. W. Song, H. Y.Steiner,F.Naider, G. Stacey, and J. M. Becker, Plant Physiol. 110,171 (1996). 46. D. F. Feng and R. F, Doolittle,]. Mol. Euol. 25,351 (1987). 4% S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, J. MoZ. Biol. 215, 403 (1990). 48. R. Wilson, R. Ainscough, K. Anderson, C. Baynes, et al., Nature (London) 368,32 (1994). 49. I. Muldin and B. Ingemarsson, Plant Physiol. 108,1341 (1995). 50. E. Ruoslahti and M. D. Pierschbacher, Science, 238,491 (1987). 51. H. R. Kaback, Haroey Lect. 83,77 (1987). 52. M. F, Varela and T.H. Wilson, Biochim. Biophys. Acta 1276,21 (1996). 53. Y.Miyamoto, V. Ganapathy, and F. H. Leibach,]. Biol. Chem. 261, 16133 (1996). 54. M. Kato, H. Maegawa, T. Okana, K. I. Inui, and R. Hori,]. Phannocol. Exp. %. 251,745 (1989). 55. W. Kramer, F. Girbig, E. Petzoldt, and I. Leipe, Biochim. Biophys. A& 943,288 (1988). 56. Y.J. Fei, W. Liu, P.D. Prasad, R. Kekuda, T.C . Oblak, V. Ganapathy, and F. H. Leibach, Biochemistry 36,452 (1997). 5%M. H. Saier,Jr., BioEssays, 16,23 (1994).
This Page Intentionally Left Blank
DoubleStrand Break-Induced Recombination in Eukaryotes FEKRET OSMAN*AND SURESH SUBRAMANIt31
*Department of Biochemistry University of 0xjbr-d OX1 3QU, United Kingdom +Departmentof Biology university of Calijiiiu, Sun Diego LaJollu, California 92093-0322 I. Models of Double-StrandBreak-Induced Recombination ............ 11. Double-StrandBreak-Induced Mitotic Recombination ...............
A. Recombination Events Associated with DSBs Induced in Vitro .... B. Recombination Events Associated with Artificial Site-SpecificDSBs Induced in Vivo ............................ C. Biological Systems Utilizing Naturally Occurring Site-Specific DSBs Induced in Vioo ............................ 111. Double-StrandBreak-Induced Meiotic Recombination .............. N. The Genetic Control of Double-StrandBreak-Induced Recombination . A. The Genetic Control of DSB-Induced Mitotic Recombination in S. cerevisiae ................................ B. The Genetic Control of DSB-Induced Mitotic Recombination in S. pombe ................................................ V. Concluding Remarks ........................................... References ....................................................
266 277 278 281 287 288 291 291 294 295 295
Genetic recombination is of fundamental importance for a wide variety of biological processes in eukaryotic cells. One of the major questions in recombination relates to the mechanism by which the exchange of genetic information is initiated. In recent years, DNA double strand breaks (DSBs)have emerged as an important lesion that can initiate and stimulate meiotic and mitotic homologous recombination. In this review, we examine the models by which DBSs induce recombination, describe the types of recoqrbination events that DBSs stimulate, and compare the genetic control of DBS-induced mitotic recombination in budding and fission yeasts. 8 1998 Academic R w
To whom correspondencemay be addressed: Department of Biology, Room 3230 Bonner Hall, 9500 Gilman Drive, University of California, San Diego, La JoUa, CA 92093-0322; telephone: 619-534-2327;Fax:619-534-0053;e-mail:
[email protected]. Progress in Nucleic Acid Research and Molecular Biology, Vol. 58
263
Copyright 0 1998 by Academic Press.
AN rights of reproduction in MY form reserved. 0079-6603B8 $25.00
264
FEKRET OSMAN AND SURESH SUBRAMAN1
Genetic recombination is ubiquitous in living organisms. It can be defined as the exchange of information between DNA sequences. Recombination can occur between DNA sequences on two or more DNA molecules, or within a single DNA molecule. For the purposes of this review, three types of recombination are relevant: homologous, site-specific, and illegitimate recombination. Homologous recombination typically occurs between DNA sequences with extended regions of homology. It can occur anywhere along the length of homology, but is not restricted to specific sites, although some sites may be preferentially used. Site-specific recombination typically occurs between precisely prescibed sites on two partner DNA sequences that otherwise bear no overall homology to each other. These sites typically comprise short recognition sequences for a particular DNA-binding protein(s) that acts on these binding sequences to catalyze recombination. Illegitimate or nonhomologous recombination occurs between DNA sequences with no prescribed sites and with no homology, or at best only a few base pairs of homology, Illegitimate events are typically nonconservative in that they result in the loss or gain of a small number of nucleotides at the site of recombination in the DNA. This review focuses mainly on homologous recombination. Recombination is of fundamental importance for a wide variety of biological processes in eukaryotic cells, including meiosis, vegetative chromosome stability and segregation,antigenic variation and immunoglobulin gene rearrangements, maintenance of copy number and sequence homogeneity in repeated gene families, and control of gene expression. Recombination also plays a role in neoplastic transformation. In addition, in many organisms, the modification of specific chromosomal genes in a predetermined manner by gene targeting or gene replacement relies on homologous recombination between chromosomal and newly introduced DNA sequences. Accumulating evidence highlights the central role of DNA double-strand breaks (DSBs) in these recombinational processes. DSBs are important DNA lesions that can arise in mitotic cells spontaneously or in response to certain DNA-damagingagents. Some DNA-damaging agents, such as x-rays, can produce DNA DSBs directly, whereas others generate DSBs or gaps following processing of initial lesions by repair enzymes. The repair of DSBs and gaps also occurs via recombinational mechanisms. The DNA molecular structures generated during meiotic and mitotic recombination of chromosomes are similar to those occurring during recombinational repair of DSBs. Given this overlap, it is therefore not surprising that the genes involved in DSB repair have important roles in mitotic and meiotic chromosome metabolism. An understanding of recombination involves two aspects: (i)a description in molecular detail of the stepwise structural changes of DNA as parental sequences undergo recombination to give rise to recombinant products, and (ii) the elucidation of the precise roles of enzymes and other molecules that
DSB-INDUCED RECOMBINATION IN EUKARYOTES
265
catalyze or facilitate these structural changes and regulate their stability, duration, and extent at each step. In eukaryotes, the best studied organism with respect to recombination is the budding yeast Sacchurmyces cerevisiae, though much has also been learned from studies in other fungi and from Drosophila,Xenopus, and mammalian cells. Several experimental approaches have been employed to elucidate the underlying molecular mechanisms and genetic control of recombination in yeast and other eukaryotes. The first approach has relied on classical genetic analysis techniques to examine the segregation of linked genes, or alleles within one gene, by the characterization of the products of genetically detectable exchange events resulting from recombination between DNA sequences during both meiotic and mitotic development. Such genetic studies have examined interchromosomal,intrachromosomal,and extrachromosomal recombination, as well as recombination between chromosomal and extrachromosomal sequences. These analyses have been aided by the ability to use in vitro molecular biological techniques to artificially engineer defined chromosomal and extrachromosomal recombination substrates. Defined DSBs and double-strand gaps (DSGs) can be introduced in vitro within extrachromosomalsequences prior to introduction into cells. This has allowed the analysis of in vivo recombination events, both between extrachromosomal sequences and between chromosomal and extrachromosomal sequences in many eukaryotic organisms. In recent years, the genetic analysis of DSB-induced recombination has been aided by the use of biological tools that allow a single site-specific DSB to be induced in vivo within defined recombination substrates that are either chromosomal or extrachromosomal. Typically, the substrate includes a recognition site for a site-specific endonuclease, and the expression of the corresponding endonuclease is under the control of an inducible promotor. The second major approach has been to analyze the genetic control of recombination by the isolation and characterization of mutants defective in some aspect of recombination. Using the recombination substrates in which in vivo site-specific DSBs are induced, existing mutants have been characterized and new mutants are now being isolated on the basis of a direct defect in DSB-induced recombination. The isolation and molecular characterization of the genes involved in recombination has played an important role in gaining insights into the recombination process and the role of DSBs. In most cases, recombination genes have been isolated by complementation of the mutant phenotype with DNA, particularly in lower eukaryotes. In recent years some of the corresponding genes from higher eukaryotes have been defined following the identification of conserved sequences of homologs of recombination genes from lower eukaryotes. Biochemical studies are also having an increasingly important impact on
266
FEKRET OSMAN AND SURESH SUBRAMAN1
our understanding of recombination. In viet-o recombination systems have been developed in S. cerevisiue (1)and mammalian cells (2).In addition, from what is known regarding biochemical activities of recombination enzymes in bacteria, and from genetic and molecular studies in eukaryotes, several enzymatic activities have been postulated to be required for eukaryotic recombination. A subset of these can be assayed in uiet-0, many using substrates with DSBs. These systems have been used to purify (partially or completely) specific recombination activities from eukaryotes. However, there has only been limited success in characterizing the precise biochemical activities of individual gene products involved in eukaryotic recombination. A different biochemical approach in studying both the pathways and the role of the gene products of eukaryotic recombination has been the physical monitoring of DNA undergoing recombination following the introduction of a single in uivo DSB by the site-specificendonucleases mentioned earlier (reviewed in 3).The correlation of mutant phenotype to the loss of a biochemical activity in vivo, and of gene product to biochemical activity in viko, will be the critical criteria by which the underlying molecular mechanisms of recombination will be elucidated. This review focuses on our knowledge of DSB-induced homologous recombination in S. cermisiae, the best studied eukaryotic organism for this subject, and the one in which most of the more informative studies have been conducted. It also includes work on other eukaryotes, and our own work on DSB-induced intrachromosomal homologous recombination in the fission yeast Schixosaccharomycespmnbe.
1. Models of Double-Strand Break-Induced Recombination The extensive genetic studies on meiotic recombination in fungi have been reviewed by many authors (4-7). The properties of fungi make them particularly amenable for such analyses. Following a sexual cross between two strains differing in one or more marker genes, the isolation and analysis of the spores of intact tetrads and octads permits the genetic constitution of each DNA strand of each chromatid present after meiosis to be determined. As well as demonstrating reciprocal exchange, or crossing over, between homologous chromatids, these studies also established the occurrence of nonreciprocal exchanges, gene conversion, and postmeiotic segregation (PMS). PMS, the segregation of alleles at the division following meiosis, was taken as evidence that the chromatids that were segregated at the second division of meiosis consisted of heteroduplex DNA @DNA),that is, a DNA duplex in which the two strands contain different information for
DSB-INDUCED RECOMBINATION IN EUKARYOTES
267
the segregating marker. These studies also showed that there was a definite correlation between nonreciprocal gene conversion within a gene and reciprocal crossing over of flanking markers. The existence of hot spots of meiotic recombination and the phenomenon of polarity of gene conversion (that gene conversion frequency along the length of a gene is polarized) led to the hypothesis that meiotic recombination events are initiated at preferred specific sites on the DNA. Models were postulated to explain the genetic data on meiotic homologous recombination in fungi in molecular terms. Homologous recombination is envisaged as a multistep process catalyzed by many gene products. The models defined a mechanism for the association of gene conversion and crossing over. In the initial models, it was proposed that recombination was initiated by single-strand nicks (8-10;Fig. 1). It was postulated that, following a single-strand nick, a Holliday junction and hDNA intermediates were generated by strand exchange between homologous duplexes. In these models gene conversion is the result of the formation and mismatch repair of hDNA. A Holliday junction could be resolved into a crossover or noncrossover event according to which strands were cut at the junction. In Meselson and Radding's model (Q), the molecule on which the initiating singlestrand nick is made becomes the donor of genetic information. However, numerous pieces of evidence suggested that the initiating event occurred on the molecule that was ultimately the recipient of genetic information (4-7), and Radding (10)proposed a modified model (Fig. 1) to account for this. Although there is good evidence for the role of hDNA as a recombination intermediate, evidence for the role of single-strand nicks as initiators of eukaryotic recombination has not been forthcoming. Although attempts have been made (I]), a problem in assessingthe importance of single-strand nicks as initiators has been the absence of an efficient experimental system to study them. In contrast, the development of such systems for the study of DSBs has produced compelling evidence for their role as initiators of recombination. Studies on x-ray-induced DNA DSB repair in S. cerevisiue led Resnick (12)to propose a model in which recombination was initiated by DSBs (Fig. 2). In Resnick's model (Fig. 2), exonucleolytic degradation of one strand on each side of the DSB gives rise to two 3' single-strand tails, but only one recombines with homologous sequences of the intact duplex (a one-sided invasion). In the model, DNA synthesis primed from the 3' end of the invading 3' tail replaces the missing information at the break. As in the previous models, gene conversion is the result of the formation and mismatch repair of hDNA. Resnick's DSB repair model for recombination involves either no, or only one, Holliday junction. Experiments on DSB repair and DSG filling of plasmids transformed into S. cerevisiue (13,14)were, in part, responsible for the development of the
268
FEKRET OSMAN AND SURESH SUBRAMANI
I
t b
n : t
c
t
t
- -
L
-
I
"'7
crossover
L -.I
7
I
nontrossover
FIG.1. Radding's model for recombination initiated by a single-strand nick (10). (a) Recombinationis initiated by a single-strandnick in one of the duplexes that is extended into a single-strandgap. (b)The gap is then invaded by a strand derived from the homologous chromatid
DSB-INDUCED RECOMBINATION IN EUKARYOTES
269
DSB-gap repair model (IS),which allows for gene conversion without an absolute requirement for an hDNA intermediate. One version of this model is shown in Fig. 3. The initiation of recombination involves a DSB that is enlarged by exonucleolytic degradation of both strands of each end to produce a DSG. Both 3’ ends of the gap are involved in strand invasion (two-sidedinvasion) of the undamaged homologous duplex to produce two Holliday junctions. The invading 3‘ ends prime DNA synthesis,using the undamaged template to fill the gap. In this model, gene conversion can occur by one of two mechanisms. Most conversions occur by DSG filling, but it is also possible by repair of hDNA adjacent to gaps, formed during strand invasion and branch migration of Holliday junctions. Again, Holliday junction resolution accounts for conversion with or without associated crossovers. The DSB-gap repair model suggests that most gene conversion results from double-strand gap filling, but evidence suggested that chromosomal gene conversion resulted mainly from the mismatch correction of asymmetrical heteroduplexes (16, 17).Also, although DSGs produced in vitro are repaired in vivo with information donated by an endogenous duplex (13),there is no evidence that DSBs are processed into gaps in vim. Instead, evidence suggests that both mitotic and meiotic DSBs are processed not to a gap but rather to long (>1kb) 3’ single-stranded tails by the action of a unidirectional (5’+ 3’)exonuclease (18-20).This led to modifications of the DSB-gap repair model to produce variants, one of which is shown in Fig. 4 (18),in which the DSB is processed to 3’ single-strand tails both of which are involved in the invasion of the homologous duplex (two-sided event) to produce hDNA that is mismatch repaired to result in gene conversion. However, evidence suggests that the two 3‘ tails act independently of each other in finding and invading a homologous duplex ( 2 4 2 2 )and, moreover, there is no requirement for both ends to invade, as repair synthesis primed from one end can produce a region complementary to the opposite 3‘ extension. This, together with studies on recombination of extrachromosomal molecules with chomosomal DNA in mammalian cells (23-26), led to models of DSB-induced recombination based on one-sided invasion events (reviewed in Ref. 27). Models have been proposed for both nonconservative and conservative one-sided invasion events. A model for conservative onesided invasion (27) is shown in Fig. 5. Recombination intermediates consis~
resulting in a D-loop. (c)The D-loop is nicked at one end and the noninvasive 3‘ end acts as a primer for DNA synthesis.(d) Branch migration and ligation of the nicks results in a structure that isomerizes to form a Holliday junction. (e) Symmetrical hDNA can be formed by branch migration of the Holliday junction. The Holliday junction can then be resolved to give either (f)a crossover or (g) a noncrossover configuration. The arrowhead indicates the 3’ end of the DNA strand.
270
FEKRFT OSMAN AND SURESH SUBRAMANI
t t L
t L
L
t
t non-crossover
crossover
DSB-INDUCED RECOMBINATION IN EUKARYOTES
271
tent with one-sided invasion events have also been observed in DSB-induced recombination in S. cerevisiae (1,21,28).The model proposed that following strand invasion by one of the 3’ tails, DNA synthesis was primed from the 3 ‘ end using the invaded strand as template. Annealing between the newly synthesized strand and the noninvading single-stranded end would lead to the formation of hDNA on one side of the DSB, and a short heteroduplex could also be formed on the other side. A Holliday junction could be generated by cutting at the front end of the D-loop with subsequent annealing with the noninvading end and DNA synthesis. As in previous models, associated crossovers would depend on Holliday junction resolution. These one-sided invasion models are similar to Resnick‘s model (Fig. 2), but the events leading to hDNA formation differ in the two sets of models. Detailed analysis has shown that processing of DSBs at the MAT locus in S. cerevisiae occurs at only one end, with asymmetrical strand transfer resulting in hDNA only in the recipient molecule, as predicted from one-sided models. However, Schwacha and Kleckner described double Holliday junctions in S. cerevisiae meiotic recombination intermediates, indicating that these events involve two-ended invasions (29). Detailed genetic analysis of homologous recombination induced by transposable P-elements in Drosophila mlanogaster has led to proposals for another model for DSB-induced recombination (30, 31). The model, referred to as the synthesis-dependent strand-annealing model, is shown in Fig. 6. It proposes that the DSB is processed to give two 3’ single-strand tails that behave independently of each other during homology search. It postulates that strand invasion and hDNA formation between invading 3’ single-strand tails and the template DNA strand of the homologous duplex are transient. Newly synthesized DNA, spanning the DSB and primed from the invading 3’ ends, is released from the templates and reanneals to form a duplex. Such transient formation of hDNA means that there is no Holliday junction intermediate. This accounts for P-element recombination only generating gene FIG.2. Resnick’s model (12).(a) Recombination is initiated by a DSB in one duplex. (b)3’ OH single-strand overhanging tails are exposed on either side of the break by 5‘+ 3’ exonucleolytic digestion. (c) One of the two 3’ ends invades the intact homologous duplex. (d)The homologous duplex is now cut on one chain. There are two alternativepathways for processing the intermediate in (d). In one pathway (e) the invading 3‘ end is extended by DNA synthesis using the intact homologous strand as template. (f) This is followed by release of the invading fragment and its annealing with the other fragment. (h) Repair synthesis and ligation results in one parental chromosome and one that has a patch of information derived from the homologous duplex. Alternatively, (h) the second 3’ end invades the intact homologous duplex as did the first, forming a Holliday junction. (i)The intact strand of the invaded duplex is cut, resolving the Holliday junction. 6)Repair synthesisand ligation restores the duplexes, giving crossover molecules, each with a segment of’hDNA.
2 72
FEKRET OSMAN AND SURESH SUBRAMANI
t
tV
n
-
F I----L D
-
--A
crossover
-
7
L
-
\----
non-crossover
FIG.3. DSB-gap repair model (15).(a) Recombination is initiated by a DSB in one duplex. (b)Both ends of the DSB are processed by exonucleases to form a double-strand gap with 5’ and 3’ single-strandoverhangs on the same chain. (c) Both 5‘ and 3’ overhangs invade the intact homologous duplex, displacing a D-loop. (d) Repair synthesis using both intact strands of the homologous duplex as template fills the gap, creating two Holliday junctions. One of the junctions of the double Holliday structure is resolved by cutting the outer strands (open up and down arrowheads).The other Holliday junction can then be resolved to give either (e)a crossover or (f) a noncrossover configuration.
t 7
b
t
+ t
- 0 -.-----
L
-..,,.A
. . A
crossover
B
. I . . -
non-crossover
FIG.4. Modified DSB repair model (18).(a)Recombination is initiated by a DSB in one duplex. (b) Extensive 3’ OH single-strandoverhanging tails are exposed on either side of the DSB by 5’ + 3’ exonucleolyticdigestion. (c) One of the two 3’ ends invades the intact homologous duplex, displacing a D-loop. (d) The D-loop is enlarged by DNA repair synthesis primed from the invading 3’ end, using the intact strand of the invaded homologous duplex. It anneals to the second 3‘ end single-stranded DNA. (e) Repair synthesis from the second 3’end takes place, and two Hollidayjunctions are formed.The Hollidayjunctions can then be resolved to give either (0a crossover or (g) a noncrossover configuration.
L
-
7
t L
7
L
7
- -
L, -
--
L
I
--
3 7
non-crossover
---
---A crossover
FIG.5. One-sided invasion model for conservative homologous recombination (27). (a) Recombination is initiated by a DSB in one duplex. @) 3' OH single-sixandoverhanging tails are exposed on either side of the break by 5' + 3' exonucleolytic digestion. (c) One of the two 3' ends invades the intact homologous duplex, displacing a D-loop. (d) DNA repair synthesis primed from the invading 3' end, using the intact strand of the invaded homologous duplex, extends the D-loop. (e) Cutting at the front end of the D-loop with subsequent annealing with the noninvading end and DNA synthesis generates a Holliday junction. The other Holliday junction can then be resolved to give either (0a noncrossover or (9) a crossover configuration.
274
7
81
t t t
t t.
L-.... 7 I
7
-..-.I-
ff
d 7
FIG.6. Synapsis-dependent strand-annealingmodel (30).(a) Recombination is initiated by a DSB in one duplex. (b)Both ends of the DSB are processed by exonucleases that digest 5’ ends quicker than 3’ ends to form a double-strand gap with 3’ single-strand overhangs. (c) The 3’ ends independently invade a homologous duplex and transiently displace only a local loop or “bubble” of DNA. hDNA formation is also only transient. (d) F’rimed from the invading 3‘ ends, new single strands are synthesized via a “bubble migration” mechanism using the intact strands of the invaded homologous duplex as template. The bubble migration mechanism proposes that the bubble is collapsed behind the DNA polymerase by rapid displacement of the newly synthesized single strands from the template. (e)Following synthesis,the new strands are completely displaced and anneal to one another. (0The gap is then completed by extension of the ends of the annealed strands using each other as templates.
275
276
FEKRET OSMAN AND SURESH SUBFUMANI
conversion-type recombinants, and a similar model (32)has been proposed for mating-type switching in S. cereuisiae, which also involves only conversion-type recombinants. An alternative model for DSB-induced recombination, based on transformation experiments in mammalian cells (33, 34), has been proposed for DNA molecules in which the DSB is flanked by homologous sequences. The single-strandannealing (SSA) model (Fig. 7 ) proposes that the ends of a DSB undergo extensive bidirectional 5’ + 3’ exonuclease digestion, until complementary regions in direct repeats are exposed and can reanneal. Removal
t t e
t d
t B
FIG.7. Single-strandannealing model (33,34).(a) Recombination is initiated by a DSB between flanking directly repeated homologous sequences (shaded rectangles). @) Both ends of the DSB are subjected to extensive single-strand 5’ + 3‘ exonuclease digestion until flanking homologous single-strand regions are exposed. (c) The complementary single strands anneal. (d)The nonhomologoustails are removed. (e) DNA repair synthesis and ligation yield a deletion product in which the intervening sequence is lost.
DSB-INDUCED RECOMBINATION IN EUKARYOTES
277
of nonhomologous tails, followed by DNA repair synthesis and ligation, completes the recombination process. The SSA model is inherently nonconservative, as the DNA between the two flanking homologous sequences is degraded and the two parental homologous sequences give rise to one recombinant duplex. It does not involve strand invasion or formation of a Holliday junction, but does involve hDNA formation at the annealed junctions (Fig. 7e) that is subject to mismatch repair. This model is supported by the analysis of products of DNA injected into Xenopus oocytes (35, 36), and by the analysis of the kinetics of DSB-induced recombination between direct repeats in S. cerevisiae (19, 37). It should be noted that these recombination events between repeated sequences could also be accounted for by a nonconservative one-sided recombination mechanism (27). None of the models for homologous recombination in eukaryotes is consistent with all the available data fiom the extensive studies on recombination in a variety of systems. This is most likely because recombination in eukaryotes, like that in prokaryotes, involves multiple pathways utilizing different mechanisms. Nevertheless, generally, most models postulate at least six stages: (i) initiation involving formation of a single-strand nick, a DSB, or a DSG, followed by formation of a single-stranded DNA; (ii)presynapsis involving activation of the single strand to allow homology searching; (iii) the search for homology and homologous DNA pairing; (iv) strand exchange leading to hDNA formation; (v) Holliday junction formation and branch migration; and (vi) resolution of Holliday junctions to yield recombinant products.
11. Double-Strand Break-Induced Mitotic Recombination As mentioned previously, mitotic recombination is important for the repair of DNA DSBs that can arise naturally during the life cycle of a cell or in response to a DNA-damagingagent. The consequences of unprocessed DSBs are blockage of DNA replication and loss of genome integrity leading to lethality. Moreover, DSB-induced chromosomal mitotic recombination is involved in a number of basic cellular processes seen in a wide variety of eukaryotic systems, including mating-type switching in S. cereoisiae (38)and S. pomhe (39);transpositions of P-elements in Drosophilia melanogaster (31); and mammalian site-specificV(D)J gene rearrangements that give rise to immmunoglobin and T-cell receptor diversity (40). There are several different fates of a mitotic DSB: homologous recombination, nonhomologous (illegitimate) recombination, or the addition of new telomeres at the break site. All of these types of mechanisms have been observed in eukaryotes as diverse as yeasts and mammals, although the relative
278
FEKRET OSMAN AND SURESH SUBRAMANI
efficiencies of these events vary considerably in different eukaryotic cells. In S. cermisiae, DNA DSBs are primarily processed by homologous recombination pathways (reviewed in 41).In contrast, in mammalian cells evidence suggests that an illegitimate recombination mechanism, DSB end-joining, rather than homologous recombination is the prevailing mechanism (42,43). Other eukaryotes appear to lie in between these two extremes with respect to relative efficiencies of homologous and illegitimate recombinational processing of DSBs. It is not clear whether these differences reflect additional capacity for nonhomologous DNA DSB joining or diminished pathways for homologous recombination in other eukaxyotes compared to S. cermisiae. DNA-damaging agents that induce DNA DSBs, such as ionizing radiation, have been shown to stimulate mitotic chromosomal recombination in a wide variety of organisms. This could lead to conversions, crossovers, deletions, duplications, inversions, and translocations. Although such studies have been very informative, and were in part responsible for Resnick's model of DSB-inducedrecombination (12;Fig. 2), the damage-induced DSBs are difficult to study because they are randomly distributed and infrequent at biologically relevant doses. Generalized spontaneous mitotic recombination also occurs randomly at low frequency. The genetic and physical consequences of a mitotic DSB are best examined at a defined DSB. Three approaches to study the fate of a defined mitotic DSB have been employed in eukaryotes: (i) using extrachromosomal substrates with defined DSBs introduced in vitro prior to delivery into cells; (ii) examining recombination associated with the cellular processes involving a defined DSB mentioned earlier (i.e., mating-type switching in yeasts, P-element transposition in Drosophila mZanoguster and V(D)J recombination in mammalian cells); and (iii)the use of components of some of these endogenous systems to induce site-specific DSBs elsewhere in the genome. This section focuses on current studies of defined DSB-induced mitotic homologous recombination in S. cermisiae, but also includes studies on nonhomologous recombination and studies in other eukaryotes.
A. Recombination Events Associated with DSBs Induced in Vitro Extrachromosomal circular DNA molecules linearized with DSBs and DSGs produced in vitro have been introduced into cells of many eukaryotic organisms. This has allowed the fates of the in uitro-produced DSBs and DSGs to be monitored in vivo. In these experiments, recombination efficiency is inferred from transformation efficiency. During transformation of S. cermisiae,it was found that DSBs and DSGs stimulate transformation frequencies by as much as 3000-fold (13,14).The integration of linearized nonreplicative plasmids bearing DSBs or DSGs within sequences homologous
DSB-INDUCED RECOMBINATION IN EUKARYOTES
279
to chromosomal sequences occurs by a homologous recombination mechanism (13,14).Integration is accompanied by the repair of the termini through a gene conversion event copying the missing plasmid-borne information from homologous chromosomal sequences. Replicative plasmids with DSGs undergo similar events, with equal numbers of transformants containing integrated (gap repair with crossing over) and nonintegrated (gap repair without crossing over) plasmids. Recircularization during transformation of a replicative plasmid linearized by a DSB was found to occur at high efficiency by recombination using either a homologous chromosomal sequence (13) or a homologous sequence on a co-transformed plasmid (44). Reducing homology also reduced the frequency of recombination events associated with DSBs and DSGs (45). As mentioned previously, these observations were, in part, resposible for the DSB-gap repair model for homologous recombination (15; Fig. 3). Replicative plasmids linearized by a DSB and bearing homologous repeats could also recircularize to yield deletion-type recombinants associated with the loss of one repeat and the intervening sequences (46, 47). It was suggested (47) that these recombinants could arise via three pathways: gene conversion associated with reciprocal exchange, nonconservative one-sided invasion events, and SSA. These and numerous other such studies (reviewed in 6 and 48) suggest that in vitro-produced, plasmid-borne DSBs and DSGs are primarily processed by homologous recombination mechanisms in S. cerevisiae. However, nonhomologous recombination events occurring at low frequency (at least 100-fold less efficient than homologous recombination events) have also been described. When S. cerevisiae cells are transformed with linearized replicative plasmids lacking homology to genomic DNA, recircularization can occur via direct rejoining of the ends (46),or via interaction of the ends of two linear molecules, resulting in the formation of head-to-head plasmid dimers (49).These illegitimate recombination events are associated with short deletions or insertions around the ends of the DSB. Nonhomologous integration of linearized plasmids utilizing little or no (4 bp or less) end-sequence homology can also occur at low frequency in S. cerevisiue and is also associated with small deletions-insertions at the join sites (50, 51). Ulegimate recombination in S. cerevisiue may reflect the existence of a separate end-to-end joining mechanism, or, alternatively, it has also been suggested (52, 53) that it occurs via mechanisms similar to SSA or one-sided events involving small stretches of overlapping, locally homologous sequences (microhomology). Studies with extrachromosomal substrates in several other eukaryotes have defined similar homologous and illegitimate recombination activities, although illegitimate events are more efficient in other eukaryotes compared to S. cwevisiae. For example, in S. pombe, although linearized self-replicating plasmids are efficiently recircularized by homologous recombination in
280
FEKRET OSMAN AND SURESH SUBRAMANI
the presence of homology, 1in 26 (compared to less than 1 in 100 for S. cerevisiae) are recircularized by nonhomologous end-joning (54).In the absence of homology, recircularization by nonhomologous end-joining pathways is efficient in S. pombe (54). As with S. cerevisiae, end-joining involved interaction of short patches (1-4 bp) of sequence homologies and generated deletions at the ligation points. Similarly, integration of linearized plasmids by homologous recombination is efficient in S. pombe, but there is a higher frequency of nonhomologous integration compared to S. cerevisiae ( 5 5 5 6 ) . In mammalian cells, illegitimate events are the primary mechanism for processing DSBs and DSGs in extrachromosomal substrates. Despite the great desire for efficient gene targeting in mammalian cells, less than 1 in 100 DSB-induced integrative transformation events involves homologous recombination. Nevertheless, DSBs and DSGs do stimulate homologous integration events in mammalian cells, and strategies have been developed to select for homologous integration against a large background of nonhomologous integrations (e.g., 57,58). In many respects these rare homologous recombination events strongly resemble the well-characterized events in S. cerevisiae. However, there are some notable differences, since homologous DSB-inducedintegrative transformation can be accomplished in mammalian cells by one-sided events when only one end of the transforming DNA integrates by homologous recombination (reviewed in 27). DSBs and DSGs induce extrachromosomalhomologous recombination between co-injected or co-transfected DNA molecules in mammalian cells (reviewed in 59, 60), albeit at a reduced frequency compared to S. cerevisiae. Again, these homologous recombination events, involving gap filling, conversion, and reciprocal exchange, strongly resemble those in S. cmeuisiae.As mentioned previously, it was also shown that SSA is a major homologous recombination pathway for linearized extrachromosomal molecules containing direct repeats transformed into mammalian cells (33,34) or injected into Xenopus oocyte nuclei (35,36).SSA has also been invoked to explain some types of recombination between linearized plasmids in the smut fungus Ustilago muydis (61) and plant cells (62). Efficient illegitimate recombination events, end-joining, and end-to-end joining of linearized extrachromosomal DNA molecules has been observed in Xenopus (63, 6 4 , mammalian cells (42, 43) and other eukaryotes. In contrast to the similarities of recombination pathways for processing DSBs and DSGs among eukaryotes,there are also notable exceptions. For example, in a study in U. muydis (65) it was found that recombinational repair of a plasmid-borne DSG gap using chromosomal sequences was only very rarely accompanied by crossing over. Processing of the DNA ends flanking the gap was unequal, and a migrating D-loop model was proposed (65), similar to the synthesis-dependent strand annealing model shown in Fig. 6.
DSB-INDUCED RECOMBINATION IN EUKARYOTES
28 1
B. Recombination Events Associated with Artificial Site-Specific DSBs Induced in Vivo 1. RECOMBINATION EVENTS IN s. CEREVZSIAE HO endonuclease initiates mating-type switching in S. cermisiae by producing a DSB at its target site, the 1/12junction within the recipient MAT locus (66, 67). I-Sce-I is a mitochondrial intron-encoded, site-specific endonuclease that produces a DSB at its target site to initiate insertion of the intron into a new site (68).A modified version of I-Sce-I can be expressed in the nucleus. In S. cwevisiue both of these site-specific endonucleases, under the control of inducible promotors, have been used to study recombination initiated by a single, site-specificin vivo DSB by inserting their respective recognition sites at specific locations within defined DNA sequences (reviewed in 3).In all cases DSB induction stimulated recombination. Only a few studies have utilized the I-Sce-I system (69, 70) compared to the numerous studies with HO. The studies with I-Sce-I used recombination substrates similar to those used in the studies with HO endonuclease. An important consideration is that the I-Sce-I-inducedand the HO-initiated events, described later, are indistinguishable. This argges that HO and I-Sce-I endonucleases play no other role in the recombination events under study except DSB induction. HO endonuclease has been used extensively to initiate DSB-induced intramolecular recombination events between repeated sequences in s. cerevisiae, both intrachromosomal and intraplasmid, by inserting the HO recognition site within artificially created duplications. Typically, the duplication consists of either two different alleles of the same gene or two overlapping segments of a gene. The repeated sequences can be in direct or inverted orientation and are typically separated by unique DNA, commonly with a marker gene also present within the intervening sequence. The HO recognition site has been placed either within duplicated DNA in one of the repeat elements (19,28,37,71-73), or within unique DNA in the intervening sequence between the repeats (20, 72, 74). The use of such substrates has allowed a genetic analysis of DSB-induced recombination by the recovery and analysis of recombinants arising from interaction between the repeated elements. Two main classes of recombinants were recovered: conversion-type recombinants, which still have two copies of the repeat element and have presumably arisen by nonreciprocal transfer of information from one element to the other without loss of the intervening sequences; and deletion-type recombinants, which have a single copy of the repeat element with accompanying loss of the intervening sequences. The relative frequencies of conversion-type and deletion-type recombinants depended on whether the DSB was induced within the homologous sequences of one of the repeated sequences, or within the unique sequences
282
FEKRGT OSMAN AND SURESH SUBRAMANI
between the repeats. In studies in which the in vivo DSB was made in duplicated DNA, both conversion-typeand deletion-type events were stimulated, for both intrachromosomal(28, 71-73) and intraplasmid (19, 28,37) recombination substrates. The spectrum of DSB-induced recombination events depended on the particular substrate employed but remained the same as spontaneous events, indicating that the pathways involved in spontaneous and DSB-induced mitotic recombination may be the same. For both direct and inverted repeats, for the induced conversion-typerecombinants the cleaved repeat sequence acted almost exclusively as the recipient of genetic information, and these recombinants could be accounted for by a DSB-gap repair pathway. With the repeats in inverted orientation, it was shown that deletion-typerecombination was also consistent with a conservativeDSB-gap repair pathway, that is, a conversion event associatedwith reciprocal crossing over (28). For repeats in direct orientation, spontaneous and induced deletion-type recombinants could result by any of the following means: gene conversion associated with crossing over; an unequal sister chromatid exchange at G, (for chromosomal substrates); a nonconservative one-sided strand invasion pathway; or SSA. If gene conversion is accompanied by reciprocal crossing over, the segment of DNA that is internal to the two halves of the repeat will be excised as a circle. However, the majority of spontaneous and induced conversion-type events were not accompanied by crossing over, and for chromosomal substrates unequal sister chromatid exchange only occurred at very low levels (19,28,37,71-73, 74, suggesting that deletion-type recombinants arose via the SSA pathway and/or via a nonconservative one-sided strand invasion pathway. At present it is not clear whether deletion-typerecombinants arise via SSA or nonconservative one-sidedevents, or via both mechanisms, as suggested by Prado and Agulera (47). For intrachromosomal and extrachromosomal recombination substrates in which the DSB was induced within unique DNA between direct repeats, there was a predominance (>9900) of deletion-type mitotic recombinants (20, 71, 72, 74). Their production was also shown to be consistent with a SSA mechanism. In addition to the genetic analysis, these DSB-induced recombination events could also be monitored physically. The use of inducible promotors to express HO endonuclease allows the in vivo DSBs to be produced synchronously. Subsequent steps in recombination, and the appearance of recombination intermediates and final products, could then to be followed over time by physically analyzing DNA extracted from cells. A physical analysis of the kinetics of both DSB-induced conversion-typeand deletion-typeproduct formation provided evidence that DSB-gap repair and SSA (and/or one-sided events) are two independent competing pathways of DSB-induced recombination with a common intermediate (19,28). There was a distinct difference
DSB-INDUCED RECOMBINATION IN EUKARYOTES
283
in the time of appearance of gene conversion- and deletion-type products, and the appearance of deletion-type recombinants could be delayed by increasing the distance between repeats without affecting the appearance time of conversion-type recombinants (19).The likelihood of conversion-type recombination was increased with increasing distance between repeats. These studies identified 3’ single-stranded DNA on both sides of the DSB as recombination intermediates (19,20).These are intermediates common to the modified DSB repair model (18)and the SSA model (33,34). Although nonconservative, one-sided strand-invasion models envisage extensive singlestrand tails on one side of the break, it is possible nevertheless that such pathways still contribute to recombination between repeated sequences to give deletions. Deletion-typerecombination between regions flanking a DSB, presumably by SSA, was linearly dependent on the length of flanking homology, and appeared to have a minimum homology requirement of 65-90 bp (20).This requirement may reflect the minimum length needed for homology searching, or it may be the length needed to form a stable intermediate structure. SSA as an alternative recombination pathway appears to be as efficient as DSB-gap repair (19). HO-induced DSBs introduced in the ribosomal DNA or in the CUP1 tandem gene arrays also stimulated recombination events resulting in loss of one or more repeat units (76), consistent with the observations for the artificial duplications. In vivo HO endonuclease-induced DSBs have also been used to investigate recombination in other types of substrate in S. cermisiae. For example, recombination between plasmid and chromosomal homologous sequences initiated by a HO DSB was used to investigate conversion tract length and directionality (unidirectional or bidirectional) and the effects of nonhomology or homeology at the ends (77, 78). Interchromosomal mitotic recombination was also stimulated by a HOinduced DSB in one of the participating chromosomes in diploids of S. cerevisiae (79,80).A HO-induced DSB stimulated biparental recombination between his3 heteroalleles on heterologous chromosomes (79). The DSB was made at a site 8.6 kb from one of the his3 heteroalleles. The cleaved chromosome acted as recipient of genetic information, and recombination was accompanied by repair of the DSB. In most cases the DNA between the break site and the his3 heteroallele was intact and did not show enhanced recombination, as would be expected from most recombination models, prompting the suggestion of a discontinuous hDNA model (79).In the other study, interchromosomal recombination in diploids was monitored between homologous sequences at allelic sites (80).In most recombinants the chromosome with the DSB acted as recipient of genetic information. In general most of the data were consistent with the DSB-gap repair, with the forma-
284
F E K R E T OSh4AN AND SURESH SUBRAMAN1
tion of DSGs from a few hundred to a few thousand bases in size that are repaired by information from the uncut chromosome. However, heteroduplexes flanking the DSB could also be generated, resulting in discontinuous conversion tracts, and in some cases the cut chromosome acted as the donor of genetic information. An inducible HO endonuclease was also used to show that, in the absence of homologous recombination, in vivo DSBs at the MAT locus could be processed by nonhomologous end-joining, resulting in small deletions or insertions at the join sites (52).Evidence suggests that these deletions and insertions are formed by different nonhomologous end-joining pathways in S. cmevisiue (81). Finally, HO endonuclease was used to investigate the formation of new telomeres at chromosome break sites in S. cerevisiae (82, 83). 2. RECOMBINATION EVENTSIN OTHER EUKARYOTES a. lntruchrmsomal Recombination in S. pombe. It has been shown that DSBs at the mating-type locus can initiate mitotic recombination in S. pombe (39). These studies illustrated some of the general features of the DSB-gap repair model. No other studies had examined DSB-induced mitotic intrachromosomalrecombination at loci other than the mating-type loci in S. pombe. In our laboratory we sought to determine whether the pathways of DSB-induced intrachromosomal recombination in S. cerevisiue were conserved in S. pombe. We showed that the S. cerevisiae HO endonuclease, expressed from an inducible S. pombe promotor, and its MATu target site could successfully be used to introduce site-specificDNA DSBs in vivo within intrachromosomal recombination substrates in S. pombe (84).The recombination substrates were similar to those used in S. cermisiae and consisted of nontandem direct repeats of ade6 heteroalleles. The MATa cutting site was located either within duplicated DNA in the left-hand ade6 heteroallele or within unique DNA between the ade6 repeats. Induction of DSBs resulted in a 2000-fold stimulation in the frequency of recombinants compared to spontaneous events. The DSB-induced recombination frequency was high enough so that it was not necessary to select for recombinants and all cells were analyzed, permitting an unbiased evaluation of all the different fates of the recombination substrate. Analysis of the recombinants illustrated that DSB-induced intrachromosomal mitotic recombination in S. pombe was very similar to that in S. cereuisiae. When the DSB was located in duplicated DNA in one of the ade6 heteroalleles, both conversion-typeand deletion-type recombinants were induced, and in the same relative proportions as spontaneous recombinants. This suggested that the majority of spontaneous recombinants could also be arising due to spontaneous DSBs within duplicated DNA. For DSB-induced
DSB-INDUCED RECOMBINATION IN EUKARYOTES
285
conversion-type recombinants, the copy of a h 6 in which the DSB was made was the recipient of genetic information, which is a prediction of the DSB-gap repair model for recombination (15).Several different types of conversion-type recombinants were observed: those in which only the MATa site was lost; those that co-converted both the MAT‘ site and the ade6 mutation to wild type; and those that converted all the information of the recipient heteroallele to that of the donor heteroallele. When the DSB was situated within unique DNA between the &6 heteroalleles, over 99.8% of DSB-induced recombinants were deletion types. No &6 triplications, which are diagnostic of sister-chromatid reciprocal exchanges, were observed in our study, regardless of whether the DSB was made in duplicated or unique DNA. DSB-gap repair and SSA (and/or nonconservative one-sided events) could account for the data. DSB-induced conversion-type recombinants, in which all the information of the recipient heteroallele is converted to that of the donor heteroallele, show that the DSG or hDNA postulated by the DSB-gap repair model (Fig. 3),or the 3’ single-strand tails postulated by the modified DSB repair model (Fig. 4), could be extensive, covering almost the entire &6 locus. Our results also suggested that, during SSA, 5’ + 3’ exonuclease digestion on both sides of the DSB exposed extensive, more than 1-kb, complementary homologous 3’ single-strand regions in the two ude6 repeats. Annealing would result in extensive hDNA formation covering almost the entire &6 locus, with hybrid DNA at both the &6 heteroallelic mutation sites (located 1.3kb apart on the &6 locus). The mutated bases of these two hybrid sites are in truns with respect to which strand of the duplex they are located on (i,e,, +I- and -/+), and a careful analysis of Ade- deletion-type recombinants revealed the absence of final recombinants that retained both mutations. This suggested that either of the single strands of the hDNA covering the the entire &6 locus was subject to unidirectional mismatch repair. In addition, genetic intermediates in the form of half-sectored colonies were isolated, analyzed, and interpreted as evidence of hDNA formation during the SSA pathway.
b. Recombination Events in Higher Eukuyotes. Previously, induction of in uiuo DSBs in higher eukaryotic cells relied on introducing bacterial restriction endonucleases into the cells by electroporation (85, 86). However, introduction of restriction endonucleases into higher eukaryotic cells causes wholesale genomic breakage that may induce cellular responses to global damage and obscure the effect of a single DSB. In recent years, S. cerevisiae I-Sce-I endonuclease has been introduced into higher eukaryotic cells to induce a single in vivo site-specific DSB into DNA containing the I-Sce-I recognition site. The I-Sce-I recognition site, which is 18 bp in length (87), is unlikely to occur randomly in the genomes of higher eukaryotes. I-Sce-I en-
286
FEKRET OSMAN AND SURESH SUBRAMANI
donuclease has been introduced into the cells either by electroporation (88, 89) or by in vivo expression (90-93). In vivo I-Sce-I-induced DSBs have been used to study extrachromosomal, extrachromosomal-chromosomal and chromosomal recombination in higher eukaryotic cells. I-Sce-I endonuclease-induced DSBs stimulated both intramolecular and intermolecularhomologous recombination in extrachromosomal substrates in mammalian cells (91),plants cells (93),and Xenopzls oocyte nuclei (94).DSBinduced recombination between direct repeats seemed to proceed via a SSA mechanism (91,93).In contrast, DSBs in extrachromosomallyreplicatingplasmids, generated in vim by electroporation of restriction enzymes, were processed primarily by nonhomologous end-joining in mammalian cells (85). Given the desire to effect efficient, precise gene targeting in mammalian cells, and the fact that most integrative transformation events occur randomly (even if the extrachromosomal molecule has a DSB), attention has been focused on recombination between sequences on extrachromosomal DNA molecules and homologous chromosomal sequences. In previous targeting experiments, the exogenous vector DNA had a DSB, but the chromosomal target did not. It was of great interest to determine whether a site-specificin vivo chromosomal DSB could stimulate homologous recombination with a homologous extrachromosomal sequence. Two reports (90, 92) have described the expression of the yeast endonuclease I-Sce-I in mouse cell lines to create site-specific in vim chromosomal DSBs. The DSBs stimulated, by two to three orders of magnitude, homologous recombination between the two chromosomal sequences flanking the break and two homologous regions on a transfecting circular targeting vector, resulting in targeted integration of the vector. A DSB repair pathway, a SSA mechanism, or one-sided homologous recombination could account for the events. Nonhomologous end-joining between the cleaved chromosomal ends was also stimulated, either by direct ligation or associated with small deletions resulting from joining through short sequence homologies. One report (88) described the direct electroporation of I-Sce-Ienzyme together with a targeting vector into mouse cells. Although cleavage and repair of the chromosomal target took place, no homologous recombination between the targeting vector and the chromosomal target was detected, in contrast to the other studies. Recombination in Xenopzrs oocyte nuclei was monitored between homologous sequences on two extrachromosomal molecules, a linear DNA molecule and a circular one containing a I-Sce-I recognition site in a system designed to mimic a gene-targeting experiment based on a SSA mechanism (94).The linear DNA molecule contained two regions each homologous to the sequences flanking the DSB in the circular DNA molecule. In vivo DSB cleavage of the circular DNA stimulated homologous recombination with the linear DNA, resulting in a joint molecule. In plant cells, I-Sce-I-induced
DSB-INDUCED RECOMBINATION IN EUKARYOTES
287
chromosomal DSBs stimulated homologous integration of extrachromosomal targeting vectors by two different pathways, a DSB-gap repair mechanism involving both homologous ends and one-sided events involving one homologous end (95). In mammalian cells, using intrachromosomal recombination substrates consisting of nontandem direct repeats, it was shown that in vivo I-Sce-I- or restriction enzyme-induced DSBs located within the duplicated homologous regions stimulated predominantly homologous recombination events (10fold increase), with only a minority involving nonhomologous end-joining (89). Restriction enzyme-induced DSBs outside of the repeated regions, or between them, produced no change in recombination frequency. Godwin et al. (96)examined the effect on interchromosomal recombination between homologous sequences of in uivo DSBs induced by electroporation of a restriction enzyme. In their experiments they detected only nonhomologous end-joining events. Similarly, chromosomal deletions associated with nonhomologous end-joining have been shown to result from electroporation of mammalian cells with restriction enzymes (86). Thus in vivo DSBs have been shown to stimulate predominantly homologous, nonhomologous, or both types of recombination depending on the systems used. The differences between these various results may have to do with variations in recombination substrates, the endonucleases chosen, or the methods used to introduce endonucleases into the cell.
C. Biological Systems Utilizing Naturally Occurring Site-Specific DSBs Induced in Vivo As mentioned, several endogenous recombination systems in mitotic cells are associated with site-specific DSBs. An investigation of these systems has
provided additional insights into the molecular mechanisms of DSB-induced recombination. It is beyond the scope of this review to describe studies with these systems in detail, but readers are directed to the reviews cited. Homothallic switching of the mating-type genes in S. cerevisiae occurs by a highly regulated site-specifichomologous recombination event (for reviews see 32, 38). HO endonuclease makes a DSB in the MAT locus at its recognition site near the MAT-y/z border. This stimulates gene conversion to replace DNA at the MAT locus with sequences copied from one of two unexpressed donor loci, H M L or H M R , located on the same chromosome. MAT conversion is not accompanied by reciprocal crossing over. The expression of HO endonuclease is normally confined to the G, phase of the cell cycle and only in cells that have previously divided. However, the use of an inducible HO gene made it possible to produce a DSB in the MAT locus at any time and follow the process kinetically. An early intermediate was a single
288
FEKRET OSMAN AND SURESH SUBRAMANI
long 3' single-strand tail beginning at one end of the induced DSB, followed by its invasion of the donor site and elongation by copying of the donor sequences (21).Strand invasion generated hDNA that was rapidly mismatch repaired, nearly always in favor of the donor sequences (97).Mating-type switching in S. pombe is similar to that in S. cereoisiae in that it involves initiation by a DSB at the mat1 locus that stimulates gene conversion to replace DNA at the mat1 locus with sequences copied from one of two unexpressed donor loci, mut2-P or mat3-M, located on the same chromosome (reviewed in 39). Detailed analysis of the P-element transposition in Drosophila (reviewed in 31) also involves a site-specific DNA DSB resulting in conversion without an associated crossover. It was shown that DSB-induced recombination events described in S. cereoisiae also occurred in Drosuphila. These studies complement those from lower eukaryotes and reveal new aspects of recombination giving rise to the synthesis-dependent strand-annealing model of homologous recombination (Fig. 6). In vertebrates, V(D)J recombination occurs during B and T lymphocyte development and is responsible for the tremendous diversity in antibody and T-cell receptor specificities (reviewed in 40). During V(D)J recombination, three gene segments, the variable 0, joining and diversity (D) elements, occurring at distinct locations in germ cells, become rearranged into a contiguous exon. V(D)Jrecombination is initiated by site-specific DSBs, acts between specific signal sequences, and does not require extensive sequence homology, although short sequence homologies have been implicated in coding joint formation. Attention has been focused on the mechanism of V(D)J recombination, since it was shown that there is significant overlap between V(D)J recombination and the ubiquitous nonhomologous end-joining mechanism for processing of DSBs in all other cell types (reviewed in 98 and 99).
a),
111. Double-Strand Break-Induced Meiotic Recombination Homologous recombination is a major feature of meiosis in sexually reproducing plants and animals. Meiosis is the central vehicle for the exchange of genetic information in eukaryotes, and it is in meiosis that recombination in eukaryotes achieves its highest frequency. As well as being responsible for the reassortment of the genetic material, genetic studies in many organisms have shown that recombination between homologous chromosomes is necessary for proper disjunction during meiosis I in organisms in which recombination usually occurs (reviewed in 100). In the classical view of meiosis, the process that brings about homologous alignment of chromosomes in close apposition (chromosome synapsis) dur-
DSB-INDUCED RECOMBINATION IN EUKARYOTES
289
ing meiotic prophase I, culminating in the formation of the synaptonemal complex, occurs prior to, and is required for, meiotic recombination. However, despite inconsistencies and exceptions, alternative ideas have developed in which DSB formation in early meiosis I prophase initiates genomewide searches for homology and DNA-DNA exchanges, which precede and mediate homologous chromosome pairing and synaptonemal complex formation (reviewed in 101-103). One interpretation is that mechanisms for the repair of DSBs have been recruited from somatic cells to function in meiosis as a homology seeking mechanism (104). Most of what we know about the molecular biology and genetics of meiosis comes from work in lower eukaryotes, and in the last few years many of the details come from work in S. cmevisiae. This section focuses on current compelling evidence that DSBs initiate meiotic recombination in S. cmeuisiae. Meiotic recombination events, either crossovers or gene conversions, occur at high frequencies in certain regions (hot spots) of the S. cereuisiae genome (reviewed in 105).These hot spots are associated with elevated levels of meiosis-specific DSBs. Hot spots associated with DSBs have been localized near the ARG4, HZS4, HZS2, and CYS3 loci, a centromere-linked region of chromosome 111, and a Tn3-derived transposable element (see 105 and references therein). A prominent DSB has also been associated with a high level of recombination observed in an artificial hotspot, HZS4-LEU2, created by the insertion of a LEU2-containing fragment distal to the HIS4 gene on chomosome I11 (see 105 and references therein). As well as occurring at these hot spots, meiosis-specific DSBs have also been detected at a number of preferred sites on every chromosome assayed in S. cereuisiae (104, 106, 107). Several lines of evidence suggest that these DSBs are responsible for the initiation of meiotic reccombination. First, these DSBs appear at the time of commitment to recombination. Second, mutations that alter recombination frequencies at hot spots also alter the frequency of nearby meiosisspecific DSBs in a directly correlated way. Third, the position of the DSB correlates with gene conversion polarity. Meiotic DSBs are not DNA sequence specific, but occur preferentially in intergenic regions that contain transcription promotors and are hypersensitive to nuclease digestion in chromatin isolated from both meiotic and vegetative cells (107-109), indicating that chromatin structure plays a major role in determining the sites of meiotic DSBs and transcriptional regulation. Meiotic DSB cleavage at these sites must be catalyzed either by a meiotically induced endonuclease or by a constitutive endonuclease that is somehow activated or recruited by meiosis-specific gene products. Evidence suggests an interaction between the meiosis-specific endonuclease and transcription factors (109). DSBs at hot spots are processed by 5' + 3' DNA exonuclease activity to
290
FEKRET OSMAN AND SURESH SUBRAMANI
generate DNA molecules with 3’ overhangs several hundred base pairs in length (18, 110), similar to those observed during mitotic recombination. These tails are presumably used to form strand-exchange products. Double Holliday junction intermediates in S. cereuisiae meiotic recombination have been described (29). The M26 point mutation of the &6 gene is the best studied meiotic hot spot in S. pombe (111).In vitro mutational analysis showed that M26 creates a specific 7-bp sequence that is crucial for hot spot activity (112),although this sequence is not sufficient to create a hot spot when inserted into other chromosomal locations (112, 113). Proteins binding to this heptanucleotide sequence have been purified (114).Meiotic and mitotic recombination in &6 can also be increased by fusing the gene to a strong ADH1 promotor (115).However, to date, DSBs have not been shown physically to be associated with meiotic recombination in S. pombe, as they have in S. cerevisiae. DSBs have also not been shown to be associated with the M26 hot spot physically, and genetic evidence is consistent with M26 creating an initiation (or termination) site for gene conversion by the introduction of a single-strand break in its vicinity (116). In contrast to most other eukaryotes, meiotic recombination in S. pombe occurs in the absence of detectable synaptonemal complexes and crossover interference (reviewed in 117).Given this fundamental difference, we were interested in whether DSBs are the initiators of meiotic recombination in S. pombe. It had previously been shown that mitotically induced DSBs at the mat1 mating-type locus of S. pombe persisted during mating-stimulated homologous meiotic recombination (118).However, DSBs as stimulators of meiotic recombination at other loci in S. pombe had not been investigated.In S. cereuisiae, HO-induced, site-specific in vivo DSBs were shown to stimulate meiotic intrachromosomal recombination between nontandem direct repeats of &4 heteroalleles (73).In our experiments, crosses involved two S. pombe strains of opposite mating type. One strain contained direct repeats of &6 heteroalleles, with the HO recognition site either in unique or duplicated DNA, but with no HO gene. The other strain contained the HO gene downstream of an inducible s. pombe nmtl promotor, but no HO recognition site. Thus the experiments were designed so that the HO endonuclease and HO recognition site would only come together during meiotic nuclear fusion, thus ensuring the DSBs were induced during meiosis. However, no stimulation of meiotic recombination was observed (F‘. Osman and S. Subramani, unpublished results). This either implies that DSBs did not induce detectable meiotic recombination events, or that functional HO endonuclease was not expressed from the nmtl promotor during meiosis. This is a potentially interesting result that requires additional experiments, perhaps involving hsion of the HO gene to a meiosis-specific S. pombe promotor.
DSB-INDUCED RECOMBINATION IN EUKARYOTES
291
IV. The Genetic Control of Double-Strand Break-Induced Recombination An investigation of the genetic control of the pathways of recombination in eukaryotes depends on the isolation of recombination-defective mutants. The analysis of such mutants can identify genes whose products are required for recombination and can define the components of the different recombination pathways. The subsequent cloning of these genes, and the determination of the biochemical activities of the encoded gene products, are crucial steps in elucidating the molecular mechanisms of recombination. The genetic control of recombination has been most extensively analyzed in S. cermisiue.Numerous mutants affecting meiotic and mitotic homologous recombination in S. cerevisiae have been isolated and characterized, and several excellent, detailed reviews have been published (6,41,119,120).Similarly, reviews have dealt in depth with the genetic control of nonhomologous recombination (end-joining) in mammalian cells (121-123). Homologs of mammalian nonhomologous recombination genes have been described in S. cerevisiae (124, 125). In the presence of homologous recombination, they play only a minor role in processing DSBs in S. cerevisiae.On the other hand, despite initial doubt that homologous recombination pathways in S. cerevisiae were even conserved in mammalian cells, mammalian homologs of S. cerevisiae homologous recombination genes have now been identified, and an understanding of the sigruficance of homologous recombination for processing DSBs in mammalian cells is developing rapidly (reviewed in 126). We are interested in the genetic control of DSB-induced mitotic intrachromosomal recombination between direct repeats in S. pombe compared to S. cermisiae, and this is the main focus of this section.
A. The Genetic Control of DSB-Induced Mitotic Recombination in S. cerevisiae As discussed earlier (Section III,B,l), genetic and physical analyses of DSB-induced mitotic intramolecular recombination between direct repeats in S. cerevisiae suggested that there are at least two independent competing pathways with a common intermediate (19,28).The two proposed pathways are based on a DSB-gap repair mechanism and the SSA mechanism. Additional evidence for at least two distinct pathways of homologous recombination between nontandem intramolecular repeats in S. cermisiue comes from an examination of its genetic control. The best-studied mutants affecting recombination in S. cerevisiae are those in the RAD52 epistatic group, isolated mainly on the basis of their sensitivity to x-rays (reviewed in 6, 41,119,120).The W 5 2 epistatic group in-
292
FEKRET OSMAN AND SURESH SUBRAMANI
cludes at least eight genes: RAD50, -51, -52, -54, -55, and -57, M R E l l , and XRS2. The rad52 mutant has been extensively studied and manifests a pleiotropic phenotype with regard to deficiency in meiotic and mitotic DSBinduced homologous recombination involving a variety of substrates (6,41, 119,120). Several studies have shown that, for DSB-induced recombination between direct repeats, DSB-gap repair to give conversion-typeproducts is RAD52 dependent (19-21, 72, 74, 76, 127).The effect of rud52 mutations on the formation of deletion-type recombinants during DSB-induced recombination between direct repeats is more complex: it occurs at a reduced efficiencyin rad52 cells but is RAD52-independent (19-21, 72, 74, 76'). Experiments in which HO-induced DSBs were introduced into the ribosomal DNA cluster or an 18-foldrepeated CUPl gene locus demonstrated that deletion formation, resulting in loss of one or more repeat units, can occur by a RAD52-independent mechanism (76). However, when the number of CUPl repeats was reduced from 18 to 3, the events were RAD52 dependent. Consistent with the physical analyses, this suggested an additional pathway(s)for deletion formation. Surprisingly, although other members of the RAD52 epistatic group play crucial roles in both meiotic and mitotic recombination, they are not required for both conversion-typeand deletion-type product formation during DSB-induced recombination between direct repeats (3,127). This was despite the observation that single-stranded DNA formation was slower in rud50 cells than wild type (20,127). RADl and RADlO belong to the RAD3 epistatic group and are required for nucleotide excision repair of ultraviolet (uv)-damaged DNA in S. cerevisiue (for a review see 128).They have also been shown to have a role in mitotic recombination. Mutations in RADl and R A D l O decreased the efficiency of integration of circular and linearized plasmids (129,130). They also reduced the frequency of recombinants, primarily deletion types, during spontaneous intrachromosomal recombination between direct repeats (47, 129-132). Analysis of double mutants showed that RADl and RADlO functioned in the same recombinationpathway. For rud52-rudl and rud52-rudl0 double mutants, the reduction in mitotic intrachromosomal recombination between direct repeats was greater than with the single mutants, suggesting that RAD52 and RADI-RADIO functioned in separate recombination pathways (47,129-132). Together with the kinetic data on DSB-induced recombination between repeats (19), this suggested that RADl -RADIO functioned in the SSA pathway. It was subsequently shown that radl and rudl0 strains were deficient in DSB-induced repeat recombination in which the DSB was introduced by HO cleavage at HO recognition sequences within duplicated DNA. This was due to the inability to remove small regions of nonhomologous DNA (the HO recognition sequences) at the site of the induced DSB (37,133). This suggested that RADl and R A D l O coded for an endonuclease
DSB-INDUCED RECOMBINATION IN EUKARYOTES
293
activity that was required to remove nonhomologous DNA from the 3’ ends of recombining DNA, a process required for SSA (Fig. 7) and analogous to
excision of photodimers during repair of uv-damaged DNA. Subsequently, it was shown that Radl and RadlO proteins form a complex that functions as a single-strand DNA-specific endonuclease that can cleave at a junction between duplex DNA and 3’ single-stranded DNA tails (134, which explains the role of these proteins in excision repair and for the removal of nonhomologous DNA during SSA. This function would be required for recombination between direct repeats if the initiating DSB was made in unique DNA between the repeats or in nonhomologous regions within duplicated sequences. The RadlO protein has also been shown to promote renaturation of complementary DNA strands (135),so it may also function in other steps of SSA. Other NER genes (RAD2,-3, -4, -14, -1 6 and -25)were not required for spontaneous intrachromosomal recombination (129, 130) or for removal of nonhomologous DNA during DSB-induced repeat recombination (133). However, it has been shown that two mismatch repair genes, MSH2 and MSH3, are also required in the RADl -RADIO pathway for spontaneous intrachromosomal recombination between direct repeats and for homologous integration of linearized plasmids (136).Mutations in two other mismatch repair genes, P M S l and MLHl, had no effect. It is not clear how many pathways there are for DSB-induced intramolecular mitotic homologous recombination between direct repeats in S. cereuisiue, but studies indicate that there may be at least three pathways: a RAD52-dependent DSB-gap repair pathway, a RADl -RADIO-dependent SSA pathway, and a third alternative pathway for the generation of deletiontype recombinants. The rudl +ad52 and rudl 0-rad52 strains still exhibit a residual capacity for intrachromosomal mitotic recombination between direct repeats (47, 129-132). One possible explanation is that additional alternative recombination pathways exist. Studies on the genetic control of direct-repeat and inverted-repeat recombination in S. cerevisiue (47, 137, 138) suggest that, as well as a RAD52-dependent DSB-gap repair pathway for the generation of conversion-type recombinants and a RADl -RADIO-dependent SSA pathway for the generation of deletion-type recombinants, there exists a third pathway based on nonconservative one-sided recombination events that requires RAD52, RADl, and RADlO and generates deletion-type recombinants. Using an approach similar to that taken to iden@ multiple recombination pathways in Escherichiu coli, a mutation in the RFAl gene, a gene encoding a single-stranded DNA-binding protein, was isolated in a screen for mutations that increased intrachromosomal recombination between directrepeats in a rad52-rudl background (139).The $ul mutation on its own
294
FEKRET OSMAN AND SURESH SUBRAMANI
caused an increase in recombination that was, unlike most other hyperrecombination mutants, independent of RAD52 function. Additionally, the .f.1 mutant strain was uv sensitive and exhibited decreased levels of interchromosomal mitotic recombination in diploids. These results indicate that RFAl may function in an alternative recombination pathway for direct-repeat recombination. Interestingly, a novel allele of RFA1 was also isolated in a screen for mutants that decreased homologous recombination between plasmid and chromosomal sequences stimulated by an in vivo HO-induced DSB within the plasmid sequences (140). Finally, a series of S. cermisiue mutants has been isolated directly on the basis of a deficiency in intrachromosomal mitotic recombination between repeated uru3- heteroalleles stimulated by an in vivo HO-induced DSB within duplicated sequences (141).To date only one of these mutations has been described and was found to be an allele of the essential CDC1 gene (141). The mutation completely eliminated DSB-induced recombination to yield Ura+ recombinants, and this was shown not to be due to stimulation of sister-chromatid exchanges. The mutation also caused moderate sensitivity to methylmethane sulfonate and ionizing radiation, but did not affect spontaneous recombination or cell viability. Although the precise effect of the cdcl mutation on DSB-induced recombination is not known, its effect could be due to the blocking of the normal pathways of recombination to allow an alternative pathway (141).
B. The Genetic Control of DSB-Induced Mitotic Recombination in S. pombe Intrachromosomal direct-repeat recombination substrates have been used to investigate the genetic control of DSB-induced mitotic recombination in S. pombe. Earlier studies had shown that these substrates undergo gene conversion and deletion events by the DSB-gap repair and SSA pathways of recombination, respectively (84).While recombinants generated by the DSB-gap repair pathway in S. cermisiae are RAD52 dependent, the production of similar recombinants in s. pombe is independent of rud22 (homolog of S. cermisiae RAD52) (142).Similarly, while the production of deletion-type recombinants in S. cermisiae is dependent on the RAD1-RADIO genes, the generation of analogous recombinants in S. pombe is independent of the S. pombe rud1O (homolog of S. cereuisiae RADI). Neither the deletion nor the gene conversion events were affected by mutations in the S. pombe rud5 (homolog of the S. cermisiae RAD3 gene), rad21, radl, or rud3 genes (142).These results suggest that, although the pathways of DSB-induced recombination may be similar in S. cermisiae and S. pombe, their genetic control is likely to be different. In this context, it is worth noting that the gene for a uv endonuclease has been cloned from S. pombe (143).This endonu-
DSB-INDUCED RECOMBINATION IN EUKARYOTES
295
clease, like the S. cermisiae Radl-RadlO proteins, cleaves 5’ to the uv damage, and might function in intrachromosomal recombination in an s. pombe rudlO mutant. We have isolated a number of mutants deficient in intrachromosomal, DSB-induced mitotic recombination in S. pombe. The analysis of the phenotypes of these mutants and the genes that complement them should yield additional insights regarding the DSB-gap repair and SSA models of recombination.
V. Concluding Remarks It should be evident from the preceding account that considerable progress has been made in the last decade in elucidating the general mechanisms by which mitotic recombination proceeds in lower and higher eukaryotes. There is evidence for multiple pathways that appear to be common among all eukaryotes, but their prevalence and genetic control might vary from organism to organism. In the near future, most of the progress will come in the identification and characterization of the genes and proteins involved in these processes. The yeast systems are likely to be at the forefront because they are amenable to genetic screens for recombination-deficient mutants, and these mutant phenotypes can be complemented easily with DNA libraries. A complete understanding of the enzymatic activities of the proteins involved, and of the possible redundancies, in these pathways will keep investigators in this field busy for some time to come.
ACKNOWLEDGMENTS This work was supportedby NIH grant GM31253 to S.S. and by a Wellcome Trust Project Grant (043822/2/95/2)awarded to Shirley McCready.
REFERENCES 1. L. S. Symington, E M B O ] . 10,987 (1991).
R. Jessberger,V.Podust, U. Hubscher, and P. Berg,]. Biol. Chem. 268,15070 (1993). J. E. Haber, BioEssays 17,609 (1995). T. L. Orr-Weaver and J. W. Szostak, Mimbiol. Reo. 49,33 (1985). P. J. Hastings, in “Genetic Recombination”(R. Kucherlapatiand G . R. Smith, eds.),p. 397. ASM Press, Washington, DC, 1988. 6. T.D. Petes, R.E. Malone and L. S. Symington,in “The Molecular and Cellular Biology of
2. 3. 4. 5.
296
FEKRET OSMAN AND SURESH SUBRAMANI
the Yeast Sacchurmyces ceTeDiSMe” (J. R. Broach, J. R. Pringle and E. W. Jones, eds.), Vol. 1, p. 407.Cold Spring Harbor, Laboratory Press, Cold Spring Harbor, NY, 1991. 2 P. J. Hastings, Mutat. Res. 284,97 (1992). 8. R. Holliday, Genet. Res. 5,282 (1964). 9. M. S. Meselson and C. M. Radding, h c . Natl. Acad. Sci. U.S.A. 72,358 (1975). 10. C. M. Radding, Annu. Rev. Genet. 16 405 (1982). 11. J. N. Strathem, K. G. Weinstock, D. R. Higgins, and C. B. McGill, Genetics 127,(1991). 12. M. A. Resnick,]. Theor. Biol 59,97 (1976). 13. T. L. Orr-Weaver, J. W. Szostak and R. J. Rothstein, h c . Nutl. Acad. Sci. U.S.A.78,6354 (1981). 14. T. L. Om-Weaver and J. W. Szostak, R-oc. Natl. A d . Sci. U.S.A.80,4417 (1983). 15. J. W. Szostak, T. L. On-Weaver, R. J. Rothstein, and F. W. Stahl,Cell 33,25 (1983). 16. D. K. Nag, M. A. White, and T.D. Petes, Nature (London) 340,318 (1989). 12 M. Lichten, C. Goyen, N. P. Schultes, D. Treco, J. W. Szostak,J. E. Haber, and A. Nicholas, h c . Natl. A d . Sci. U.S.A.87,7653 (1990). 18. H. Sun, D. Treco, and J. W. Szostak, Cell 64,1155 (1991). 19. J. Fishman-Lobell,J. N. Rudin, and J. E. Haber, Mol. Cell. Biol. 12,1292 (1992). 20. N. Sugawara and J. E. Haber, Mol. Cell. Bwl. 12,563 (1992). 21. C. I. White and J. E. Haber, EMBO]. 9,663 (1990). 22. P. J. Hastings, C. McGill, B. Shafer, and J. N. Strathem, Genetics 135,973 (1993). 23. G. M. Adair, R. S . Naim, J. H. Wilson, M. M. Seidman, K. A. Brotherman, C. MacKinnon, and J. B. Scheerer,h c . Natl. A d . Sci. U.S.A.86,4574 (1989). 24. A. Belmaaza, J. C. Wallenberg, S. Brouillette, N. Gusew, and P. Chartrand, Nucleic Acids Res. 18,6385 (1990). 25. J. Ellis and A. Bemstein, Mol. Cell. Bwl.9,1621 (1989). 26. J. S.Mudgett and W. D. Taylor, Mol. Cell. Bwl. 10,37 (1990). 22 A. Belmaaza and P. Chartrand, Mutat. Res. 314,199 (1994). 28. N. Rudin, E. Sugarman, and J. E. Haber, Genetics 122,519 (1989). 29. A. Schwacha and N. Kleckner, Cell 83,783 (1995). 30. N. Nassif, J. Penney, S. Pal, W. R. Engels, and G. B. Gloor, Mol. Cell. Biol. 14, 1613 (1994). 31. D. H. Lankenau, Chmmosoma 103,659 (1995). 32. J. N. Strathem, in “GeneticRecombination”(R. Kucherlapati and G. R. Smith, eds.),p. 445. ASM Press, Washington,DC, 1988. 33. F.-L. Lin, K. Sperle, and N. Stemberg, Mol. Cell. Bwl. 4, 1020 (1984). 34. F.-L. Lin, K. Sperle, and N. Stemberg, Mol. Cell. Biol. 10,103 (1990). 35. E, Maryon and D. Carroll, Mol. Cell. Bid. 11,3268 (1991). 36. E. Maryon and D. Carroll, Mol. CeU. Bwl. 11,3278(1991). 32 J. Fishman-Lobell and J. E. Haber, Science 258,480 (1992). 38. J. E. Haber, Tmds Genet. 8,446 (1992). 39. A. J. S. Klar, in “The Molecular and Cellular Biology of the Yeast Saccharomyces: Gene Expression” @. W. Jones, J. R.F‘ringle. and J. R. Broach, eds.), p. 745.Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY,1992. 40. S. M. Lewis, Adu. Immunol. 56,27 (1994). 41. M. A. Resnick, C. Bennett, E. Perkins, G. Porter, and S. D. Priebe, in “The Yeasts, Vol. 6” (A. H. Rose, A. E. Wheals, and J. S. Harrison, eds.), p. 357.Academic Press, New York, 1995. 42. M. K. Derbyshire, L. H. Epstein, C. S . H. Young, P. L. Munz, and R. Fishel, Mol. Cell. Bwl. 14,156 (1994). 43. D. Roth and J. Wilson, in “GeneticRecombination” (R. Kucherlapati and G. R. Smith, eds.), p. 621.ASM Press, Washington, DC, 1988.
DSB-INDUCED RECOMBINATION IN EUKARYOTES
297
44. H. Ma, S. Kunes, P. J. Schatz, and D. Botstein, Gene 58,201 (1987). 45. C. Mezard, D. Pompon, and A. Nicholas, Cell 70,659 (1992). 46. C. Mezard and A. Nicholas, Mol. Cell. Biol. 14, 1278 (1994). 4% F. Prado and A. Aguilera, Genetics 139,109 (1995). 48. D. M. Livingston,Plosmid 20,97 (1988). 49. S. Kunes, D. Botstein, and M. S. Fox, Genetics 124,67 (1990). 50. R. H. Schiestl, M. Dominska, and T. D. Petes, Mol. Cell. Biol. 13,2697 (1993). 51. J. Zhu and R. H. Schiestl,Mol. Cell. Biol. 16,1805 (1996). 52. K. M. Kramer, J. A. Brock, K. Bloom, J. K. Moore, and J. E. Haber, Mol. Cell. Bwl. 14,1293 (1994). 53. A. L. Nicholas, P. L. Munz, and C. S. H. Young, Nucleic Acids Res. 23,1038 (1995). 54. W. Goedecke, P. Pfeiffer, and W. Vielmetter, Nmleic Acids Res. 22,2094 (1994). 55. C. Grimm, J. Kohli, J.Murray, and K. Maundrell, Mol. Gen. Genet. 215,81 (1988). 56. K. Tatebayashi,J. Kato, and H. Ikeda, Mol. Gen. Genet. 244,111 (1994). 5% V. Valencius and 0. Smithies,Mol. Cell. Bid. 11,4389 (1991). 58. C. Deng, K. R. Thomas, and M. R. Cappechi, Mol. Cell. Bwl. 13,2134 (1993). 59. R. J. Bollag, A. S. Waldman, and R. M. Liskay, Annu. Reu. Genet. 23,199 (1989). 60. S. Subramani and B. L. Seaton, in “Genetic Recombination” (R. Kucherlapati and G. R. Smith, eds.), p. 549. ASM Press, Washington, DC, 1988. 61. S. Fotheringham and W. R. Holloman, Genetics 129,1053 (1991). 62. H. Puchta, S . Kocher, and B. Hohn, Mol. Cell. Bid. 12,3372 (1992). 63. W. Goedecke, W. Vielmetter, and P. Pfeiffer, Mol. Cell. Bid. 12,811 (1992). 64. C. W. k h a n , M. Clemens, D. K. Worthylake, J. K. Trautman, and D. Carroll, Mol. Ce2l. Bid. 13,6897 (1993). 65. D. 0.Ferguson and W. K. Holloman, Proc. Natl. Acad. Sci. U.S.A. 93,5419 (1996). 66. J. N. Strathem, A. J. Klar, J. B. Hicks, J. A. Abraham, J. M. Ivy, K. Nasmyth, and C. McGill, Cell 31, 183 (1982). 6% R. Kostriken and F. Heffron, Cold Spring Harbor Symp. Quant. Bbl. 49,89 (1984). 68. K. Nakagawa, N. Morishima, and T. Shibata, EMBO]. 11,2707 (1992). 69. A. Plessis, A. Penin, J. E. Haber, and B. Dujon, Genetics 130,451 (1992). 70. C. Fairhead and B. Dujon, Mol. Gen.Gend. 240,170 (1993). 71. J. A. Nickoloff, E. Y. Chen, and F. Heffron, Proc. Natl. Acad. Sci. U.S.A. 83,7831 (1986). 72. J. A. Nickoloff, J, D. Singer, M. F. Hoekstra, and F. Heffron,]. Mol. Biol. 207,527 (1989). 73. A. Ray, I. Siddiqi, A. L. Kolodkin and F. W. Stahl,]. Mol. Bwl. 2,247 (1988). 74. N. Rudin and J. E. Haber, Mol. Cell. Biol. 8,3918 (1988). 75. B. J. Thomas and R. Rothstein, Cell 56,619 (1989). 76. B. A. Ozenberger and G. S. Roeder, Mol. Cell. Bwl. 11,1222 (1991). 7% D. B. Sweetser,H. Hough, J. F. Whelden, M. Arbuckle, and J. A. Nickoloff, Mol. Cell. Biol. 14,3863 (1994). 78. H. H. Nelson, D. B. Sweetser,and J. A. Nickoloff, Mol. Cell. B i d . 16,2951 (1996). 79. A. Ray, N. Machin, and F. W. Stahl, h c . Natl. Acad. Sci. U.S.A.86,6225 (1989). 80. C. B. McGilI, B. R. Shafer, L. K. Den; and J. N. Strathem, Cum. Genet. 23,305 (1993). 81. J. K. Moore and J. E. Haber, Mol. Cell. Bid. 16,2164 (1996). 82. K. M. Kramer and J. E. Haber, Gaes Deu. 7,2345 (1993). 83. L. L. Sandell and V. A. Zakian, Cell 75,729 (1993). 84. F. Osman, E. A. Fortunato, and S. Subramani, Genetics 142,341 (1996). 85. R. A. Winegar, J. W. Philippes, J. K. Youngblom, and W. F. Morgan, Mutat. Res. 225,49 (1989). 86. J. W. Phillips and W. E Morgan, Mol. Cell. Bwl. 14,5794 (1994). 8% F. S. Gimble and J. Thomer, Nature (London) 357,301 (1992).
298
FEKRET OSMAN AND SURESH SUBRAMAN1
88. T. Lukacsovich,D. Yang, and A. S. Waldman, Nucleic. Acids Rar. 22,5649 (1994). 89. M. Brenneman, F. S. Gimble, and J. H. Wilson, Proc. Natl. Acad. Sci. U.S.A. 93,3608 (1996). 90. A. Choulika, A. Perrin, B. Dujon, and J. F. Nicholas, Mol. Cell. Biol. 15,1968 (1994). 91. P. Rouet, F. Smth, and M. Jasin, Proc. Natl. Acad. Sci. U.S.A. 91 6064 (1994). 92. P. Rouet, F. Smith, and M. Jasin, Mol. Cell. Bwl. 14,8096 (1994). 93. H. Puchta, B. Dujon, and B. Hohn, Nucleic Acids Res. 21,5034 (1993). 94. D. J. Segal and D. Carroll, Proc. Natl. A d Sci. U.S.A. 92,806 (1995). 95. H. Puchta,B. Dujon, and B. Hohn, h c . Natl. Acad. Sci. U.S.A.93,5055 (1996). 96. A. R. Godwin, R. J. Bollag, D. M. Christie, and R. M. Liskay, Proc. Natl. Acad. Sci. U.S.A. 91 12554 (1994). 97. J. E. Haber, B. L. Ray, J. M.Kolb, and C. I. White, Proc. Natl. Acad. Sci. U.S.A. 90,3363 (1993). 98. P. A. Jeggo, G. E. Tacciolo, and S. P. Jackson, BioEssays 17,949 (1995). 99. M. A. Oethger, Cum. @in. Genet. Dew. 6,141 (1996). 100. R. S. Hawley, in “Genetic Recombination” (R. Kucherlapati and G. R. Smith, eds.), p. 497. ASM Press, Washington, DC, 1988. 101. R. S. Hawley and T. Arbel, Cell 72,301 (1993). 102. P. B. Moens, BioEssays 16, 101 (1994). 103. G. S . Roeder, Proc. NatZAcad. Sci. U.S.A. 92,10450(1995). 104.J. C. Game, Dev.Genet.13,485 (1992). 105. M. Lichten and A. S . H. Goldman, Annu. Rev. Genet.29,423 (1995). 106. J. C. Game, K. C. Sitney, V.E. Cook, and R. K. Mortimer, Genetics 123,695 (1989). 107. TX.Wu and M. Lichten, Science 263,515 0994). 108. K.Ohta, T. Shibata, and A. Nicholas, EMBOJ. 13,5754 (1994). 109. Q.Fan and T.Petes, Mol. Cell. Bwl. 16,2037 (1996). 110. L. Cao, E. Alani,and N. Kleckner, Cell 61,1089 (1990). 111. H. Gutz, Genetics 69,317 (1971). 112. P. Schuchert, M. Langsford, E. Kaslin, and J. Kohli, E MBOJ. 10,2157 (1991). 113. A. S. Ponticelli and G. R. Smith, Roc. Natl. Acad. Sci. U.S.A. 89,227 (1992). 114. W. P. Wahls and G. R. Smith, G m . Deo.8,1693 (1994). 115. C. Grimm, P. Schaer, P. Munz, and J. Kohli, Mol. Cell. Biol. 11,289 (1991). 116. P. Schar and J. Kohli, EMBOJ. 13,5212 (1994). 117. J. Kohli and J. Bahler, ExpaienCM 50,295 (1994). 118. A. J. S . Klar and L. M. Miglio, Cell 46,725 (1986). 119. J. C.Game, Semin. Cancer Biol. 4,73 (1993). 120. A. Shinohara and T. Ogawa, TZBS 20,387 (1995). 121. P. Jeggo,Mutut. Res. 239, l(l990). 122. A. R. Collins, Mutut. Rar. 293,99 (1993). 123. M. Z.Zdzienicka, Mutut. Res. 336,203 (1995). 124. W.Siede, A. A. Fried, I. Dianova, F. Eckardt-Schupp,and E. C. Friedberg, Genetics 142, 91 (1996). 125. G. J. Mages, H. M. Feldmann, and E.-L.Winnacker,J. Biol. C h .271,7910 (1996). 126. L.H. Thompson, Mutut. Res. 363,77 (1996). 127. N. Sugawara,E. L. Ivanov, J. Fishman-Lobell,B. Ray, X. Wu, and J. E.Haber, Nature (London) 373,84 (1995). 128. S.Prakash, P. Sung, and L. Prakash, Annu. Rev. Genet.27,33 (1993). 129. R. H.Schiestl and S . Prakash, Mol. Cell. B b l . 8,3619 (1988). 130. R. H. Schiestl and S . Prakash, Mol. Cell. Bwl. 10,3619 (1990). 131. H. L. Klein, Cenetb 120,367 (1988).
DSB-INDUCED RECOMBINATION IN EUKARYOTES
299
132. B. J. Thomas and R. Rothstein, Genetics 123,725 (1989). 133. E. L. Ivanov and J. E. Haber, Mol. Cell. Biol. 15,2245 (1995). 134. A. J. Bardwell, L. Bardwell, A. E. Tomkinson, and E. C. Friedberg, Science 265, 2082 (1994). 135. P. Sung, L. F’rakash, and S. F’rakash, Nature (London)355,743 (1992). 136. M. Saparbaev, L. F’rakash, and S. F’rakash, Genetics 142,727 (1996). 137. A. J. Ratbay and L. S. Symington,Gatetics 139,45 (1995). 138. H. Santos-Rosa and A. Aguilera, G d c s 139,57 (1995). 139. J. Smith and R. Rothstein, Mol. Cell. Biol. 15, 1632 (1995). 140. A. A. Firmenich, M. Elias-Amanz, and P. Berg, Mol. Cell. Biol. 15, 1620 (1995). 141. J. Halbrook and M. F. Hoekstra, Mol. CeU. Biol. 14,8037 (1994). 142. E. A. Fortunato, F., Osman and S. Subramani, Mutat. Res. DNA Repair 364,14 (1996). 143. M. Takao, R. Yonemasu, K. Yamamoto, and Y.Yasui, Nwl. Acids Res. 24,1267 (1996).
This Page Intentionally Left Blank
Impaired Folding and Subunit Assembly as Disease Mechanism: The Example of Medium-Chain acyl-CoA Dehydrogenase Deficiency’ PETERBROSS,BMGE S . ANDRESEN, AND NIELS GREGERSEN Centerfw Medical Molecular Biology Aarhus University Hospital and Faculty of Health Sciencas Skejby Sygehus and Danish Centrefw Human Genome Research Aarhus. Denmark I. Protein Folding and Its Disturbance by Missense Mutations .......... A. Basic Principles of Protein Folding ............................ B. The Role of Chaperones for Protein Folding in the Cell ........... C. Mitochondrial Chaperones ................................... D. Mutations Affecting Folding Kinetics .......................... 11. The Role of MCAD in Mitochondrial P-Oxidation of Fatty Acids ...... A. MCAD-theEnzyme ........................................ B. MCAD Deficiency-the Disease ............................... 111. Studies on the Molecular Pathology of MCAD Deficiency ............ A. Characterization of the K304E Mutant Variant .................. B. Characterization of Other Missense Mutations in MCAD Deficiency ........................................ IV. Conclusions .................................................. A. Characterization of the Molecular Effects of Missense MutationsChoice of the Expression System .............................. B. Future Lines of Research To Improve the Ability To Predict Effects of Mutations ............................... C. Impaired Folding Resulting in Decreased Stabilitya Common Theme in Genetic Diseases ........................ D. Protein Quality Control Systems .............................. References ....................................................
303 303 305 305 309 310 310 311 312 312 325 327 327 328 328 330 332
Abbreviations:FAD, flavin adenine dinucleotide; LCAD, long-chain acyl-CoA dehydrogenase; MCAD, medium-chain acyl-CoA dehydrogenase; NMR, nuclear magnetic resonance; PCR, polymerase chain reaction; SCAD, short-chain acyl-CoA dehydrogenase; VLCAD, verylong-chain acyl-CoA dehydrogenase. Progress in Nucleic A d d Research and Molecular Biology, Vol. 58
301
Copyright 0 1998 by Academic Press. Ail rights of repduction in any form reserved 0079-6603&8 $25.00
302
PETER BROSS ET AL.
Rapid progress in DNA technology has entailed the possibility of readily detecting mutations in disease genes. In contrast to this, techniquesto characterize the effects of mutations are stlll very time consuming. It has turned out that many of the mutations detected in disease genes are missense mutations. Characterization of the effect of these mutations is particularly important in order to establish that they are disease causing and to estimate their severity. We use the experiences with investigation of medium-chain acyl-CoA dehydrogenase deficiency as an example to illustrate that (i) impaired folding is a common effed of mipsense mutations occurring in genetic diseases, (U) increasing the level of available chaperones may augment the level of functional mutant protein in vivo, and (i) one mutation may have multiple effects. The interplay between the chaperones assisting folding and proteases that attack folding intermediates is decisiie for how large a proportion of a mutant polypeptide impaired in folding acquires the functional structure. Thii constitutes a protein quality control system, and the handling of a given mutant protein by this system may vary due to environmental conditions or geneticvariability in its components. The possibility that intraindiddual differences in the handling of mutant proteins may be a mechanism accounting for phenotypic variability is discussed 8 1998 Academic Rear,
Research in human genetic diseases has developed explosively in the last years due to the availability of gene sequences through the Human Genome Project and the advent of the PCR technique. This has facilitated the investigation of the molecular basis for genetic diseases and has produced a wealth of data on mutations in disease genes of many monogenic diseases. Identification of sequence variations as such, however, does not necessarily establish the mutation-disease relationship,nor does it reveal the molecular mechanism. In the case of gross gene rearrangements and mutations that cause premature termination of translation, the disease-causing effect is usually directly evident, but the effects of other mutation types are less obvious. In many genes a large proportion of the mutations detected are mutations that result in substitution of single amino acids. In these cases, the disease-causing nature of the mutations and the underlying molecular mechanism are usually difficult to predict. Characterization of mutant variants by heterologous expression experiments and biochemical analysis of the mutant proteins is still rather time consuming. Many newly detected disease-associated missense mutations are therefore currently published without further characterization. Theoretical hypotheses regarding the mutation’s effect may be suggested from the threedimensional structure of the respective protein or if residues with known function are affected. However, even these cases are subject to error and must be considered merely speculative until the disease-causing nature of a mutation has been proven experimentally. It has emerged that many of the missense mutations detected in genetic
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
303
deficiencies mainly affect folding or assembly of the polypeptide. We here use the experiences with the investigation of the molecular pathoiogy of mediumchain acyl-CoA dehydrogenase(MCAD)deficiency as an example to iuustrate this finding and to draw some conclusions that may be of general interest. The importance of molecular chaperones for folding in the mitochondria is emphasized. We show that the proportion of a given folding mutant that acquires the native structure can be increased by supplying higher levels of chaperones. The balance between the chaperone system supervising folding and the proteolytic systems that degrade misfolded proteins thus constitutes a protein quality control mechanism that is decisive for the relative yield of mutant proteins.
1. Protein Folding and Its Disturbance by Missense Mutations Much of the progress in the understanding of the molecular pathology underlying MCAD deficiency and other genetic diseases has only been possible due to the development of the concept of chaperone-assisted protein folding in vivo. Although protein folding in general, and folding in vivo in specific, still are quite a bit away from being f d y understood, considerable progress has been made and concepts regarding the basic mechanisms have been developed. In this section we briefly discuss the aspects of the basic principles of protein folding on the one hand and chaperone assisted folding in vivo on the other that have particular relevance for our research. Although conflicting hypotheses exist on many aspects, we try to extract the information that can contribute to the understanding of the molecular pathology underlying many genetic defects caused by missense mutations in the affected genes. For a more detailed discussion of all aspects of folding and the involvement of the various chaperones, reviews should be consulted ( I , 2).
A. Basic Principles of Protein Folding Beginning with work pioneered by Anfmsen (3)more than 30 years ago, it has emerged that all the information determining the three-dimensional structure of proteins is encoded in the primary structure, the linear sequence of amino acids. Experimentally denatured proteins will, under appropriate conditions, spontaneously refold to the unique native conformation. The information for the three-dimensional structure encoded in the primary structure has also been termed the “second genetic code” or “folding code.” As the second landmark, theoretical calculations by Levinthal(4) demonstrated that proteins cannot explore all theoretically possible conformational possi-
304
PETER BROSS ET AL.
bilities (conformers)as this would take astronomically long time spans. Protein folding both in viuo and in vitro occurs in much shorter time intervals (millisecondsto seconds for small single-domainproteins), and it is therefore necessary to assume folding pathway(s) in which conformational rearrangements during the folding process are channeled in order to acquire the native conformation in a time frame acceptable for a living cell. Although much skilled effort has been put into the study of the protein folding problem, the second genetic code is still largely unknown. As Oleg Ptitsyn put it in an overview to a series of reviews addressing protein folding, “each protein knows how to fold but nobody understands how this is done” (5). Approaches addressing the structural transitions occurring in in vitro folding and unfolding reactions have led to the identification of a set of intermediate states with common characteristics termed the “molten globule” (6). The molten globule state is characterized by a compact conformation with substantial native-like secondary structure. This state has been proposed to be similar to the collapsed state that arises by a so-called hydrophobic collapse. The collapse, by which most of the hydrophobic side chains are buried in the inside of the molecule, is very fast and constitutes the first observable process in in vitro folding reactions. Such an obligatory intermediate state would provide an explanation for the apparent channeling of the folding pathway and thus overcome the Levinthal paradox (7). It has been proposed that the molten globule state is a third thermodynamic state of polypeptides besides the unfolded and the native states (8). The presence of many different conformers at a given time in a folding reaction and the difficulty in studying these conformationsof short-lived folding intermediates have been the major challenges for the techniques applied to observe the order of events and the conformations of the intermediates. Such a descriptive view of the folding reaction would form the basis for understanding its principles. NMR and mass spectrometric techniques applied to in vitro folding reactions have been very fruitful in the recent past (9,10). The folding process of small proteins like chymotrypsin inhibitor 2 and lysozyme could be observed at good resolution with such methods. Based on the studies with chymotrypsin inhibitor 2, a so-called nucleation condensation mechanism has been suggested for this 64-amino-acid single-domain protein (11).According to this model, folding proceeds in a cooperative twostate (unfolded-native) process passing through a transition state. In the transition state the so-called nucleation site is formed, a native-like arrangement of key amino acids and their side chains that triggers the cooperative formation of the native structure. Extrapolated to larger proteins, a similar mechanism may be expected to occur for each subdomain, followed by slower rearrangements resulting in domain merging and docking, and finally the acquisition of the native state.
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
305
In spite of the progress made, the current understanding of protein folding is still far away from enabling prediction of the folding pathway from the mere knowledge of the primary structure. Due to this, it is not possible yet to pinpoint residues that are particularly important for the folding process. Furthermore, even when the three-dimensional structure of the native form of a given protein has been determined, it is still impossible to predict with reasonable confidence whether a given mutation might interfere with the folding pathway.
B. The Role of Chaperones for Protein Folding in the Cell Protein folding in the living cell has to occur under physicochemical conditions that are grossly different from the conditions usually applied in order to achieve optimal folding in uitro. The in uitro reactions are typically carried out at very low protein concentrations and low temperatures. In order to cope with an environment unfavorable for polypeptide folding, biological systems have evolved mechanisms that allow efficient folding also under these conditions. Helper proteins exist in the cells that assist folding by counteracting competing side reactions such as irreversible aggregation of folding intermediates. These helper proteins have been termed “molecular chaperones,” and they are defined as “a functional class of unrelated families of proteins that assist the correct non-covalent assembly of other polypeptide-containing structures in uiuo, but are not components of these assembled structures when they are performing their biological functions” (12). Many molecular chaperones are abundant proteins that had been known for a long time before their actual chaperone function was established. The chaperones are subdivided into families. Each family is defined by sequence similarity between its members. The sequence conservation between chaperones within the families is particularly high; for example, Escherichia coli DnaK shares 53% identical amino acid residues with its human mitochondrial homolog mHsp70 (13).Many chaperones are stress proteins that are up-regulated due to cellular stresses such as heat shock.
C. Mitochondria1Chaperones As far as mitochondria are concerned, three distinct chaperone types are known, each of which works in concert with other cofactors or co-chaperones: (i) mitochondrial Hsp70 (mHsp7O),a homolog of E. coli DnaK; (ii) mitochondrial HspGO, a homolog of E. coli GroEL; and (iii) homologs of the E. coli Clp chaperones. Hsp70 chaperones cooperate with two other proteins, Hsp40 (homologs of E. coli DnaJ), which also has chaperone properties, and GrpE, a nucleotide exchange factor. These three polypeptides constitute the so-called Hsp7O chaperone machine (14).Homologs of the Hsp7O type of
306
PETER BROSS ET AL.
chaperone machine are present in the cytoplasm, in mitochondria, and in the endoplasmatic reticulum and constitute the most ubiquitous type of chaperones known. They are involved in protein trafficking, folding, translocation, and gene regulation (15). Hsp70 chaperones bind to hydrophobic segments of extended (i-e.,unstructured) polypeptides, and a given polypeptide can bind several copies of Hsp70 (16-18). Native polypeptides will usually not interact with Hsp70 because such segments with hydrophobic amino acids are buried in the inside of the molecule. Hsp70 has a nucleotide-binding domain with ATPase activity and a peptide-binding domain (19). The polypeptide exists in two conformations, one with high substrate affinity (ADP form) and one with low substrate affinity (ATP form). Hsp40 is considered to aid in recruiting polypeptides for binding, and GrpE mediates exchange of ADP with ATP, resulting in substrate release. The Hsp70 chaperone machine performs cycles of binding and release of segments of unfolded polypeptides. This keeps the polypeptides in an unfolded conformation and ensures release of the chains in a controlled way for folding and/or interaction with other cellular factors (e.g., the chaperonins or import factors of organelles). Hsp6O and its helper HsplO constitute a subgroup of chaperones that has been termed chaperonins in order to emphasize their specific properties. Mammalian Hsp6O and HsplO are homologs of the E. coli chaperonins GroEL and GroES, respectively. Chaperonins with high sequence similarity to E. coli GroEL have been detected in eubacteria, mitochondria, and chloroplasts, and a more distant group with somewhat different characteristics is present in archaebacteria and the eukaryotic cytosol(20). Most of the present knowledge of chaperonins is based on E. coli GroEL and GroES, and the three-dimensionalstructure of both complexes has been determined (21-23). Fourteen subunits of GroEL form a barrel-like structure consisting of a double ring with seven subunits each. Polypeptides are bound in the inner cavities of both half-rings of the barrel. GroES also forms a seven-mer ring that may bind to one (or both) end(s) of the GroEL barrel. Binding of GroES results in the extension of the cavity to a domelike structure in which the bound polypeptide is sequestered from the bulk solution. Theoretical calculations suggest that the volume of the domelike cavity can accommodate a compact folding intermediate with a maximum size of approximately 40,000 Da (24). Binding, release, and rebinding of the polypeptide substrate is triggered by recurrent cycles of ATP binding, hydrolysis, and release of ADP by the GroEL subunits that mediate complex conformational rearrangements and cooperative communication between the two half-rings. The coordinated process includes a timed mechanism, which causes release of the GroES seven-mer from GroEL after approximately 15 s, thereby opening the lid for the polypeptide bound in the cavity (25). It is disputed whether the polypep-
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
307
tide substrate folds to its native conformation while it is sequestered in the “dome” structure, or whether it is released into the bulk solution after each cycle (2, 26). The GroESL complex does not catalyze folding of the polypeptide, rather, a mechanism by which kinetically trapped folding intermediates are partially unfolded by iterative annealing to the Hsp6O chaperonin complex in order to be able to undertake a new round of folding has been suggested (27). In this model the binding energy used for unfolding and release is acquired at the expense of ATP hydrolysis. In contrast to Hsp70, GroEL chaperonin preferentially binds polypeptides that are more structured and compact, similar to the so-called molten globule (18). Chaperones of the Clp family (28, 29) were first detected in E. coli and are subdivided into two groups based on sequence similarity between their nucleotide-binding domains and some functional characteristics: one family has E. coli ClpB and ClpA as typical representatives and another one has E. coli ClpX as a representative. In yeast mitochondria a ClpB homolog has been described (Hsp78).Yeast Hsp78 has been shown to be able to partially substitute for mitochondrial Hsp70, and it has been proposed to have a role under stress conditions when mHsp70 is limited (30).In E. coli both ClpA and ClpX are found associated with ClpP, a protease without chaperone activity. It has been suggested that the biological function of the ClpA and ClpX chaperones is to present polypeptide substrates in an unfolded conformation to ClpP, resulting in degradation of the polypeptide substrate. It has been established that both ClpX and ClpA are bonafide chaperones (31,32).ClpA and ClpX appear to confer different substrate specificitieswhen in complex with ClpP (33,34). We have cloned the cDNA for a human ClpP that is presumably localized in mitochondria (35). Human expressed sequence tags with significant homology toward ClpX/A are present in the EMBU GenBank databases, indicating that this chaperone-protease system has a counterpart in human mitochondria. The current picture of intramitochondrial folding of a typical mitochondrial matrix protein like MCAD can be outlined as illustrated in Fig. 1.The nascent precursor polypeptide chain emerging from the ribosome becomes associated with cytoplasmic Hsp70 and possibly other proteins (not shown), thus arresting premature folding in the cytoplasm and directing the polypeptide to mitochondrial import sites. The interaction of this complex with the mitochondrial transport machinery triggers a complex mechanism, which results in import of the precursor chain into the mitochondrial matrix. Mitochondrial Hsp70 receives the unfolded protein emerging through the transport channel. It contributes to the movement of the chain through the transport channel, and keeps the protein in an unfolded conformation (36-40). Removal of the transit peptide by mitochondrial preprotein protease
308
PETER BROSS ET AL.
FIG.1. Model for chaperone-assisted folding in mitochondria.The interactions of a typical mitochondrial matrix protein like MCAD, which uses both the Hsp70 and Hsp6O chaperone machines, is shown. MPP, mahix processing protease; mpTM, mitochondrial protein transport machinery. For details see text.
and binding and release cycles of the mitochondrial Hsp7O chaperone machine allow initiation of folding, resulting in structural conformers of the polypeptide that possess lower binding affinity for Hsp70 and higher affinity toward Hsp6O. Folding proceeds further through binding and release cycles with the Hsp6O chaperone machine. Finally, folded monomers assemble to the native tetrameric form. It remains to be established how general this pathway is. Calculations from E. coli indicate that the amounts of GroESIL present in the cytoplasm are limiting, so that only a small proportion of the newly synthesized proteins can use GroES/L for folding (41).Inactivation of GroEL in E. coli is lethal, demonstrating that this chaperonin is essential under all growth conditions (42). It is known that some mitochondrial matrix proteins require Hsp60, whereas others do not but rather fold while interacting with the Hsp7O chaperone machine (43, 44). Investigation of a patient with systemic mitochondrial encephalomyopathy with multiple deficiencies of mitochondrial enzymes suggested that the deficiency is due to decreased levels of mitochondrial Hsp6O (45, 40, corroborating the importance of mitochondrial Hsp6O in human cells.
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
309
Recent evidence showed that a matrix resident peptidyl-prolyl cis-truns isomerase accelerates folding of an artificially constructed fusion protein (47, 48). Peptidyl-prolyl cis-truns isomerases have been known for a long time; their role is to catalyze the isomerisation of proline residues from the truns to the cis form. This process occurs much more slowly without catalysis and may thus slow down folding of proteins containing trans-prolines in their native structure. Cis-prolines are typically found with a frequency of 10-30% in native structures, whereas only 0.1% or less of the free amino acid is in the cis conformation (49).It can be expected that further factors will be identified that contribute to folding and assembly of mitochondrial matrix proteins and that different proteins may take different routes.
D. Mutations Affecting Folding Kinetics The investigation of effects of point mutations on folding has been used extensively as an approach to study the mechanisms of protein folding. In Jonathan King's laboratory, mutations in the bacteriophage P22 coat and tailspike proteins that affect kinetics of folding but not the stability of the native structure have been isolated by a genetic selection procedure (for reviews see 50,51).Such mutations have been termed temperature-sensitive folding (tsf) mutations. T'mutations destabilize certain folding intermediates, thus acting on the folding pathway. It has been proven directly for some tsfmutants in the P22 tailspike protein that thermolabile folding can be observed both in vivo (in the presence of chaperones) and in vitro (in the absence of chaperones), demonstrating that these mutations act intrinsically on the folding information contained in the primary structure rather than on some interactions of the folding polypeptide with chaperones (52). King and co-workers were also able to isolate secondary mutations (suppressor mutations) that relieved the effect of tsfmutations and abolished the temperature-sensitive folding phenotype. Surprisingly, these suppressor mutations worked globally, that is, they compensated the tsfphenotype of different tsfmutations (53, 54). This result may suggest that there exist amino acids in a given protein that have a strategic function for the folding pathway, as they contribute to the stability of certain key structures formed during the folding pathway. Such key structures might be the nucleation sites proposed for the nucleation condensation mechanism (see Section LA). Further evidence for the notion that certain residues in an amino acid chain are particularly important for folding while others may b e replaced almost at w i l l without changing the folding efficiency comes from large-scale mutagenesis experiments with bacteriophage T4 lysozyme (55). Extended segments of the polypeptide chain could be substituted with alanine residues without abolishing the protein's capability to form a native structure. From these experiments it appears that only about 50% or less of the amino acids
310
PETER BROSS ET AL.
of a protein are essential for the acquisition of the native structure. Furthermore, from analysis of missense mutations in factor IX,it has been suggested that a considerable proportion of the amino acids in this protein may be
substituted without affecting hnction, whereas every replacement within a minority of amino acid positions will in all cases affect the protein (56).
II. The Role of MCAD in Mitochondrial p-Oxidation of Fatty Acids Mitochondrial fatty acid p-oxidation (reviewed in 57) provides an important energy source for the organism. It is specifically important in muscle tissues. It has been estimated that 60-70% of the ATP in heart tissue derives from mitochondrial p-oxidation (58, 59). Degradation of fatty acids in mammals can also be achieved through two other pathways: peroxisomal p-oxidation and w-oxidation in microsomes. The quantitatively most important pathway for fuel production from fatty acids, however, is the mitochondrial system. The mitochondrial p-oxidation pathway (57) comprises four enzymecatalyzed reactions, which per cycle shorten the acyl chain by two carbons and yield one molecule of acetyl-CoA and four reduction equivalents-two in the form of FADH, and two in the form of NADH. At least two different enzymes with different chain-length specificity have been identified for each of the four steps. It has emerged that two interlinked p-oxidation systems exist. One is membrane associated and processes long-chain fatty acids ( X - l o ) , whereas the other is localized in the matrix space and processes fatty acylCoAs of shorter chain length (60,61).The dehydrogenation step in the membrane-associated system is apparently exclusively catalyzed by very-longchain acyl-CoA dehydrogenase (VLCAD) and in the matrix system by long-chain acyl-CoA dehydrogenase (LCAD), medium-chain acyl-CoA dehydrogenase (MCAD), or short-chain acyl-CoA dehydrogenase (SCAD) depending on the chain length. All four dehydrogenases interact with electrontransferring flavoprotein, which overtakes the two hydrids from the dehydrogenation reaction and feeds them into the respiratory chain via electron-transferring flavoprotein-ubiquinone oxidoreductase (62).
A. MCAD-the Enzyme MCAD is a flavoprotein and, like all other known p-oxidation enzymes, is encoded in the nucleus. The structure and catalytic mechanism of MCAD have been investigated in detail (63).MCAD belongs to the acyl-CoA dehydrogenase gene family. Members of this family are found throughout phylogeny from bacteria to humans (64).In human cells seven representatives
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
311
are known (65):the four straight-chain acyl-CoA dehydrogenases (VLCAD, LCAD, MCAD, and SCAD), and glutaryl-CoA dehydrogenase, isovalerylCoA dehydrogenase and short-branched-chain acyl-CoA dehydrogenase, which are involved in amino acid metabolism. All enzymes contain one molecule of noncovalently bound flavin adenine dinucleotide (FAD)per subunit. Except for VLCAD, which is dimeric and membrane associated, all are tetramers composed of four identical subunits. Sequence similarity also exists between the mitochondrial acyl-CoA dehydrogenases and peroxisomal acyl-CoA oxidase, which catalyzes the first step in peroxisomal p-oxidation (65, 66). Like VLCAD, the peroxisomal oxidase is dimeric, with two identical subunits. The three-dimensional structure of porcine MCAD has been determined (67).The tetrameric enzyme displays a 2-2-2 symmetry with three distinct sets of intersubunit contacts. Each subunit is organized in an amino-tenninal a-helix domain, a central P-sheet domain, and a carboxyl-terminal a-helix domain. The active site with the noncovalently bound FAD is buried between the three domains. A catalytically active glutamic acid residue has been identified biochemically (68)and characterized by site-directed mutagenesis experiments (69).
B. MCAD Deficiency-the Disease The first patient with MCAD deficiency was described in 1976 (70).The disease was studied in the following years and characterized biochemically in 1982-1983 (71-73). In the following years an increasing number of MCAD-deficient patients were discovered and described. MCAD deficiency is an autosomal recessively inherited defect due to mutations in the MCAD gene. MCAD deficiency presents with life-threatening attacks, which typically are triggered by fasting stress often in connection with feverish viral infections (74, 75).The clinical phenotype is characterized by nonketotic hypoglycemia and lethargy, potentially leading to coma and death. The patients are usually clinically asymptomatic in the periods between the attacks. In these asymptomatic periods, urine metabolites that indicate the disease are difficult to detect (76). The cDNA sequence for human MCAD was elucidated in 1987 (77).Although this provided the means to detect mutations in the MCAD coding region, it took 3 more years until the first mutation in MCAD deficiency was described. Four groups reported at about the same time the G985 mutation (78-81). It is a replacement of an adenine nucleotide at position 985 of the cDNA sequence with a guanine, altering the codon for lysine at position 304 of the amino acid sequence for mature MCAD to a glutamic acid codon (K304E).
312
PETER BROSS ET AL.
In the following years it turned out that the G985 mutation is by far the most frequent mutation in MCAD deficiency. It is present in about 90% of the alleles of MCAD-deficient patients (82, 83). Studies in many different countries revealed that the G985 mutation is rather frequent in northwest European Caucasians, among whom it occurs with an average allele frequency of approximately 1in 100 individuals (84,85).The frequency is particularly high in the Netherlands and England and lower in southern Europe, and no G985 alleles could be detected in samples from 500 Japanese individuals. The homozygote frequency in northwest Europeans is about 1 in 10,000to 1 in 40,000 newborn babies, which is in the same range as that for phenylketonuria. A series of different non-G985 mutations has also been detected, and these mutations have so far been found in only a few patients each (83, 86-93). MCAD deficiency may be considered to be a conditional disorder since clinical disease only appears under specific environmental conditions, namely fasting stress often in connection with fever. That means that it is uncertain whether individualswith the disorder will ever experience clinical symptoms if not exposed to these stress conditions. Cases are well known where MCAD deficiency in asymptomatic individuals was diagnosed only after younger siblings manifested the disease and thus triggered clinical investigation of other family members. An anecdotal case of a 30-year-old individual who had his first attack after strenuous exercise in the cold with insufficient nutrition was reported (94). It has been suggested that retrospective investigation after detection of the disorder in most if not all of the cases will reveal symptomatic episodes (74).In spite of the often mild presentation and the long asymptomatic periods, MCAD deficiency is a potentially lethal disease and may cause developmental disturbances if unrecognized (74).In addition to the conditional expression, there may also be genetic factors that render some individuals more susceptible to attacks than others, as is discussed further later.
111. Studies on the Molecular Pathology of MCAD Deficiency
A. Characterization of the K304E Mutant Variant As mentioned earlier, the first mutation detected in MCAD deficiency was the prevalent G985 mutation, which results in replacement of glutamic acid for lysine at position 304 (K304E) of the mature polypeptide (78-81). Lysine-304 is localized in helix H, which forms part of the subunit interface
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
313
(67,95,96).It is distant from the active site and neither directly nor indirectly involved in the mechanism of catalysis. One of the problems at the time of discovery of the mutation was that no simple explanation was evident based on the x-ray structure or other known properties of the enzyme. 1. DEGRADATION AND AGGREGATION
The first expression studies with the K304E mutant variant were performed by us using an E. coli expression system. A derivative of a construct with that part of the human MCAD cDNA encoding the mature protein preceded by an artificial initiator methionine, which we had produced in order to express and study active-site mutants (69), was used. The results were as follows. The total amount of K304E MCAD antigen present in the bacterial cells was similar to that of the wild type. However, there was a significantly reduced amount of soluble K304E MCAD as compared to wild-type MCAD expressed under the same conditions, and K304E MCAD enzyme activity was undetectable in the cell extracts (79).This led us to suggest that the mutation has an effect on the acquisition of the native structure rather than on the stability of the tetramer. It is well known that many proteins upon heterologous expression in E. coli do not fold correctly but rather form insoluble aggregates, so-called inclusion bodies. In this case, however, a protein that folds rather well in the E. coli environment is converted to an “inclusion body former” by a single point mutation. Expression studies in eukaryotic COS-7 cells confumed the diseasecausing nature of the K304E mutation (93, and similar results were obtained later with CHO cells (98). These experiments indicated further that the K304E mutant protein is capable of acquiring the active native conformation. Overexpression in COS-7 cells resulted in a relative MCAD activity level augmented to more than 20-fold for wild-type MCAD and approximately 2-fold for the K304E variant as compared to controls. In the case of the K304E mutant this was not an unequivocal proof that the elevated activity was due to the presence of active mutant enzyme. We could later show by two-dimensionalgel electrophoresisthat K304E MCAD antigen was present in COS-7 cells in amounts corresponding to the 2-fold increased activity (99). No indications for significant amounts of persistent aggregates of K304E MCAD in COS-7 cells were observed (97).Thus, in the eukaryotic system, the level of K304E antigen roughly coincided with the amount of MCAD enzyme activity measured. Earlier experiments with pulse-labeling-immunoprecipitation of fibroblast MCAD from cells from normal individuals and MCAD-deficient patients had shown that transcription and translation of MCAD were unaffected by the G985 mutation (100).Taken together, this suggested that a considerable proportion of K304E MCAD is degraded in
314
PETER BROSS ET AL.
the eukaryotic system. Using two-dimensional gel electrophoresis we could detect K304E MCAD antigen in lymphoblast cells derived from patient lymphocytes (99).The level of antigen was si&icantly decreased in comparison to the wild-type polypeptide, as could be observed directly in the blots derived from cells with one G985 allele and one normal allele. However, in cultured fibroblastswe were only able to detect wild-typebut not K304E mutant MCAD antigen using the two-dimensional gel electrophoresis system. This is in line with Western blotting analysis performed in other laboratories on patient fibroblasts,which showed that K304E MCAD antigen was undetectable (101, 102).The relative level of wild-type MCAD in fibroblast cells is lower than in lymphoblast cells, and K304E antigen in fibroblasts cells may thus be below the detection level. Altogether, this indicated that the amounts of K304E MCAD are dramatically decreased in eukaryotic cells, probably due to degradation of the polypeptide. 2. IMPAIRED FOLDING-SUBUNIT ASSEMBLY AND THE 1-WOLVEMENT OF CHAPERONINS The E. coli expression approach appeared to be in a deadlock as far as further characterization of the K304E mutant protein was concerned. Most of the K304E mutant protein expressed under the conditions applied was present in insoluble aggregates. However, there were indications that a minor fraction of the mutant protein was present in a tetrameric form. Low amounts of K304E MCAD could be detected in a form eluting at the same position as wild-type tetramers in gel filtration chromatography (P. Bross, unpublished observations). We attempted to circumvent the aggregation problem by using the experience from expression of heterologous proteins in E. coZi, which had shown that adjustment of the expression conditions sometimes resulted in avoidance of inclusion bodies. It had been shown that, for example, ribulose bisphosphate carboxylase from plant chloroplasts could only be expressed in an active form in E. coZi when the chaperonins GroEL and GroES were co-overexpressed (103). Co-overexpression of the chaperonins GroES/L together with mature wild-type or K304E mutant MCAD had a dramatic effect (104):Upon GroESlL overexpression,active tetrameric K304E MCAD appeared in clearly detectable amounts (Fig. 2). The enzyme activity level for the K304E mutant enzyme was more than 10-foldhigher than the background level in the E. cold cells, thus establishing that K304E MCAD is capable of forming active tetrameric enzyme. GroES/L co-overexpression had only a minor effect on the relative amount of active wild-type MCAD produced. The relative level of K304E MCAD enzyme activity in the extracts was approximately 7% of the levels of wild-type MCAD expressed under the same conditions. This
315
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
K304E
0
1
3
6
25
time after induction hours1 FIG.2. The effect of chaperonin co-overexpressionon the amounts of active K304E MCAD enzyme produced in bacteria. Escherichia coli cells transformed with the MCAD K304E expression plasmid together with the plasmid encoding the GroESiL chaperonins (+GroESL) or a control plasmid (-GroESL) were induced for different time intervals and MCAD enzyme activity was measured in the soluble extracts. (Reprinted from P. Bross, B. S. Andresen, V. Winter, et al., Bbchim. Biophys. A& 1182,264 (1993) with kind permission of Elsevier Science-NL, Sara Burghartstraat 25, 1055 KV Amsterdam, The Netherlands.)
made it possible to partially p* recombinant K304E MCAD from the E. coli cells and perform a rough estimation of its specific enzyme activity toward octanoyl-CoA.The specific activity turned out to be in the range of that of the wild-type enzyme, probably somewhat lower. At this point it was evident that K304E MCAD was capable of forming active enzyme, that the amino acid replacement impaired the process of acquisition of the native conformation, and that this impairment could be partially overcome by supplying a higher level of chaperonins. As most chaperones are regulated and strongly induced under cellular stress conditions, this was an indication for the possibility that the residual activity of the K304E mutant variant might depend on the availability of cellular chaperonins and thus be subject to interindividual differences due to genetic variations in these proteins and their regulation. Analysis of the influence of chaperonin co-overexpression on the total amount of MCAD antigen and the partition between soluble and insoluble species was performed. Cells transformed with the respective MCAD plasmid and either the GroESlL plasmid or a derivative where the GroES/L
316
PETER BROSS ET AL.
genes had been deleted were grown under identical conditions (i.e., growth medium with antibiotics selecting for both plasmids, temperature, point of induction). This allowed us to achieve very similar results in repeated experiments. There was a minor effect on partition of the wild-type polypeptide, indicating that the level of chaperonins in the cells not overexpressing GroES/L is limiting. For the K304E mutant, high chaperonin levels shifted the partition of antigen between soluble and insoluble species, resulting in clearly increased levels of soluble K304E mutant protein. Gel filtration chromatography of the soluble extracts from the co-overexpression experiments followed by detection of MCAD antigen in the fractions by Western blotting showed that two MCAD species appeared upon GroES/L co-overexpression, one with an apparent molecular mass corresponding to the tetramer and another one with a higher molecular mass. The same result was obtained by native gel electrophoresis and Western blotting (Fig. 3A): Two bands could be observed, one corresponding to the tetramer and another one with a higher mass that co-migrated with the GroEL complex as shown by using anti-GroEL antibodies (Fig. 3B). The amount of MCAD co-migratingwith GroEL was distinctly higher in the samples where GroES/L had been co-overexpressed, strongly suggesting that this MCAD species was complexed with GroESIL. This is in accordance with the notion that the availability of higher amounts of chaperonins that can complex with labile structural MCAD intermediates prevents them from irreversible aggregation. Evidence for the presence of an intermediate high-molecular-weight complex of MCAD also came from a different approach, in this case using a eukaryotic system. The group of Kay Tanaka performed in vitro translation experiments of wild-type or K304E MCAD mRNA in the presence of rat mitochondria (105).By pulse labeling with 35S-methioninefollowed by different time intervals of chase with unlabeled methionine and subsequent sizing of the products by gel filtration chromatography, the dynamics of the process of acquisition of the native tetrameric structure could be monitored. In this way it could be observed that an intermediate high-molecular-weight complex was formed that was the precursor form for the tetrameric form. Very small amounts of monomers were observed, but they appeared with kinetics that indicated that they might be direct precursors for the high-molecularweight form rather than the tetrameric species. Experiments with wild-type MCAD and the K304E mutant demonstrated that the formation of tetramers was dramatically impaired for K304E mutant MCAD. The K304E mutant polypeptide was retained for longer time periods in the high-molecular-weight complex. The presence of a highmolecular-weight complex of MCAD had been observed in similar experi-
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
A
317
B F I C ; . ~Native . polyacrylamide gel electrophoresis of wild-type and mutant MCAD expressed in E. coli cells. Cells co-transformed with the control plasmid (1)or a plasmid encoding wild-type (2),K304E (3), or K304Q (3)MCAD together with the GroESlL plasmid (+) or a control plasmid lacking the CroES/L genes (-) were grown and induced for 3 hours. Cells were harvested, disrupted, and subjected to 4-15"h polyacrylamidegel electrophoresisin the absence of detergent. The gels were blotted and probed with anti-MCAD antibodies (A) or anti-GroEL antibodies (B). The position of tetrameric MCAD and the GroESlL complex are indicated. (Reprinted from P. Bross, B. S. Andresen, V. Winter, et al. Biochim. Biophys. Acta 1182, 264 (1993)with kind permission of Elsevier Science-NL, Sara Burghartstraat 25,1055 KV Amsterdam, The Netherlands.)
ments already in 1987 (106)and had been interpreted in terms of aggregates. The new results together with the notion that this species apparently was the precursor for the tetrameric form suggested immediately the involvement of chaperoneslchaperonins because the concept of chaperone-assisted folding and assembly had taken shape. In following up this route, Saijo et al. (107) refined their in vitro translation system by immunoprecipitation using anti-Hsp70 and anti-Hsp60 antibodies. It could be shown that, after import into mitochondria, wild-type MCAD intermittently formed complexes first with mitochondrial Hsp70 and subsequently with Hsp6O before tetrameric forms appeared. This established for MCAD, a genuine mitochondrial protein, that its intramitochondrial chaperone-assisted folding and assembly proceeded along the pathway found for model proteins in mitochondria and in E. coli (see Fig. 1).For the
318
PETER BROSS ET AL..
K304E mutant these authors could demonstrate that, while its kinetics of interaction with Hsp70 were unaffected, the mutant polypeptide remained associated with Hsp6O for much longer time intervals than did wild-type chains. After 18min chase of the in vitro translation products, low amounts of tetrameric K304E species appeared while the major part of the labeled product was still in the complex. The wild-type polypeptide, in contrast, was to a large degree converted to tetramers after this chase period. The native MCAD enzyme carries a prosthetic group, flavin adenine dinucleotide (FAD). One molecule of FAD is buried between the three domains of each MCAD monomer. Each FAD molecule also forms contacts with side chains of the neighboring subunit in the tetramer (67). Analysis of tissues from rats fed a riboflavin-deficientdiet has shown that the amount of antigen of MCAD and the other acyl-CoA dehydrogenases-that is, the steady-state amount of the proteins-critically depended on the presence of FAD (108).When the in vitro translation experiments described previously were performed in the presence of mitochondria derived from riboflavindeficient rat liver tissue, the picture was very similar to that observed upon translation of K304E MCAD in the presence of normal rat mitochondria. MCAD polypeptide was retained in complex with Hsp6O (109).Disruption of the riboflavin-deficient mitochondria and subsequent incubation with FAD in the presence of ATP released the block on folding and subunit assembly, allowing production of tetramers. This showed that lack of FAD has an effect similar to that of the K304E missense mutation: blocking of the transfer of Hsp6O-associated folding intermediates to assembled tetramers. Altogether, the in vitro translation experiments and the chaperonin cooverexpression experiments in E. coli demonstrated that the K304E mutation caused a severe disturbance of the process(es) occuring during its interaction with the Hsp6O chaperonin. Analysis of the dynamics of the process by the translation experiments revealed that association with Hsp6O persisted, and the co-overexpression experiments indicated that conversion of a larger proportion of the K304E polypeptide into native tetramers could be accomplished by supplying higher levels of chaperonins. This indicates that the pool of available Hsp6O sets a limit to the yield of correctly folded and assembled mutant tetramers.
3. IMPAIRED FOLDING A c c o u m FOR PARTOF THE EFFECT We further exploited the bacterial chaperonin co-overexpression system to study the effect of the K304E mutation on folding in more detail. As a second manipulatable parameter besides the level of chaperonins, we varied the growth temperature. Hydrophobic interactions increase in strength with temperature, and this parameter is therefore an important factor for polypeptide
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
319
folding because the probability of side reactions such as inappropriate intraor intermolecular interactions of hydrophobic stretches leading to aggregation will significantly increase with increasing temperature. A set of experiments with these two variables was performed and the influence on solubility, amount of MCAD antigen, and level of active MCAD enzyme was studied comparing the K304E mutant with wild-type MCAD, the R28C mutant found in patients, and the artificial K304Q mutant (110).Three important results could be observed in these experiments (Fig. 4): 1. Lower temperature in itself had a positive effect on folding of the mutant proteins and it further added to the rescuing effect of increased chaperonin levels. 2. At low temperature and high chaperonin levels, K304E MCAD could be rescued to a level of 40-50% of the wild-type enzyme, whereas both the R28C and the K304Q mutant could be rescued to near 100% wild-type levels. 3. Low chaperonin levels resulted in disposal of a large proportion of the mutant variants into insoluble aggregates at high growth temperatures, whereas degradation was the consequence at lower growth temperature (compare, e.g., R28C expressed without GroESIL co-overexpression at 31134°C to 37141°C). To rule out an effect of temperature on the stability of the folded enzyme variants, we measured the temperature stability profiles of the active enzymes. There was no difference observed between wild-type MCAD and the R28C and K304Q mutant enzymes. This showed that the R28C and K304Q mutants may be counted under the temperature-sensitive folding mutants (see Section I,D),which are impaired in foldingbut not the stability of the native state. For the K304E mutation, the picture was more complex: The temperature stability profile was shifted to lower temperature, demonstrating that, besides its effect on folding-assembly, the K304E mutation also decreases the thermal stability of the native form. The relevance of the results obtained with the co-overexpression experiments in bacteria was checked by analysis of the relative amounts of K304E MCAD antigen present in lymphoblast cells grown at different temperatures in order to vary the intrinsic cellular folding conditions. Lymphoblast cells carrying one K304E and one wild-type MCAD allele were analyzed by twodimensional gel electrophoresis followed by Western blotting with antiMCAD antibodies. The ratio between wild-type MCAD and the K304E mutant proteins was monitored. The relative level of K304E compared to wild-type MCAD decreased with higher temperature (110).This was in agreement with the results obtained with recombinant bacterial expression
28"
31" 34" 37"
40"
28"
31"
34"
37"
40"
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
321
and a good indication for the notion that the formation of active K304E MCAD enzyme is temperature sensitive in human cells. 4. EFFECTOF THE CHARGE REPLACEMENT Lysine-304 is localized in helix H at that surface of the monomer that forms part of the subunit interface. It was thus obvious to suspect an effect of the K304E mutation on subunit assembly. Inspection of the three-dimensional structure showed that the side chain of lysine-304 does not directly interact with parts of neighboring subunits. However, two charged side chains in the close vicinity of lysine-304 were identified: glutamic acid-346 and aspartic acid300 (Fig. 5).The distance between the charged groups is in both cases larger than expected for a typical salt bridge interaction. Lysine-304 forms a hydrogen bond with glutamine-342 from the neighboring helix I in the same subunit. This hydrogen bond can potentially also form between glutamic acid and glutamine-342 in the K304E mutant (67).Aspartic acid300 forms a salt bridge with arginine-383 of the neighboring subunit. Introduction of a negative charge at position 304 might result in repulsion effects between the charged groups, altering the topology of the subunit interface. This in turn might impair subunit docking. There is evidence that assembly of subunits in oligomeric proteins in general occurs through the formation of contacts involving hydrophobic, charge, and hydrogen bond interactions closely equivalent to those occurring during domain docking and merging in the folding process of multidomain proteins (111).Oligomer assembly may thus be considered to be a late folding event. According to theoretical calculations only one subunit of MCAD can be accommodated in the cavity of the Hsp6O chaperonin complex (24). Folded MCAD monomers must therefore be expected to leave the cavity in order to assemble with other monomers in free solution. It has been indicated that chaperonins do not actively promote subunit assembly (2).The lack of chap-
FIG.4. Expression of wild-type and mutant MCAD with and without chaperonin co-overexpression at various growth temperatures. E. coZi cells co-transformedwith plasmids encoding the MCAD variant genes indicated, together with a plasmid encoding the GroES/L genes (+) or a control vector (-), were grown at the culture temperatures indicated and induced for 3 hours. Cells were harvested and disrupted and the proteins were split into soluble (s) and insoluble (p) species by centrifugation. Left panel: MCAD enzyme activity was measured in the soluble fraction (open columns, without GroESlL co-overexpression; closed columns, with GroESiL co-overexpression).Right panel: Aliquots of the soluble and insoluble fraction were subjected to SDS-PAGE followed by immunoblotting with anti-MCAD antibodies. (Reprinted from P. Bross, C. Jespersen, T.G. Jensen, et d., J. BioZ. Chem 270, 10284 (1995) by kind permission of the American Society for Biochemistry and Molecular Biology.)
322
PETER BROSS ET AL.
F1c.5. Enlarged view of the vicinity of lysine-304 in the crystal structure of porcine MCAD. For details see text. (Reprinted from l? Bross, C. Jespersen,T.G . Jensen,et al., J. Biol. Chm. 270,10284 (1995) by kind permission of the American Society for Biochemisby and Molecular Biology.)
eronin co-overexpression and growth at low temperature to fully rescue the effect of the K304E mutation may thus be due to a distinct effect of the mutation on subunit assembly. The effect of the charge replacement in the K304E mutant variant was investigated experimentallyin several ways. Yokota and co-workers (105) had, parallel to their experiments with wild-type and K304E mutant MCAD, analyzed the artificially constructed mutant variants K304D and K304R. The K304R mutant, which carries apositively charged side group at position 304, behaved like the wild-type MCAD while the variant with the negatively
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
323
charged side chain (K304D)was retained in the high-molecular-weightcomplex in the same way as observed for the K304E mutant. This demonstrated that the kind of the charge was particularly important. We constructed a variant with an uncharged side chain (K304Q).The side chain of glutamine is very similar in shape to that of glutamic acid but lacks the negative charge. In the bacterial expression system with and without chaperonin co-overexpression,the K304Q mutation turned out to be less severe than K304E. Expressed without co-overexpression of GroESIL, amounts of active K304Q mutant enzyme significantly higher than that of K304E and lower than wild-type MCAD were produced (see Fig. 4). At lower growth temperature and with chaperonin co-overexpression, the amounts of active K304Q mutant enzyme increased to levels similar to those of the wild type (104).The effect of the neutral-for-positivesubstitution on folding was thus milder and could be rescued almost fully by supplying higher levels of chaperonins. Furthermore, the thermal stability of the K304Q mutant was similar to the wild type, indicating that the observed effect of the K304E mutation on this property was due to introduction of a negative charge rather than the elimination of the positive one. In order to investigate this question further, we attempted to compensate for the potential charge repulsion effect between glutamic acid-304 and the neighboring charged groups in the vicinity (see Fig. 5) by introducing secondary mutations at position 300 or 346 (110).The study of the double mutant K304E-D346K revealed that this mutant produced significantly higher levels of active enzyme in the E. coli expression system than both the K304E and D346K single mutants did. The secondary mutation thus part i d y rescued the detrimental effect of the K304E mutation. The relative amounts of active K304E-D346K mutant protein increased significantly when the GroESIL chaperonins were co-overexpressed,indicating that the effect on folding persisted. In contrast to this, the E300K mutation further decreased the amounts of active enzyme produced when present together with the K304E mutation, suggesting that there is no relevant charge interaction between these two residues. On the one hand, the K304E-D346K double mutant polypeptide behaved very similar to the K304Q mutant: It could be rescued to almost wildtype level by chaperonin co-overexpression and production at low temperature. However, its temperature stability profile was shifted to even lower temperatures than that for the K304E mutant, whereas the profile of the K304Q mutant closely resembled the wild type. This suggested that the D346K replacement interferes with the stability of the enzyme, although it renders subunit assembly more efficient; that is, assembly kinetics are enhanced but at the same time the thermodynamic stability of the assembled
324
PETER BROSS ET AL.
tetramer is decreased. It is interesting to note that the D346K mutation in itself displayed an even lower thermal stability than the K304E-D346K double mutant. The thermostability thus decreases in the order wildtype>K304E>K304E-D346K>D346K. This suggests that the both K304E and D346K mutations individually decrease the stability, and that the combination of the D346K with the K304E mutation partially rescues the very strong effect of the D346K replacement on this parameter. The conclusions obtained with investigation of the molecular effects of the K304E mutation and the key experiments are summarized in Table 1.The results with the K304E mutant and artificial mutants scrutinizing the role of lysine-304 for folding, assembly, and stability strongly suggest that the replacement of lysine-304 with glutamic acid has three distinct effects: it (i) causes impaired folding, (ii) causes impaired tetramer assembly, and (iii) compromises the thermal stability of the enzyme. It is difficult to assess the contribution of the effects to the dramatically decreased amounts of K304E MCAD enzyme observed in patient fibroblasts. The effect of chaperonin co-overexpression and low growth temperature in the bacterial system may indicate that impaired folding may account for up to 40-50% of the decrease.
TABLE I MAJORCONCLUSIONS I N THE INVESTIGATION OF THE MOLECULAR PATHOLOGY OF MCAD DEFICIENCY DUE TO THE K304E MUTATION Conclusions
Key experiments
Reference(s)
Pulse chase-immunoprecipitation using patient cells
100
Missense mutation (C985/K304E)
Sequence analysis of patient cDNA
Effect on folding
Expression in E. coli 5 chaperonin co-overexpression; in oitro translation-import immunoprecipitation
78-81 104
Synthesis-import normal
Noiminor effect on specific enzyme activity Effect on tetramer assembly Effect on tetramer stability
Partial purificationof mutant variant expressed in E. coli Site-directed mutagenesis-expression of artificial mutants Biochemical analysis of overexpressed mutant enzyme
109 104
110 110
325
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
B. Characterization of Other Missense Mutations in MCAD Deficiency At present, nine nonX304E missense mutations in MCAD have been characterized by expression in E.coli (92,110,112,1124 and some also in COS-7 cells (86,89, 112).As shown in Table 11, six of the mutant variants possess residual enzyme activity. Moreover, five of these six respond to chaperonin co-overexpression, suggesting that in these at least part of the molecular mechanism is due to impaired folding. The R28C mutant variant was investigated in parallel in the co-overexpression experiments discussed in Section II1,A (110).From these results, the R28C mutation appears to affect only monomer folding and not the stability of the assembled enzyme. This mutation may thus be designated as a pure folding mutation (110).The R28C mutation has also been expressed in COS7 cells, and amounts of active mutant enzyme between approximately 50% and wild-type level were detected in this system (89,112).Our data show that the mutation potentially is particularly sensitive to factors deteriorating the folding conditions in the cell. Arginine-28 is conserved in the human acylCoA dehydrogenases SCAD, LCAD, and glutaryl-CoA dehydrogenase. Short-branched-chain acyl-CoA dehydrogenase has a lysine at this position and VLCAD and isovaleryl-CoAdehydrogenase carry valine and alanine, re-
TABLE I1 MCAD MISSENSE MUTATIONS CHARACTERIZED BY EXPRESSION WITH AND WITHOUT CHAPERONIN CO-OVEREXPRESSION IN E. COLI
MCAD mutation R28C M1241 T168A G170R
Residual enzyme activity
+ + + -
Chaperonin effect
Corresponding mutation in other acyl-CoA dehydrogenases
+ G170V
0) C242R M301I K304E S311R Y327C
326
PETER BROSS ET AL..
spectively (65).A mutation of the correspondingresidue in SCAD has been detected in a SCAD-deficientpatient (113).Inspection of the three-dimensional structure of MCAD reveals that arginine-28 is localized in helix A and forms a salt bridge with glutamic acid-86 in helix D. The latter residue is conserved in all known human acyl-CoA dehydrogenases except VLCAD. Both side chains protrude toward the surface of the tetramer (89).An interaction of arginine-28 with glutamic acid-86 or other residue(s) in the polypeptide may thus be a kinetically relevant but not essential element in the folding process. The M1241, G242R, and Y327C mutations all affect folding and can, to variable degrees, be rescued by chaperonin co-overexpression (1124 This indicates that, like the K304E and the R28C mutations, at least part of the effect of these mutations is on polypeptide folding. Two mutations located in the same helix as the K304E mutation, the M301T and S311R mutations, were expressed in both bacteria and COS-7 cells. Both of these proteins did not produce any detectable amounts of tetramer (86, 112).The side chain of methionine-301 points toward the inside of the monomer and is buried in a hydrophobic pocket. Introduction of an amino acid with a sterically different polar side chain in this tightly packed hydrophobic environment may conceivably disturb the architecture of this part of the protein in such a way that acquisition of the native structure is not possible. Serine-311, like lysine-304, is localized close to the subunit interface. The S311R mutation introduces a charge and a much longer side chain and may therefore disturb oligomer assembly more dramatically than the K304E mutation does. The T168A mutation is the only mutation detected in patients so far that affects a residue directly interacting with the substrate or co-factor. Threonine-168 potentially forms a hydrogen bond with the N5 of the bound flavin (67). Expression experiments in E. coli indicate that significantly lower amounts of active enzyme are produced. However, co-overexpression of chaperonins does not sigmficantly increase these levels, indicating that the T168A mutation does not affect or only to a minor extent affects folding (B. S. Andresen et al., unpublished). The influence of the T168A substitution on enzyme function is currently being analyzed. The G170R mutation is localized in the same turn as threonine-168; however, its side chain does not interact with the bound FAD.Prokaryotic expression of this mutant variant (92;B. S. Andresen et al., unpublished) does not produce any active enzyme. The introduction of a bulky charged side chain at this tightly packed position of the molecule plausibly disturbs the domain to such an extent that the structure of the whole enzyme cannot be formed. A mutation at the corresponding position in isovaleryl-CoA dehydrogenase (G170V) has been detected in a patient with isovaleryl-CoA dehydrogenase deficiency (114).
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
M
327
Conclusions
A. Characterization of the Molecular Effects of Missense Mutations-Choice of the Expression System Our experience with the characterization of the effects of mutations using recombinant expression of MCAD mutant variants in both prokaryotic and eukaryotic systems shows that the choice of the expression system is important. The mutations affecting folding were particularly suited for analysis in the E. coli system as the chaperonin levels could be readily manipulated. Compelling evidence suggests that mitochondria originate from eubacteria that through an endosymbiont intermediate have been integrated into eukaryotic cells (20). This is in accordance with many homology comparisons between bacterial proteins and mitochondrial and cytoplasmatic proteins of eukaryotes (e.g., the Hsp7O family of chaperones) (13,115, 116).As far as chaperone-assisted folding including the Hsp70 and Hsp6O chaperone machineries is concerned, homologs to all bacterial components have been identified in yeast mitochondria and to most of them in human mitochondria. Chaperones are promiscuous in their substrate specificities. Comparison of the binding specificities between Hsp70 chaperone homologs from E. coZi and the eukaryotic cytosol and endoplasmatic reticulum showed that the general specificity of all three was identical, with minor differences regarding particular positions in the overall consensus. The differences between folding in the mitochondrial matrix space and the bacterial cytoplasm thus arise from the fact that the nascent chain in bacteria emerges from the ribosome, whereas it protrudes into the matrix space through the mitochondrial transport machinery. However, in both cases the unfolded polypeptide chain initially forms a complex with the respective Hsp70 so that the starting point for folding is similar. Escherichia coli may thus be a relevant expression system for at least some mitochondrial matrix proteins. We have compared the results obtained with expression of MCAD mutants in the E. coli system and eukaryotic COS-7 cells (112).As far as folding mutations are concerned, the folding environment in E. cuZi appears to be less permissive, but similar residual levels of enzyme activity were obtained when the GroES/L chaperonins were co-overproduced.The advantage of the bacterial system is thus that a given mutant variant may be expressed at two different chaperonin levels, thus revealing whether the mutation in question has an effect on folding. It would be desirable to develop a similar manipulatable eukaryotic expression system so that the protein in question can be analyzed in an environment more similar to its normal occurrence. When using heterologous expression approaches for characterization of
328
PETER BROSS ET AL.
mutant proteins, one must be aware that the results sometimes are dependent on the particular expression system and the conditions used. A particular system may lead to exaggeration of effects of mutations on the one hand or to overseeing of effects on the other.
B. Future Lines of Research To Improve the Ability To Predict Effects of Mutations The active site of a typical enzyme is formed by a very limited number of the residues of the protein. Furthermore, as discussed in Section I,D, only 50% or less of the amino acid positions may be essential for the formation of the functional structure. This means that mutations with an effect on the functional activity of a protein or its ability to acquire the native state will typically lie within these essential residues. At present the capabilities to predict the effect of a mutation on folding are still very limited. If the three-dimensional structure of a protein is known, as in the case of MCAD, one rather general rule may be applied. Replacements of amino acids the side chains of which are buried in the inside of the protein set some steric restrictions that allow only amino acids with side chains of similar size to be accommodated. Much larger, bulkier side chains will be impossibleto accommodate and therefore strongly disturb the structure of the affected domain. A typical example for this type is the G170R mutation in MCAD deficiency (see Section 111,B). As apparently many mutations in genetic deficiencies interfere with folding, a better understanding of the second genetic code would be very helpful for making reliable predictions. The exponential growth of detected mutations in genetic diseases through the availability of fast sequence analysis methods will lead to a growing lagging behind of the understanding of the molecular effects of mutations because empirical characterization of the mutant variant is much more time consuming. The enormously growing knowledge base of mutations affecting the life cycle of affected proteins provided by genetic analysis of patients may be very useful to pinpoint sites in the primary sequence that are particularly important for folding. Such experimental tools are otherwise only available by genetic selection experiments in yeast or bacteria. Mutual fruitful research cooperation on selected model proteins where many different mutations are known may thus contribute to better understanding the structural information contained in a coding sequence and to predicting the consequences of mutations for protein folding, assembly, stability, and function.
C. Impaired Folding Resulting in Decreased Stability-
Common Theme in Genetic Diseases
Characterization of the effects of missense mutations detected in patients with MCAD deficiency shows that such mutations rather commonly affect
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
329
folding. Furthermore, many of the mutations detected in VLCAD deficiency are missense mutations and deletions of single amino acids (65,117-119). The position of the affected residues is apart from the active site as deduced from homology comparison with the MCAD structure. The mutations result in decreased levels of VLCAD antigen as monitored by analysis of cultured fibroblast cell lines from patients (65) and, for some, by recombinant expression in CHO cells (118).Only two disease-causing missense mutations have been identified in SCAD deficiency so far (113).One of the mutations (R22W) substitutes the residue corresponding to arginine-28 in MCAD that is affected by the R28C mutation discussed in Section II1,B. Moreover, one (G170V) of the two missense mutations detected in isovaleryl-CoAdehydrogenase deficiency affects the residue corresponding to the one replaced by the G170R mutation in MCAD deficiency (114).This coincidence of the mutation site in two cases again suggests that there exists a subset of residues that are particularly important. In the past it has generally been recognized that impaired folding caused by mutations is an often-encountered phenomenon in genetic diseases (120-122).In two of the best investigated and frequently occurring genetic diseases, cx-l-antitrypsin deficiency and cystic fibrosis, the respective major mutations have been investigated in detail. Both affect proteins passing through the endoplasmatic reticulum. For the Z mutation in a-l-antitrypsin, experimental evidence suggests that the mutant protein is impaired in folding, resulting in the accumulation of aggregation-prone intermediates (123). Accumulation of the aggregates results in cell damage. The phenotype is particularly severe in a subgroup of patients, and experiments with fibroblast cells suggest that the severe phenotype correlates with a lower capacity for degradation (124).In cystic fibrosis, temperature-sensitive processing of the prevalent A508 mutant variant has been observed (129, and it has been shown that folding of the polypeptide is impaired when it harbors the A508 mutation (126,127).The functional activity of the mutant variant appears not to be affected;however, its stability after insertion into the plasma membrane is decreased (128).These examples could be extended and many new ones may be expected in the future. One important aspect of folding mutations is that they may provide an explanation for variable clinical expression of defects due to one specific mutation in an affected gene. Such variability can be observed in MCADdeficient patients harboring the K304E (G985) mutation as well as other genetic diseases-for example, retinitis pigmentosa due to mutations in the rhodopsin gene (129).As folding mutants often possess a residual capability to acquire the native conformation, the level of functional mutant variant formed may depend on factors in the cellular environment, such as, for example, the availability of chaperones. The cellular environment for folding
330
PETER BROSS ET AL.
may in turn be determined by genetic diversity. It remains to be established to what extent phenotypic variability observed in specific instances is due to this mechanism.
D. Protein Quality Control Systems Impairment of the acquisition of the native structure often results in decreased levels of the mutant proteins. The handling of a folding mutant by the cellular folding and assembly assistants on the one side and the degradation system on the other is decisive for the proportion of the given protein that acquires the functional structure. For the endoplasmic reticulum, a quality control mechanism has been proposed that ensures that only correctly folded and assembled proteins leave this compartment for their destination in other compartments of the cell or secretion (130).Misfolded proteins exposing hydrophobic surfaces that may interact nonspecifically with other folding intermediates present a danger for every cellular compartment. Therefore, it must be assumed that protein quality control systems exist in every compartment where unfolded proteins occur. The factors assisting intracellular folding have been investigated in some detail in the last years, but there is still only limited knowledge about the proteolytic systems that secure the removal of malfolded and aggregated proteins and the cross-talkbehveen these two systems. For controlling the protein levels, degradation is just as important as synthesis. It is known that proteins have widely different half lives (131, 132). Specific degradation of cytoplasmic and nuclear regulatory proteins through the proteasome system has been investigated (133-135). As far as mitochondria are concerned, a process called macroautophagy accounts for lysosomal degradation of areas of the cytoplasm that may contain whole organelles like mitochondria (132).This process appears to be nonspecific and degradation is independent of whether the proteins are folded or unfolded. In addition, there appear to be several degradative pathways both inside and outside mitochondria that provide for the specific degradation of mitochondrial proteins. The specific degradation of certain short-lived proteins and malfolded polypeptides occurs through specific proteases and cofactors that are present inside the mitochondrion. Most of the knowledge on proteases in the mitochondrial matrix space stems from analogy with proteases in E. coli and in some cases yeast (136, 137).Homologs of the soluble E. coli proteases Lon and ClpP have been detected in human cells (35, 138).It has been demonstrated that the yeast analog of Lon protease (PIM1)cooperates with mitochondrial Hsp70 and Hsp40 for degradation of misfolded proteins (139).Lon protease consists of an ATPase domain and a proteolytic domain (140).For an artificial model protein that is unable to fold, it has been shown that its degradation in E. coli requires
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
331
both ClpP and the chaperoninsGroES/Lbut not the cofactors ClpA or ClpX (141,142).Whether ClpP in certain cases directly interacts with GroEL and thus receives the substrate protein for degradation is not clear. The proteolytically active unit of ClpP protease consists of a tetradecameric double ring of ClpP subunits to which one or two rings composed of ClpA or ClpX subunits are attached (143). This structure is reminiscent of the architecture of the proteasome (143)and GroEL. A direct interaction of the double rings of GroEL and ClpP without the involvement of ClpA or ClpX can be imagined. Due to their links to the Hsp70 and Hsp6O chaperone machines, respectively, the Lon and ClpP proteolytic systems appear to attack folding intermediates at different steps in the folding pathway. This may be due to different substrate specificities of the inherent (Lon) or associated (ClpP) chaperones that, like Hsp70 and Hsp60, prefer extended or more compact structural intermediates, respectively. A flow scheme illustrating our current model of the intramitochondrial route that a typical soluble ma.trixpolypeptide using both the Hsp70 and the Hsp6O chaperone machines may take is depicted in Fig. 6. In the productive pathway the polypeptide acquires the native structure supervised by the
FIG 6. Flow scheme for the fate of a typical mitochondrial matrix protein during foldingin mitochondria.
332
PETER BROSS ET AL.
chaperones. Polypeptides that for some reason fold slowly or become kinetically trapped may be channeled to the proteolytic systems by an unknown mechanism. Alternatively, they may be released by the chaperones and kinetically trapped conformers are either captured by the proteolytic systems or recaptured by one of the chaperone machines in order to reinitiate folding. There is thus a competition for the folding polypeptide between the chaperone machines and the respective linked proteolytic systems. Folding mutations increase the time necessary for folding and thus expose the polypeptide for longer time intervals in a conformation that is accessible for the proteases. The decision mechanism between another chance for folding or degradation may be simply statistical: that is, the ratio between the levels of available chaperones supervising folding on one side and that of the components of the degradative systems on the other side is determining. Variation in the handling of folding mutants by the quality control system due to varying environmental conditions or to genetic diversity in the genes encoding its components is a mechanism that may account for some of the phenotypic variations observed in genetic diseases. More understanding of the basic mechanisms of the protein quality control systems is necessary to estimate whether this is a common reason for phenotypic variability. ACKNOWLEDGMENTS This work was supported by grants from the Danish Medical Research Council and the Danish Center for Human Genome Research.
REFERENCES 1. Different authors, FASEBJ. 10,l (1996). 2. F. U. Hartl, Nature (London)381,571 (1996). 3. C. B. Anfinsen, Science 181,223 (1996). 4. C. Levinthal,J. Chim. Phys. 85,44 (1968). 5. 0. B. Ptitsyn, FASEB.]. 1 4 3 (1996). 6. 0.B. Ptitsyn, Adu. Protein Chem. 47,83 (1995). Z C. M. Dobson, Nature SEruct. B i d . 2,513 (1995). 8. 0.B. Ptitsyn and V. N. Uversky, FEBS Lett. 341,15 (1994). 9. A. R. Fersht, Curr. @in. Struct. B i d . 5,79 (1995). 10. A. Miranker, C. V. Robinson, S. E. Radford, and C. M. Dobson, FASEBJ. 10,93 (1996). 11. A. R. Fersht, Proc. Natl. Acad. Sci. U.S.A. 92, 10869 (1995). 12. R. J. Ellis, Philos. Trans. R. SOC.L d . Ser.B. 339,257 (1993). 13. R. S . Gupta, G. B. Golding, and B. Singh,]. Mol. Euol. 39,537 (1994). 14. C. Georgopoulos and W.J. Welch, Annu. Rev. Cell. B i d . 9601,634 (1993). 15. J. Rassow, W. Voos, and N. Pfanner, Trends Cell. Biol. 5,207 (1995).
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
333
16. A. M. Fourie, J. F. Sambrook, and M. J. H. Gething,J. Biol. Chem. 269,30470 (1994). 1% G. C. Flynn, J. Pohl, M.T. Flocco, and J. E. Rothman, Nature (London)353,726 (1991). 18. S. J. Landry, R. Jordan, R. Mcmacken, and L. M. Gierasch, Nature (London) 355, 455 (1992). 19. X. T. Zhu, X. Zhao, W. F.Burkholder, A. Gragerov, C. M. Ogata, M. E. Gottesman, and W. A. Hendrickson, Science 272,1606 (1996). 20. R. S. Gupta, Mol. Microbiol. 15, l(1995). 21. K. Braig, Z. Otwinowski, R. Hegde, D. C. Boisvert, A. Joachimiak, A. L. Horwich, and P. B. Sigler, Nature (London) 371,578 (1994). 22. D. C. Boisvert, J. M. Wang, Z. Otwinowski,A. L. Horwich, and P. B. Sigler, Nature Struct. B i d . 3, 170 (1996). 23. J. F. Hunt, A. J. Weaver, S. J. Landry, L. Gierasch, and J. Deisenhofer, Nature (London) 379, 37 (1996). 24. A. R. Clarke and P. A. Lund, in “Chaperonins”(R. J. Ellis, ed.), p. 167. Academic Press, San Diego, 1996. 25. J. S. Weissman, H. S . Rye, W. A. Fenton, J. M. Beechem, and A. L. Horwich, Cell 84,481 (1996). 26. W. A. Fenton, J. S. Weissman, and A. L. Honvich, Chem. Bid. 3,157 (1996). 2% M. J. Todd, G. H. Lorimer, andD. Thirumalai,Proc.Natl. A c d , Sci. U.S.A.93,4030 (1996). 28. C. Squires and C. L. Squires,J. Bacterial. 174,1081 (1992). 29. A. K. Clarke,]. Biosci. 21,161 (1996). 30. M. Schmitt, W. Neupert, and T. Langer, EMBO]. 14,3434 (1995). 31. S. Wickner, S. Gottesman, D. Skowyra, J. Hoskins, K. Mckenney, and M. R. Maurizi, hoc. Nafl. A c d . Sci. U.S.A. 91,12218 (1994). 32. A. Wawrzynow, D. Wojtkowiak, J. Marszalek, B. Banecki, M. Jonsen, B. Graves, C. Georgopoulos, and M. Zylicz, EMBOJ. 14,1867 (1995). 33. S. Gottesman, W. P. Clark, V. de Crecy-Lagard, and M. R. Ma&, J. Biol. Chem. 268, 22618 (1993). 34. I. Levchenko, L. Luo, and T.A. Baker, Gene Dev. 9,2399 (1995). 35. P. Bross, B. S. Andresen, I. Knudsen, T. A. Kruse, and N. Gregersen, FEBS Lett. 377,249 (1995). 36. R. A. Stuart, D. M. Cyr, E. A. Craig, and W. Neupert, Trends. Biochem. Sci. 19,87 (1994). 3% N. Pfanner, E. A. Craig, and M. Meijer, Trends. Biocha. Sci. 19,368 (1994). 38. C. Ungermann, W. Neupert, and D. M. Cyr, Science 266,1250 (1994). 39. N. G. Kronidou, W. Oppliger, L. Bolliger, K. Hannavy, B. S. Glick, G. Schatz, and M. Horst, Proc. Natl. Acad. Sci. U.S.A. 91,12818 (1994). 40. C. Ungermann, B. Guiard, W. Neupert, and D. M. Cyr, EMBO]. 15,735 (1996). 41. G. H. Lorimer, FASEBJ. 10,5 (1996). 42. A. L. Horwich, K. B. Low, W. A. Fenton, I. N. Hirshfield, and K. Furtak, Cell 74,909 (1993). 43. M. Y. Cheng, F. U. Had, J. Martin, R. A. Pollock, F. Kalousek, W. Neupert, E. M. Hallberg, R. L. Hallberg, and A. L. Horwich, Nature (London)337,620 (1989). 44. S. Rospert, R. Looser, Y.Dubaquie, A. Matouschek, B. S. Glick, and G. Schatz, EMBO]. 15,764 (1996). 45. A. Huckriede and E. Agsteribbe, Biochim. Biuphys. Actu Mol. Basis. Dis. 1227, 200 (1994). 46. E. Agsteribbe,A. Huckriede, M. Veenhuis, M. H. Ruiters, K. E. Niezen Koning, 0. H. Skjeldal, K. Skullerud, R. S . Gupta, R. Hallberg, 0. P. van Diggelen, d al., Biochem. Biophys. Res. Commun. 193,146 (1993). 4% A. Matouschek, S . Rospert, K. Schmid, B. S. Click, and G. Schatz, h c . Natl. A d . Sci. U.S.A. 92,6319 (1995).
334
PETER BROSS ET AL.
48. J. Rassow, K. Mohrs, S. Koidl, I. B. Barthelmess, N. Pfanner, and M. Tropschug, Mol. Cell. Biol 15,2654 (1995). 49. E X. Schmid, L. M. Mayr, M. Mucke, and E. R. Schonbrunner, Adu. Protein Chem. 44,25 (1993). 50. J. King,C. Haase Pettingell, A. S. Robinson, M. Speed,and A. Mitraki, FASEB]. 10,57 0996). 51. A. Mitraki and J. King, FEBS Lett.307,20 (1992). 52. A. Mitraki, M. Danner, J. King, and R. Seckler,]. Biol. Chem. 268,20071 (1993). 53. A. Mitraki, B. Fane, C. Haase Pettingell,J. Sturtevant, and J. King, Science 253,54 (1991). 54. B. Fane, R.Villafane, A. Mitraki, and J. King]. Bid. Chem. 266,11640 (1991). 55. B. W. Matthews, FASEBJ. 10,35 (1996). 56. C. D. Bottema, R. P. Ketterling, S. Ii, H. S. Yoon, J. A. Phillips, Ill, and S. S. Sommer, Am. ]. Hum. Genet. 49,820 (1991). 5%W. H. Kunau, V. Dommes. and H. Schulz, R-og. Lipid. Res. 34,267 (1995). 58. C. D. Moyes, Cump. Biochem. Physiol. [A]. 113,69 (1996). 59. J. R. Neely and H. E. Morgan, Annu. Reo. Physiol. 36,413 (1974). 60. K. Izai, Y. Uchida, T.Orii, S.Yamamoto, and T. Hashimoto,]. Bwl. Chem. 267,1027 (1992). 61. M. A. Nada, W. J. Rhead, H. Sprecher,H. Schulz,and C. R.Roe,]. Biol. C h .270,530 (1995). 62. C. Thorpe, in “Chemistryand Biochemistry of Flavoenzymes”(F.Muller, ed.), p. 471. CRC Press, Boca Raton, FL, 1991. 63. C. Thorpe and J. J. P. Kim, FASEB]. 9,718 (1995). 64. A. Nandy, B. Kiichler, and S. Ghisla, B i o c h . SOC. Trans.24,105 (1996). 65. B. S. Andresen, P. Bross, C. Vianey-Saban, P. Divry, M. T. Zabot, C. R.Roe, M. A. Nada, A. Byskov, T. A. Kruse, S. Neve, K. Kristiansen,I. Knudsen, M. J. Corydon, and N. Gregersen, Hum. Mol. Gmd. 5,461 (1996). 66. K. Tanaka and Y. Indo, in “New Developments in Fatty Acid Oxidation” (P. M. Coates and K. Tanaka, eds.), p. 95. Wiley-Liss, New York, 1992. 6%J. J. Kim, M. Wang, and R.Paschke, h c . Natl. A c d . Sci. U.S.A. 90,7523 (1993). 68. P. J. Powell and C. Thorpe, Biochemisty 27,8022 (1988). 69. P. Bross, S. Engst, A. W. Strauss, D. P. Kelly, I. Rasched, and S. Ghisla,]. Bid. Chem. 265, 7116 (1990). 70. N. Gregersen, R. Lauritzen, and K. Rasmussen, Clin. Chim. Acta 70,417 (1976). 71. S. Kolwaa, N. Gregersen, E. Christensen,and N. Hobolth, Clin. Chim. A& 126,53 (1982). 72. W. J. Rhead, B. A. Amendt, K. S. Fritchman, and S. J. Felts, Science 221,73 (1983). 73. C. A. Stanley, D. E. Hale, P. M. Coates, C. L. Hall, B. E. Corkey, W. Yang, R. I. Kelley, E. L. Gonzales, J. R.Williamson, and L. Baker, Pediatr. Res. 17,877 (1983). 74. A. K. Iafolla, R. J. Thompson, and C. R. Roe, J. Pediatr. 124,409 (1994). 75. C. R. Roe and P. M. Coates, in “The Metabolic and Molecular Basis of Inherited Disease” (C. R. Scriver, A. L. Beaudet, W. S. Sly, and D. Valle, eds.), p. 1501. McGraw-Hill, New York, 1995. 76. N. Gregersen, V. Winter, S. Lyonnet, J. M. Saudubray, U. Wendel, T. G. Jensen, B. S. Andresen, S. Kolvraa, W. Lehnert, L. Bolund, E. Christensen, and P. Bross, ]. Inhait. Metab. Dis. 17,169 (1994). 7%D. P. Kelly, J. J. Kim, J. J. Billadello, B. E. Hainhe, T. W. Chu, and A. W. Strauss, R-oc. Nutl. A d . Sci. U.S.A.84,4068 (1987). 78. I. Yokota, Y. Indo, P. M. Coates, and K. Tanaka,]. Clin. Inuest. 86, 1000 (1990). 79. N. Gregersen, B. S. Andresen, P. Bross, V. Winter, N. Rudiger, S. Engst, E. Christensen, D. Kelly, A. W. Strauss, S. Kelvraa, L.Bolund, and S. Ghisla, Hum. Genet. 86,545 0991). 80. Y. Matsubara, K. Narisawa, S. Miyabayashi, K. Tada, P. M. Coates, C. Bachmann, L. J. Elsas, R. J. Pollitt, W. J. Rhead, and C . R. Roe, B i o c h . Biophys. Res. Commun. 171,498 (1990).
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
335
81. D. P. Kelly, A. J. Whelan, M. L. Ogden, R. Alpers, Z. F. Zhang, G . Bellus, N. Gregersen, L. Dorland, and A. W. Strauss, h c . Natl. A c d . Sci. U.S.A.87,9236 (1990). 82. N. Gregersen, A. I. Blakemore, V. Winter, B. Andresen, S. KQ~WLU, L. Bolund, D. Curtis, and P. C. Engel, Clin. Chirn. Acta 203,23 (1991). 83. I. Yokota, P. M. Coates, D. E. Hale, P. Rinaldo, and K. Tanaka, Am.J. Hum. Gend. 49,1280 (1991). 84. N. Gregersen, V. Winter, D. Curtis, T. Deufel, M. Mack, J. Hendrickx, P.J. Willems,A. Ponzone, T.Parella, R. Ponzcme, J. H. Ding, W. Zhang, Y. T. Chen, S. Kahler, C. R. Roe, S. Kclvraa, K. Schneideman, B. S. Andresen, P. Bross, and L. Bolund, Hum. Hered. 43,342 (1993). 85. Y. Matsubara, K. Narisawa, K. Tada, H. Ikeda, Y. Q. Yao, D. M. Danks, A. Green, and E. R. McCabe, Lancet 338,552 (1991). 86. B. S. Andresen, T. G. Jensen, P. Bross, I. Knudsen, V. Winter, S. KQ~WLU, L. Bolund, J. H. Ding, Y. T. Chen, J. L. K. Vanhove, D. Curtis, I. Yokota, K. Tanaka, J. J. P. Kim, and N. Gregersen, Am. J . Hum. Genet. 54,975 (1994). 87. K. Tanaka, I. Yokota, P. M. Coates, A. W. Strauss, D. P. Kelly, Z. Zhang, N. Gregersen, B.S. Andresen, Y. Matsubara, D. Curtis, and Y.-T. Chen, Hum. Mutat. l, 271 (1992). 88. B. S. Andresen, S. KP~WM, P. Bross, L. Bolund, D. Curtis, H. Eiberg, Z. F. Zhang, D. P. Kelly, A. W. Strauss, and N. Gregersen, Hum. Mol. Genet. 2,488 (1993). 89. B. S. Andresen, P. Bross, T G. Jensen, V. Winter, I. Knudsen, S. KQ~WW,U. B. Jensen, L. Bolund, M. Duran, J. J. Kim, D. Curtis, P. Diwy, C. Vianey-Saban, and N. Gregersen, Am. ]. Hum. Genet. 53,730 (1993). 90. J. H. Ding, B. Z. Yang, Y. Bao, C. R. Roe, and Y. T. Chen, Am. J. Hum. Genet. 50,229 (1992). 91. A. A. M. Moms, R. W. Taylor, R.N. Lightowlers, A. Aynsleygreen, K. B d e t t , and D. M. Turnbull, Hum. Mol. Genet. 4,747 (1995). 92. J. C. Brackett, H. F. Sims, R.D. Steiner, M. Nunge, E. M. Zimmeman, B. Demartinville, P. Rinaldo, R.Slaugh, and A, W. Strauss,]. Clin. Inwest. 94,1477 (1994). 93. R. Ziadeh, E. I? Hoffman, D. N. Finegold, R. C. Hoop, J. C. Brackett, A. W. Strauss, and E. W. Naylor, Pediatr. Res. 37,675 (1995). 94. W. Ruitenbeek, P. J. E. Poels, D. M. Turnbull, B. Garavaglia,R.A. Chalmers, R. W. Taylor, and F. J. M. Gabreels,]. Neurol. Neurosurg. Psychiatry 58,209 (1995). 95. J. J. Kim, M. Wang, S. Djordjevic, and R.Paschke, h g . Ckn. Bioz. Res. 375, 111 (1992). 96. J. J. Kim and J. Wu, h g . Clin. Biol. Res. 321,569 (1990). 97. T. G. Jensen, B. S. Andresen, P. Bross, U. B. Jensen, E. Holme, S. KQ~WM, N. Gregersen, and L. Bolund, Biochim. Biophys. Acta 1180,65 (1992). 98. A. J. Whelan, A. W. Strauss, D. E. Hale, N. J. Mendelsohn, and D. P. Kelly, Pediatr, Res. 34, 694 (1993). 99. P. Bross, T.G. Jensen, B. S. Andresen, M. Kjeldsen, A. Nandy, S. KQ~WWS. Ghisla, I. Rasched, L. Bolund, and N. Gregersen, B i o c h . Med. Metab. Biol. 52,36 (1994). 100. Y. Ikeda, D. E. Hale, S. M. Keese, P. M. Coates, and K. l'anaka, Pediatr. Res. 20,843 (1986). 101. I. Ogdvie, S. Jackson, K. Bartlett, and D. M. Turnbull, Biochem. Med. Metab. B i d . 46,373 (1991). 102. P. M. Coates, Y.Indo, D. Young, D. E. Hale, and K. Tanaka, Pediatx Res. 31,34 (1992). 103. P. Goloubinoff, A. A. Gatenby, and G. H. Lorimer, Nature (London) 337,44 (1989). 104. P. Bross, B. S. Andresen, V. Winter, F. Krautle, T. G. Jensen, A. Nandy, S. KQ~WW,S. Chisla, L. Bolund, and N. Gregersen, Biochim. Biophys. Acta 1182,264 (1993). 105. I. Yokota, T. Saijo, J. Vockley, and K. Tanaka,]. Biol. Chem. 267,26004 (1992). 106. Y. Ikeda, S. M. Keese, W. A. Fenton, and K. Tanaka, Arch. Biochem. Biophys. 252, 662 (1987).
336
PETER BROSS ET AL.
107. T. Saijo, W. J. Welch, and K. Tanaka,]. Biol. Chem. 269,4401 (1994). 108. M. Nagao and K. Tanaka,]. Bid. Chem.267,17925 (1992). 109. T. Saijo and K. Tanaka,]. Biol. Chem. 270,1899 (1995). 110. P. Bross, C. Jespersen, T. G. Jensen, B. S. Andresen, M. J. Kristensen, V. Winter, A. Nandy, F, Krautle, S. Ghisla, L. Bolund, J. J. P. Kim, and N. Gregersen, ]. Biol. Chem. 270, 10284 (1995). 111. R. Jaenicke, h o g . Biophys. Mol. Biol. 49,117 (1987). 112. T. G. Jensen, P. Bross, B. S. Andresen, T. B. Lund, T. J. Kristensen, U. B. Jensen, V. Winther, S. Kolvraa, N. Gregersen, and L. Bolund, Hum. Mutat. 6,226 (1995). 1 1 2 ~ B. . S. Andresen, P. Bross, S. Udvari, J. Kirk, G. Gray, S. Kmoch, N. Chamoles, I. Knudsen, V. Winter, B. Wilcken, I. Yokota, K. Hart, S. Packman, J. P. Harpey, J. M. Saudubray, D. E. Hale, L. Bolund, S. Kslvraa, and N. Gregersen, Hum. Mol. Genet.6,695 (1997). 113. E. Naito, Y.Indo, and K. Tanaka,]. Clin. Inuest. 85, 1575 (1990). 114. J. Vockley, B. Parimoo, and K. Tanaka, Am. 1.Hum. Genet. 49,147 (1991). 115. R.S. Gupta and G. B. Golding,]. Mol. Euol. 37,573 (1993). 116. W. R. Boorstein, T. Ziegelhoffer,and E. A. Craig,]. Mol. Euol. 38, 1 (1994). 11%A. W. Strauss, C. K. Powell, D. E. Hale, M. M. Anderson, A. Ahuja, J. C. Brackett, and H. F. Sims, Roc. Natl. Acad, Sci. U.S.A.92,10496 (1995). 118. M. Souri, T. Aoyama, K. Orii, S. Yamaguchi, and T. Hashimoto, Am. J. Hum. Genet. 58,97 (1996). 119. B. S. Andresen, C. Vianey-Saban, P. Bross, P. Divry, C. R. Roe, M. A. Nada, I. Knudsen, and N. Gregersen,]. Inherit. Metab. Dis. 19,169 (1996). 120. P. J. Thomas, B. H. Qu, and P. L. Pedersen, Trends. Biochem. Sci. 20,456 (1995). 121. R.N. Sifers, Nature Struct. Biol. 2,355 (1995). 122. V. E. Bychkova and 0.B. Ptitsyn, FEBS Lett. 359,6 (1995). 123. M. H. Yu, K. N. Lee, and J. Kim, Nature Struct. Biol. 2,363 (1995). 124. Y. Wu, I. Whitman, E. Molmenti, K. Moore, P. Hippenmeyer, and D. H. Perlmutter, Proc. Natl. Acad. Sci. U.S.A.91,9014 (1994). 125. G . M. Denning, M. P. Anderson, J. F. Amara, J. Marshall, A. E. Smith, and M. J. Welsh, Nuture (London) 358, 761 (1992). 126. B. H. Qu and P. J. Thomas,]. Bwl. Chem. 271,7261 (1996). 127. I. Yike, J. Ye, Y. Zhang, P. Manavalan, T. A. Gerken, and D. G. Dearbom, Protein Sd. 5,89 (1996). 128. C. Li, M. Ramjeesingh, E. Reyes, T. Jensen, X. Chang, J. M. Rommens, and C. E. Bear, Nature Genet. 3,311 (1993). 129. E. L. Berson, Roc. Nutl. Acad. Sci. U.S.A.93,4526 (1996). 130. S. M. Hurtley and A. Helenius, Annu. Reo. Cell Biol. 5,277 (1989). 131. H. P. Jennissen, Eur. 1.Biochem. 231, 1 (1995). 132. A. J. Tanner and J. F. Dice, Biochem. Mol. Med. 57, 1 (1996). 133. T. Tamura, I. Nagy, A. Lupas, F. Lottspeich, Z. Cejka, G. Schoofs, K. Tanaka, R. Demot, and W. Baumeister, Cum. Biol. 5,766 (1995). 134. A. L. Goldberg, Chem. Biol. 2,503 (1995). 135. S. Jentsch and S . Schlenker, Cell 82,881 (1995). 136. M. R. Maurizi, W. P. Clark, Y. Katayama, S. Rudikoff, J. Pumphrey, B. Bowers, and S. Gottesman,]. Biol. Chem. 265,12536 (1990). 13% A. L. Goldberg, Eur. ]. Bwchem. 203,9 (1992). 138. N. Wang, S. Gottesman, M. C. Willingham, M. M. Gottesman, and M. R. Maurizi, Roc. Natl. Acad. Sci. U.S.A.90,11247 (1993). 139. I. Wagner, H. Arlt, L. Vandyck, T. Langer, and W. Neupert, EMBO]. 13,5135 (1994).
IMPAIRED FOLDING AND SUBUNIT ASSEMBLY
337
140. A. L. Goldberg, R. P. Moerschell, C. H. Chung, and M. R. Maurizi, Methods EnzymoZ. 244, 350 (1994). 141. 0.Kandror, L. Busconi, M. Sherman, and A. L. Goldberg, ]. BioZ. Chem. 269, 23575 (1994). 142. 0.Kandror, M. Sherman, M. Rhode, and A. L. Goldberg, EMBOJ. 14,6021 (1995). 143. M. Kessel, M. R. Maurizi, B. Kim, E. Kocsis, B. L. Tius, S. K. Singh, and A. C. Steven,]. MoZ. BioZ. 250,587 (1995).
This Page Intentionally Left Blank
Interaction of Retroviral Reverse Transcriptase with Template-Primer Duplexes during Replication ERICJ. ARTS AND STUART E J. LE GRICE] Centerfw AIDS Research and Division of Infectious Diseuses Case Western Resme University School of Medicine Cleveland Ohio 44106-4984
I. Human Immunodeficiency Virus Reverse Transcriptase . . . . . . . . . . . , . A. The HIV Replication Cycle . . , . . . . . . . . . . . . . . . . . . . , . . . . . . . . . , . . B. Biogenesis of Reverse Transcriptase . . . . . . . . . . . . .. . . . . .. , . . . , , . C. Structural Features of p66-p51 HIV-1 RT . . . . . . . . . . . . , . . . , . , , . . 11. tRNALYss3-MediatedInitiation of f-) Strand DNA Synthesis . . . . . . . . . A. Packaging of tRNALys*3 into HIV-1 and Its Interaction with the Viral Genome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. tRNA-Viral RNA Interactions Outside the PBS . . . . . . . . . . . . , . . . . . C. Initiation and Synthesis of HIV-1 (-) Strand DNA from tRNALyR3. . . . , , . , . , . , , . . . . . . . . . . . . . . . . . , . . . . . , . , , . . . . . D. Heterodimer-Associatedp51 Mediates tRNA-Primed Events in HIV-1 . . , . . . . . . . . . . . . . . . . . . . . . . , . . . . . . E. Recognition of tRNALys*3-PBSDuplexes by Heterologous RTs . . . . , 111. Interaction of RT with the Template-Primer Duplex . . . . . . . . . . . . . . . . A. Primer and Template-Grip Motifs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Chemical and Enzymatic Footprinting Studies . . . . . . . . . . . . . . . . . . C. Mutagenesis of Structural Elements Involved in Template-Primer Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV The RNase H Domain and Hydrolysis of RNA-DNA Hybrids . . . . . . . . A. Structure of the RNase H Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . , , B. Polymerization-Dependentand -Independent RNase H Activities . . C. RNase H-Dependent Steps in Retroviral Replication . . . . . . . . . . . . . . D. Polymerization-IndependentRNase H Activity and Strand Transfer . . . , . . . . , . , . . . . . . . . . . . . . . . . . . . . . . . . . . , , . . V. The Polypurine Tract and Second-Strand Synthesis . . . . . . . . . . . , . . , , . . A. Selection and Initiation from the 3’ Polypwine Tract Primer . . . , . . . B. Mutations in RT Influencing PPT Selection and Extension . . . . , . . , C. Central PPT and Central Termination Sequences of Lentiviruses . . , VI. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , . . , . , , . . . . . . . . . , . . .
.
341 341 344 344 346 346 349 353
355 360 361 361 364 366 370 370 374 376 377 380 380 383 384 386 387
To whom correspondence may be addressed: Telephone: 216-368-6989; Fax:216-3682034; e-mail:
[email protected]. Rogresr in Nucleic Acid Research and Molecular Biology, Vol. 58
339
AU
Copyright 0 1908 by Academic Press. rights of reproductionin m y form reserved. 0079-6603&8$25.00
340
ERIC J. ARTS AND STUART F.J. LE GRICE
Conversion of the single-stranded RNA of an invading retrovirus into doublestranded proviral DNA is catalyzed in a multi-step process by a single virus-coded enzyme, reverse transcriptase (RT).Achieving this requires a combination of DNA polymerase abd ribonuclease H (RNase H) activities, which are located at the amino and carboxy terminus of the enzyme, respectively. Moreover, proviral DNA synthesis requires that three structurally-distinct nucleic acid duplexes are accommodated by this enzyme, namely (a) A-form RNA (initiation of minus strand synthesis), non-A, non-B RNhDNA hybrid (minus strand synthesis and initiation of plus strand synthesis) and B-form duplex DNA (plus strand synthesis). This review summarizes our current understanding of the manner in which retroviral RT interacts with this diverse array of nucleic acid duplexes, exploiting in many cases mutants unable to catalyze a specific event. lhese studies illustrate that seemingly ‘simple’ events such as tRNA-primed initiation of minus strand synthesisare considerably more complex, involving intermolecular tRNA-viral RNA interactions outside the primer binding site. Moreover, RNase H activity, generally thought to catalyze non-specific degradation of the RNA-DNA replicative intermediate, is required for highly specialized events including DNA strand transfer and polypurine selection. Finally, a unique structure near the center of HIV proviral DNA, the central termination sequence, serves to halt the replication machinery in a manner analogous to termination of transciption. As these highly specialized events are better understood at the moleo ular level, they may open new avenues of therapeutic intervention in the continuing effort to stem the progression of HIV infection and AIDS. 63 iese Academic prea~
Conversion of the single-stranded RNA genome into double-stranded DNA, mediated by virus-coded reverse transcriptase (RT; deoxynucleoside triphosphate:DNAdeoxynucleotidyltransferase,RNA-directed E.C. 2.7.7.49) is an obligatory event in the replication cycle of all retroviruses ( I ) . Achieving this requires a combination of RNA- and DNA-dependent DNA polymerase activities, in addition to a degradative function that hydrolyzes the RNA component of the RNA-DNA replication intermediate (ribonuclease H, or RNase H). During the reverse transcription cycle, this single enzyme must accommodate A-form RNA and B-form DNA duplexes, and a non-A, non-B RNA-DNA hybrid, between its catalytic centers. Since documentation in 1970 of an enzyme catalyzing DNA synthesis on an RNA template (2,3),advances in molecular biology have made recombinant RT from several retroviruses available over the last decade (4-10), allowing a detailed analysis of subunit and subdomain interactions controlling catalytic events. This is currently most pertinent for the enzymes of human immunodeficiencyvirus type 1and 2 (HIV-1,HIV-2) ( I I , 12),due to the unabating spread of HIV infection and devastating consequences of acquired immunodeficiency syndrome (AIDS). Although therapeutic strategies have been hampered by the rapid emergence of drug-resistant virus (.23-I,5), this should not render RT an inappropriate target for antiviral agents, but rather should serve as a challenge
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
34 1
to more precisely dissect its multiple functions and determine their susceptibility to more rationally designed therapeutic agents. Our understanding of HIV RT would also benefit from a comparative analysis with its counterpart from other retroviruses, many of which are now available in recombinant form (5, 7, 9, 10). Intriguingly, these functionally equivalent enzymes display considerable structural diversity. For example, while the biologically significant form of the HIV-1 and HIV-2 enzymes is a heterodimer of subunits derived from a single gene (16-18),RT from murine leukemia virus (MLV) is a 75-kDa monomer (7). Although the avian counterpart is also a heterodimer, the larger subunit retains the integrase (IN) component of the precursor polyprotein (19),which is removed during maturation in other retroviral systems. Furthermore, in addition to altered subunit organization, evidence suggests that higher order structures between the tRNA replication primer and the viral RNA genome control retroviral replication (20-27), although the exact mechanism might be subtly different for each retrovirus. Data summarized in this review present our current understanding of the interaction of HIV-1 RT with the template-primer duplex during the RNA- and DNA-dependent DNA synthesis phases of replication, and the applicability of these studies to related retroviral enzymes.
1. Human Immunodeficiency Virus Reverse Transcriptase
A. The HIV Replication Cycle RT-mediated events yielding a ribonucleotide-free, double-stranded, proviral DNA from the single-stranded viral RNA genome are summarized in Fig. 1.Minus (-) strand DNA synthesis initiates from a host tRNA whose 3’ terminus shares 18 nucleotides (nt) of complementarity to a region located at the 5 ‘ end of the viral RNA genome and designated the primer-binding site (PBS) (28).Examples of tRNA replication primers include tRNALYs,3for the human, simian, and feline immunodeficiency and equine infectious anemia viruses (HIV-1, HIV-2, SIV, FIV, and EIAV, respectively); tRNATQ for respiratory syncytial virus (RSV), tRNARo for MLV; and tRNALys1*2for caprine arthritis encephalitis virus (CAEV) (28).Once the replication complex has traversed the tRNA-viral RNA duplex, RNase H activity will degrade RNA of the RNA-DNA replicative intermediate, while the primer terminus (now DNA) is extended at the polymerase catalytic center. RNA-dependent DNA synthesis proceeds as far as the 5’ terminus of the viral genome, after which “strong-stop” (-) strand DNA is relocated to the 3’ terminus by a
342
ERIC J. ARTS AND STUAFtT F. J. LE GRICE
RNase H
' L POL
ENV
pol
env
-
(-1+ ? 4
gag
PPT
u3
R
,
tRNA
+
e3
*\
PBS
GAG
POL
ENV
gag
pol
env
PPT u3
U3 R U5 rl-
U3
R U5
FIG.1. Reverse transcriptase-mediatedconversion of the single-stranded RNA genome into double-stranded proviral DNA. (A) Synthesis of (-) strand DNA (solid line), initiated from a cellular tRNA hybridized to the primer-binding site (F'BS), continues to the 5' terminus of the viral RNA genome (open line), during which the resulting RNA-DNA hybrid is hydrolyzed via a synthesis-dependentRNase H activity. (B) Synthesis-independentRNase H activity hydrolyzes the replicative intermediate to within 8-10 nt of the template 5' terminus, thereby permitting transfer of nascent (-) strand DNA to the 3' end of the RNA genome via complementarity of
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
343
strand transfer event, mediated through homology between 5’ and 3‘ repeat (R and r in Fig. 1)sequences of the retroviral genome (29).As (-) strand DNA synthesis continues from the 3’ end of the genome, RNA of the RNA-DNA replicative intermediate is subject to continual degradation, the exception to which is a short purine-rich region near its 3‘ terminus, designated the polypurine tract (PPT) (30).Plus (+) strand DNA-dependent DNA synthesis initiates from the RNA primer of this RNA-DNA duplex and proceeds toward the 3‘ end of the genome, using newly synthesized (-) strand DNA as template. Although the (-) strand DNA template at this stage retains the tRNA replication primer, only 18 nucleotides at its 3’ end are copied, after which the replication complex stalls, presumably by the difficulties encountered in copying a methylated template base (Me-58Ain the case of HIV-1; see Section IV).A second strand transfer event subsequently relocates nascent (+) strand strong-stop DNA (without the tRNA primer, which is released by RTassociated RNase H activity) to the 5’ end of the genome, using homology between PBS regions of the (-) and (+) strands. Finally, DNA-dependent DNA synthesis continues to generate full-length proviral DNA flanked at both termini by long terminal repeat (LTR)sequences. An additional level of complexity has evolved in lentiviruses, which make use of a second copy of the PPT located within the IN gene at the center of the RNA genome (31,32; L. Boone, personal communication).A later section deals with (+) strand initiation at this “central” PPT. To accomplish the events depicted in Fig. 1requires considerable flexibility of retroviral RT,since the enzyme is required to accommodate several structurally diverse nucleic acid substrates. These include (i) A-form duplex RNA, encountered during initiation of (-) strand synthesis from the tRNA primer; (ii) non-A, non-B RNA-DNA hybrid representing the replication intermediate; (iii)a DNA template-RNA primer heteroduplex during initiation of (+) strand synthesis from the PPT primer; and (iv) B-form duplex DNA during the ensuing (+) strand synthesis. A later section discusses the possirepeat (R) sequences. Thereafter, (-) strand DNA synthesis continues along the RNA genome. (C) Concomitant with (-) strand DNA synthesis, RNase H activity hydrolyzes RNA of the replicative intermediate. The exception to this is the polypurine tract 0, which resists hydrolysis to serve as primer for (+) strand synthesis. (+) strand synthesis proceeds along the (-) strand DNA template and over the fust 18 nucleotides of the tRNA primer, where a methylated base is encountered. Transient pausing of RT at this base leads to RNase H-mediated excision of the tRNA primer. (D) Homology behveen PBS regions of (-) and (+) DNA permits a second-strand transfer event, relocating nascent (+) strand DNA to the 3‘end of the fully elongated (-) strand. Although an intramolecularevent is depicted here, intermolecular strand switching is also possible. (E) Following the second-strand transfer, DNA-dependent DNA polymerase activity completes synthesis of (-) and (+) strands to yield a double-stranded preintegrative intermediate.
344
ERIC J. ARTS AND STUAFtT F. J. LE GRICE
bility that an unusual structure adopted by the double-stranded proviral DNA may serve to terminate DNA-dependent DNA synthesis at late stages in replication.
B. Biogenesis of Reverse Transcriptase In HIV, as with most other retroviruses, RT is synthesized as a component of a larger (-165-kDa) gag-pol precursor polyprotein harboring both structural (gag) and enzymatic @oZ) components (33).Although the temporal features of gag-pol maturation remain poorly defined, this precursor is cleaved into its individual components by the poZ-coded protease (PR) shortly after the mature virus particle buds from an infected cell. Early amino acid sequencing data demonstrated that virion-derived HIV-1 RT was represented by 66- and 51-kDa subunits collinear at their N-termini (16, 17), subsequent to which it was demonstrated that these associated into a p66-p51 heterodimer (34).The smaller, or p51, subunit of HIV-1 RT is derived through PR-mediated cleavage of p66 between Phe440and Tyr441 (8),and therefore lacks the entire C-terminal RNase H domain. The manner in which such partial proteolysis is accomplished has been somewhat controversial, although prevailing dogma suggests an asymmetrical organization of subunits in the p66-p66 homodimer renders the RNase H domain of one subunit accessible to the retroviral protease (35, 36). Following release of a single RNase H domain, the heterodimer presumably adopts a configuration rendering the other domain inaccessible.Surprisingly, despite considerable similarity at the amino acid level, data suggest the smaller subunit of HIV-2 RT arises through alternative cleavage of p66 between Met4s4 and Ala485, with the consequence that p51 of the HIV-2 enzyme contains a 5 to 6-kDa extension at its C-terminus,the consequences of which are currently unclear (37,38).A similar observation has been made for the closely related enzyme of EIAV, whose smaller subunit results from cleavage of p66 around residue 460 (39).Based on several reports that the smaller (i.e.,RNase H-lacking)RT subunit displays low-level polymerase activity and is predominantly distributive in nature (6, 10,40-42), an extended HIV-2 p51 subunit containing -40 residues of the RNase H domain might not be expected to impair functions of the parental heterodimer. However, a later section suggests this may have consequences for the ability of the HIV-2 heterodimer to initiate (-) strand DNA synthesis from the tRNA replication primer.
C. Structural Features of p66-p51 HIV-1 RT The availability of several high-resolution structures for p66-p51 HIV-1 RT (43-48) has been pivotal in understanding the role its constituent subunits play in the parental heterodimer. Figure 2A schematically illustrates a structure of the holoenzyme complexed with the nonnucleoside inhibitor
Palm
RT Subdomaln
I
Proposed Functions
p86 Fingers p51 Fingers
Positions single-stranded template extension Portionsnear active site; dlmerlzation
p66 Palm p51 Palm
Polymerase active site; positions template-primer Positioningof tRNA replication primer
p66 Thumb p51 Thumb
Template-primer binding and translocation Floor of template-primer binding cleft; dimerizatlon
p66 Connection p51 Connection
Floor of template-primer binding cleft; dimerization Floor of template-primer binding deft; dimerization
p66 RNase H
Dlgests RNA of RNAiDNA hybrid dimerization
FIG.2. (A) Structural elements of the 66- and 51-kDa subunits comprising p66-p51 HIV1 RT. The structure depicted here is derived from a co-crystal of RT and Nevirapine (43). Only the polymerase domain of p66 is shown. a-Helices are depicted by letters, while p-strands are depicted by numbers. Note the alternative arrangement of the connection and thumb subdomains of p51, despite sharing the identical secondary structure to the p66 subunit. (B) Potential roles for subdomains of heterodimer HIV-1 RT (44).
346
ERIC J. ARTS AND STUART F. J. LE GRICE
Nevirapine (43).Based on the anatomical resemblance to a right hand, subdomains of the p66 subunit have been designated (from the NH, terminus) “fingers,”“palm,” and “thumb,” which are linked by the “connection” to the C-terminal RNase H domain (43).As outlined in the previous section, the 66and 51-kDa subunits differ in that p51 lacks a copy of the RNase H domain. More noteworthy was the finding that, despite sharing virtually identical tertiary structures, the relative positions of p66 and p51 subdomains were dramatically different. This was best exemplified for the p51 connection subdomain, which occupies an expanded cleft between its fingers and palm. As a consequence, while residues Aspll”, AsplS5,and Asp186of p66 are juxtaposed to facilitate catalysis, their counterparts in the p51 subunit are not so grouped. The implication of these findings was that heterodimer-associated p51 would be unlikely to contribute to catalysis, a notion originally proposed from “subunit-selective”in vitro mutagenesis experiments involving reconstituted p66-p51 heterodimers within which amino acids of one subunit had been selectively altered (40, 49). Since p51 is also devoid of RNase H function, its role in heterodimer HIV-1 RT has been a point of conjecture. However, one clue to a potential role for this subunit has been provided by the cocrystal of RT and the nonnucleoside inhibitor Nevirapine. It has been suggested (43)that p51 of the heterodimer may function to sequester and position the tRNA replication primer (tRNALYs93)via its anticodon and D arms, allowing the 3’ terminus (hybridized to the PBS of the viral genome) to be positioned at the polymerase catalytic center and aligned for initiation of (-) strand synthesis. As presented later, in vitro mutagenesis experiments have supported this hypothesis, suggesting additionally that accommodation of the replication primer precedes disruption of critical tRNA-viral RNA interactions by HIV-1 RT for productive (-) strand synthesis. Finally, despite the inactivity of p51 in the parental heterodimer, a series of subunit-selectivemutagenesis experiments have clearly demonstrated that resistance of HIV-1 to the nonnucleoside inhibitor is imparted through a mutation in the p51 RT subunit (50).Alteration of p51 residue ( G ~ U+ ’ ~L ~y ~ l results ~ ~ ) in resistance to TSAO-m3T, while enzyme carrying the equivalent substitution in its p66 subunit remains sensitive.
II. tRNALysWbdiatedInitiation of (-) Strand DNA Synthesis
A. Packaging of tRNALys13into HIV-1 and Its Interaction with the Viral Genome Prior to the discovery of an RNA-dependent DNA polymerase activity in 1970 (2,3),RNA tumor viruses were shown to contain RNA species with sedi-
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
347
ment coefficients of 70 and 35s (51).It was later discovered that disruption of the 70s species by denaturation resulted in not only three to four 35s RNA molecules, but also discrete low-molecular-weight RNA components of 4 to 9s (52,53).By the turn of the decade, it became evident that these small RNA fragments were in fact host-derived tRNAs (54-56). Although it was proposed that tRNA was the primer for reverse transcription, it was severalyears before (i) specific tRNA isoacceptor species were identified as the cognate primers for reverse transcription in specific oncoviruses (57,58;reviewed in 28) and (ii) the location of tRNA binding for initiation of (-) strand, RNA-dependent DNA synthesis (the PBS) was located toward the 5' end of the retroviral RNA genome (59).The cognate primer of HIV-1 and HIV-2, tRNALYs*3 was identified through a complementarity between 18 nucleotides at its 3' end and a PBS sequence immediately upstream of the U5 region in the RNA genome (14 12; Fig. 3).As discussed later, selection of tRNALYsv3by HIV-1, and its role in controlling initiation of HIV-1 reverse transcription, is a complex process involving several viral components (60). Although there is no common tRNA isoacceptor species utilized by all retroviruses, most display preferential incorporation of their cognate primer (57, 61-65). In HIV-1, the tRNALYsisoacceptor species tRNALYS,l,tRNALYSs2, and tRNALYs9 are preferentially incorporated during virus assembly, among which only tRNALYs*3has been detected tightly associated with the RNA genome (63, 64, 66). In light of these observations, what is the mechanism underlying preferential incorporation of tRNALYs isoacceptors into HIV-1 particles? Previous studies indicated that an avian myeloblastosis virus (AMV) mutant lacking RT,but containing viral RNA, was deficient in its cognate primer (tRNATV),suggesting RT or larger precursors encoding this enzyme were necessary for packaging the replication primer (62).Support for this notion was provided by the observation that A M V and MLV mutants incapable of packaging viral genomic RNA, but containing their RT component, retained the selective incorporation of their cognate tRNA primers demonstrated with wild-type v i r u s (61, 62).Using a protease-defective HIV1 mutant (i.e., virus that fails to process the p55g"c: and p16Og"g-P"l precursor polyproteins), preferential incorporation of tRNALYs species was found to be unaffected by maturation of these precursors (66).Furthermore, expression of both the p55gq and p160g'~-~02 precursors in cos-7 resulted in production of virus-like particles devoid of genomic RNA, but still enriched for tRNALys isoacceptor species (66).However, this preferential selection was lost in virus-like particles containing solely the p55g"gprecursor. In contrast to the packaging of the HIV-1 RNA genome, which requires a contribution from the nucleocapsid (NC) component of p55gq (67),incorporation of tRNALys into HIV-1 appears to be dependent on the packaging of p160~afi-P01 molecules into the budding virus.
348
ERIC J. ARTS AND STUART E J. LE GRICE
FIG.3. L-shaped (A) and clover-leaf (B) representations of the HN-1 replication primer, tRNALYs,3,indicating bases demonstratedby either chemical cross-linking(open squares) or enzymatic footprinting (open circles) to interact with RT The 18 nt at the tRNA 3' terminus complementary to the PBS of the viral genome are in bold. Modified bases of tRNALYs,,3are; z,G, 2-methyl guanosine; 5,C, 5-methyl cytosine; ,,G. 7-methyl guanosine D, dihydrouridine; Y, pseudouridine; Tm, methyl ribothymidine; R, N-[(9-~-~ribofuranosy1-2-methylthiopurin-6-y1) carbamoyl] threonine S, 2-thio-5 carboxymethyluridine methyl ester. Bases of the anticodon domain were shown by both chemical crosslinking (79) and enzymatic footprinting (114)to interact with RT, while those of the W C and D loops were determined by enzymatic footprinting
(114).
Although the PBS is not directly involved in packaging tRNALYsinto HIV particles, this sequence is clearly necessary for the selection of tRNALYs,3over tRNALy",' and tRNALYs,2for initiation of (-) strand DNA synthesis. Wildtype HIV-1 particles contain approximately8 molecules of tRNALYs,3and 12 molecules of tRNALYs*land tRNALYss2per diploid RNA genome (66, 68),yet
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
349
only tRNALYs,3is found tightly associated with the genome. Overexpression of a tRNALYs*3mutant altered in its anticodon loop G,UUU,, substituted by 34CTA3d in cells producing HIV-1 increased the ratio of tRNALYs,3to tRNALysJand tRNALYs*,2 in the virus, although the total amount of tRNALYsremained constant at 20 molecules per diploid genome (68).In such v i r u s particles, both mutant and wild-type tRNALYs,3isoacceptors were found tightly associated with the viral RNA genome. Despite the original observation (61)that packaging of the tRNA primer does not require the retroviral genome in MLV, the PBS still could play the ultimate role in primer selection regardless of which tRNA species were packaged into the virus. To test this hypothesis, several researchers have introduced mutations into the PBS sequence of different retroviruses and assessed the consequences of these alterations on virus replication (69-77). Two groups have shown that HIV-1, produced from cells transfected with a proviral clone carrying a deletion of a limited number of nucleotides at the 5’ end of the PBS, failed to replicate in CD4+ lymphoblasts.In contrast, deletions introduced into the middle or 3’ end of the PBS resulted in only a delay of virus replication compared to wild-type HIV-l(69, 74). When the PBS of HIV-1 was substituted for a sequence complementary to the 3’ end of another tRNA isoacceptor, a similar delay in the appearance of virus was observed (70-72, 78). After 6 days of culturing both phenotypes of PBS-mutated HIV-1,there was an increase in the virus replication kinetics, coinciding with a reversion of the mutated PBS sequences to that of the wild type (i.e., complementary to the 3’ end of tRNALys*3) (69-74, 78). Although it appears that other tRNA species found at low levels in HIV-1 particles can support initiation of reverse transcription, reversion of the mutated PBS to the wildtype sequence most likely reflects usage of tRNALYsy3,covalently linked to (-) strand DNA, as the PBS template during (+) strand DNA synthesis (Fig. 1).Thus there must be factors distinct from the PBS involved in primer selection into the virion, its placement on at least the 5’ UGG of the PBS, and ultimate utilization during initiation of (-) strand DNA synthesis. In addition to preferential packaging of tRNALys,3by the ~16Oge-P~’ precursor (63, 64, 66), such factors may include selective binding of tRNALYs,3 by HIV-1 RT and NC (79-83) and additional interactions between tRNALYSy3and the HIV-1 RNA genome outside the PBS (25,84,85).
B. tRNA-Viral RNA Interactions Outside the PBS Several studies provide a convincing argument for sequences other than the PBS of the RNA genomes of retroviruses and retrotransposons interacting with the tRNA replication primer (20-23, 83, 85-89). However, the specific role of these intermolecular tRNA-viral RNA structures remains to be established (68,QO;X. Li, personal communication).In addition to the 18-bp duplex
350
ERIC J. ARTS AND STUART F. J. LE GRICE
between consecutive bases at the 3‘ terminus of tRNATT (extendingfrom the acceptor stem into W C loop) and the PBS of the RSV genome, nucleotides of its W C loop and stem have been demonstrated to interact with U5-inverted repeat (IR)sequences upstream of the PBS. Disruption of this tRNA-viral RNA interaction by introducing alterations to the U8IR stem resulted in significant decreases in virus replication and reverse transcription (20-23,28). Although similar higher order sbmctures involving the T W arm of tRNA and the U5-IR stem of the viral genome have been proposed for several other retroviruses, including HIV (22,23,28,86),this sequence (sequence B in Fig. 4)does not appear well conserved among different HIV-1 isolates. Sequence analysis of regions immediately adjacent to the PBS in the genome of several HIV-1 strains has revealed alternative, more highly conserved sites that have been proposed to interact with tRNALYs*3.One such site, an A-rich sequence contained within a short loop 3’ to the PBS (sequence C2 in Fig. 4),has been proposed to interact with the U-rich anticodon loop of the replication primer. Although this interaction was found necessary for efficient annealing of tRNALYs33to the PBS and initiation of reverse transcription (84),a combination of nuclease and chemical footprinting of the HIV-1 tRNALYs53-viral RNA complex failed to detect an interaction involving the tRNA anticodon loop and this site (25). Since short DNA oligonucleotides were employed as templates in this study (84),these may not have assumed the equivalent intermolecular structures observed with tRNALYs,3 and a large PBS-containing RNA fragment of the HIV-l,, isolate (25).However, since footprinting results represent a static complex between tRNALYs,3 and HIV-1 RNA, an interaction between its anticodon loop and the A-rich sequence 3’ to the PBS during primer placement on the PBS, or the transition between the initiation and elongation phase of reverse transcription (QI), could not be ruled out. However, mutagenesis studies have suggested that the A-rich sequence 3’ to the PBS is unnecessary for placement of tRNALYSJ F1c.4. Proposed models for loop-loop interactions between tRNALys.3 and the viral genome controllinginitiation of HIV-1 replication. The upper portion of the figure illustrates the sequence of several HIV-1 genomes in the immediate vicinity of the PBS (Region A) with the potential for loop-loop interactionswith the tRNA. Region B invokes an interaction of the tRNA W C loop and the U5-IR stem of the RNA genome, and was extrapolatedfrom studies with avian RT (20-23). In contrast, Region C2 has been proposed to interact with bases of the tRNA anticodon loop (84).Finally, chemical and enzymatic probing studies (25, 85) have provided evidence that the A-rich UBIR loop of the viral genome (Region C1) participates in intermolecular base pairing with the U-rich tRNA anticodon loop. Of these possibilities,sequence variations on the HnT-1 genomes suggest the U5-IR loop-tRNA anticodon loop interaction is the most likely. The lower portion of the figure provides a schematic illustration of the manner in which the tRNA primer is proposed to interact with regions A, B, C1, and C2 of the viral genome. Notations T + 1 and T + 5 refer to the 3’ termini of tRNA-DNA chimeras used to probe these loop-loop interactions (see Fig. 5). See Fig. 3 for notations Y and S.
351
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES U C A G A C C C U U U U A G U C A G U G U G G A A A A U C U C U A G C A G U G G
U
uc AC C U
A G G
-
u u
-A
BRU
u
U
UAG
TwC loop
B
MAL
AA
MN NL43
U
* L _ _ _
anticodon loop
primer binding site
A
c1
c1
antiiodon loop
c2
ELI
u--
UA G U U U G GC UAG U C A G A C C C U U U U A G U C cA G U G" U G G ~ U C U C U A G U G
Interaction with:
HXB2
OYI
SF2 HX82
352
ERIC J. ARTS AND STUART F. J. LE GRICE
onto the HIV-1 RNA by NC (X. Li, personal communication), as well as initiation of (-) strand DNA synthesis from tRNALys,3(26). A second A-rich sequence, highly conserved between several HIV-1 strains, is located immediately 5’ to the PBS (sequence C1 in Fig. 4) and has been demonstrated by a combination of nuclease and chemical footprinting to interact with the anticodon loop of tRNALYs,3(85). Furthermore, this interaction was significantly stabilized by the presence of a thiolated uridine residue in the tRNA anticodon loop (2-thio-5carboxymethyl uridine methyl ester, designated S in Fig. 3).Unlike natural tRNALYSv3,the anticodon loops of either tRNALYs*3dethiolated at this position or unmodified tRNALys73prepared by in vitro transcription failed to form an equally stable complex with the A-rich sequence of the U5-IR loop of HIV-lMa,RNA (85, 91). A slight shift in positioning of the A-rich sequence, due to a shorter stem separating the U5-IR loop and the PBS in HIV-1 strains other than the Mal isolate, may also influence the stability of this loop-loop interaction (E. J. Arts and S. F. J. Le Grice, unpublished data). The consequences of this anticodon loop-U5-IR loop interaction in HIV1reverse transcription and replication are not fully understood. As with sequence C2 (Fig. 4),proposed to interact with the anticodon loop (84, the presence of the U5-IR loop does not appear necessary for NC-mediated placement of tRNALYSy3onto HIV-1 RNA (X. Li, personal communication) and in vitro does not appear to an absolute requirement for (-) strand DNA synthesis from PBS-bound tRNALys*3(27). However, the stability of a mutation where the HIV-1 PBS was substituted with a sequence complementary to the 3’ end of tRNAHisis significantly enhanced by introducing a compensatory substitution into the viral A-rich U5-IR loop via a sequence complementary to the anticodon loop of tRNAms. The resulting HIV-1 mutant replicates more efficiently and also exhibits a slower rate of reversion to the PBS specifylng tRNALYs,3than a mutant in which the PBS alone is altered to the 3’ end of tRNAHis(92).Interestingly,a deletion of the -A-A-A-A-sequence in the U5-IR loop does not significantlyalter HIV-1 replication kinetics. However, the A-rich sequence in the U5-IR loop of this mutant virus appears to be reestablished after several passages in culture (X. Li and M. Wainberg, personal communication). It is not unreasonable to assume from current data that retroviruses have evolved to accommodate higher order structures between their tRNA replication primers and viral RNA genome. However, the exact nature of these complexes may vary drastically among different retroviruses, and also between different strains of the same retrovirus. For example, although all lentiviruses utilize tRNALYs,3as primer, neither EIAV nor FIV possess an Arich loop immediately 5’ to the PBS in their RNA genomes. Furthermore, HIV-2,0, has two A-rich sites in its U5-IR stem-loop, one of which is par-
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
353
t i d y “masked” through intramolecular base pairing in the U5-IR stem and the other of which is found in a short loop, 26 nt 5’ to the PBS (86).Therefore, it appears unlikely that the equivalent tRNALysv3anticodon loop-U9IR loop interaction occurs in lentiviruses other than HIV-1. Other retroviruses or retrotransposons may nevertheless employ alternative genome sequences to interact with and stabilize the tRNA-viral RNA initiation complex. For example, three sites in the immediate vicinity of the PBS of the Tyl retrotransposon genome have been proposed to interact with the TW.2 and D arms of the cognate replication primer tRNAiMet(87-89,93). It is interesting to note that, in HIV-1 RNA, several sites with considerable homology to the PBS (90) can be identified, yet the PBS appears to be the major initiation site of reverse transcription in vivo. It is therefore possible that interactions outside, but immediately adjacent to, the PBS may participate either in specific placement of the primer or in directing the RT to initiate only from a replication primer annealed to the appropriate site.
C. Initiation and Synthesis of HIV-1 (-)
Strand DNA
from tRNALySt3 An earlier section dealt with the assembly and interactions involving the tRNALYs*3-HIV-1RNA complex during virion production. Formation of this complex as well as the subsequent initiation of reverse transcription may be dependent in vivo on viral and host factors such as NC (79, 8 4 , RT (26, 80, 82, 83, go), viral gag and gag-pol precursors (667, and possibly p-actin (94). However, only HIV-1 RT and a preformed tRNA-HIV-1 RNA complex appear to be required for initiation of (-) strand DNA synthesis in vitro (90). Several studies have demonstrated that the presence of DNA of heterogeneous length can be detected in retroviral particles (95-100). It has been suggested that initiation of (-) strand DNA synthesis may not have occurred following infection and virus entry into a host cell, but rather during virus assembly, maturation, and/or release of the virus particle (98,99).However, the mere presence of viral DNA in virus particles does not prove its necessity in virus replication. In fact, this viral DNA was present in less that 0.1% of the virus population (97, 100). Furthermore, HIV-1 particles produced in the presence of azidothymidine (AZT) contained chain-terminated viral DNA, yet were as infectious for H9 cells as untreated, wild-type virus (97). From these results, it appears that HIV-1 reverse transcription may be initiated in the host cell from tRNALYSv3or possibly chimeras of tRNA and incomplete nascent (-) strand strong-stop DNA. It was shown that viral DNA in HIV-1 particles could contribute to virus replication in quiescent CD4+ cells (101). Again, the HN-1 DNA in v i r u s particles that acts as a primer for reverse transcription in these cells must be incomplete (-) strand strong-stop DNA products, since synthesis of HIV-1 DNA was arrested before the completion of
354
ERIC J. ARTS AND STUART F. J. LE GRICE
(-) strand strong-stopDNA in quiescent CD4+ lymphocytes or macrophages treated with chain-terminating nucleoside analogs (e.g., AZT) (60, 102). As discussed later, there is evidence to suggest that addition of a limited number of deoxynucleotides to the 3’end of tRNALYs,3in HIV-1 particles may be necessary for efficient initiation of (-) strand DNA synthesis upon entry into the host cell (60, 101). Studies with purified components have indicated that the mechanisms underlying initiation of (-) strand DNA synthesis from tRNALYss3hybridized to HIV-1 genomic RNA are considerably more complex than originally envisioned. During initiation, there appears to be an intricate interplay between HIV-1 RT and the primer-template duplex requiring (i) specific binding of HIV-1 RT to PBS-bound tRNALYs,3, (ii) disruption of an intermolecular loop-loop complex between the tRNA primer and viral RNA sequences 5‘ to the PBS, (iii) establishment of an initiation complex leading to the distributive addition of several deoxynucleotides, (iv) stabilization of the initiation complex to permit productive and processive elongation of (-) strand DNA, and, finally, (v) orientation of the first endonucleoytic cleavage by HIV-1 RT on the RNA template after clearance of the tRNA-PBS duplex. To effectively study initiation of (-) strand DNA synthesis during HIV-1 reverse transcription, it has been necessary to study these events during productive HIV1 infection in several cell types (97,102-204) as well as in a reconstituted, in vitro reverse transcription assay (26, 73, 79-81, 90, 91, 103-105). In both cellular infections and in vitro reverse transcription systems, HIV1reverse transcription almost exclusively initiates from tRNALYSy3annealed to PBS of the RNA genome. Most DNA-dependent RNA polymerases (e.g., RNA polymerases I, 11, and 111)recognize a promoter on the DNA template, together with a variety of DNA-binding proteins, then initiate de novo synthesis of RNA (106, 107). In contrast, RNA polymerases from RNA plant viruses initiate de novo synthesis on a tRNA-like structure at the 5’end of their RNA genome (108, 109). Finally, initiation of reverse transcription in hepadnaviruses occurs from RNA stem-loop structure, but priming involves covalent linkage of the first nucleotide of the (-) strand DNA to a tyrosine residue of the hepadnaviral polymerase (110).These examples suggest that retroviruses, retrotransposons, and some retroelements may have evolved a unique mechanism for initiation of DNA synthesis. By some mechanism, HIV-1 RT discriminates its cognate primer annealed to PBS from tRNA annealed to closely related viral RNA sequences, or the 3’ end of the RNA genome folding back upon itself (90, 103). However, these specific tRNALYs,3-viral RNA interactions involving the PBS and adjacent regions do not occur spontaneously at ambient temperatures, suggesting they may be dependent upon additional factors sequestered into the budding virion (e.g., NC, RT, gug-pol and gag precursor proteins), (79-81).The most popular can-
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
355
didate is NC, which has been demonstrated to possess the capacity to unwind nucleic acid duplexes (111).More recently, it has been shown that NC promotes annealing of tRNALys,3to the PBS of an HIV-1 RNA template. In addition, this accessory protein prevented stable interactions between short oligoribonucleotidesand the 3’ end of the RNA genome (X. Li,personal communication). (-) strand DNA in HIV-1 occurs following the interaction of PBS-bound tRNALys73with RT. From earlier studies on HIV-1 RT, it appeared that initiation of (-) strand DNA synthesis was a nonspecific event; that is, the intact tRNA primer could be substituted by either an RNA or DNA oligonucleotide complementary to PBS. When tRNALYs,3was substituted with these oligonucleotide primers, considerable differences have been observed in (i)initiation of RNA-dependent DNA synthesis and RNase H activity, (ii) pausing by RT during (-) strand DNA synthesis, and (iii) the efficiency of the first template switch (26, 90, 91, 104).Utilization of a PBS-bound DNA primer by HIV-1 RT does not result in pausing 1to 6 nt from the primer terminus, which contrasts sharply with significant levels of pausing and abortive synthesis in the immediate vicinity of the tRNALYs*3 or oligoribonucleotide3’ terminus. Thus, HIV-1 RT may assume an altered conformation when binding to RNA and DNA primers hybridized to an RNA template, resulting in a different mechanism of initiation. Data from Isel et al. (91)suggest that HIV-1 RT may undergo a transition from an initiation to an elongation complex when (-) strand synthesis is initiated from an RNA primer on an RNA template. A difference in RT conformation when bound to an RNA template containing RNA and DNA primers may also explain decreased pausing during (-) strand DNA synthesis (26, 112) and an increase in template switching efficiency when the former is used.
D. Heterodimer-Associated p5 1 Mediates tRNA-Primed Events in HIV-1 A total of 31 nucleotides of tRNALYSv3have been proposed to be involved in duplex structures with HIV,, RNA, leaving only the D loop, the W C loop, and part of the D stem single stranded (26).With such an extensive complex near the PBS and 3‘ terminus of tRNALYs33,it may be necessary for HIV-1 RT to disrupt certain tRNA-viral RNA structures for efficient initiation of (-) strand DNA synthesis. Original studies on reconstituted heterodimers of HIV-1 RT described a mutant that had reduced affinity for free tRNALYsJ but maintained unaltered levels of both RNA- and DNA-dependent DNA polymerase activity (113). This “selectively deleted mutant (p66-p51A13) contained a 13-amino-aciddeletion (Gln428-Phe440)at the Cterminus of its p51 subunit. To extend these initial findings, p66-p51A13 RT
356
ERIC J. ARTS AND STUART E J. LE GRICE
and additional mutants containing shorter p51 C-terminal deletions were employed in a series of HIV-1-specific,RNA-dependent DNA polymerase assays (2s). Since p66-p51A13 RT exhibited reduced affinity for free @NALYSs3,a logical experiment was to assess the effects of p51 C-terminal deletion on initiation of (-) strand DNA synthesis from both tRNALYs*3and oligonucleotide primers. Although p66-p51A13 RT supported (-) strand strong-stop synthesis from an 18-nt, PBS-bound oligoribonucleotide,it failed to do so when this was replaced with either synthetic or natural tRNALy”*3, while mutants whose p51 subunit lacked 5 or 9 residues were active with both primers. What features of the initiation complex allowed p66-p51A13 RT to initiate (-) strand synthesis from the 3‘-OH of an 18-nt RNA primer, yet prevented this from the 3’-OH of tRNALYSv3?As outlined previously, it is possible that HIV-1 RT is required to partially disrupt the tRNA-viral RNA complex in order to align its DNA polymerase domain over the primer terminus and initiate DNA synthesis. To test this theory, tRNALYsy3was extended with 1 or 5 deoxynucleotides at its 3’ terminus to generate tRNA-DNA chimeras of 77 and 81 nt as primers for synthesis of (-) strand DNA (Figs. 4B and 5A). As with the 76-nt primer tRNALYSs3,the p66-p51A13 RT failed to initiate (-) strand synthesis from the 77-nt tRNA-DNA chimeric primer (Fig. 5B). However, RNA-dependent DNA polymerase activity was completely restored to this mutant when the 81-nt tRNA-DNA chimera was employed as primer (Fig. 5C). Thus some constraint had been removed between the first and fifth nucleotide 5’ to the PBS, permitting p66-p51A13 to initiate (-) strand DNA synthesis (Fig. 4B). Based on the tRNALys,3-viral RNA structural model derived by chemical and enzymatic footprinting (24,this short extension of tRNALys*3at its 3’ terminus would be predicted to unwind the U5-IR stem upstream of the PBS, thereby disrupting the tRNA anticodon FIG.5. Use of WNA-DNA chimeras to probe tRNA-viral RNA interactions controllinginitiation of HIV-1 replication. In panel (A), the protocol for chimera preparation is presented. In step a, tRNALy”,3is hybridized to a PBS-containingDNA template and extended in the presence of a limited set of dNTPs. In the example shown, the dNTP combinations allow extension of the tRNA by one or five deoxynucleotides,tRNA-DNA chimeras are subsequentlypurified by highvoltage electrophoresis and rehybridized to a PBS containing HIV-1 RNA genome. The ability of RT to support (-) strand strong-stopDNA synthesis in the presence of all four dNTPS is then evaluated in step b. The uiility of this approach is illustrated in panel B. In both panels, the ability of wild-type RT 0and mutants containing deletions of the p51 subunit to support (-) strand synthesis from 77-nt and 81-nt WNA-DNA chimeras was evaluated. RT mutants p66-p51AS and p66-p51A9 are poorly active with the 77-nt chimera, while no (-) strand synthesis is supported by mutant p66-pSlA13. In contrast, all enzymes support equivalent levels of (-) strand synthesis from an 81-nt tRNA-DNA chimera. The structural model of Fig. 4 strongly suggests restoration of (-) strand synthesis results From disrupting an interaction between the tRNA anticodon loop and an A-rich US-IR loop.
357
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
60 nt PBScontaining
template
<
31 nt
5"
>
PBS 497 nt PBScontaining viral 81119 template 4 'Wended PBY
tRNA-DNA chimera 77 nt 1RNA-DNA C h i i m 81 nt tRNkDNA ChkWm
7l nt tRNA-DNA
PI
chlmera
81 nt tRNA-DNA chlmera
+
358
ERIC J. ARTS AND STUART F. J. LE GRICE
[A1
7Bnl tRNA
AA
-RNA ganomr
n-,
246 nt 4
5'-
HIV-1 pbs RNA
ACCGCGGGCWGUCCCUG
-
AGAUCUG----------TGGAAAACTCTAGCAGWGC~C~CUGAAAGCGAAA----------
R
192nt
250 nt
(-) strand strong-stop DNA
PBS
u5
ACCGCGGGCWGUCCCUG 18 nl RNA primer 3'
ACCGCGGGCUUGUCCCUG
3'
primer
FIG.6. Importance of tRNA-viral RNA interactions in initiation of HIV-1 replication and their inhibition of closely related retroviral enzymes. Panel A illustrates three variants of a template-primer duplex used to evaluate synthesis of (-) strand strong-stop DNA. In the first of these, a PBS-containing portion of the wild-type HIV-lmn2 genome contains an 18-nt RNA primer hybridized to the PBS, defining a 192-nt product. Substitution of this oligonucleotide with the intact, 76-nt tRNALyss3 gives rise to a 250-nt product. Finally, an RNA template from which the -A-A-A-A-sequence from the US-IR loop (see Fig. 4) is removed gives rise to a 246-nt
3'
RETROVIRAL R T AND TEMPLATE-PRIMER DUPLEXES
359
loop-U5-IR loop interaction and enabling the p66-p51A13 RT to initiate (-) strand DNA synthesis. If the tRNA anticodon loop-U5-IR loop posed as a barrier to RT mutant p66-p51A13, it is possible that even wild-type HIV-1 RT might be inhibited to a certain extent by such a complex. This does not seem likely considering similar levels of DNA synthesis were supported by HIV-1 RT from tRNALYs,3 annealed to HIV-1 RNA as from tRNALYs,3annealed to a mutant HIV-1 RNA template lacking the A-rich U5-IR loop sequence (27; Fig. 6). Interestingly, both chemical and nuclease footprinting have suggested that HIV-1 RT interacts with the anticodon loop of tRNALYs*3(79-81,114). In addition, a large segment of the p66 thumb and connection subdomains of HIV-1 RT could be cross-linked to the anticodon loop of tRNALYsa(115).Therefore, it appears that HIV-1 RT, upon recognition and initial binding of PBS-bound tRNALYs,3, must displace the tRNA anticodon loop from the U5-IR loop of the RNA genome as a prerequisite to initiating (-) strand DNA synthesis. However, additional interactions, involving other viral accessory proteins such as NC, may play a supporting role (81).Interestingly, a defect of p66-p51A13 RT in synthesis-independent RNase H activity can be restored by the addition of HIV-1 NC (C. Cameron and S. F. J. Le Grice, unpublished observations).The structure of unliganded HIV-1 RT (47)suggests the 13-residue deletion at the C-terminus of p51A13 (113), which influences RNase H activity and tRNALys,3binding, might alter the geometry of its thumb subdomain, which in turn would impact on the positioning of the p66 RNase H domain. Although confirmatory evidence is required, this raises the possibility that NC may interact with and stabilize the mutant heterodimer to prevent a deleterious conformational shift in the RNase H- and tRNA-binding domains. More precise analysis of the p51 C-terminus indicates that a heterodimer mutant whose p51 subunit lacks 12 residues showed no defect in initiation of (-) strand DNA synthesis from tRNALYSs3or its RNase H hydrolysis profile (K. J. Howard and S. F, J. Le Grice, unpublished data). Thus, although the extreme C-terminus of heterodimer associatep51 remains poorly defined in the available RT structures (43-48), it clearly plays an important role in subunit a d o r subdomuin geometry. (-) strand strong-stop product. (B-D) Efficiency of (-) strand strong-stopDNA synthesis on the template-primer variants of panel A by heterologous retroviral RTs. The retroviral enzymes in each panel are: Lane 1, p66-p51 HIV-1 RT; Lane 2, p66-p66 HN-1 RT; lane 3, p66-p51 HIV2 RT; lane 4, SIV RT; lane 5, FlV RT; lane 6, EIAV RT; lane 7, MLV RT; lane 8, AMV RT. While all enzymes support oligoribonucleotide-primedsynthesis (B), only the HN-1 heterodimer and AMV RT are active on a duplex of tRNALys.3-wild-typeviral RNA (C). However, disrupting a critical loop-loop interaction with the AA viral template allows each enzyme to productivelyinitiate from tRNALys.3(D).
360
ERIC J. ARTS AND STUART F. J. LE GRICE
E. Recognition of tRNALYSJ-PBS Duplexes by Heterologous RTs Although the tRNALYSv3isoacceptor is utilized as primer for all lentiviruses, including EIAV, FIV, HIV-1, HIV-2, and SIV, unlike the HIV-1 RNA genome, none of the respective genomes possesses a comparable A-rich loop in the immediate 5’ vicinity of its PBS (27, 86). Considering these observations, could closely related lentiviral RTs employ tRNALYSv3as primer on HIV1 RNA, and do these enzymes display a preference for the tRNALYs,3primer on their homologous RNA template? To investigate these possibilities, several retroviral RTs have been used in a series of RNA-dependent DNA polymerase assays (27).Although all enzymes efficiently extended an 18-nt RNA primer annealed to HIV-1 RNA (Fig. 6B), only HIV-1 p66-p51 RT and AMV RT were capable of efficiently initiating (-) strand DNA synthesis from tRNALYs.3annealed to HIV-1 RNA (Fig. 6C). However, deleting the A-rich sequence in the U5-IR loop of the HIV-1 RNA template restored tRNALYs,3primed synthesis capacity to all enzymes (Fig. 6D). Furthermore, all enzymes could efficiently use an 85-nt tRNA-DNA chimera (tRNALYSy3extended by 9 deoxynucleotides)as primer on the wild-type HIV-1 RNA template. In both situations, it appears that disrupting the tRNA anticodon loop-U5-IR loop complex removes a structural barrier inhibiting initiation of (-) strand DNA synthesis by heterologous RTs. The ability of AMV RT to disrupt this loop-loop complex and initiate (-) strand DNA synthesis from tRNALYs,3on HIV-1 RNA may be attributable to its proposed nucleic acid unwinding activity (116). Although the genomes of other lentiviruses such as FIV and EIAV lack an A-rich sequence in the immediate 5’ vicinity of their PBS, it is possible that alternative intermolecular duplexes can be adopted and require disruption for efficient initiation of (-) strand DNA synthesis. To investigate this, (-) strand DNA synthesis on PBS-containing RNA fragments from FIV and EIAV has been evaluated (27). When tRNALYs*3was annealed to either of these genomes, heterologous RTs support efficient synthesis of (-) strand DNA. Surprisingly, both FIV and EIAV RT showed little preference for tRNALyss3on their homologous RNA templates. Although the HIV-1 enzyme was no more efficient than heterologous RTs when (-) strand DNA synthesis was initiated from an 18-nt RNA primer (Fig. 6B), this enzyme supports significantly more tRNALYs*3-primed(-) strand DNA synthesis on all homologous and heterologous templates, which may be attributable to the increased a n i t y of HIV-1 RT for free tRNALYs*3(27, 79-81). Since no barrier to initiation of (-) strand DNA synthesis appears to exist with tRNALYs93hybridized to the FIV or EIAV genomes, disruption of extensive tRNA-viral RNA interactions such as those determined for HIV-1 may be unnecessary.
RETROVIRAL RT AND TEMPLA=-PRIMER
DUPLEXES
361
However, this does rule out the possibility that interactions between FIV and EIAV RNA genomes and tRNALYs,3may have a lower energy barrier and be more easily disrupted (28, 84). Two significant results have arisen from these studies of heterologous RTs. First, the enzymatically active p66-p66 homodimer of HIV-1 supports oligoribonucleotide-primed(-) strand synthesis, yet fails to do so when this is substituted by tRNALp,3.This is in keeping with previous data indicating that the homodimer of HIV-1 RT had reduced affinity for its replication primer (82).The presence of a second RNase H domain thus appears to impose a steric clash to prevent tRNA from being properly positioned. A second surprising observation was the inability of the p66-p51 HIV-2 enzyme to initiate tRNA-primed (-) strand synthesis from the PBS of the HN-1 RNA genome, while efficiently supporting the equivalent event from PBS-bound oligoribo- and oligodeoxyribonucleotide primers. However, an observation (37, 38) indicates that the smaller HIV-2 RT subunit arises through PR maturation of p66 at Met484and not Phe440,as originally postulated (18, 117). Since previous studies (26, 113) have indicated that both the DNA polymerase and RNase H functions of HIV-1 RT can be altered by modest alterations to the p51 C-terminus, the 440-residue p51 subunit of the original recombinant HIV-2 enzyme (18) may induce a conformation on the heterodimer that is incapable of disrupting the tRNA-viral RNA interaction. Alternatively, since the -A-A-A-A-sequence upstream of the HIV-2 PBS is partially base paired (86),an extensive interaction with the tRNALys*3anticodon loop may not be possibIe. Data such as this underscore the need for further comparative studies with related retroviral enzymes in order to provide a comprehensive picture of the initiation events.
111. Interaction of RT with the Template-Primer Duplex
A. Primer and Template-Grip Motifs In order to consider implementing a program of rational design of antiHIV drugs, a high-resolution structure of the DNA polymerase and RNase H catalytic centers of HIV-1 and HIV-2 RT complexed with the appropriate nucleic acid substrates should be available. Early attempts to define the subunits and subdomains of HIV-1 RT involved in template-primer binding exploited ultraviolet (uv) cross-linkingof short homopolymers, such as polydeoxythymidine (p[dTl), and analysis of the cross-linked products. Preliminary findings (118) indicated that cross-linking occurred almost exclusively to the p66 subunit of the p66-p51 heterodimer, which would be consistent with the notion that heterodimer-associated p51 does not contribute signifi-
362
ERIC J. ARTS AND STUART F. J. LE GRICE
cantly to substrate binding and catalysis (40,49).Using a combination of uv cross-linking and amino acid sequencing of peptides containing the radiolabeled primer, it was proposed (119)that a region encompassing amino acids 195-300 was in close association with the primer. Subsequent to this, a similar approach was used to more accurately define a short polypeptide spanning p66 residues 288-307 as important in primer binding (120).Finally, Sheng and Dennis (121)made use of a photoaffinity probe that could be added to the 3' terminus of the extended primer (thereby locating the photoaffinityprobe to the DNA polymerase catalytic center) and showed that this cross-linked to a peptide corresponding to residues 314 and 423 of the p66 subunit. When these observations are considered in light of the three-dimensional structure of HIV-1 RT (43,44,47;Fig. 2A),p66 residues 195-243, which have been shown to constitute a portion of the palm subdomain, and residues 244-300, which span almost its entire thumb subdomain, have been proposed to mediate template-primer binding and its translocation, respectively (Fig. 2B). The region within the p66 thumb defined by Basu et al. (120)would correspond to a-helix J of the p66 thumb. Although the peptide identified by Sheng and Dennis (121)corresponds largely to the p66 connection subdomain, the amino-terminus of this peptide is located at the base of the p66 thumb. Crystallographicstudies of a complex containing p66-p51 HIV-1 RT and a short (19-mer-18-mer) duplex DNA (44indicate that template-primer occupies a large cleft formed by the fingers, palm, and thumb of the p66 subunit. In addition, the p66 and p51 connection subdomains, together with the p51 thumb, form the floor of the nucleic acid-binding cleft. The p12-pl3 hairpin of the p66 subunit (corresponding to Phe227-His235of the palm subdomain) has been designated the primer grip, which functions to maintain the primer terminus in an orientation appropriate for nucleophilic attack on an incoming deoxynucleotide triphosphate (dNTP)(Fig. 7). The temp2ategrip has been proposed to comprise portions of the fingers and palm subdomains of p66. While p-strand 3 (Ty~=j"-Ile~~) and a-helix B (Arg78-Arg83)of the fingers are possibly involved in contacting the single-stranded template overhang, the P8-aE connecting loop (Va1148-Lys154) and P-strand Sa (Glys6Valg0) of the palm are proposed to contact the primer-template duplex. Unfortunately, the nucleic acid used in preliminary crystallography studies (43) contained insufficient template nucleotides to accurately determine the full extent to which the single-stranded overhang would be accommodated by RT From their model of enzyme containing an extended substrate, Wohrl and co-workers (122,123) have suggested that this should be achieved through an interaction with p-strand 4 ( L y ~ ~ ~ - P and h e ~the ~ )p3+4 connecting loop ( L y ~ " ~ - A r g which ~ ~ ) , is in keeping with earlier proposals from the RT-Nevirapine co-crystal of Kohlstaedt et al. (43).More recently, Hermann et al.(124)
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
Hydroxyl radical footprinting
363
DNase I footprinting
FIG.7. Structural elements of heterodimer-associatedp66 HIV-1 RT involved in accommodation of the template-primer duplex at the catalytic site. The primer grip comprises the p12-pl3 hairpin (Phe227-His235),while the template grip involves the p8-E connecting loop (Val’4”-Lys154) and p-strand 5a (GlyS6-ValQ”) of the palm (in addition to elements of the p66 fingers subdomain).In addition to these, the aH-aI hairpin ( G l ~ ~ ’ ~ - T h rhas ~ * been ~ ) designated the “helix clamp,” serving to stabilize the nucleic acid duplex in the substrate-binding cleft (124). (Adapted from A. Jacobo.-Molina,J. Ding, R. G . Nanni, et aZ., Proc. Nutl. Acad. Sci. U.S.A. 90,6230 (1993),with permission).
have proposed from modeling studies and a comparison with other nucleic acid-polymerizing enzymes that the aH-a1 hairpin (Gln258-Thr286)of the p66 and p51 subunits may act as “clamps” to stabilize duplex nucleic acid in the binding cleft. A later section deals with the consequences of introducing mutations into each of these structurally important motifs.
364
ERIC J. ARTS AND STUART F. J. LE GRICE
B. Chemical and Enzymatic Footprinting Studies The importance of applying complementary biochemical and structural approaches to a study of the interaction of HIV-1 RT with template-primer duplexes was well demonstrated through analysis of DNA-directed replication complexes by hydroxyl radical footprinting (i.e., resistance of nucleoprotein complexes to Fe-EDTA-generated hydroxyl radicals) (125).Independent of the position on the template at which DNA synthesis was arrested (via incorporation of a chain-terminating dideoxynucleoside triphosphate), primer nucleotides - 1to - 15 and template nucleotides +3 to - 15 (the last primer nucleotide incorporated into the nascent DNA chain, and its template equivalent are designated position - 1, while the first nucleotide of the single-stranded template overhang is designated +1)were protected by both wild-type and RNase H-deficient p66-p51 RT from hydroxyl radical-mediated cleavage (Fig. 8A). However, within this footprint, 4 nucleotides of the template (nucleotides -8 to -11) and primer (nucleotides -7 to -10) remained sensitive to chemical attack, providing a “window of accessibility.” When comparing a model of A- or B-form DNA in HIV-1 RT with the pattern of hydroxyl radical protection, it was suggested that a nucleic acid duplex would most likely adopt an A-like conformation in the binding cleft. At the same time as this proposal, the publication of a high-resolution crystal structure for HIV-1 RT containing a 19-mer-18-mer duplex DNA (43) demonstrated that nucleic acid assumed an A-like configuration within the DNA polymerase catalytic center, but a more B-like configuration within the RNase H catalytic center. In the RT-DNA co-crystal (44, it was also noted from crystallography that the nucleic acid duplex was modestly bent (-459 between the two catalytic centers, which correlates well with the windows of accessibility on the template and primer defined by chemical footprinting (125). Finally, both approaches also indicated that the catalyhc centers of HIV-1 RT were separated by a distance of -18 bp. In the RT-DNA co-crystal (44,duplex DNA extended as far as the RNase H catalytic center, but not through the entire C-terminal RNase H domain, while the single base template extension was insufficient to determine how the single-stranded template overhang would be accommodated by the p66 fingers subdomain. This raised the possibility that larger portions of the template overhang and template-primer duplex might be encompassed by the replicating enzyme. In support of this, enzymatic footprinting of DNA-directed replication complexes (123) revealed protection of template nucleotides between positions + 7 and -23, and primer nucleotides between positions -1 and -241-25 (Fig. 8B). Modeling studied suggested that extended interactions with the template overhang were conferred by the f33-f34hairpin of the p66 fingers subdomain, while the aH-a1
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
365
FIG.8. (A) Chemical probing of DNA-directed replication complexes containing wild-type (lane 1) and RNase H-deficient p66-p51 (lane 2) HIV-1 RT. Footprinting was performed following addition of 4 nt to the DNA primer (i.e., +4 replication complexes). Both enzymes protect the DNA template between nucleotides +3 and - 15 from hydroxyl radical cleavage, within which a 4-nt “window”remains accessible (filledbar). Lane 3 illustrates the cleavage pattern of template-primer alone. (B) Enzymatic footprinting of the same complexes indicates protection of the template-primer duplex as far as position -23, and the single-strandedtemplate overhang as far as nucleotide +6 (the latter was defined by S1 footprinting(123)).For both wild-type and RNase H-deficient RT, lanes 1 represent the hydrolysis profde of nucleoprotein complex, while lanes 2 represent the profdes of the template-primer duplex alone. Note that the window of chemical accessibility in panel A lies between nucleotides -8 and -11, while template nucleotides -20 and -21 (indicatedby asterisk) are accessible to the enzymatic probe.
hairpin of the p51 thumb and olE’ of the p66 RNase H domain were proposed to provide a “floor and wall,” respectively, for duplex DNA immediately adjacent to the C-terminal RNase H domain. Thus, the “umbrella” of replicating HIV-1 RT was proposed to embrace -30 template nucleotides (+7 to -23), within which its catalybc centers, or “heart,” are tightly associated with template nucleotides + 3 to - 15 (125). Although a steric clash between the enzymatic probing agent and the enzyme being probed could have accounted for the extended DNase I footprint, parallel studies lend support to the enzymatic footprinting data. Using model template-primer combinations (126),it has been demonstrated that the sensitivity of HIV-1
366
ERIC J. ARTS AND STUART F. J. LE GRICE
RT to dideoxynucleoside triphosphate inhibition is not imparted until the template overhang extends 3-6 nucleotides beyond the DNA polymerase active site (i-e.,enzyme is resistant to ddNTP-mediated inhibition when the template extension is 3 nt or less). Since several RT mutations conferring nucleoside resistance have been located within the p3+4 hairpin of its p66 fingers subdomain, these findings indirectly illustrate the extent to which this subdomain interacts with the template overhang to control selection of an incoming dNTP. The ability to rapidly evaluate DNA-directed replication complexes by either chemical or enzymatic footprinting raised the possibility of determining the contribution of the C-terminal RNase H domain to template-primer binding through a comparative analysis of the RNase H-containing (p66-p51 and p66-p66) and RNase H-free forms @51-p51) of HIV-1 RT. Unfortunately, while qualitatively similar footprints were derived from the p66-p51 heterodimer and p66-p66 homodimer, rapid dissociation of the p51 homodimer from the template-primer duplex has precluded this possibility with the HIV-1 enzyme. Although a closely related p51 subunit EIAV was considerably more active than its HIV counterpart (42),this too failed to generate a stable footprint (B. M. Wohrl and S. F. J. Le Grice, unpublished observations). However, the contribution of the RNase H domain was revealed in a study with the intact @75) and RNase H-free @55) forms of recombinant murine leukemia virus RT (41).Section IV deals with the consequences of removing the MLV RNase H domain on template-primer occupancy (127)and the importance of these studies when addressing a controversial proposal on the orientation of retroviral RT on the template-primer duplex (128).
C. Mutagenesis of Structural Elements Involved in Template-Primer Binding Prior to elucidation of the three-dimensional structure of HIV-1 RT (43), in vitro mutagenesis studies dealt primarily with the consequence of altering critical residues of either the N-terminal DNA polymerase (AspllO, TyP3, Met184,AsplSs and Aspls6) (4, 6, 40, 129) or C-terminal RNase H (Asp443, Glu478,Asp498, His539, and Asn545) domains (130-135).The subsequent availabilityof a detailed model that outlined subdomain geometry within this two-subunit enzyme (Fig. 2A) has made it possible to evaluate the contribution of several structural motifs within its p66 fingers and palm subdomains to binding and positioning of the template-primer duplex. As outlined in an earlier section, the primer grip of HN-1 RT is defined by the p12-Pl3 hairpin of the p66 palm subdomain (Phe227-His235;Fig. 7). Using alanine scanning mutagenesis (the advantage of which in preliminary
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
367
screening experiments is minimization of unfavorable steric contacts and avoidance of imposing new charge interactions or hydrogen bonds), the consequences of altering the primer grip were evaluated in vitro with purified recombinant enzyme, as well as in vivo via replication of recombinant virus (136).Alterations to DNA polymerase and RNase H activity were most apparent following substitution of residues comprising the -Trp229-Met230Gly23’-Tyr232-quartet, which constitutes a tight turn (Met230and and immediately adjacent residues of p-strands 12 (Trp229)and 13 ( ’ 5 ~ ~ Surprisingly, altering Glye31 in the p12-pl3 connecting loop had only a modest influence on enzyme activity, despite a high degree of conservation at this position in the primer grip motif of several retroviral enzymes (137, 138). In support of the in vitro observations, recombinant virus containing the GlyZ3lAla mutation remained viable, while the infectivity of mutants Ti-p22”Alaand Met230Alawas lost and that of mutant Tyr232Alawas 100-fold reduced (139).The observation that bulky aromatic residues with T-electronrich side chains flanked the p12-pl3 connecting loop raised the possibility they might be involved in IT-IT stacking interactions with terminal bases of the template-primer duplex as an additional means of stabilizing the primer 3’ hydroxyl. However, two observations suggest this scenario is unlikely. First, the resolution of the RT-DNA co-crystal (43)is sufficient to demonand Y232 point inward, serving to stabistrate that the side chains of W229 lizing the primer grip via hydrophobic interactions. Furthermore, when the potential for IT-IT stacking is removed through substitution of Phe at positions 229 and 232 of the p66 subunit, the RNA- and DNA-dependent DNA polymerase activities of this double mutant (p66W22YA,Y232A-p51 RT) are only marginally affected (M. Ghosh and S. F. J. Le Grice, unpublished observations). In addition to the consequences of altering residues of the primer grip on DNA polymerase function, an unexpected observation was the inability of the RT mutant p66L234Ato associate into heterodimer with either wild-type p51 (139)or a p51 derivative carrying the equivalent mutation (K. J. Howard and S. F. J. Le Grice, unpublished observations). The trivial possibility that impaired dimerization reflected an in vitro artifact of the recombinant enzyme could be ruled out by the observation that in vivo viral infectivity was also impaired (139).The rationale for impaired dimerization is not immediately clear, although enhanced susceptibility to dissociation raises the possibility of targeting peptides to this region and inactivation of RT function through loss of dimer-associatedactivities (140,141).One clue to this may lie in the observation that Leu234constitutes one of a small number of residues in the inhibitor-binding pocket of p66 that has not undergone mutation in Nevirapine-resistant HIV-1 isolates (142),suggestive of an important archi-
~~)
368
ERIC J. ARTS AND STUART F. J. LE GRICE
tectural contribution. Other primer grip residues falling into this category are Trp229and Phe227,the former of which is clearly sensitive to substitution (136).With respect to Phe227,we have demonstrated that its substitution with Ala (in addition to alanine substitution at Pro226)has severe consequences for the manner in which recombinant enzyme selects the PPT primer for initiation of (+) strand synthesis (M. Powell, M. Ghosh, S. F. J. Le Grice, and J. G. Levin, unpublished observations). A second motif that has received attention is a-helix H of the p66 thumb subdomain ( A ~ n ~ ~ ~ - S e(143-145). r~") This highly conserved motif of nucleic acid polymerizing enzymes has been proposed to contact the sugarphosphate backbone of the primer strand (Fig. 7), and together with a-helix I (Gln278-Thr2s6)may provide a tracking or "clamping" mechanism for the template-primer duplex as it approaches the DNA polymerase catalytic tenter. Within a-helix H the most notable consequences of alanine substitution occurred at residues Gly262and Trp266.While enzymes carrying these mutations were not significantly altered with respect to the rate of catalysis or dNTP binding, the K , for the synthetic homopolymer poly(rA)/oligo(dT)20increased 5-fold (Gly262Ala)and 15-fold (T'266Ala). At the same time, the fidelity of these mutants was significantly decreased relative to the wild-type enzyme. Crystallographic data (43)suggest that Gly262and Trp266occupy a side of a-helix H facing the minor groove of the template-primer duplex, from which it has been proposed (143,144)that the side chain of Trp266contacts the sugar-phosphate backbone at the third position of the primer, while the a-carbon of G1y262is in the immediate vicinity of the fourth sugar of this strand. In the course of these studies, the inactivity of two mutants was also attributed to reduced dimer content, providing a second example of how relatively modest alterations (Val2"IAla and L e ~ ~ ~ ~can A alter l a ) subunit geometry and perturb the dimer interface. In contrast to the prediction that a-helix I was an integral feature of the helix clamp (124, alteration of this motif appears to have minimal consequences for RT function (145). An equivalent systematic analysis remains to be conducted with the fingers subdomain of p66. However, localization of mutations confemng resistance to nucleoside-based inhibitors (AZT, ddC, ddI) (14)within this subdomain of HIV-1 and HIV-2 RT has conveniently provided an extensive study of the p3+4 hairpin (TyrS6-Args3) and surrounding regions. In the original RT-DNA co-crystal (43), the single base template overhang precluded an analysis of the manner in which the extended template might be accommodated. Through a combination of (i) molecular modeling (122), (ii) the response of RT to ddNTP inhibition as a function of template length (1267,and (iii) enzymatic footprinting (123),the p66 p3-p4 hairpin has been proposed to interact with the extended template as far as position +6. Interestingly,the
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
369
site of dNTP binding is some 20 removed from the position around which most nucleoside resistance mutations are clustered, examples of which include resistance to AZT ( A ~ p ~ ~ ALys70Arg; sn, 146), ddC ( L y ~ ~ ~ A147) rg; (ThPAsp; 148), and ddI ( L y ~ ~ ~ A147) r g ; (Leus4Val; 149).This finding has led to the suggestion that altered contacts to the extended template at the fmgers subdomain have a long-range influence on the geometry at the active center, the consequence of which is altered substrate selection. While many of these drug-resistance mutations arose through culturing virus in duo, an elegant approach of Kim and co-workers (150-152) achieved the same result via saturation mutagenesis of the p3-p4 hairpin and the ability of recombinant RT mutants to complement a DNA polymerase temperature-sensitive mutant of Escherichia coli, providing a valuable a microbial screening system where replication at the nonpermissive temperature is placed under control of the retroviral DNA polymerase. Finally, in the context of template-primer interactions, a unique observation has been documented regarding a mutation in p66 HIV-1 RT that has the consequence of increasing fidelity of DNA synthesis. Administration of the nucleoside analog 2’,3’-dideoxy3’-thiacytidine (3TC) in vivo leads to rapid emergence of drug-resistant virus harboring the mutation MetlS4Val (153).Metls4 resides within the highly conserved -TF-Met-Asp-Asp-motif of P-strand 10 of the p66 palm subdomain and constitutes a critical component of the DNA polymerase active site (43,44). When 3TC-resistant virus is subsequently challenged with a second antiviral drug such as AZT,resistance to this second analog emerges only after a considerable delay. One explanation offered for these findings (153)was that introduction of Val at position 184 of HIV-1 RT increased the fidelity of DNA synthesis, with the consequence that replication errors resulting in AZT resistance were significantly suppressed. Although the hypothesis remains controversial, support has been provided through a detailed analysis of recombinant HIV-1 RT carrying the mutation MetlS4Val, where a substantial increase in mismatch selectivity could be demonstrated (154).One hypothesis forwarded for increased fidelity of the Metls4Val HIV-1 mutant is that the isopropyl side chain of VallS4could be more favorably positioned to interact with the base or deoxyribose moiety of the terminal primer nucleotide, thereby providing a more discriminatory role between correct and incorrect bases. While providing only indirect evidence, observations that (i)wild-type murine leukemia virus RT contains Val at the equivalent position and has been documented to be considerably less error prone than the HIV-1 enzyme (155,156),and (2)replacing MetlS4of HIV1 RT with Ala (whose side chain is considerably shorter) results in decreased fidelity of DNA synthesis, provide support a role of Metls4 in discriminating between selection of the appropriate incoming dNTp.
370
ERIC J. ARTS AND STUART F. J. LE GRICE
IV. The RNase H Domain and Hydrolysis of RNA-DNA Hybrids
A. Structure of the RNase H Domain The RNase H domain of heterodimer HIV-1 RT comprises approximateDespite ly the 135 C-terminal residues of its p66 subunit sharing only 24% amino acid homology with its E. coli counterpart, the threedimensional structure of the isolated HIV-1 domain is remarkably similar. In both polypeptides (157-159),a five-stranded mixed P-sheet is surrounded by asymmetrically distributed a-helices. However, a significant difference between the two can be found in the connection between a-helices B and D. Whereas this is simply a five-residue loop in the RNase H domain of HIV-1 RT, the bacterial enzyme contains an additional a-helix (a-helix C) and charge cluster region, which have been proposed to mediate nucleic acid binding (160).Thus, since the DNA polymerase domain of HIV-1 RT provides the majority of the nucleic acid-binding site, it appears that the retroviral enzyme has evolved to accommodate the nucleic acid duplex without a requirement for the "basic protrusion" (159)present in E. cold RNase H. By extrapolation, the catalyticallyimportant residues of HIV-1 RNase H include the metal-binding carboxylates of Glu478,and Asp498(Asplo,G ~ u ~ ~ , and Asp7()in E. coZi RNase H);although Asp549is conserved among bacterial and retroviral RNases H (159),a direct role in catalysis remains to be established. The same holds true for the conserved His539, whose substitution with Phe results in reduction rather than elimination of RNase H activity with both the bacterial and retroviral enzymes (130, 133, 161).However, a molecular clone of HIV-1 containing the equivalent mutation has been demonstrated to be noninfectious (133).A later section provides evidence that this most likely arises through altering the modes of RNase H-mediated hydrolysis and disruption of the first strand transfer event. Despite structural homology with E. coli RNase H, it has been difficult to demonstrate activity with purified polypeptides derived from the RNase H domain of the HIV-1 enzyme (162,163),although fluorescence studies indicated this was capable of binding a model RNA-DNA hybrid (164).Although reports have been documented (165)suggesting considerable levels of activity in purified HIV-1 RNase H, the amount of recombinant polypeptide required to detect activity could not rule out the possibility of low-level contamination by the highly active bacterial counterpart. Loss of hydrolytic function with the purified RNase H domain is most likely explained by the observation that the p51-p66 RT junction recognized by HIV-1protease during heterodimer maturation +Tyr441) , is embedded within P-strand l', with the consequence that the 120-residue C-terminal polypeptide is not
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
371
correctly folded (159).Alternatively, since a cluster of tryprophan residues have been implicated in binding the DNA-RNA hybrid in E. coli RNase H, the absence of this element from the purified HIV-1 domain could also account for its inactivity. Interestingly,the RNase H domain of HIV-1 RT is preceded by an analogous array of tryptophan residues, spanning amino acids 398 and 426, and significant levels of RNase H activity can be recovered in a recombinant 20-kDa version of HIV-1 RNase H containing this -30residue N-terminal extension (166).Similar observations have been made in our laboratory,and substantiated by the observation that “p20” HIV-1 RNase H harboring an inactivating mutation at the highly conserved G ~ continu ~ ued to bind a model RNA-DNA hybrid, but was catalyticallyinert (N. M. Cirino and S. F. J. Le Grice, unpublished observations). What is the contribution of the RNase H domain of retroviral RT to the binding of nucleic acid substrates? Although a high-resolution structure of the RT-DNA co-crystal(43) accurately defined the spatial separation of the DNA polymerase and RNase H catalytic centers, it contained insufficient duplex DNA to determine the total amount of nucleic acid accommodated within the entire RNase H domain. This discrepancy was partially addressed by enzymatic footprinting, which indicated that as much as 24 bp of the template-primer duplex was in contact with the replicating enzyme, a notion that was also verified by modeling studies with extended template-primer duplexes (123).Elements of the RNase H domain proposed to aid in accommodating an extended substrate include a-helix E‘ and the P5’-aE’ connecting loop, which, in addition to a-helices H and I of the p51 thumb subdomain, constitute a floor and wall to correctly position the hybrid for hydrolysis (122).In support of this proposal, heterodimer HIV-1 RT whose p66 subunit contains a short deletion extending into a-helix E’ (p66A16-p51, lacking residues Gly543-Le~560)yields a shortened DNase I footprint of the template-primer duplex in the RNase H domain (167).The observation that a 4-bp “ b e n d in duplex DNA is evident immediately adjacent to the DNA polymerase catalytic center (43)predicts that -12-14 bp of the template-primer duplex might be encompassed by the entire RNase H domain. Although a simple means of testing this would be to determine the nucleic acid-binding profile of RNase H-free enzyme (i.e., the p51 subunit), this has proven difficult with this version of HIV-1 RT and its counterpart from EIAV. The inability to generate a footprint with p51 HIV-1 RT most likely reflects the tendency of the purified polypeptide to dissociate into an inactive monomer (168).However, our analysis of MLV RT lacking its RNase domain H (127)has indicated that its elimination results in diminished protection of the template-primer duplex by -11-12 bp from DNAse I cleavage; that is, while the parental enzyme accommodated the nucleic acid duplex between positions - 1 and -261-27, a derivative devoid of the RNase H do-
~
~
3 72
ERIC J. ARTS AND STUART F. J. LE GRICE WT MLV RT
ARH YLV RT
12 bp ductlon In DNlure I footprint
+W4
-16
ARHMLVRT
71-nt Template
-1
r-i
3 6 4 Prlmw
.1 6 ARHMLVRT
priwnuchotide8
FIG.9. Consequences of eliminating the RNase H domain of MLV RT on the interaction with template-primer. Enzymatic footprinting (a DNase I footprint is illustrated here, while S1 footprinting was used to determine interactions with the single-stranded template) indicates wild-type MLV RT protects the template-primer duplex to position -27 and the single-stranded
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
3 73
main afforded protection as far as position -IS. At the same time, RNase H-deleted MLV RT continued to protect the single-stranded template overhang as far as position +7 (Fig. 9). While enzymatic footprinting studies provide only a low-resolution picture of complexes of HIV-1 RT and template-primer, the consequences of minor truncations (167) or complete removal of the RNase H domain (127) have provided important information with respect to the orientation of retroviral RT relative to the nucleic acid duplex. Based on the crystal structure of rat DNA polymerase p, an alternative proposal was forwarded that HIV-1 RT (and presumably closely related retroviral enzymes) binds in a manner opposite to that predicted (43)and later determined from x-ray crystallography (44).In the model extrapolated from the structure of rat DNA polymerase p (128), the RNase H domain of HIV-1 RT would contact the single-stranded template overhang ahead of the template-primer duplex, suggesting that removal of the RNase H domain would have little influence on how RT interacts with the template-primer duplex. The experimental observation that partial (167) or complete (127) removal of the RNase domain H reduces the DNase I footprint of the template-primer duplex downstream of the primer terminus, while contacts to the template overhang remain unaffected, contradicts this proposal and supports the original postulates (43,44). RNase H activity is classically defined as hydrolysis of the RNA component of an RNA-DNA hybrid. However both the HIV-1 and MLV enzymes have demonstrated a capacity to hydrolyze duplex RNA (169,170). This activity was originally designated RNase D (169),but subsequently designated RNase H* (171) in order to avoid confusion with the enzyme involved in tRNA maturation (172).and indicate that, like several restriction endonucleases, it is invoked by altering the divalent cation requirement (with the HIV1 enzyme, RNase H activity is strongly Mg2+dependent, while RNase H* activity is favored in the presence of Mn2+).The ability to hydrolyze hybrid and duplex RNA arises from the same catalytic center of the HIV-1 enzyme, evidenced by the observation that alteration of G ~ ua residue ~ ~ ~critical , for metal ion coordination (130,131),simultaneouslyeliminatesboth activities in the presence of Mg2+(169). In contrast, it has been possible to generate mutants of MLV RT whose RNase H and RNase H* activities are independent (170). However, a role for RNase H* activity during retroviral replication is not immediately clear. The observation that RNase H* activity is invoked under con~
~~
~~
template overhang as far as position +6 (an explanation of S1 footprinting is given in (123)). Elimination of the RNase H domain results in protection of the template-primer duplex only as far as position - 15, while protection of the single-stranded template remains unchanged. The lower portion of the figure summarizes the interaction of wild-type and RNase H-deletedMLV RT with the template and primer strands.
374
ERIC J. ARTS AND STUART F. J. LE GRICE
ditions when RT is artificially arrested in vitro during RNA-dependent DNA synthesis (e.g., limited synthesis from PBS-bound tRNALYs*3in the presence of chain terminating ddNTPs),which localizes the polymerizing enzyme over duplex RNA (109, suggests this unusual property may not be of biological significance.
B. Polymerization-Dependentand -Independent RNase H Activities A combination of biochemical and biophysical analyses have indicated that the N-terminal DNA polymerase and C-terminal RNase H catalytic centers of HIV-1 RT are separated by a distance of -18 bp (43, 44). It would therefore not seem unreasonable that, during RNA-directed DNA synthesis, hydrolysis of the template occurs a fHed distance from the growing primer terminus. This notion has been verified by several groups (173-179);an example of is illustrated in the hydrolysis profile of Fig. 10. However, the same figure clearly illustrates that hydrolysis is not restricted to endonucleolybc cleavage at template nucleotide -17, but is accompanied by a second series of processing events extending to within 8 nt of the primer terminus. Although the second hydrolytic function has been designated 3’ + 5’ exonuclease (180-182), the terminology “polymerization-”or “synthesis-independent hydrolysis” is perhaps more appropriate, while endonuclease activity can be likened to polymerization-dependent hydrolysis. Further cleavage of the RNA-DNA hybrid beyond template nucleotide -8 is most likely a consequence of instability and dissociation of the short RNA-DNA hybrid. The analysis of Fig. 10 holds true for the 3’-OH of a recessed, fully annealed DNA primer. However, it has been demonstrated that, when the 3’ terminus of primer is not base paired, the nucleic acid duplex exerts a strong influence on RT positioning and hydrolysis; that is, the RNA template is hydrolyzed approximately 18 bp behind the first base pair of the duplex (183). RNase H activity has been examined in most cases with an RNA template hybridized to a recessed DNA primer as a means of correlating the degradative and polymerizing functions of RT (164,167,173,174,184).Such a scenario can be regarded as providing an accurate picture of (-) or first-strand DNA synthesis, which accompanies virus entry and uncoating. However, during second- or (+) strand synthesis, multiple fragments of the (+) strand RNA genome remain hybridized to nascent (-) strand DNA. These fragments must be removed either by strand displacement activity of RT (185) or through further hydrolysis under conditions where a free DNA 3’-OH is absent to position the polymerizing enzyme (i.e., the substrate is a hybrid whose RNA 5’ and 3‘ termini are recessed). Under such conditions, HIV-1 RT is capable of positioning itself to generate equivalent cleavage events 18 and 8 nucleotides from the RNA 5’ terminus (183).However, when the re-
375
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
Directional processing 5’ labeled RNA Template
Endonucleolytic cleavage
-,7 -8
+7
-24
-e -1
DNAPrimer
HIV-1 RT
-25
Template position
-17
-
-8
-
-
74
- 71 - 64 - 62 Time (Seconds)
FIG.10. (A) A model RNA-DNA hybrid for evaluation of RT-associatedRNase H activity. Substrate is a 90-nt RNA radiolabeled at the RNA 5’ terminus and hybridized at its 3‘ terminus to a 36-nt oligodeoxynucleotide.Template and primer nucleotides occupied by RT (filled ellipsoids) were determined by enzymatic (123, 127) and chemical (125) footprinting (B) When p66-p51 HIV-1 RT is located over the primer 3’ terminus, two modes of hydrolysis can be distinguished. The first involves cleavage at position -17 (71-nthydrolysis product), which defines the spatial separation of the DNA polymerase and RNase H catalytic centers (44). Subsequent to this, a directional processing or synthesis-independentRNase H activity hydrolyzes the template as far as position -8 (62-nt hydrolysis product).
376
ERIC J. ARTS AND STUART F. J. LE GRICE
cessed RNA 5’ strand contains a short unpaired 5’ terminus, this continues to “direct” cleavage specificity rather than the nucleic acid duplex (i.e., hydrolysis occurs 18 and 8 nucleotides from the unpaired terminus). As the length of this unpaired terminus is extended, the ability to direct RT toward hydrolysis of the nucleic acid duplex is lost, eliminating the possibility of a mechanism where the retroviral polymerase attaches to the single-stranded terminus and thereafter scans the nucleic acid until it encounters and positions itself over a duplex substrate. Presently, it is not immediately clear why unpaired DNA and RNA termini have a differential effect on hydrolysis of the adjacent RNA-DNA hybrid, which was the same in both cases. Although strand displacement activity during (+) strand synthesis (185)might eliminate a requirement for an RNase H activity directed by an RNA 5’ terminus, a later section deals with the possibility that such a specialized function could potentially aid in selecting the PPT primer for (+) strand synthesis.
C. RNase H-Dependent Steps in Retroviral Replication As presented in the HW-1 replication cycle of Fig. 1, the primary responsibility for RT-associated RNase H can be considered simply nonspecific hydrolysis of the RNA moiety of the RNA-DNA replication intermediate, and in doing so making newly synthesized (-) strand DNA available as template for (+) strand synthesis. However, several occasions arise during retroviral replication where a considerable degree of selectivity is required of the RNase H domain. One example is selection of the PPT from the (+) strand RNA genome as primer for second-strand synthesis via resistance of the corresponding RNA-DNA hybrid to hydrolysis (186-189). In addition, both the (-) and (+) strand RNA primers (tRNA and the PPT, respectively) must be removed from nascent DNA in order to generate a double-stranded proviral DNA unintempted by ribonucleotides. Since the junctions between the (+) and (-) strand RNA primers and nascent DNA define proviral sequences at the termini of the 5’ and 3’ LTRs, respectively, which are critical to efficient integration, it would not be unreasonable to assume that mechanisms have evolved that allow for precise primer removal. Prior to the second-strand transfer event, the 3’ end of the tRNA replication primer is used as a template, dictating DNA-dependent DNA synthesis until a methylated base is encountered within the tRNA W C loop (Me-A58). In conjunction with secondary structure adopted by the tRNA replication primer, transient pausing of the replication complex at Me-A58 would have the consequence of positioning the C-terminal RNase H domain approximately 18 nt behind the primer terminus, which corresponds to the junction of (-) strand DNA and tRNA. The spatial separation of the DNA polymerase and RNase H active centers of RT may then provide the appropriate mechanism for precise tRNA primer release, as was originally noted for the avian
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
377
enzyme (190). Model substrates have been constructed to mimic tRNA primer removal, comprising an RNA-DNA chimera hybridized to DNA, and provided a more complex picture with the HIV-1 enzyme. Rather than cleaving at the tRNA-DNA junction, a single ribonucleotide remains covalently attached to the 5’ terminus of the nascent DNA chain (173, 174, 191-194), the implications of which for retroviral integration are unclear. Invoking a similar scenario for removal of the (+) strand PPT primer is more dimcult, although several groups have recorded efficient and precise elimination of the PPT in in vitro reconstituted systems with the RTs of MLV (187),HIV-1 (188,194, and EIAV (J. W. Rausch and S. F. J. Le Grice, unpublished observations). Finally, RNase H activityplays a pivotal role in events whereby newly synthesized (-) and (+) strand DNA are translocated within or between the termini of the retroviral genome during the process of strand transfer ( I , 29, 196).Using a model system mimicking the first-strand transfer event, Peliska and Benkovic (197,198)have demonstrated a direct relationship between the efficiency of strand transfer and the size of residual template RNA remaining hybridized to nascent (-) strand DNA. As presented later, at least the first-strand transfer in HIV invokes a polymerization-independent mode of hydrolysis, since enzymes lacking this function, but fully active as polymerization-dependent RNase H, fail to support this process (164,167).
D. Polymerization-IndependentRNase H Activity and Strand Transfer While data in Section IV,B clearly define two modes of RNase H activity, the necessity for both during retroviral replication was not immediately clear, and was only addressed as methods for analyzing RT-associated RNase H have increased in sophistication. Figure 1 highlights steps during retroviral replication where nascent DNA must be transferred between two genomes (interstrand) or within termini of a single genome (intrastrand) by a process designated DNA strand transfer (29).The first of these is predominantly an interstrand event involving the translocation of a chimera of tRNA and nascent (-) strand strongstop DNA between repeat regions (designated R and r in Fig. 1) of the (+) strand RNA genome. An elegant series of in vivo experiments by Goff and co-workers with MLV RT (199,200) demonstrated that RNase H activity is required to hydrolyze the RNA-DNA replicative intermediate, thereby making (-) strand DNA available for hybridization to an acceptor template. Subsequent to this, an in vitro assay was developed, comprised of short oligonucleotide substrates, that provided an accurate representation of in vivo events ( 1 7 9 and is outlined diagrammaticallyin Fig. 11. In an initial step, an oligodeoxynucleotideprimer is extended to the 5’ ter-
1
tc1
, R 20nt
40 nt
r
41 nt
r
u3
Mg, dNTPs, RT
<
41 nt u3
61 nt 4
+
4 l3
40nt
l3
61 nt
r
u3
Time (min)
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
379
minus of a donor RNA template. In the presence of a second (acceptor) template with homology to the 5’ end of the donor, DNA strand transfer is accomplished, followed by further extension of the primer to the 5’ terminus of the acceptor. By end-labeling the donor RNA template, its fate during DNA strand transfer could be monitored. According to these authors, synthesis-dependent RNase H activity would, during polymerization, cleave the RNA template a fmed distance of 18 nt behind the primer terminus until the 5’ terminus of the template is reached. Subsequent to this, it was demonstrated that strand transfer was dependent on the ability of RT-RNase H to hydrolyze the donor template within 8-10 nt of its 5’ terminus via a process designated synthesis-independent RNase H activity. This model predicts that a role for DNA synthesis-independent RNase H activity could involve degradation of the donor RNA template to a size permitting its dissociation from nascent DNA and hybridization of the latter to the acceptor template. Implicit in this model is that RT mutants specifically deficient in directional processing activity would not support DNA strand transfer. Reports from our laboratory have confirmed this notion, an example of which is provided in the experiment of Fig. 11.Removal of a portion of a-helix E’ from the RNase H domain generated a mutant of HIV-1 RT (p66A8-p51, lacking p66 residues SeP53-Leu560) that retained full DNA polymerase function, but only the synthesis-dependent subset of RNase H activities (167),which resulted in severely impaired DNA strand transfer. In a related study, substituting Mn2+ for Mg2+ as divalent cation restored DNA synthesis-dependent RNase H activity to an HIV-1 RT mutant altered at the highly conserved G ~ of uthe RNase ~ ~ H domain ~ (130).Despite retaining wild-type levels of DNA synthesis and the accompanying RNase H activity in the presence of Mn2+,this mutant likewise failed to support DNA strand transfer (164).Similar observations have been made with an HIV-1 mutant containing a 13-residue deletion at the C-terminus of the p51 subunit @66-1151813). FIG.11. (A and B) Mutations in the p66 (A) or p51 (B) subunit of HIV-1 RT leading to loss of directional processing RNase H activity. In panel A, residues SelJ53-Leu560 were removed from the C-terminusof the p66 subunit, while in panel B, residues Gln428-Phe440were removed from p51. W, wild-type RT; M, mutant RT; C, uncleaved substrate. Using the RNA-DNA hybrid of Fig. 10, both mutants support efficient cleavage at template nucleotide -17, but fail to process from here to position -8. (C)In oitro assay for interstrand DNA transfer. The assay comprises a 40-nt RNA template to which a 20-nt DNA primer is hybridized, and a second 41-nt RNA template. The two templates share 20-nt of homology (r), allowing exchange of a 40-nt strand transfer intermediate and subsequent extension to yield a 61-nt strand transfer product. (D) Inability of RT mutants p66A8-p51 and p66-p51A13 to support DNA strand transfer, despite retaining synthesis-dependentor endoribonuclease activity. Note in panel D that the axes differ in scale for the wild-type and mutant enzymes.
380
ERIC J. ARTS AND STUART F. J. LE GRlCE
Although several reports have been documented in the literature dealing with the requirement of RNase H activity for (-) strand DNA transfer (188, 196-198,201) and its enhancement in the presence of NC (198,202),there have been relatively few studies on the second-strand transfer event, which is responsible for relocating (+) strand strong-stop DNA to the 5’ terminus of the (-) strand (Fig. 1).However, the availability of chemically synthesized oligoribonucleotides,as well as natural and synthetic tRNA, should make it possible to prepare RNA-(-) strand DNA chimeras as model templates for (+) strand synthesis and the second-strand jump. Early reports with avian retroviruses (190) have suggested that the cognate tRNA primer (tRNATrp) is released intact from (+) strand strong-stop DNA, which would predict that polymerization-independentRNase H activity is dispensable for efficient second-strand transfer. As our understanding of these and other events requiring precise RNase H-mediated hydrolysis and the different modes of hydrolysis improves, this should render RNase H activity an attractive target for future antiviral efforts.
V. The Polypurine Tract and Second-Strand Synthesis A. Selection and Initiation from the 3‘ Polypurine Tract Primer Perhaps the least understood event in retroviral replication is the manner in which the appropriate primer is selected from the RNA-DNA replicative intermediate for initiation of (+) strand synthesis. Data from several laboratories has indicated that an RNase H-resistant, purine-rich segment located toward the 3’ end of the genome, designated the PPT, provides this cis-acting function (30,186,191,196,203; Fig. 1).However, analysis of most retroviral genomes reveals the presence of several purine-rich regions that might be equally capable of resisting hydrolysis, yet fail to support (+) strand synthesis, suggesting the PPT contains additional structural features critical to its use as a primer. An example of PPT selection in vitro is illustrated in Fig. 12. The model system of Figure 12A ( 2 0 3 ~comprises ) a PPT-containing (+) strand RNA fragment to which a (-) strand DNA primer is hybridized at the 3’ terminus. Primer is extended with an RNase H-deficient RT (130)and the intact RNA-DNA hybrid is purified. In a second reaction, the full-length hybrid is incubated with wild-type RT in the presence of m s , one of which is radiolabeled, and the products are fractionated by high-voltage electrophoresis. Several features of PPT selection and extension are evident from the analysis of Fig. 12B, which assesses the manner in which the EIAV 3’ PPT is uti-
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
381
FIG.12. Discontinuous (+) strand synthesis in lentiviruses and termination of 3’ PF’T-initiated synthesis at the central termination site (CTS).(-) and (+) strands are represented by bold and gray lines, respectively. (Step a) (+) strand synthesis initiates from both the 3’ and central PPT (3’and C, respectively).3’ PPT-initiated synthesis terminates within the tRNA template, and the tRNA is released. (Step b) Second-strandtransfer, via a circular intermediate, uses PBS homology of (-) and (+) DNA. (Step c) Bidirectional strand displacement synthesis, via elongation of the (-) strand and downstream (+) strand. (Step d) Completion of upstream (+) strand segment and its termination after a CTS-controlled strand displacement event at the center of the retroviral genome.
lized by homologous RT, as well as that of HIV-1.The predominant (+) strand DNA species in a reaction containing EIAV RT corresponds to initiation immediately adjacent to the 3’ G of the PPT, and is present in the absence and presence of alkali treatment (which liberates residual (+) strand RNA from nascent DNA). Such a result implies that EIAV RT (i)accurately processes the PPT at both its 5’ and 3‘ termini for use as primer and (ii)efficiently removes
382
ERIC J. ARTS AND STUART F. J. LE GRICE
the primer from nascent (+) strand DNA. In contrast, HIV-1 RT appears to initiate (+) strand synthesis at several positions within the EIAV PPT, and from an RNA primer whose 5’ terminus is somewhat more heterogeneous. Thus, although a preferential site for initiation of (+) strand initiation clearly exists for both retroviral enzymes, the exact position differs for two closely related lentiviral enzymes. Similar observations with heterologous enzymes on the HIV-1 and MLV PPT primers have also been made (191,204, suggesting that each retroviral enzyme may have evolved to accommodate sequence and/or structural features of its particular (+) strand primer. What are the features of the 3’ PPT that favor its usage over other (+) strand RNA fragments as the appropriate primer? According to the replication scheme of Fig. 1, initiation of (+) strand synthesisat a position other than the 3’ PPT would have the consequence of altering the sequence at the terminus of the 5’ LTR, which would be lethal for the subsequent integration event. Theoretically, it is sufficient that the 3‘ PPT is used m e eficiently than other purine-rich RNA fragments generated by RNase H degradation of the replicative intermediate, rather than exclusively (205). Once (+) strand DNA initiated from the 3’ PPT is copied from the first 18 nucleotides of the tRNA template, RNase H activity removes the tRNA replication primer, allowing the second-strandjump via homology of PBS sequences of (-) and (+) strand DNA (Fig. 1). The absence of the t3NA template for any imprecisely primed (+) strand DNA thereby assures that this cannot undergo strand transfer. Furthermore, the RNA-DNA replicative intermediate containing the 3’ PPT is generated shortly after the first strand transfer event, imparting a temporal advantage on this primer over others several hundred to several thousand bases from the 3‘ end of the (-) strand DNA. However, several purinerich segments in the immediate vicinity of the HIV-13‘ PBS that might have a similar advantage are bypassed by the replication machinery, indicating that additional features contribute to 3’ PPT selection and usage. The observation that HIV-1 and MLV RT efficiently initiate from homologous and heterologous PPTs, despite significant differences in the surrounding sequences (204, suggested that the appropriate information is confined to the 15-18 nt of the PPT. This contention is supported by recent data indicating that, in vitro, the HIV-1 PPT can be relocated within a new sequence context without altering the efficiency of its selection and extension into (+) strand DNA (195). In light of this, the sequence of the PPT itself appears the logical determinant. In most, if not all retroviruses, the PPT has a defined arrangement of purine nucleotides; that is, the 5’ region is predominantly A-rich, while a series of up to six consecutive G residues lie immediately adjacent to the site of (+) strand initiation (an example is given for the EIAV 3‘ PPT in Fig. 12). By introducing a series of single nucleotide
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
383
substitutions into the PPTs of MLV and HIV-1, Pullen et al. (204)proposed that RT was capable of measuring a defined number of nucleotides 3 ' to the contiguous stretch of adenine residues and initiating from this point. However, several observations suggest this postulate is unlikely. First, the same authors noted that mutations in this stretch of A residues had a significant effect on (+) strand priming by MLV RT while having little influence on the HIV-1 enzyme. Second, the PPT of the retrotransposon Ty3 is for the most part a series of alternating A and G residues (206),and a similar sequence can be selected and used as a PPT primer by the HIV-1 enzyme (195).Finally, Powell and Levin (295)have demonstrated that a sequence as short as the six contiguous G residues at the 3' end of the HIV-1 PPT can be introduced into a new context in which they are preceded by the sequence -UC-A-U-A-C-C-A-U-,yet continue to direct efficient (+) strand initiation (in this respect, it is interesting to note that the 3' PPT of the yeast retrotransposon Tyl contains a G-rich sequence at its 3' end (207),but is preceded by a more pyrimidine-rich sequence). These latter observations suggest that the contiguous G residues impart an unusual configuration into the PPT-DNA hybrid, which spectropolarimetry has clearly demonstrated adopts a structure significantly different from a random RNA-DNA hybrid.
B. Mutations in RT Influencing PPT Selection and Extension A complement to determining how sequence and/or structural features influence selection of the PPTs of different retroviruses and retrotransposons might be to identify RT mutants that fail to support either selection or extension from this primer. Such defects have been revealed by studies of HIV44,136, 1RT mutated in the p12-pl3 hairpin or primer grip 139).Alanine substitution at Pro226or Phe227of p-strand 12 the p66 subunit yields selectively mutated heterodimers whose DNA-dependent DNA polymerase and RNase H activities appear unaffected on model templates (136). The same two mutants correctly process the PPT from a model substrate comprising a DNA strand hybridized to a PPT RNA-DNA chimera (i.e., they cleave at the PPT-(+) strand DNA junction). However, despite correct PPT processing, these mutants apparently fail to support (+) strand synthesis while have little difficulty extending a DNA oligonucleotidehybridized to the PBS (M. Powell, M. Ghosh, S . F. J. Le Grice, and J. Levin, manuscript submitted). A second unusual phenotype is evident when G ~ anduHis235 ~ of~ P-strand 13 are substituted with Ala. Although active in hydrolysis of a heteropolymeric RNA-DNA hybrid (139),these mutants have considerable difficulty in processing the 3' PPT (207~). These combined observations suggest that (i) PPT selection may be a specialized form of RNase H activity
~
384
ERIC J. ARTS AND STUART F. J. LE GRICE
(PPT-processing RNase H activity may be more appropriate), and (ii) RNA and DNA primers may be recognized differently within the DNA polymerase active site of HIV-1 RT, which may offer further therapeutic possibilities directed at initiation of (+) strand synthesis.
C. Central PPT and Central Termination Sequences of Lentiviruses The replication scheme presezited in Fig. 1 depicts (+) strand synthesis initiating exclusively from a PPT located toward the 3’ end of the retroviral genome and therefore intact prior to integration of the double-stranded proviral DNA. However, early data from Visna lentivirus (208) and spumaviruses(209)indicated a discontinuityin the proviral (+) strand, which could be mapped to the central portion of the genome in the vicinity of a near-perfect copy of the 3’ PPT. Subsequent to this, similar discontinuities have been demonstrated for HIV-1(32,210), EIAV (J.W. Rausch and S. F. J. Le Grice, unpublished observations), and the yeast retrotransposon Tyl (207).These observations led to the modified scheme of Fig. 13, where a second site for (+) strand initiation is provided by the central PF’T (cPPT).The requirement for a central PPT is not immediately clear, since (i)this does not appear to be a feature of murine and avian retroviruses and (ii) in the yeast retrotransposon Ty 1, transposition frequency is unaffected by mutations eliminating the cPPT but sensitive to alterations to the 3’ PPT (207). However, both Hungnes et al. (210) and Chameau and Clavel(32) have presented convincing data that cPPT-inactivating mutations that maintain the integrity of the IN gene product (the cPPT is located within the I N coding region) have a si&icant influence on in vivo replication kinetics. Furthermore, the observation that subviral nucleoprotein complexes harboring such (+) strand discontinuities are competent for integration (211) suggests that “sealing” the discontinuous (+) strand can be accomplished by host enzymes following integration of full-length proviral DNA. Since strand displacement during DNA synthesis is a common feature of RT (184,the existence of a (+) strand discontinuity imposes another regulatory feature of the replication cycle, namely terminated synthesis of 3‘ PPTinitiated (+) DNA that has undergone strand transfer as it approaches the cPPT. In fact, data supporting this notion have been provided for HIV-1, where efficient termination of (+) strand synthesis immediately 3‘ to the cPPT has been demonstrated (212). In vitro studies indicated that the efficiency of termination increases when RT is required to perform limited displacement of the cPPT-initiated (+) strand, which has generated the model of Fig. 13. Structural features of this sequence, designated the “central termination sequence” (CTS),indicate it may adopt an unusual curvature as a
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
385
[A1
FIG.13. An in vitro assay for selection,utilization,and removal of the (+) strand PFT primer. Panel A outlines the experimental strategy. In step a, (-) strand DNA synthesis is initiated by RNase H-deficient RT (130)from an oligodeoxynucleotideprimer hybridized to the 3’ end of a (+) PPT-containingRNA template. The resulting RNA-DNA hybrid is purified (step b) and incubated with wild-type RT in the presence of d ” s , one of which is radiolabeled (step c).RNase H activity digests the RNA-DNA replicative intermediate with the exception of the PPT, which is extended into (+) DNA (step d). As the enzyme traverses the PPT RNA-(+) DNA junction, the PPT is removed from nascent (+) DNA (step d). Panel B depicts (+) strand synthesis on the EIAV genome catalyzedby HIV-1 and EIAV RT. (Left)A sequencing gel is used for location the EIAV PPT. (Right)selection and extension of the EIAV (+) strand PPT primer. Lanes notations (-) and (+) indicate whether nascent (+) strand DNA was treated with NaOH to remove the PPT primer. Clearly, EIAV RT selects, extends, and removes the PPT primer in a single step, since the major (+) strand product is largely unaffected by alkali treatment. Synthesis is also initiated from the 3’ -G- of the PPT. In contrast, HlV-1 RT (i) is less efficient in removing the EIAV PPT, and (ii) initiates (+) strand synthesis from a different position (i.e., within the EIAV PPT) (203~).
386
ERIC J. ARTS AND STUART F. J. LE GRICE
consequence of dA,-dT, tracts, a feature common to other biological systems (213-215). As with alterations to the central PPT, altering the CTS to allow “readthrough” (+) strand synthesis has severe consequences for viral replication. Interestingly, in a model system, HIV-1 RT will also terminate (+) strand synthesis shortly after it has used the CTS as template. Such data imply that duplex DNA with an unusual curvature occupies the nucleic acidbinding cleft of the enzyme, which, together with strand displacement synthesis, may impose an additional control mechanism to halt (+) strand synthesis. Although this is presently documented for HIV-1, we have recently demonstrated in vitro that EIAV RT is similarly stalled at a CTS sequence in the immediate vicinity of its cPPT (S. Stetor and S. F. J. Le Grice, unpublished observations).
VI. Conclusions As a consequence of HIV-1infection and the unabating problem of AIDS, the RTs of HIV-1 and HIV-2 have become perhaps the most intensely studied retroviral enzymes of the last decade, reflected in as little as 6 years between documentation of the heterodimeric nature of the HIV-1 enzyme (16, 17)and the availability of a high-resolution structure of the p66-p51-Nevirapine co-crystal(43). The advent of modem molecular techniques has also made it possible to finely dissect the multiple steps involved in retroviral replication and involvement of accessory proteins such as NC. This review has focused for the larger part on our current understanding of the interaction of HIV-1 RT with template-primer duplexes indicative of (-) and (+) strand initiation complexes, as well as the RNA-DNA replication intermediate, revealing unexpectedly complex control mechanisms surrounding certain events. Such findings will hopefully offer novel avenues for therapeutic intervention in our attempts to develop future generations of antiviral agents. However, a caveat in the data presented here should not be underestimated. As our understanding of HIV-1 RT improves, it is clear that certain mechanistic hypotheses are often not directly applicable to other retroviral enzymes. This is exemplified by the observation that many retroviral enzymes employing tRNALYs*3as their replication primer (including HIV-2) fail to extend tRNALYs,3hybridized to the PBS of the HIV-1 genome while doing so from their own PBS (27).Additionally,while HIV-l-derived data predicted that the smaller HIV-2 subunit should arise through PR-derived hydrolysis between Phe440 and TYp4l, it has now been demonstrated that the HIV-2 RT heterodimer results from cleavage between Met484 and Ala485(37).Partial removal of the RNase H domain may also be a feature of the heterodimeric EIAV enzyme (39).These simple observations provide clear evidence that
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
387
HIV-1 RT should be treated as one of a family of retroviral enzymes rather than the prototype. The availability of additional retroviral enzymes through recombinant DNA technology should promote further comparative studies to provide a comprehensive picture of events mediated by this highly versatile enzyme.
ACKNOWLEDGMENTS This work was funded by Public Health Service Grants AI31147, GM 52263, and GM46623 to S. F. J. Le Grice. E. J.Arts was supported by a postdoctoral fellowship from Health and Welfare Canada.The Center For AIDS Research at Case Western Reserve University is funded by Public Health Service Grant P30 AI36219. Finally, the assistance of J. W. Rausch, J. Miller, E. Arnold, J. Ding, and J. Levin in preparation of this manuscript is gratefully acknowledged.
REFERENCES I. H. E. Varmus and R. Swanstrom, in “Molecular Biology of Tumor Viruses,” 2nd ed. RNA Tumor Viruses (R. Weiss et al., eds.), p. 369. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1984. 2. H. Temin and S. Mizutani, Nature (London) 226,1211 (1970). 3. D. Baltimore,Nature (London) 226,1209 (1970). 4. B. Larder, D. F‘urifoy,K. Powell, and G. Darby, EMBOJ. 6,3133 (1987). 5. D. A. Soltis and A. M. Skalka, R-oc. Natl. Acad. Sci. U.S.A. 85,3372 (1988). 6. A. Hizi, C. McGill, and S. H. Hughes, Es-oc. Natl. Acad. Sci. U S A . 85,1218 (1988). Z M. Roth, N. Tanese, and S. P. Goff,]. Biol. Chem. 260,9326 (1985). 8. S. F. J. Le Grice and F. Griininger-Leitch,Eur.]. Biol. 187,307 (1990). 9. S. F. J. Le Grice, M. Panin, R. Kalayjan, N. Richter, G. Keith, J. L. Darlix, and S. L. Payne, ]. Virol. 65,7003 (1991). 10. M. h a c k e r , M. Hottiger, and U. Hubscher,]. Viml. 69,6273 (1995). 11. L. Ratner, W. Haseltine, R. Patarca, K. J. Litvak, B. Starcich, S . F. Josephs, E. R. Doran, J. A. Rafalski, E. A. Whitehorn, K. Baumeister, L. Ivanoff, S. R. Petteway, Jr., M. L. Pearson, J. A. Lautenberger, T. S.Papas, J. Ghrayeb, N. T. Chang, R. C. Gallo, and F. Wong-Staal, Nature (London)313,277 (1985). 12. M. Guyader, M. Emerman, P. Sonigo, F. Clavel, L. Montagnier, and M. Alizon, Nature (London) 326,662 (1987). 13. B. A. Larder, in “Reverse Transcriptase”(A. M. Skalka and S. P. Goff, eds.), p. 205. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1993. 14. C. Tantillo, J. Ding, A. Jacono-Molina,R. G. Nanni, P. L. Boyer, S. H. Hughes, R. Pauwels, K. Andries, P. A. J. Janssen, and E. Arnold,]. Mol. B i d . 243,369 (1994). 15. J. W. Erickson and S. K. Burt, Annu. Rev. P h m o l . Toricol.36,545 (1996). 16. M. M. Lightfoote, J. E. Coligan, T.M. Folks, A. S. Fauci, M. A. Martin, and S. Venkatesan, ]. ViroZ60,771 (1986). 1 Z F. Di Marzo Veronese, T. D. Copeland, A. L. DeVico, R. Rahman, S. Oroszlan, R. C. Gallo, and M. G. Samgadharan, Science 2 3 1 1289 (1986). 18. B. Muller, T. Restle, H. Kuhnel, and R. S.Goody, ]. Biol. C h .255,14709 (1991).
388
ERIC J. ARTS AND STUART F. J. LE GRICE
19. A. M. Skalka, in “Reverse Transcriptase” (A. M. Skalka and S. P. Goff, eds.), p. 193. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1993. 20. A. Aiyar, D. Cobrinik, Z. Ge, H.-J. Kung, and J. Leis,]. Viml.66, 2464 (1992). 21. A. Aiyar, Z. Ge, and J. Leis,]. Viml. 68,611 (1994). 22. D. Cobrinik, L. Soskey, and J. Leis,]. Viml.62,3622 (1988). 23. D. Cobrinik, A. Aiyar, Z. Ge, M. Katzman, H. Huang, and J. Leis, ]. Virol. 65,, 3864 (1991). 24. F. Baudin, R. Marquet, C. Isel, J.-L. Darliu, B. Ehresmann, and C. Ehresmann,]. Mol. Biol. 229,383 (1993). 25. C. Isel, C. Ehresmann, G. Keith, B. Ehresmann, and R. Marquet,]. Mol. Biol. 247, 236 (1995). 26. E. J. Arts, Ghosh, P. S. Jacques, B. Ehresmann, and S. F. J. Le Grice, ]. Biol. C h . ,271, 9054 (1996). 2% E. J. Arts,S. Stetor, X. Li, J. W. Rausch, K. J. Howard, B. Ehresmann, T. W. North, B. M. Wohrl, R. S. Goody, M. A. Wainberg, and S. F.J. Le Grice, Proc. Natl. Acad. Sci. U.S.A. (1996) (in press). 28. J. Leis, A. Aiyar, and D. Cobrinik, in “Reverse Transcriptase”(A. M. Skalka and S. P. Go& eds.),p. 33. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1993. 29. A. Telesnitsky, and S. P. Goff, in “ReverseTranscriptase” (A. M. Skalka and S. P.Goff, eds.), p. 49. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1993. 30. J. Sorge, and S. H. Hughes,]. Virol. 43,482 (1982). 31. P. Chameau, M. Alizon, and E Clavel,]. Viml. 66,2814 (1992). 32. P. Chameau and F. Clavel, ]. Virol. 65,, 2415 (1991). 33. J. G. Levin, D. Hatfield, S. Oroszlan, and A. Rein, in “ReverseTranscriptase”(A. M. Skalka and S. P. Goff, eds.), p. 5. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1993. 34. D. M. Lowe, A. Aitken, C. Bradley, G. K. Darby, B. A. Larder, K. L. Powell, D. J. M. Purifoy, M. Tisdale, and D. K. Stammers, Biochemistry 27,8884 (1988). 35. Z. Hostomska, D. A. Matthews, I. Davies, S. R. Jordan, B. R. Nodes, and Z. Hostomsky, ]. Bid. C b m . 266,14697 (1991). 36. J. Wang, S. J. Smerdon,J. Jager, L. A. Kohlstaedt,P. A. Rice, J. M. Friedman, and T. A. Steitz, Roc. Natl. Acad. Sci. U.S.A. 91,7242 (1994). 37. N. S. Fan, K. B. Rank, J. W. Leone, R. L. Heinrickson, C. A. Bannow, C. W. Smith, D. B. Evans, S. M. Poppe, W. G. Tarpley, D. J. Rothrock, A. G. Tomasselli, and S. K. Sharma, ]. Bwl. Chem. 270,13573 (1995). 38. N. S. Fan, K. B. Rank,S. M. Poppe, W. G. Tarpley, and S. K. Sharma, Biochemistry 35,1911 (1996). 39. J. Toszer, D. Friedman, I. T.Weber, and S. Oroszlan, Biochemishy 32,3347 (1993). 40. S. F. J. Le Grice, T. Naas, B. Wohlgensinger,and 0. Schatz, EMBO]. 10,3905 (1991). 41. A. Telesnitsky and S. P. Goff, Roc. Natl. Acad. Sci. U.S.A. 90,1276 (1993). 42. B. M. Wohrl, K. J. Howard, P. S. Jacques, and S. E J. Le Grice, ]. Biol. C h . 269,8541 (1994). 43. L. A. Kohlstaedt, J. Wang, J. M. Friedman, P. A. Rice, and T A. Steitz, Science 256, 1783 (1992). 44. A. Jacobo-Molina,J. Ding, R. G. Nanni, A. D. Clark, J . , X. Lu, C. Tantillo, R. Williams, R. G. Kamer, A. L. Fenis, P. Clark, A. Hizi, S. H. Hughes, and E. Arnold, Roc. Natl. Acad. Sci. U.S.A. 90,6320 (1993). 45. D. K. Stammers, D. 0. Somers, C. K. Ross, I. Kirby, P. H. Ray, J. E. Wilson, M. Norman, J. S. Ren, R. M. Esnouf, E. F. Garman, E. Y. Jones, and D. I. Stua]. Mol. Biol. 242,586 (1994).
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
389
46. T. Unge, S. Knight, R. Bhikhabbai, S. Lovgren, Z. Dauter, K. Wilson, and B. Strandberg, Structure 2,953 (1994). 4% D. W. Rodgers, S. J. Gamblin, B. A. Harris, S. Ray, J. S. Culp, B. Hellrnig, D. J. Woolf C. Debouck, and S. C. Harrison, h c . Natl. Acad. Sci. U.S.A.92,1222 (1995) 48. J. Ren, d al., Nature Struct. Biol. 2,293 (1995). 49. Z. Hostomsky, Z. Hostomska, T. Fu, and J. Taylor, J. Virol. 66,3179 (1992). 50. H. Jonckheere, J. M. Taymans, J. Balzarini, S. Velazquez, M. J. Camarasa, J. Desmyter, E. De Clerq, and J. Anne,]. Bwl. Chem. 269,26255 (1994). 51. W. S. Robinson, et d.,hot. Natl. Acad. Sci. U.S.A. 54,137 (1965). 52. P. Duesberg, R-oc. Natl. Acad. Sci. U.S.A. 60,1511 (1968). 53. R. L. Erickson,]. Virol.37, 124 (1969). 54. E. Erickson and R. L. Erickson,]. Virol.254, (1971). 55. I. M. Verma, G. F. Meuth, E. Bromfeld, K. F, Manly, and D. Baltimore, Nature New Biol. 233,131 (1971). 56. J. Leis and J. Hurwitz,]. Virol.9,130 (1972). 5% J. E. Dahlberg, R. C. Saywer,J. M. Taylor, A. J. Faras, W. E. Levinson, H. M. Goodman, and J. M. Bishop,]. Virul. 13, 1126 (1974). 58. F. Harada, R. C. Sawyer, and J. E. Dahlberg,]. B i d . Chem. 250,3487 (1975). 59. J. M. Taylor and R. Illmensee,]. Virol. 15,553 (1975). 60. E. J. Arts and M. A. Wainberg, A&. Virus Res. 46,9 (1996). 61. J. G. Levin and J. G. Seidman,]. Virol.29,328 (1979). 62. G. G. Peters and J. Hu,]. Virul.26,692 (1980). 63. L. Kleiman, S. Caudry, F, Boulerice, M. A. Wainberg, and M. A. Parniak, Biochem. Biop h y ~RM. . Cmmun. 174,1272-1280 (1991). 64. M. Jiang, J. Mak, A. Ladha, E. Cohen, M. Klein, B. Rovinski, and L. Kleiman,]. Virol. 67, 3246 (1993). 65. L. C. Waters and B. C. Mullin, Transfer RNA in RNA tumor viruses. hog. Nucleic Acid Res. Mol. Biol. 20, 131 (1977). 66. J. Mak, M. Jiang, M. A. Wainberg, M. L. Hammarskjold, D. Rekosh, and L. Kleiman,]. Virol. 68,2069 (1994). 6% R. D. Berkowitz, J. Luban, and S. P. Goff,]. Virol.67,7190 (1993). 68. Y.Huang, J. Mak, Q. Gao, Z. Li, M. A. Wainberg, and L. Kleiman,]. Viral. 68,7676 (1994). 69. T. Nagashunmugam, A. Velpandi, C. S. Goldsmith, S. R. Zaki, S. Kdyanaraman, and A. Srinivasan,h c . Natl. Acad. Sci. U.S.A. 89,4114 (1992). 70. J. K. Wakefield, H. Rhim,and C. D. Morrow,]. ViroZ. 68,1605 (1994). 71. A. E. Das and B. Berkhout, Nucleic Acids Res. 23,1319 (1995). 72. A. E. Das, B. Klaver, and B. Berkhout]. Virol.69,3090 (1995). 73. X. Li, M. Johnson, E. J. Arts, Z. Gu, L. Kleiman, M. A. Wainberg, and M. A. Parniak,]. Virol. 68,6198 (1994). 74. H. Rhim, J. Park, and C. D. Morrow,]. Virol. 65,4555 (1991). 75. J. M. Whitcomb, B. A. Ortiz-Conde, and S. H. Hughes,]. Virul. 69,6228 (1995). 76. J. Colicelli and S. P. Goff,]. Virol. 57, 37 (1987). 77. A. H. Lund, M. Duch, J. Lovmand,P.Jorgensen, and F. S. Pederson,]. Virol. 67,7125 (1993). 78. J. K. Wakefield, A. G. Wolf, and C. D. Morrow,]. Virol. 69,602 (1995). 79. C. Barat, V. Lullien, 0. Schatz, G. Keith, M. T. Nugeyre, F. Gruninger-Leitch, F. BarreSinoussi, S. F. J. Le Grice, and J.-L. Darh, EMBO]. 8,3279 (1989). 80. C. Barat, S. F. J. Le Grice, and J. L. Darh, Nucleic Acids Res.19,751 (1991). 81. C. Barat, 0. Schatz, S. F. J. Le Grice, and J.-L. Dark,]. Mol. B i d . 231,185 (1993). 82. N. J. Richter-Cook, K. J. Howard, N. M. Cirino, B. M. Wohrl, and S. E J. Le Grice,]. Bid. Chem. 267,15952 (1992).
390
ERIC J. ARTS AND STUART F. J. LE GRICE
83. L. Sarih-Cottin, B. Bordier, D. Musier-Forsyth, A. L. Andreola, P. J. Barr, and S. Litvak, J. Mol. B i d . 226, l(1992). 84. L. A. Kohlstaedt and T. A. Steitz, hot. Natl. Acad. Sci. U.S.A.89,9652 (1992). 85. C. Isel, R. Marquet, G. Keith, C. Ehresmann, and B. Ehresmann,]. Biol. C h m . 268,25269 (1993). 86. B. Berkhout and I. Schonveld,Nucleic Acids Res. 21,1171 (1993). 87. J. B. Keeney, K. B. Chapman, V. Lauerman, D. F. Voytas, S. U. Astrom, U. von PawelRammingen, A. Bystrom, and J. Boeke, Mol. Cell. Bid. 15,217 (1995). 88. M. Wilhelm, F. X. Wilhelm, G. Keith, B. Ajoutin, and T. Heyman, Nucleic Acids Res. 22, 4560 (1994). 89. S. Friant, T.Heyman, M. L. Wilhelm, and E X.Wilhelm, NucZeic Acids Res. 24,441 (1996). 90. E. J. Arts, X. Li, Z. Gu, L. Kleiman, M. A. Pamiak, and M. A. Wainberg,]. Biol. C h m . 269, 14672 (1994). 91. C. Isel, J. M. Lanchy, S. E J. Le Grice, C. Ehresmann, B. Ehresmann, and R. Marquet, EMBO]. 15,917 (1996). 92. J. K. Wakefield, S. Kang, and C. D. Morrow, ]. Virol. 70,966 (1996). 93. R. Marquet, C. Isel, C. Ehresmann, and B. Ehresmann, Biochimie 77,113 (1995). 94. M. Hottinger, K. Gramatikoff,0.Georgiev, C. Chaponnier, W. Schaffner, and U. Hubscher, NucZeic Acids Res. 23,736 (1995). 95. W. Levinson,J. M. Bishop, N. Quintrell,and J. Jackson, Nature (London) 227,1023 (1970). 96. N. Biswal, B. McCain, and M. Bensyesh-Melnick, Virology 45,697 (1971). 9%E. J. Arts, J. Mak, L. Kleiman, and M. A. Wainberg,]. Gen. Virol75,1605 (1994). 98. F. Lori, F. Di Mamo Veronese, A. L. DeVico, P. Lusso, M. S. Reitz, and R. C. Gallo,]. Viml. 66,5067 (1992). 99. D. Trono,]. Virol. 66,4893 (1992). 100. H. Zhang, Y.Zhang, T. P. Spicer, L. Z. Abbott, and B. J. Poiesz, AIDS Res. Hum. Retrooir. 9, 1287 (1993). 101. H. Zhang, G . Domadula, Y.Wu, D. Havlir, D. D. Richman, and R. J. Pomerantz, ]. Virol. 70,628 (1996). 102. R. Geleziunas, E. J. Arts, F. Boulerice, and M. A. Wainberg, Antimicrob. &en$ Chemothm. 37,1305 (1993). 103. E. J. Arts and M. A. Wainberg, Antimicrob. Agents Chemothm.38,1008 (1994). 104. E. J. Arts, J. P. Marois, X. Gu, S. F. J. Le Grice, and M. A. Wainberg,]. Virol.70,712 (1996) 105. M. Gotthe, S. Fackler, T. Hermann, E. Perola, L. Cellai, H. J. Gross, S. F. J. Le Grice, and H. Heumann, EMBO]. 14,833 (1995). 106. L. Zawel and D. Reinberg, Annu. Rev. Biochem. 64,533 (1995). 107. M. Salas, J. T. Miller, J. Leis, and M. L. DePamphilis, in “DNA Replication in Eukaryoyic Cells’’ (M. L. DePamphlis, ed.), p. 131. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1993. 108. T. W. Dreher, C. H. Tsai, C. Florentz, and R. Geige, Biochemistry 31,9183 (1992). 109. R. Geige, C. Florentz, and T. W. Dreher, Biochimie 75,569 (1993). 110. G . H. Wang and C. Seeger, Cell 71,663 (1992). 111. R. Khan and D. P. Giedroc,]. Biol. Chem. 267,6689 (1992). 112. J. W. Rausch, E. J. Arts, B. M. Wohrl, and S. F. J. Le Grice,]. Mol. Biol. 257, 500 (1996). 113. P. S. Jacques, B. M. Wohrl, K. J. Howard, and S. F. J. Le Grice, ]. Bid. Chem. 269, 1388 (1994). 114. B. M. Wohrl, B. Ehresmann, G. Keith, and S. F. J. Le Grice, J. B i d . Chm. 268, 13617 (1993). 115. Y. Mishima and J. Steitz,EMBO]. 14,2679 (1995). 116. M. S. Collett,J. Leis, M. S. Smith, and A. J. Faras,]. Viml.26, 6961 (1978).
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
391
11% S. F. J. Le Grice, J. Mills, R. Ette, and J. Mous,]. Biol. Chem. 264, 14902 (1989). 118. N. Cheng, G. R. Painter, and P.A. Furman, Biochem. Biophys. Res. Commun. 174,785 (1991). 119. R. W. Sobol, R. J. Suhadolnik, A. Kumar, B. J. Lee, D. L. Haffield, and S. H. Wilson, Biochemistry 30,10623 (1991). 120. A. Basu, K. K. Ahluwalia, S. Basu, and M. Modak, Biochemistry 31,616 (1992). 121. N. Sheng and D. Dennis, Biochemistry 32,4938 (1993). 122. R. G. Nanni, J. Ding, A. Jacobo-Molina,S. H. Hughes, and E. Arnold, Pwspect. Drug Discooey Design 1,129 (1993). 123. B. M. Wohrl, C. TantiUo, E. Arnold, and S. E J. Le Grice, Biochemistry 34,5343 (1995). 124. T. Hermann, T. Meier, M. Gotte, and H. Heumann, Nucleic Acids Res. 22, 4625 (1994). 125. W. Metzger, T. Hermann, 0. Schatz, S. E J. Le Grice, and H. Heumann, R-oc. Natl. Acad. Sci. U.S.A.90,5909 (1993). 126. P. L. Boyer, C. Tantillo, A. Jacobo-Molina, R. G. Nanni, J. Ding, E. Arnold, and S. H. Hughes, R-oc. Natl. Acad. Sci. U.S.A. 91,4882 (1994). 12% B. M. Wohrl, M. Georgiadis, A. Telesnitsky, W. A. Hendrickson, and S. F. J. Le Grice, Science 267,96 (1995). 128. H. Pelletier, M. R. Sawaya, A. Kumar, S. H. Wilson, and J. Kraut, Science 264,1891 (1994). 129. J. K. Wakefield, S. A. Jablonski, and C. D. Morrow,]. Virol. 66,6806 (1992). 130. 0. Schatz, E Cromme, F. Griininger-Leitch,and S. F. J. Le Grice, FEBS Lett. 257,311 (1989). 131. 0.Schatz, F. Cromme, T. Naas, D. Lindemann, J. Mous, and S. E J. Le Grice, in ‘Dncogenesis and AIDS (T.Papas, eds.), p. 304. Portfolio Publishing, Houston, TX, 1990. 132. V. Mizrahi, M. T. Usdin, A. Harington, and L. R. Dudding, N u c k Acids Res. 18,5359 (1990). 133. M. Tisdale, T. Schulze, B. A. Larder, and K. Moehg,]. Gen. Virol. 72,59 (1991). 134. B. M. Wohrl, S. Volkmann, and K. Moehg,]. Mol. Biol. 220,801 (1991). 135. L. R. Dudding, N.C. Nkabinde, and V. Mizrahi, Biochemistry 30,10498 (1991). 136. P. S. Jacques, B. M. Wohrl, M. Ottmann, J. L. Darh, and S. E J. Le Grice,]. Biol. Chem. 269,26472 (1994). 13% 0. Poch, I. Sauvaget, M. Delarue, and N. Tordo, EMBO]. 8,3867 (1989). 138. Y. Xiong and T H. Eickbush, EMBO]. 9,3353 (1990). 139. M. Ghosh, P. S. Jacques, D. Rodgers, M. Ottmann, J. L. Dark, and S. F. J. Le Grice, BWchemistry 35,8553 (1996). 140. G. Divita, T. Restle, R. S. Goody, J.-C. Chermann, and J. G. Baillon, J. Biol. Chem. 269, 13080 (1994). 141. G. Divita, J. G. Baillon, K. Rittinger, J. C. Chermann, and R. S. Goody,]. Biol. Chem.270, 28642 (1995). 142. S. J. Smerdon, J. Jager, J. Wang, L. A. Kohlstaedt,A. J. Chirino, J. M. Friedman, P. A. Rice, and T. A. Steitz, hot. Natl. Acad. Sci. U.S.A. 91,3911 (1994). 143. W. A. Beard, S. J. Stahl, H.-R. Kim, K. Bebenek, A. Kumar, M.-E Strub, S. P. Becerra, T. A. Kunkel, and S. H. Wilson,]. Biol. Chem.269,28091 (1994). 144. K. Bebenek, W. A. Beard, J. R. Casas-Finet, H.-R. Kim, T. Darden, S. H. Wilson, and T. A. Kunkel,]. Biol. Chem. 270,19516 (1995). 145. W. A. Beard, D. T. Minnick, C. L. Wade, R. Prasad, R. L. Won,A. Kumar,T. A. Kunkel, and S. H. Wilson,]. Biol. Chem.271,12213 (1996). 146. B. A. Larder and S. D. Kemp, Science 246,1155 (1989). 14% Z. Gu, Q. Gao, H. Fang, H. Solomon, M. A. Parniak, and M. A. Wainberg Antimimob. Agents Cbnother. 38,275 (1994). 148. J. E. Fitzgibbon, R. M. Howell, C. A. Habenettl, S. J. Sperber, D. J. Gocke, and D. T. Dubin, Antimimob. Agents Chenother. 36,153 (1992).
392
ERIC J. ARTS AND STUART F. J. LE GRICE
149. M. H. St. Clair, J. L.Martin, G. Tudor-Williams, M. C. Bach, C. L. Vavro, D. M. King, P. Pellam, S. D. Kemp, and B. A. Larder, Science 253,1557 (1991). 150. B. Kim and L. A. Loeb, R-oc. Natl. A c d Sci. U.S.A. 92,684 (1995). 151. B. Kim and L. A. Loeb,]. Virol.69,6563 (1995). 152. B. Kim, T. R. Hathaway, and L. A. Loeb,]. Biol. C h m . 271,4872 (1996). 153. M. A. Wainberg, W. C. Drosopoulos, H. Salomon, M. Hsu, G. Borkow, M. A. Pamiak, Z. Gu, Q. B. Song, J. Manne, S. Islam, G . Castriota, and V. R. Prasad, Science 271, 1282 (1996). 154. V. N. Pandey, N. Kaushik, N. Rege, S. G. Sarhanos, P. N. S. Yadav, and M. J. Modak, Biochemistry 35,2168 (1996). 155. W. M. Kati, K. A. Johnson, L. F. Jerva, andK. S. Anderson,]. Biol. Chem. 267,25988 (1992). 156. M. Bakhanashvili and A. Hizi, Biochemistry 31,9393 (1992). 15% K. Katayanagi, M. Miyagawa, N. Matsushima, M. Ishikawa, S. Kanaya, M. Ikehara, T. Matsuzaki, and K. Morikawa, Nature (London) 347,306 (1990). 158. W. Yang, W. A. Hendrickson, R.J. Crouch, and Y. Satow, Science 249,1398 (1990). 159. J. F. Davies, Z. Hostomska, Z. Hostomsky, S. R.Jordan, and D. A. Matthews, Science 252, 88 (1991). 160. S. Kandya, C. Katsuda-Nakai, and M. Ikehara,]. Bid. Chem. 266,11621 (1991). 161. S. Kanaya, A. Kohara, Y. Miura, A. Sekiguchi, S. Iwai, H. Inoue, E. Ohtsuka, and M. Ikehara,]. Biol. C h a .265,4615 (1990). 162. S. P. Becerra, G. M. Clore, A. M. Groenbom, A. R.Karlstrom, S. J. Stahl, S. H. Wilson, and P. T. Wingfield, FEBS Lett. 270,76 (1990). 163. Z. Hostomsky, Z. Hostomska, G. 0. Hudson, E. W. Moomaw, and B. R. Nodes, R-oc. Natl. Acad. Sci. U.S.A.88,1148 (1991). 164. N. M. Cirino, R.C. Kalayjian, J. E. Jentoft, and S. F. J. Le Grice,]. B i d . Chem. 268,14743 (1993). 165. D. B. Evans, K. Brawn, M. T. Diebel, Jr., W. G. Tarpley, and S. K. Sharma,]. Biol. Chem. 266,20583 (1991). 166. J. S. Smith and M. J. Roth,]. Virol. 67,403 (1993). 167. M. Ghosh, K. J. Howard, C. E. Cameron, S. J. Benkovic, S. H. Hughes, and S. F. J. Le Grice, 1.Biol. Chem.271,7068 (1995). 168. S. P. Becerra, A. Kumar, M. S. Lewis, S.G. Widen, J. Abbotts, E. M. Karawya, S. H. Hughes, J Shiloach, and S. H. Wilson, Biochemistry 30,11707 (1991). 169. H.Ben-Artzi, E. Zeelon, S. F. J. Le Grice, M. Gorecki, and A. Panet, Nucleic Acids Res. 20, 5115 (1992). 170. S. W. Blain and S. P. Goff,]. Bid. C h .268,23585 (1993). 171. Z. Hostomsky, S. H. Hughes, S. P. Goff, and S. F. J. Le Grice,]. Virol. 68,1970 (1994). 172. M. Deutscher,]. Biol. C h .268,13011 (1993). 173. E. S. Furfine and J. E. Reardon,]. Bid. Chem. 266,406 (1991). 174. E. S. Furfine and J. E. Reardon, BiochemisCry 30,7041 (1991). 175. V. Gopalalaishnan,J. A. Peliska, and S. J. Benkovic, Roc. Natl. Acad. Sci. U.S.A.89,10763 (1992). 176. J. J. DeStefano, Nucleic Acids Res. 23,3901 (1995). 17% J. J. DeStefano, R. G. Buiser, L. M. Mallaber, T.W. Myers, R. A. Bambara, and P. J. Fay, 1.Biol. Chem. 266,7423 (1991). 178. J. J. DeStefano, L. M. Mallaber, P. J. Fay, and R. A. Bambara, Nucleic Acids Res. 21,4330 (1993). 179. J. J. DeStefano, L. M. Mallaber, P. J. Fay, and R. A. Bambara, Nucleic Acids Res. 22,, 3793 (1994). 180. J. Hansen, T. Schulze, W. Mellert, and K. Moelling, EMBO]. 7,239 (1988).
RETROVIRAL RT AND TEMPLATE-PRIMER DUPLEXES
393
181. 0.Schatz, J. Mous, and S. E J. Le Grice, EMBO]. 9,1171 (1990). 182. X. Zhan, C. K. Tan, W. A. Scott, A. M. Mian, K. M. Downey, and A. G. So, Biochemistry 33, 1366 (1994). 183. C. Palaniappan, G. M. Fbentes, L. Rodriguez-Rodriguez,P. J. Fay, and R. A. Bambara, ]. Biol. Chem. 271,2063 (1996). 184. K. Post, J. Guo, E. Kalman, T. Uchida, R. J. Crouch, and J. G. Levin, Biochemistry 32,5508 (1993). 185. L. R. Boone and A. M. Skalka, in “ReverseTranscriptase”(A. M. Skalkaand S. P. Goff, eds.), p. 119. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1993. 186. W. I. Finston and J. J. Champow,]. Virol. 51,26 (1984). 18% A. J. Ratbay and J. J. Champoux,]. Mol. Biol. 208,445 (1989). 188. H. E. Huber and C. C. Richardson,]. Biol. Chem. 265,10565 (1990). 189. B. M. Wohrl and K. Moelling, Biochemistry 29,10141 (1990). 190. J. J. Champoux, E. Gilboa, and D. Baltimore,]. Virol. 49,686 (1984). 191. K. A. PuIlen and J. J. Champoux,J. Virol. 64,6274 (1990). 192. K. A. Pullen, L. K. Ishimoto, and J. J. Champoux,]. Virol. 66,367 (1992). 193. J. S. Smith and M. J. Roth,]. Biol. Chem. 267,15071 (1992). 194. S. J. Schultz, S. H. Whiting, and J. J. Champoux,]. Biol. Chem. 270,24135 (1995). 195. M. D. Powell and J. G. Levin,]. Virol. 70,5288 (1996). 196. G. Luo and J. Taylor,]. vfrol. 64,592 (1990). 19% J. A. Peliska and S. J. Benkovic, Science 258,1112 (1992). 198. J. A. Peliska and S. J. Benkovic, Biochemistry 33,3890 (1994). 199. N. Tanese and S. P. Goff,]. Virol. 65,4387 (1991). 200. A. Telesnitsky, S. W. Blain, and S. P. Goff,]. Virol. 66,615 (1992). 201. B. Berkhout and B. Klaver,]. Gen. Virol. 76,845 (1995). 202. L. Rodriguez-Rodriguez,Z. Tsuchihashi, G. M. Fuentes, R. A. Bambara, and P. J. Fay, J. Biol. Chem. 270,15005 (1995). 203. J. J. Champoux, in “Reverse Transcriptase”(A. M. Skalka and S. €? GoE eds.), p. 103. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1993. 203a. J. W. Rausch and S. F. J. Le Grice,]. Biol. Chem. 272,8602 (1997). 204. K. A. Pullen, A. J. Rattray, and J. J. Champoux,]. Biol. Chem. 268,6221 (1993). 205. E. H. Bowman, V. K. Pathak, and W. S. Hu,]. Virol. 70,687 (1996). 206. L. J. Hansen and S. B. Sandemeyer,]. Virol. 64,2599 (1990). 20% T. Heymann, B. Agoutin, S. Friant, E X. Wilhelm, and M. L. Wilhelm, Nucleic Acids Res. 253,291 (1995). 207a. M. D. Powell, M. Ghosh, P.S. Jacques, S. F. J. Le Grice, and J. G. Levin, J. Biol. Chem. 272,13262 (1997). 208. H. E. Blum, J. D. Harris, P. Ventura, D. Walker, K. Staskus, E. Rebel, and A. T. Haase, Virology 142,270 (1985). 209. J. J. Kupiec, J. Tobaly-Tapiero,M. Canivet, M. Santillana-Hayat,R. M. Flugel, J. Pris, and R. Emanoil-Ravier,Nucleic Acids Res. 16,9557 (1988). 210. 0. Hungnes, E. Tjotta, and B. Grinde, Virology 190,440 (1992). 211. M. D. Miller, B. Wang, and F. D. Bushman,]. Virol. 69,3938 (1995). 212. P. Charneau, G . Miranbeau,P.Roux, S. Paulos, H. Buc, andF. Clavel,]. Mol. Biol. 241,651 (1994). 213. P. J. Hagerman, Nature (London) 321,449 (1986). 214. D. S. Koo, H. M. Wu, and D. M. Crothers, Nature (London)320,501 (1986). 215. H. C. M. Nelson, J. T.Finch, B. F. Luisi, and A. Klug, Nature (London)330,221 (1987).
This Page Intentionally Left Blank
Index
A 18A2, identification as late response gene, 50 28H6, identification as delayed-earlyresponse gene, 50 A23187 double-stranded RNA-activated protein kinase activation, 106-107 effect on protein translation, 94-95 ABC transporter, see ATF-binding cassette transporter Ace1 A-T hook motif, 181-182 copper binding stoichiometry, 178 DNA binding, 174-175 extended X-ray absorption fine structure analysis, 178-180 mediation of copper activation of genes, 174,186-188 polycopper-thiolate cluster, 177-180 structure, 174-176 tetracopper domain, structure and function, 182-185 transactivation domain, 182 zinc binding, 180-181 Actin genes, see cl-15; pME,Aftl, mediation of iron repression of genes, 190 Amtl A-T hook motif, 181-182 copper binding stoichiometry, 178 extended X-ray absorption fine structure analysis, 178-180 mediation of copper activation of genes, 175 polycopper-thiolate cluster, 177-180 structure, 175-176 tetracopper domain, structure and function, 182-185 transactivation domain, 182 zinc binding, 180-181 Arabis mosaic virus hairpin ribozyme, see Hairpin ribozyme Arsenite ion
double-stranded RNA-activatedprotein kinase activation, 107 inhibition of translation, 97-98,101-102 ATHI, see Trehelase ATF-binding cassette (ABC) transporter energetics, 241,243 physiological roles of peptide transport, 240 structure, 241 substrate specificity, 241,243
B Bip, see GRP78 calcium content, 85 protein processing and chaperones, 85-86
C Cadmium, inhibition of translation, 98 Calcium depletion and double-stranded RNA-activated protein kinase activation, 106-107 endoplasmic reticulum stress response signaling, 80-81,84-86,92-95, 112-113 sensing in cells, 167 translation initiation role, 92-95 Calmodulin-dependentcyclic nucleotide phosphodiesterase,calcium depletion effects on stability, 80-81 CDCl, mutation effects on double-strand break-induced recombination, 294 Cell cycle, control in mouse fibroblasts, 44, 61,65-70
C-FOS cell cycle control in mice, 65 identification as immediate-earlyresponse gene, 52 c - H a m , cell cycle control in mice, 67-68
395
INDEX
Chickory yellow mottle virus type 1hairpin ribozyme, see Hairpin ribozyme CHLl gene cloning, 246 hydrophobicity analysis, 257-258 sequence homology with other transporters, 248,250-256 structure, 248-249 c-Jun, cell cycle control in mice, 65 cl-15, identification as immediate-early response gene, 52,58 Clp proteins, protein folding role, 307, 331 c-Myc, cell cycle con&olin mice, 65-66 Copper ion sensing Chlamydomonasreinhurdtii, 168-169 Enterococcur hime, 168 Eschaichia coli, 168 importance to organism, 166-167 mammalian cells, 169-170,191 Pseudomonos, 168 yeast copper-inducedgenes, 170-174 copper-requiringenzymes, 170 human disease correlation, 169-170 Mac1 sensing, 171,188-189 mediation of copper activation of genes Acel, 174-176,186-188,191 Amtl, 175-176,191 Crfl, 177 functional domains of ham-acting factors, 180-185 Lpz8, 177 mechanism, 185-188 polycopper-thiolatecluster in transacting factors, 177-180, 182-185 transporters, 170 Crfl, mediation of copper activation of genes, 177 CRSS, induction by copper, 171-174,186 CTRI,repression by copper, 188-189 cup1 gene induction by copper, 171-174, 186-188 polycopper-thiolatecluster, 180 Cyclic nucleotide phosphodiesterase, see Calmodulin-dependentcyclic nucleotide phosphodiesterase
Cyclin D1, cell cycle control in mice, 68 Cyr61, identification as immediate-earlyresponse gene, 51
D Delayed-early response genes, see also specifi gdifferential hybridization screening in identification, 49-57 transcriptional regulation, 47-48 Dithiothreitol, effect on protein translation, 95,98 Double-strand break-induced recombination DNA damaging techniques, 264-265,278 experimental approaches in study, 265-266,278 meiotic recombination biological functions, 288 hot spots, 289-290 mechanisms, 288-290 mitotic recombination biological functions, 277, 287-288 events, in uitro studies Saccharomycescereuisiae, 278-279 Saccharomyca pombe, 279-280 Ustilugo maydis, 280 Xenopus, 280 events, in oiuo studies Saccharomycescereuisiue, 281-284 Saccharomycespombe, 284-285 X e n o p u ~285-287 , genetic control Sacchamyca cere~isiue,291-294 Saccharomyca pombe, 294-295 P-element transposition in hsophila, 288 V(D)Jrecombination, 277,288 models double-strand break-gap repair model, 269,285,291 one-sided invasion model, 269, 271 postmeiotic segregation,266-267 Radding's model, 267 Resnick's model, 267,271 single-strandannealing model, 276-277, 282-283,291 stages, 277
INDEX
397
synthesis-dependentstrand-annealing model, 271,276 recombination types, 264 Double-stranded RNA-activated protein kinase (PKR) activation, 104,106-108 calcium depletion and activation, 106-107 domains, 104 eIF-2a phosphorylation, 103-108 growth control and tumor suppression, 105-106 induction by interferons, 103 viral evasion, 105 DtpT gene cloning, 246-247 hydrophobicity analysis, 257-258 sequence homology with other transporters, 248,250-256 structure, 248,250
translational accommodation depletion of calcium stores, 110-111 GRP78 role, 111-112 herpes simplex virus 1 infection and translational tolerance, 115 physiological relevance, 115-116 signaling, 112-113 unfolded protein response in yeast, 88-89 Epidermal growth factor (EGF) inducible genes in mouse fibroblasts, 58 receptor, 58 ER, see Endoplasmic reticulum EXFAS, see Extended X-ray absorption fine structure Extended X-ray absorption fine structure (WAS) Acel, 178-180 Amtl, 178-180 Cupl, 180
E F
eEF-2, phosphorylation, 109-110 EGF, see Epidermal growth factor Egr genes cell cycle control in mice, 66 identification as immediate-early response genes, 52 EGTA, effect on protein translation, 92-96 eIF-2a, phosphorylation kinases double-strandedRNA-activated protein kinase, 103-108 GCN2,102 HRI, 102-103 stressor response, 91-92,95-97,101-102 Endoplasmic reticulum (ER) functions, 84 structure, 83-84 Endoplasmic reticulum stress response calcium signaling, 80-81,84-86,92-95, 112-113 glucose regulated proteins, see also GRP78 gene structure, 86-87 types, 86 relationship with heat shock response, 114-116
Fatty acid, p-oxidation enzymes, see Medium-chain acyl-CoA dehydrogenase mitochondria pathway, 310 FGF-1, see Fibroblast growth factor-1 Fibroblast growth factor-1 (FGF-1) inducible genes in mouse fibroblasts, 59-60 receptor and signal transduction, 42,59 Fisp-12iconnectivetissue growth factor, identification as immediate-earlyresponse gene, 51 Fit-1, identification as immediate-earlyresponse gene, 55 Fos-B,cell cycle control in mice, 65 FR-1, identification as delayed-early response gene, 60 FREI, repression by copper, 188-189
G GCN2, eIF-2a phosphorylation, 102 Gene therapy, see Hairpin ribozyme; Human immunodeficiencyv i r u s type 1
INDEX
GroEL, protein folding role, 306-308,314, 316,331 GroES, protein folding role, 306-308,314, 316 GRP78 ADP ribosylation,90.119 calcium effects on expression, 81,87 gene structure, 86-87 induction signaling, 88-89,112-113 messenger RNA structure, 87 oligomerization,89,119 partner proteins, 89 phosphorylation,90 translational accommodation in stress response, 111-112 translation initiation role, 117-119
H Hairpin ribozyme arabis mosaic virus, 2,4,33-35 chickory yellow mottle virus type 1,2,4, 33-37 gene therapy catalytic optimization, 17-18,35-36 delivery systems autolyhc hairpin cassette, 18-19 promoters, 19,22-24,31-33 helix 1 length optimization, 16-17,36 human immunodeficiencyvirus type 1, see Human immunodeficiency virus type 1 targeting rules, 15-16 target site selection, 16-17 minimum catalytic center, 3 , s structure secondary structure, 3, 10,13,36 three-dimensionalmodeling, 14 tobacco ringspot virus catalytic center, 5 criteria for catalytic reaction, 7 discovery, 2 , 5 kinetic properties, 7 metal dependence, 7 mutation effects on activity, 10-11.13 pH effects, 7,lO substrate specificity,3-7,14,34 temperature optimum, 7
Heat shock response heat shock proteins, see also HSC70; HSP6O; HSP70 induction, 87 messenger RNA structure, 83 types, 83 inducers, 82-83 relationship with endoplasmic reticulum stress response, 114-116 translational accommodation overview, 113-114 physiologicalrelevance, 115-116 trehalase role, 219,222-223,226-228 trehalose accumulation in yeast, 204-205 Herpes simplex virus 1 (HSV-1), infection and translational tolerance to stress, 115 HIV-1, see Human immunodeficiency virus type 1 HLH462, identification as immediate-early response gene, 51 HO endonuclease,double-strandbreak-induced recombination role, 281-284,287 HRI eIF-2a phosphorylation, 102-103 heat shock protein association, 103 heme regulation, 102 HSC7O oligomerization, 89 partner proteins, 89 translational accommodation, 113-114 Hsp60, protein folding role, 305-306,308, 318-319,331 Hsp70 oligomerization, 89 partner proteins, 89 protein folding role, 305-308,318,327, 331 HSV-1, see Herpes simplex virus 1 Human immunodeficiency virus reverse transcriptase a-helix H mutagenesis studies, 368 biogenesis, 344 deoxynucleotidebinding and inhibition, 368-369 DNA footprinting,364-366,371,373 drug targeting, 340-341.361 fidelity and mutation effects, 369 polypurine tract primer and second-strand synthesis
399
INDEX
central polypurine tract and central termination sequences of lentiviruses, 384,386 reverse transcriptase mutations affecting selection and extension, 383-384 selection and initiation, 383 primer grip motif and mutagenesis effects, 361-363,366-368 replication cycle of virus, 341,343-344 retrovirus replication role, 340-341, 343-344 ribonuclease H domain footprinting analysis, 371,373 polymerization-dependentactivity, 374, 376 polymerization-independentactivity and strand transfer, 374,376-377, 379-380 replication role, 376-377 ribonuclease H* activity, 373474 structure, 370-371 structural homology with other reverse transcriptases,341 template grip motif, 361-363 three-dimensionalstructure, 344,346 flNALys..3
genomic interactions, 347-349 initiation of (-) strand DNA synthesis, 353-355 p66-p51 mediation of primed events, 355-356,359 packaging into viral particles, 346-349 PBS duplex recognition by heterologous reverse transcriptases,360461,386 viral interactions outside the PBS, 349-350,352-353 Human immunodeficiencyvirus type 1 (HN-1),gene therapy with hairpin ribozymes criteria for antiviral activity, 20-21 delivery systems autolytic hairpin cassette, 18-19 promoters, 19,22-23,31-33 human clinical trials, 31,37 5’ leader target catalytic efficiency, 22 inhibition of virus expression by ribo-zymes diverse strain inhibition, 27
stable transfectants, 22-23,26-27 T lymphocytes, 26-27 transient transfectants, 24-26 promoters Moloney murine leukemia virus, 25-26 mouse mammary tumor virus, 22-24 polymerase I1 p-actin, 24-26 sequence, 21 stability of target site, 22-24 nef region target, 35 protease region of pol gene as target catalpc eficiency, 28-29 inhibition of virus expression, in uivu, 29-31 sequence, 29 stability of target site, 29 simultaneous delivery of two ribozymes and v i r u s inhibition, 29-31 specificity of cleavage, 4 target site selection, 16-17
I I-Sce-I, double-strand break-induced recombination role, 281,285-286 IGF-1, see Insulin-likegrowth factor-I Immediate-earlyresponse genes, see also specific genes classification, 44 differential hybridization screening in identification,49-57 messenger RNA half-lives, 45-46 superinduction with protein synthesis inhibitors, 46-47 transcriptional regulation, 44-45 Insulin-like growth factor-1 (IGF-1) inducible genes in mouse fibroblasts, 58-59 receptor and signal transduction, 58 PI-Integrin, identification as immediate-early response gene, 55 Ionomycin, effect on protein translation, 94-95,98,101 kelp, sensing of unfolded proteins, 88 Iron homeostasis, 189-190 iron-sulfur clusters, 190
400
INDEX
J Jun, see c-&; Jun-B Jun-B,identification as immediate-early response gene, 51
L L32, identification as delayed-early response gene, 56 LacI, see Lactose repressor protein Lactate dehydrogenase, identification as delayed-earlyresponse gene, 56-57 Lactose repressor protein (Lac0 amino acid modification studies, 147-148 applications in recombinant gene expression, 155-156 conformational change studies, 149, 154-155 DNA binding amino acid residues involved in binding, 139 kinetics, 138 loop formation, 137-138 nonspecific binding, 134 operator sequence identification, 134-135 searching for operator by repressor, 136-137 thermodynamics, 138-139 domain structure, 130-131 inducer binding amino acid residues involved in binding, 141-142 kinetics, 140-141 sugars, 140 thermodynamics, 140-141 lac enzyme expression, overview, 128-130,166 mutagenesis studies dimer assembly, 143 DNA binding, 142-143,146 inducer binding, 142-143 tetramer assembly, 133, 143, 146 tetrameric protein assembly shape of structure, 131,133 subunit affinity, 133-134 three-dimensional structure conformational change, 154-155
core domain, 150-151,153-154 N-terminal domain, 149-151 nuclear magnetic resonance, 149-151 subunit interface, 154 X-ray crystallography, 149-151, 153-154 tryptophan fluorescence, 148 Late response genes, see also spec@ genes differential hybridization screening in identification,49-57 transcriptional regulation, 47-48 Lon protease, degradation of misfolded proteins, 330-331 Lpz8, mediation of copper activation of genes, 177
M Macl, copper ion sensing, 171,188-189 MCAD, see Medium-chain acyl-CoA dehydrogenase MCP-1, see Monocytic chemotactic and activating factor Medium-chain acyl-CoA dehydrogenase WAD) deficiency disease clinical presentation, 311-312 discovery, 311 gene mutations, 312-313.325-326 incidence, 312 expression system selection for mutant protein studies, 327-328 flavin adenine dinucleotide effects on folding, 318-319 G170R mutant, 326,329 gene family, 310-311 K304E mutant aggregation, 313 cell-free translation, 317-318 chaperonins complexes, 316-319 overexpressioneffects on folding and aggregation,314,316-317 charge replacement role in subunit interactions, site-directed mutagenesis studies, 321-324 degradation, 313-314,330 expression in Escherichiu coli,313-314, 316,319,321
401
INDEX
temperature effects on folding, 319,321, 324 R28C mutant, 325-326 S311R mutant, 326 T168A mutant, 326 three-dimensionalstructure, 311 Menkes disease, gene, 169-170 Mitogen-activatedprotein kinase phosphatase ( M U ) cell cycle control in mice, 69 identification as immediate-early response gene, 51-52 M U , see Mitogen-activatedprotein kinase phosphatase Monocytic chemotactic and activating factor (MCP-l),identification as immediateearly response gene, 58 Myc, see c-Myc
N N51, identification as immediate-early response gene, 53,57 N65, identification as immediate-early response gene, 53, 57 NMR, see Nuclear magnetic resonance N T H l , see Trehelase NTHZ, see Trehelase NTRl gene cloning, 247 sequence homology with other transporters, 248,250-256 structure, 248,250 Nuclear magnetic resonance (NMR),lactose repressor protein structural studies, 149-151 Nup475, identification as immediate-early response gene, 51 Nur77 identification as immediate-early response gene, 51 knockout mice studies, 66
0 One-sided invasion model, see Double-strand break-induced recombination
P P16, identification as immediate-earlyresponse gene, 53-54 p53, cell cycle control in mice, 67 PCR, see Polymerase chain reaction PDGF, see Platelet-derivedgrowth factor P-element, transposition in Drosophila, 288 PEPTl gene cloning, 244-245 hydrophobicity analysis, 257-258 sequence homology with other transporters, 248,250-256 structure, 248-249 PEPT2 gene cloning, 244-245 hydrophobicity analysis, 257-258 sequence homology with other transporters, 248,250-256 structure, 248-249 Peptidyl-prolyl cis-tmns isomerase, protein folding role, 309 PKR, see Double-stranded RNA-activated protein kinase Platelet-derived growth factor (PDGF) inducible genes in mouse fibroblasts, 57-58 receptor and signal transduction, 42 structure, 57 pME,, identification as immediate-earlyresponse gene, 52,58 Polo-like kinase, cell cycle control in mice, 68-69 Polymerase chain reaction (PCR) inducible gene identification by reverse transcription, 59-60 proton-coupled oligopeptide transporter gene identification by reverse transcription, 245 Postmeiotic segregation,see Double-strand break-induced recombination POT, see Proton-coupled oligopeptide transporter Primary response genes, see Immediate-early response genes Proliferin gene, see, 28H6 Protein &sulfide isomerase, function, 85 Protein folding chaperones Clp proteins, 307,331
INDEX
Protein folding (cont.) complexes with proteins, 316-319 families, 305 GroEL, 306-308,314,316,331 GroES, 306-308,314,316 Hsp60, 305-306,308,318-319, 331 Hsp70,305-308,318,327,331 mitochondria, 305-309 overexpressioneffects on folding and aggregation, 314,316-317 cofactor effects,318-319 missense mutations diseases, 302303,310,312-313, 325-326,329 expression system selection for protein studies, 327-328 mutations affecting folding kinetics, 309-310 prediction of effects, 328 molten globule intermediates, 304 pathways, 303-304,307308,331-332 peptidyl-prolyl cis-tram isomerase role, 309 quality control systems and degradation, 330-332 temperature effects, 319,321,324,329 Proton-coupled oligopeptide transporter (POT) cloning strategies expression cloning, 244 functional complementation CHLl gene cloning, 246 DtpT gene cloning, 246-247 NTRl gene cloning, 247 principle, 245-246 F'TR2 gene cloning, 247-248 homologous hybridization, 244-245 reverse transcription-polymerase chain reaction, 245 energetics, 241,243 hydrophobicity analysis, 257-258 physiological roles of peptide transport, 240,259 sequence homology of transporters, 248, 250-256 structural properties, 248-250 subgroups, 253-254 substrate specificity,241,243,257,259
Proton-coupled peptide transport, see ATPbinding cassette transporter; Protoncoupled oligopeptide transporter PTR2 gene cloning, 247-248 hydrophobicity analysis, 257-258 sequence homology with other transporters, 248,250-256 structure, 248,250
R RAD3 epistatic group, mutation effects on double-strand break-induced recombination, 292-295 RAD52 epistatic group, mutation effects on double-strand break-induced recombination, 291-293 Radding's model, see Double-strand breakinduced recombination Ras,see c-Ha-ras Recombination, see Double-strand break-induced recombination Resnick's model, see Double-strand breakinduced recombination Reverse transcriptase, see Human immunodeficiency virus reverse banscriptase RFAI, mutation effects on double-strand break-induced recombination, 293-294 Ribozyme, see also Hairpin ribozyme discovery, 2 , 5 types, 2
S Secondary response genes, see Delayed-early response genes Single-strand annealing model, see Doublestrand break-induced recombination SODl, induction by copper, 171, 174,186 SRF, cell cycle control in mice, 66-67 ST2R1, identification as immediate-early response gene, 55,57 Stress response, see Endoplasmic reticulum stress response; Heat shock response
INDEX
403
Synthesis-dependentstrand-annealing model, see Double-strand break-induced recombination
T Thapsigargin, effect on protein translation, 94-95,98-99,101 Tobacco ringspot v i r u s hairpin ribozyme, see Hairpin ribozyme Translation initiation calcium role, 92-95 eIF-2 role, 91-92,95-97,101-102 eIF-4 complex role, 91-92 GRP78 role, 117-119 herpes simplex virus 1 infection and translational tolerance to stress, 115 inhibition by stressors arsenite ion, 97-98,100-102 cadmium, 98 ionomycin, 94-95,98, 101 mechanism, 97 thapsigargin, 98-99,101 messenger RNA selection frequency, 91 phosphorylation of initiation factors, 91-92,95-97,101-108 reticulocyte lysates, 108-110 ribosome loading, 90-92,119 Trehalase discovery, 199 inhibitors as insecticides, 232 renal disease marker, 232 sequence homology between species, 213, 218-221 yeast acid trehalase ATHl gene, 201,209 cloning of gene, 212-213 glycosylation, 208-209, 214 inhibitors, 208 mutation studies, 214, 225 pH optimum, 200-201,207-208 size, 207-208 substrate specificity, 208 transport, 214-215 assay, 211, 213 biological functions overview, 201
spore germination, 229 stress response, 219,222-223, 226-228 trehalose hydrolysis, 226 trehalose transport, 228-229 elimination in baker's yeast, 231-232 localization, 207,214 neutral trehalase gene cloning, 216 inhibitors, 210 mutation studies, 217-218,225,228 NTHl gene, 201,211,216-217 NTH2 gene, 217 pH optimum, 200-201,207,210 phosphorylation, 210-211,217-218, 222-223,225-226 purification, 209 size, 210,216 substrate specificity, 210 regulation of expression catabolite repression, 224-226 heat stress, 219,222-223,229-231 mutagen stress, 223 Trehalose accumulation in yeast heat stress, 204-205 levels during life cycle, 202-203 mutagen stress, 205 nutrient stress, 203-204,224-226 assays, 201-202 biosynthesis, 199-200, 212 discovery, 198 species distribution, 199 stress protection baker's yeast 231-232 mechanisms, 205,207 tRNALy"3, see Human immunodeficiency virus reverse transcriptase
V V(D)Jrecombination,see Double-strand break-induced recombination
w Wilson disease, gene, 169-170
404
INDEX
X
2
X-ray crystallography human immunodeficiency virus reverse transcriptase, 344,346 lactose repressor protein structural studies, 149-151,153-154 medium-chainacyl-CoA dehydrogenase, 311
Zij7268, identification as immediate-earlyresponse gene, 51 zinc Ace1 binding, 180-181 Amt1 binding, 180-1 81 homeostasis, 167
This Page Intentionally Left Blank