ADVANCES IN PROTEIN CHEMISTRY Volume 38
This Page Intentionally Left Blank
ADVANCES IN PROTEIN CHEMISTRY EDITED BY
C. 6. ANFINSEN
JOHN T. EDSALL
Department of Biology The Johns Hopkins University Baltimore, Maryland
Department of Biochemistry and Molecular Biology Harvard University Cambridge, Massachusetts
FREDERIC M. RICHARDS Department of Molecular Biophysics and Biochemistry Yale University New Haven, Connecticut
VOLUME 38
1986
ACADEMIC PRESS, INC. Harcourt Brace Jovanovich, Publishers
Orlando San Diego New York Austin Boston London Sydney Tokyo Toronto
COPYRIGHT 0 1986
BY ACADEMIC PRESS,INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.
ACADEMIC PRESS, INC Orlando, Florida 32887
United Kingdom Edition published by
ACADEMIC PRESS INC.
(LONDON) 24-28 Oval Road. London NWI 7DX
LTD.
LIBRARY OF CONGRESS CATALOG C A R D N U M B E R : 44-885 3 ISBN 0-12-034238-3 PRINTED IN THE UNITED STATES OF AMERICA
86878889
9 8 7 6 5 4 3 2 1
CONTENTS Regulatory and Cytoskeletal Proteins of Vertebrate Skeletal Muscle
IWAOOHTSUKI. KOSCAKMARUYAMA. AND SETSURO EBASHI I . Introductory Remarks . . . . . . I1. Calcium Regulatory Proteins: Troponin and Tropomyosin . . . . . . . . I11. Connectin (Titin) . . . . . . . References . . . . . . . .
.
.
1
. . .
. . .
7 52
60
Mechanistic Aspects of DNA Topoisomerases
ANTHONYMAXWELLAND MARTINGELLERT I. I1. I11. IV . V. VI . VII . VIII . IX. X.
Introduction . . . . . . . The Reactions of Topoisomerases . . DNA Binding . . . . . . DNA Cleavage . . . . . . DNAReunion . . . . . . ATP Hydrolysis . . . . . . Processivity in Topoisomerase Reactions Covalent Modification of Topoisomerases Mechanistic Models . . . . . Concluding Remarks . . . . . References . . . . . . .
. . .
. .
.
. .
. .
.
.
.
. . . .
.
. . . .
69 72 78 83 92
93 97 98 99 102 103
Molecular Mechanisms of Protein Secretion: The Role of the Signal Sequence
MARTHAS. BRICCSAND LILAM . GIERASCH I. I1. I11. IV . V.
Introduction . . . . . . . Historical Background . . . . . The Signal Sequence . . . . . Components of the Secretory Apparatus How Does Secretion Occur? . . . V
. . . . . . . . . . . . . . .
110 110 113 128 142
vi
CONTENTS
VI. What Are the Roles of the Signal Sequence?. VII. Recapitulation . . . . . . . VIII. A Model for the Initial Interactions of Signal Sequences with the Membrane. . . . IX. Signal Sequences as Membrane-Interacting Sequences . . . . . . . . References . . . . . . . .
.
.
.
152 168
.
.
170
. .
. .
171 174
.
Vibrational Spectroscopy and Conformation of Peptides, Polypeptides, and Proteins
SAMUEL KRIMMAND JAGDEESH BANDEKAR
I. Introduction . . . . . . . . 11. Theoretical Considerations. . . . . . 111. Extended Polypeptide Chain Structures. . IV. Helical Polypeptide Chain Structures . V. Reverse Turns . . . . . . . VI. Characteristics of Polypeptide Chain Modes . VII. Vibrational Spectroscopy of Proteins . . . . . . VIII. Prospects for the Future . References . . . . . . . . .
. .
. .
.
.
.
.
.
.
183 185 229 256 297 328 34 1 352 354
. .
. .
. .
. .
AUTHOR INDEX
.
.
.
.
.
.
.
.
.
.
.
365
SUBJECT INDEX
.
.
.
.
.
.
.
.
.
.
.
383
REGULATORY AND CYTOSKELETAL PROTEINS OF VERTEBRATE SKELETAL MUSCLE By IWAO OHTSUKI; KOSCAK MARUYAMA.t and SETSURO EBASHIS "Department of Pharmacology. Faculty of Medicine. Kyurhu University. Fukuoka 812. Japan tDepartment of Blology. Faculty of Science. Chlba Unlverrlty. Chlba 260. Japan *National Institute for Physlologlcal Sciences. Okarakl 444. Japan
I . Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . A. Regulatory Proteins . . . . . . . . . . . . . . . . . . . . . . . . . B . Cytoskeletal Proteins . . . . . . . . . . . . . . . . . . . . . . . . . I1. Calcium Regulatory Proteins: Troponin and Tropomyosin . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . TroponinI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. TroponinC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. TroponinT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Tropomyosin . . . . . . . . . . . . . . . . . . . . . . . . . . . . F. Some Aspects of Calcium-Regulatory Mechanisms . . . . . . . . . . . G . Structural Aspects of Troponin and Tropomyosin . . . . . . . . . . . 111. Connectin (Titin) . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . Content in Myofibrils . . . . . . . . . . . . . . . . . . . . . . . . C. Molecular Size and Shape . . . . . . . . . . . . . . . . . . . . . . D . Other Physicochemical Properties . . . . . . . . . . . . . . . . . . . E. Hydrolysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F . Interaction with Myosin . . . . . . . . . . . . . . . . . . . . . . . G . Interaction with Actin . . . . . . . . . . . . . . . . . . . . . . . . H . Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LFunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
3 5 7 7 10 15 24 31 36 42 52 53 53 53 54 55 56 57 57 59 60
I . INTRODUCTORY REMARKS The effective contractile machinery of vertebrate striated muscle represents an elaborate framework. The motion of myosin and actin filaments is controlled by regulatory proteins and their position is supported by cytoskeletal proteins . Approximately 65% of the total myofibrillar proteins is myosin and actin. the contractile proteins of muscle. There are a number of both regulatory and cyotoskeletal proteins. as listed in Table I (for review. see Obinata et al., 1981; Maruyama. 1985a). The aim of this article is to describe the structure and function of the major regulatory proteins. troponin and tropomyosin. and also of the main cytoskeletal protein. connectin (titin). The former are perhaps the 1 ADVANCES IN PROTEIN CHEMISTRY. Val. 38
Copyright 0 1986 by Academic Press. Inc. All rights of reproduction in any form reserved.
TABLE I Myofibrillur Structural Proteins of Rabbit Skeletal Muscle"
Molecular weight (kDa)
Content
520 42
43 22
A band I band
Contracts with actin Contracts with myosin
33 x 2
5
I band
Tr op on i n Troponin C* Troponin I*
70 18 21
5
I band
Troponin T*
31
Binds to actin and locates troponin Ca regulation Ca binding Inhibition of actinmyosin interaction Binding to tropomyosin
Proteinb Contractile proteins Myosin* Actin* Regulatory proteins Major Tropomyosin*
Minor M protein M yomesin Creatine kinase* C protein F protein H protein I protein a-Actinin P-Actinin y- Actinin
eu-Actinin ABP (filamin) Paratropomyosin Cytoskeletal proteins Connectin (titin) Nebulin Vinculin Desmin* (skeletin) Vimentin* Synemin Z protein Z-nin
(wt
7%)
Localization
165 185 42 135 121 74 50
2
M line M line M line A band A band Near M line A band
95 x 2 37 + 34
2 <1
Z line Free end of actin filament
<1
?
35 42 240 x 2 34 x 2 2800 (2100) 800 130
Z line Z line A-I junction
10
A-I
5
N P line
53
<1
55
<1
220 50 400
<1 <1 <1
Under sarcolemma Periphery of Z line Periphery of Z line Z line Z line Z line
Function
Binds to myosin Binds to myosin Binds to myosin Binds to myosin Binds to myosin Binds to myosin Inhibits myosin-actin interaction Gelates actin filaments Caps actin filaments Inhibits actin polymerization Binds to actin Gelates actin filaments Inhibits actin-myosin interaction Links myosin filament to Z line
Intermediate filament Intermediate filament Forms lattice structure
Compiled mainly from articles by Obinata et al. (]gal), Yates and Greaser (1983),and Maruyama (1985a). Complete amino acid sequences have been determined for those proteins marked by an asterisk.
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
3
best characterized proteins, together with actin and myosin, in the field of muscle biochemistry. They are involved in the Ca2+ regulation of muscle contraction. On the other hand, connectin, an elastic protein, is a relative newcomer, and because of its huge molecular weight (more than 2 million), its physicochemical characterization has remained incomplete. Nevertheless, it may be appropriate to call attention to this protein, since new aspects of protein chemistry might be revealed from work on such a giant peptide as connectin. However, before going further, the regulatory and cytoskeletal muscle proteins will be concisely described with reference to Table I. A . Regulatory Proteins The major regulatory proteins located on an actin filament are troponin and tropomyosin, each occupying 5% of the total myofibrillar proteins. Both proteins confer calcium sensitivity on the ATP-actinmyosin interactions (see Section 11). There are minor regulatory proteins that modify the fine structures of myosin and actin filaments and also of Z lines. 1 . Myosin-Associated Proteins
M Lines maintain the structural integrity of approximately 300 myosin filaments at their center positions. At present, three regulatory proteins constituting the M line are known: M protein (165 kDa) (Masaki and Takaiti, 1974), myomesin (185 kDa) (Grove et al., 1984), and creatine kinase (42 kDa X 2) (Wallimann et al., 1977). The latter enzyme plays a role in rapidly regenerating ATP consumed during muscle contraction (Wallimann et al., 1984). Although the presence of these proteins in the M lines has been shown by immunofluorescence, their specific structures that are thought to link myosin filaments are not known (cf. review by Mani et al., 1980). Furthermore, it has been reported that the interactions of M protein and creatine kinase with myosin are very weak in vitro, suggesting the possible presence of additional components (Woodhead and Lowey, 1983). More systematic investigations, both biochemical and structural, are needed to settle the M-line problem. Starr and Offer (197 1) examined the SDS-gel electrophoresis patterns of conventional myosin preparations and pointed out the presence of several unknown protein bands other than myosin heavy and light chains. As a result, C (135 kDa), F (121 kDa), and H (74 kDa) proteins have been isolated and shown to be myosin-associated proteins. The C protein, discovered by Offer et al. (1973), is localized to seven regularly spaced stripes in each half of the A bands of rabbit psoas muscle (Craig and Offer, 1976), and studies using monoclonal antibodies have shown
4
IWAO OHTSUKI E T AL.
that there are several C-protein isoforms located at up to as many as nine stripes in each half of the A bands of chicken skeletal muscle (Dennis et al., 1984). The C protein binds to aggregates of the tail portions of myosin, but not to the subfragment-1 heads of myosin (Starr and Offer, 1978). The C protein inhibits the actomyosin ATPase at very low ionic strengths, but accelerates it at physiological ionic strengths (Moos and Feng, 1980).Offer et al. (1973) suggested that C protein has a structural role in the formation of thick filaments, but direct evidence has not yet been presented. During the preparation of C protein, Offer et al. (1973) obtained F protein (12 1 kDa) as a by-product. Miyahara et al. (1980) characterized F protein and showed its binding to myosin. The H protein (74 kDa) has been purified and its specific location closer to the M line than to the Cprotein zone has been shown by Yamamoto (1984). The I protein (50 kDa), which inhibits the ATPase activity of myosin, is localized in the edge regions of the A band, but is easily translocated to near the Z lines in aged myofibrils (Ohashi and Maruyama, 1985). The structural roles of F, H, and I proteins have not yet been clarified.
2. Actzn-Associated Proteins P-Actinin (35 + 32 kDa) was the first discovered actin capping protein that specifically masks the pointed end of an actin filament (Maruyama et al., 1977a).It seems that p-actinin caps the free end of the thin filament, inhibiting further elongation and interfilamental interactions (Funatsu and Ishiwata, 1985; cf. Maruyama, 1985b). Another muscle capping protein masking the barbed end of an actin filament has been detected by Lin et al. (1982). It would be desirable to characterize this protein and to show its location in myofibrils. y-Actinin (35 kDa) was found to inhibit actin polymerization (Kuroda and Maruyama, 1976), but its location has not yet been shown. Paratropomyosin (34 kDa X 2),discovered by Takahashi et al. (1985), is a protein very similar to tropomyosin, but is located at the A-I junction region and translocated to the I-band region to release myosin and actin binding in postmortem myofibrils. A number of the proteins constituting Z lines have been reported, but unfortunately their structural roles have not been worked out in detail. a-Actinin (95 kDa X 2) was the first discovered Z-line protein that is most abundant in Z lines (2% of the total myofibrillar protein) (Ebashi and Ebashi, 1965). a-Actinin was discovered as a protein factor promoting contraction of actomyosin (Ebashi et al., 1964). a-Actinin is also the first example of an actin-gelating protein (Maruyama and Ebashi, 1965). Granger and Lazarides (1978) showed that a-actinin is uniformly
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
5
present in the interior of isolated Z disks. a-Actinin is easily released into the medium at low ionic strengths o r by a protease treatment, suggesting that a protein is involved in the deposit of a-actinin in Z lines. It is believed that a-actinin has a supporting role in the attachment of actin filaments to Z lines. However, its mode of attachment is unknown. Kuroda et al. (198 1) have discovered an interesting protein very similar to actin in molecular size and amino acid composition. They called it eu-actinin (42 kDa). eu-Actinin is located in Z lines and it interacts with F-actin to form bundles (Kuroda and Masaki, 1984). Unlike a-actinin, eu-actinin inhibits the onset of superprecipitation of actomyosin. Actinbinding protein (ABP, also called filamin; 240 kDa X 2; Stossel and Hartwig, 1975), which causes gelation of actin filaments, has also been shown to exist in Z lines (Gomer and Lazarides, 1981; Bechtel, 1979). B . Cytoskeletal Proteins
In recent years the biological importance of the cytoskeletal structures-microtubuies, intermediate filaments, and microfilaments (actin)-has attracted the attention of many cell biologists. In striated muscle, myosin and actin filaments together with Z lines are the main cytoskeletal structures to form sarcomeres of the myofibril. However, myosin and actin are contractile proteins, and some of the proteins constituting the Z line are classified as actin-associated proteins. Therefore, cell membrane attachment proteins, intermediate filaments, and some other structural proteins are described in this section. There has not been any report on muscle microtubules, although their presence is shown in some electron micrographs of sectioned samples. I n striated muscle, the main cytoskeletal proteins are connectin and nebulin, which together constitute 15% of the total myofibrillar proteins (Table I). T h e two proteins will be discussed in Section 111. 1 . Membrane-Attachment Proteins
Vinculin (130 kDa), mainly present in adhesion plaques of nonmuscle cells, was discovered by Geiger (1979) as a contaminating protein during the purification of a-actinin from chicken gizzard. By immunofluorescence, vinculin has been shown to be located in register at the cytoplasmic surface of muscle cells of chicken skeletal muscle (Pardo et al., 1983a,b). Vinculin lattice structures called costameres are divided into two rows flanking Z lines and overlying the I bands of the underlying sarcomeres. It appears that transverse vinculin costameres firmly attach to the myofibrils at the sarcolemma. In nonmuscle cells, vinculin is regarded as the actin filament binding protein at the cell membrane. Isenberg et al. (1982) reported that vinculin forms bundles of actin filaments, and
6
IWAO OHTSUKI E T AL.
Wilkins and Lin (1982) claimed that vinculin is an actin capping protein. However, highly purified preparations of vinculin do not have such a capping action at all (Evans et al., 1984). It has also been reported that such a purified vinculin does not bundle actin filaments either (Ohtaki et al., 1986). Therefore, it is likely that some proteins other than vinculin are needed to make actin filaments attach to the cell membrane.
2. Intermediate Filaments Intermediate filaments (10 nm) are intermediate in diameter between microtubules (25 nm) and actin filaments (7 nm). Intermediate filaments are classified into five groups: cytokeratin, desmin, vimentin, neural, and glial filaments (for review, see Lazarides, 1980). In skeletal muscle, there are both desmin and vimentin filaments. Desmin (skeletin; 53 kDa) was first isolated from chicken gizzard by Small and Sobieszek (1977). Its complete amino acid sequence (463 residues) was determined by Geisler and Weber (1982). Desmin protofilaments are regarded as dimeric double-stranded structures. Lazarides and Hubbard (1976) showed that desmin is located in Z lines and also in the connection between the Z lines of adjacent myofibrils of chicken skeletal muscle. Furthermore, it was revealed that desmin filaments are located at the periphery of Z lines (Granger and Lazarides, 1978). Thus, desmin filaments connect adjacent Z lines of neighboring myofibrils, resulting in striation of muscle fiber (Lazarides, 1980). Tokuyasu (1983), using an elegant immunoelectron micrograph, has shown the presence of longitudinally oriented desmin filaments, running from one Z line to the next in cardiac myofibrils. However, in skeletal muscle, it appears that these longitudinal desmin filaments exist only in the early stage of myofibrillogenesis (Tokuyasu et al., 1984). Vimentin (55 kDa) intermediate filaments are characteristic of cells of mesenchymal origin (Franke et al., 1978). Coexistence of desmin and vimentin in the periphery of Z lines was demonstrated by Granger and Lazarides (1979). Chicken skeletal muscle contains much less vimentin than desmin. 3. Z-Line Structure T h e Z lines arrange actin filaments in register and segment the myofibrils into the structural unit, the sarcomere. Electron microscopy shows a highly organized tetragonal lattice structure to which actin filaments attach. As described in Section I,A,2, a-actinin is most abundantly localized in Z lines. Also, eu-actinin and actin-binding protein (filamin) are present in the interior of Z lines. T h e structural basis of these three proteins has
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
7
not yet been clarified. On the other hand, the intermediate filaments, desmin and vimentin, encircle the Z lines. Together with the intermediate filaments, a high-molecular-weight protein called synemin (220 kDa) is located at the periphery of Z lines (Granger and Lazarides, 1980).The amount of synemin present is reported to be only 1% that of desmin. It appears that the 220-kDa protein located in Z lines (Muguruma et al., 1981) is the same as synemin. Ohashi and Maruyama (1979) isolated a 55-kDa protein from chicken breast muscle that formed a lattice structure with a periodicity of 8 nm. This protein, termed 2 p-otein, is the only known candidate for the lattice structure of Z lines. Although the molecular weight of Z protein is similar to that of desmin, the amino acid compositions are entirely different. The Z protein is located in the interior of Z lines as revealed by an immunofluorescent technique (Ohashi et al., 1982). Suzuki and associates (Suzuki et al., 1978; Suzuki and Nonami, 1982) carried out an interesting reconstitution experiment: Z lines almost disappeared on prolonged extraction with a low ionic strength solution, but could be reconstituted by treatment with the proteins released from Z lines by calcium-activated protease. The active principal contained three kinds of proteins: 400-, 100- (a-actinin), and 34-kDa proteins. The 400kDa protein, named 2-nin, has been shown in the interior of Z lines. It would be desirable to characterize Z-nin in detail.
11. CALCIUM REGULATORY PROTEINS: TROPONIN AND TROPOMYOSIN A. Introduction
Study of the molecular biology of calcium regulation of muscle contraction was initiated by the discovery of a new protein factor sensitizing actomyosin to calcium ions (Ebashi, 1963; Ebashi and Ebashi, 1964). This protein factor was called native tropomyosin, because of its similarity in amino acid composition to tropomyosin, which had been discovered earlier (Bailey, 1946, 1948). It was soon found that this factor is a complex of tropomyosin and a new globular protein, termed troponin (Ebashi and Kodama, 1965; Ebashi et al., 1968).Thus four proteins, i.e., myosin, actin, troponin, and tropomyosin, are involved in calcium-regulated physiological muscle contraction (Ebashi et al., 1968, 1969; Ebashi and Endo, 1968). The contractile interaction between myosin and actin is depressed by troponin and tropomyosin in the absence of calcium ions. When calcium ion acts on troponin, this depression is removed and the contractile interaction is then activated (Figs. 1 and 2).
8
IWAO OHTSUKI ET AL. control 0.2 w 0
b 1 . ~ x 1 ~M- 6Ca
-
W .L
1
f e
0.1-
8
%
+
O
1A
e-b-b-e
I
- -e
I
4
1
8
12
I
16
timelmin FIG. 1. Ca2+ regulation of the superprecipitation of myosin B from rabbit skeletal muscle. Effects of varied Ca2+concentrations, given as net Ca2+without using Ca-EGTA buffer, are shown on the time course of superprecipitation of myosin B (natural actomyosin, containing troponin, tropomyosin, and other proteins in addition to actin and myosin). Contaminating Ca2+had been removed as far as possible by washing with EDTA. Note that the superprecipitation of Ca2+-freemyosin B is intensely inhibited and that even 2 x lo-’ M Ca2+definitely accelerates it. The superprecipitation of myosin-actin without troponin-tropomyosin (not shown in this figure, but schematically illustrated in Fig. 2) proceeds quickly irrespective of Ca2+concentration in the same way as that of myosin B at high Ca2+concentration (from Ebashi, 1960).
Structurally, troponin-tropomyosin distributes over the entire length of the thin filament with a 38-nm period (Ohtsuki et al., 1967; Ohtsuki, 1974). This finding led us to propose a structure for the thin filament in which two end-to-end filaments of fibrous tropomyosin molecules, each binding to globular troponin at its specific region, lie almost in register along the grooves of actin double strands (Ebashi et al., 1969; Ohtsuki, 1974). This structure has clearly revealed the regulatory pathway by which the action of calcium ion on troponin is transmitted to actin molecules through tropomyosin. Troponin is not a single peptide but consists of three different components: troponins C, I, and T (Hartshorne and Mueller, 1968; Schaub and Perry, 1969; Greaser and Gergely, 1971; Ebashi, 1972; Ebashi et al., 1973). Troponin C is a calcium-binding component. Troponin I is an inhibitory component of contractile interaction. Troponin T is a tropomyosin-binding component. The regulatory mechanism of the three troponin components is as follows. Troponin I solely inhibits the contractile interaction between myosin and actin in the presence of tropomyosin. Troponin C removes the inhibition by troponin I, regardless of calcium ion concentration. For the calcium ion regulation of contraction,
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
9
[~a"]
FIG.2. Ca-regulatory mechanism of troponin and tropomyosin. Abscissa indicates the free calcium ion concentration in arbitrary units. Ordinates indicate (a) the extent of contractile interaction between myosin and actin in the presence of troponin and/or tropomyosin (Ebashi et al., 1969) and (b) the extent of contractile interaction of myosin-actin-tropomyosin in the presence of troponin components (Ohtsuki, 1980). TM, Tropomyosin; TN, troponin; T N * C, troponin C; T N * I, troponin I; T N .T, troponin T.
the concomitant presence of all three components, C, I, and T, is required. Troponin T does not have any particular action related to calcium ion or contraction, but it is necessary for calcium regulation (Fig. 2). This tropomyosin-binding component has a role in integrating the troponin function in the thin filament (Ohtsuki, 1980). Fine localization of these components in thin filament has been studied (Ohtsuki, 1975, 1979b).
10
IWAO OHTSUKI ET AL.
This section represents an extension of our previous reviews (Ebashi and Endo, 1968; Ebashi et al., 1969; Ebashi, 1974a, 1980) and concerns mainly the interacting properties and arrangement of the three troponin components and tropomyosin in reference to calcium regulation. The protein source is rabbit skeletal muscle unless otherwise mentioned. B . Troponin I The characteristic action of troponin I is to inhibit the contractile interaction between myosin and actin in the presence of tropomyosin (Hartshorne and Mueller, 1968; Schaub and Perry, 1969). This protein binds to actin and tropomyosin and also interacts with troponin C and troponin T (Ebashi et al., 1973; Potter and Gergely, 1974; Hitchcock, 1975a; Horwitz et al., 1979). All these interactions are necessary for calcium regulation. Amino acid analysis showed that troponin I is a basic peptide with 178 amino acid residues (Wilkinsonand Grand, 1975). This protein has not yet been crystallized and has scarcely been investigated physicochemically, largely because of its strong tendency to precipitate at low ionic strengths, especially in aged preparations. Properties of this protein are discussed in the following sections with particular reference to the interacting regions along the amino acid sequence.
1 . Inhibitory Action on the Myosin-Actin Interaction Troponin I inhibits the ATPase and superprecipitation of actomyosin in the presence of tropomyosin. The extent of inhibition of actomyosin ATPase was reported to be maximally 80% in the presence of tropomyosin, but 20% in the absence of tropomyosin (Perry et al., 1973; Syska et al., 1976). Inhibition of superprecipitation of actomyosin by troponin I was found only in the presence of tropomyosin (Ebashi et al., 1974). The inhibition of contractile interaction by troponin-tropomyosin at low Ca2+concentration is due to this inhibitory action of troponin I. The inhibition of contractile interaction by troponin I is neutralized by the addition of stoichiometric amounts of troponin C, regardless of Ca2+ concentration (Perry et al., 1973). Syska et al. (1976) showed that the cyanogen bromide fragment CN4 (residues 96-116) had 75% of the inhibitory action of troponin I. A larger fragment, CFl (residues 64-135), which contained the CN4 sequence, also showed the inhibitory action but to a lesser extent than the CN4 fragment. The inhibition was also neutralized by troponin C. Comparison of the amino acid sequences of troponin I from four different striated muscles revealed that the region of inhibitory peptide
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
11
CN4 is one of the most strongly conserved (Wilkinson and Grand, 1975). This constancy of sequence also supports the view that the CN4 region is important in binding to actin, which is also a highly conserved protein. These findings strongly indicate that the inhibitory action is exerted mostly by this relatively localized region of the troponin I molecule. However, it should also be noticed that the inhibitory action of troponin I is not solely dependent on the CN4 region, for troponin I from fast and slow muscles of the rabbit, in which the sequence of the CN4 region is exactly the same, showed different inhibitory action on the ATPase of actomyosin (Syska et al., 1974). The discussion above stresses the importance of the CN4 region (residues 96-1 16) of troponin I in the inhibition of contractile interaction of myosin-actin in the presence of tropomyosin. The amino acid sequence of the CN4 region is shown in Fig. 3. Talbot and Hodges (1981) synthesized 12 peptide analogs of the CN4 sequence and examined their inhibitory action on actomyosin ATPase activity. The absence of residues 115 and 116 did not affect the activity, whereas the absence of residue 114 significantly decreased the inhibitory action. As to the N-terminal portion of the peptide, residues 96- 103 were not essential, but the absence of Lys- 105 decreased the inhibitory activity. A peptide containing the region of residues Lys-105-Val-114 showed about half of the inhibitory action of troponin I. The authors concluded that Lys-105 and the bulky side chain at Val-114 are essential for the inhibitory action of troponin I (Fig. 3). Grand el al. (1982) reported the observation that Arg residues of the CN4 fragment were perturbed by actin, as shown by their proton magnetic resonance spectra (Fig. 3). They suggested that the C-terminal half of the CN4 region, which is rich in Arg residues, is mainly involved in the inhibitory action, whereas the N-terminal half is related to the interaction with troponin C. It is interesting to notice that both Lys-105 and Val-114, essential for the inhibitory action based on the synthetic peptide study, are not perturbed by actin. The side chains of these residues might interact with tropomyosin. 2 . Interaction with Tropomyosin There have been few reports on the interaction of troponin I with tropomyosin, in spite of the clear requirement of tropomyosin for the inhibitory action of troponin I. Ebashi et al. (1974) presented evidence from gel filtration showing that troponin I comigrates with tropomyosin. It was also reported that tropomyosin was retained in the column of troponin I-Sepharose 4B (Katayama, 1980).
12
IWAO OHTSUKI ET AL.
96 100 0 110 0 0 0 116 -Asn-Gln-Lys-Leu-Phe-As~-Leu-A~g-~ly-~-P~~-Ly~-~~g-P~o-~ro-~e"-Ar~-~g-~-Arg-~= 0
0
0
0
" 0
0
0
FIG.3. Amino acid sequence of the CN4 (residues 96-1 16) region of troponin I. Filled circles (0)denote the residues that may be perturbed by actin and open circles (0)by troponin C in the presence of Ca*+(Grand et al., 1982). Lys-105 and Val-1 14, surrounded by squares, are essential for the inhibitory action (Talbot and Hodges, 1981).
3. Interaction with Troponin C Interaction between troponin I and troponin C is strengthened by Ca2+(Head and Perry, 1974; Potter and Gergely, 1974; Ohnishi et al., 1975). Two regions along the troponin I sequence have been shown to be involved in the interaction with troponin C (Syska et al., 1976) (Fig. 4). The fragments CN5 (residues 1-21), CF2 (residues 1-48), and CN4 (residues 96-116) have been shown to interact with troponin C by the use of affinity chromatography and gel electrophoresis. The CF2 fragment binds to troponin C more strongly than does the CN5 fragment Ca2+-independently.On the basis of sequence homology of four species of troponin I, Wilkinson and Grand (1978) estimated that the most likely binding site of the N-terminal region is between residues 10 and 26. Actually, the synthetic peptide of residues 10-25 of troponin I was shown to interact with troponin C, depending on Ca2+concentration, in nondenaturing gels containing glycerol (Katayama and Nozaki, 1982). Troponin C was also found to decrease the phosphorylation of troponin I at Thr- 11 by phosphorylase kinase and at Ser-117 by CAMP-dependent protein kinase (Moir et al., 1978).These findings indicate the presence of a Ca2+-dependentinteraction between the N-terminal region of troTroponln C binding
-------____ Thr 11
Troponin T binding
Actin TropLnin c binding
-----------_---____---------------_______________
Lys
Cy.
Cy.
18
48
64
Ly. 105
Valse, 114%11
CY.
iaa
1 CNS
2!
06
CF2
CN4
I16
118
48
FIG.4. Schematic representation of the amino acid sequence of rabbit skeletal troponin I. The primary sequence of troponin I (residues 1-178) is shown as an open bar. The positions of main residues as well as approximate interacting regions are indicated at the upper part of the sequence. Tick marks below indicate the position of lysine residues reactive to acetylation; tick marks with open circles indicate residues whose reactivity is affected by troponin T (residues 40,65, 70, 78, 90) (Hitchcock-de Gregori, 1982). The position of fragments CN5, CF2, and CN4 is shown under the sequence (Syska et al., 1976).
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
13
ponin I and the region of troponin C that probably contains the lowaffinity Ca2+-bindingsite (Grabarek et al., 1981). The inhibitory peptide CN4 (residues 96- 116) was reported to bind to troponin C, consistent with the finding that the inhibition of actomyosin ATPase by the CN4 peptide was neutralized by troponin C irrespective of Ca2+concentration (Syska et aZ., 1976). Residues of the CN4 fragment were perturbed by troponin C in the presence of Ca2+ (Grand et al., 1982). This perturbation was decreased by the removal of Ca2+from the solution. At the same time, the cyanogen bromide fragment of troponin C (CB9, residues 87-134) was shown to neutralize the inhibition by troponin I, to inhibit phosphorylation of Ser-117, and to form a complex with troponin I only in the presence of Ca2+(Weeks and Perry, 1978; Grabarek et al., 1981). This strongly suggests that the CB9 region of troponin C neutralized the inhibition of troponin I by directly competing with actin for the CN4 region of troponin I. This is further supported by the proton magnetic resonance study showing that actin and troponin C perturbed different residues of the CN4 fragment (Grand et al., 1982). Thus, at present,'troponin C is considered to interact with two separate regions of troponin I, i.e., the N-terminal region including residues 10-25 and the CN4 region. Both interactions are influenced by Ca2+ (Fig. 4). Calmodulin has been reported to neutralize the inhibition of actomyosin ATPase by troponin I in the presence of Ca2+(Amphlett et al., 1976; Dedman et al., 1977; Yamamoto, 1983). However, the fact that both phosphorylation of Thr-1 1 and reactivity of Lys- 18 to acetimidates are not affected by calmodulin (Perry, 1980; Moir et al., 1983) suggests that the N-terminal region of troponin I, which has an affinity for troponin C potentiated by Ca2+,interacts only weakly with calmodulin. 4 . Interaction with Troponin T
Troponin I has been shown to interact with troponin T (Horwitz et al., 1979; Katayama, 1979; Tanokura and Ohtsuki, 1982; Pearlstone and Smillie, 1983; Tanokura et al., 1983),in accord with the early demonstration of the close apposition of troponin I and troponin T by cross-linking with dimethylimido esters (Hitchcock, 1975b). The interaction was observed only with reduced troponin I but not with the oxidized form (Horwitz et al., 1979). Since troponin containing oxidized troponin I did not sensitize the actomyosin to Ca2+,the interaction of troponin I with troponin T is necessary for the Ca2+regulation of contractile interaction.
14
IWAO OHTSUKI E T AL.
There are three Cys residues to be oxidized along the troponin I sequence: Cys-48, -64, and -133 (Fig. 4). The accessibility of Cys residues of troponin I to iodoacetamide was investigated by Chong and Hodges (1982a). Cys-48 and Cys-64 are accessible to labeling in free troponin I and troponin I-C, but not in native troponin and troponin I-T. Cys133 is reactive in all conditions. The change in Ca2+concentration did not affect the extent of labeling of native troponin. Hence, two Cys residues, Cys-48 and Cys-64, are in contact with troponin T in the native troponin complex. The study using a heterobifunctional photoaffinity probe attached to Cys-48 and Cys-64 of troponin I showed that these residues are within 1.4 nm of troponin T and troponin C in native troponin (Chong and Hodges, 1982b). The results are interpreted as meaning that the residues around Cys-48 and Cys-64 of troponin I form the complex with troponin T. The reactivity of lysyl groups of troponin I to acetylation was investigated (Hitchcock-De Gregori, 1982). The results indicated that the region from residues 40 to 98 of troponin I, which contains 5 Lys residues, was influenced by interacting with other components. All the changes in reactivity except that of Lys-70 were observed when troponin I was complexed with troponin T. Thus the region around residues 40-98 would be related to the interaction with troponin T, consistent with the results on the reactivity of Cys residues (Chong and Hodges, 1982b). The reactivity of Lys-70 was reduced on binding to troponin C. The reactivity of Lys residues related to the interaction with troponin T was found to be affected by Ca2+binding to troponin C in the native troponin complex, whereas the reactivity of Lys-70 was not influenced by Ca2+. In summary, the present knowledge on the distribution of interacting sites along the troponin I sequence is as follows (Fig. 4). The N-terminal region (residues 10-26) shows an interaction with troponin C, which is strengthened by Ca2+.The region around residues 40-98 mainly interacts with troponin T. The CN4 region (residues 96-1 16) binds to actin and inhibits the contractile interaction of actomyosin in the presence of tropomyosin. The CN4 fragment was also found to interact with troponin C. The role of the rest of the C-terminal region is obscure and might be related to the interaction with actin-tropomyosin. 5. Cardiac Troponin I
The sequences of the main portions of rabbit cardiac and skeletal troponin I are nearly identical, but cardiac troponin I has about 20 additional residues at the N-terminal end (Wilkinson and Grand, 1978). It was found that Ser-20 of rabbit cardiac troponin I is very susceptible
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
15
to phosphorylation by CAMP-dependent protein kinase (Moir et al., 1980). The extent of phosphorylation of troponin I in the heart of rat (England, 1975) and rabbit (Moir et al., 1980) perfused with adrenaline is related to the degree of inotropic effect. Studies on the effect of this phosphorylation on the actomyosin ATPase, however, revealed that phosphorylation of troponin I causes the depression of actomyosin ATPase at all Ca2+concentrations (Ray and England, 1976; Reddy and Wyborny, 1976; Bailin, 1979; Holroyde et al., 1980; Yamamoto and Ohtsuki, 1982). The Ca2+concentration required for the 50% activation of actomyosin ATPase is shifted to a higher concentration maximally by 1.0 pCa unit. This action would contribute to shortening of the course of relaxation that is characteristic of the inotropic effect of catecholamines. The neutralization of the inhibitory action of troponin I by troponin C is depressed by the phosphorylation of troponin I, but the inhibitory action of troponin 1 on actomyosin ATPase in the presence of tropomyosin is not depressed by phosphorylation (Yamamoto and Ohtsuki, 1982). These findings suggest that phosphorylation occurs in the region of troponin I facing troponin C but not at the region binding to actin. One interesting finding is that phosphorylation by CAMP-dependent protein kinase does not affect the Sr2+sensitivity of actomyosin ATPase (Yamamoto, 1983). Strontium ions would cause a conformational change of troponin C different from that induced by Ca2+ at the region facing troponin I. C . Troponin C Troponin C, the calcium-binding component, is a highly soluble acidic protein. It is the most extensively investigated protein among the three troponin components. Troponin C binds four calcium ions (Potter and Gergely, 1975) and forms a complex with both troponin I and troponin T (Ebashi et al., 1973; Head and Perry, 1974; Potter and Gergely, 1974; Hitchcock, 1975a). The interaction with troponin I is potentiated by Ca2+and has essential significance in the calcium regulation of contraction. The interaction with troponin T is also influenced by Ca2+,but at concentrations higher than required for the regulation of contraction (Ebashi et al., 1973; Ohnishi et al., 1975).
I . Ca2+Banding Ebashi et al. (1968) first suggested that the native troponin molecule has four Ca2+-bindingsites: two with high affinity and two with low affinity. Potter and Gergely (1975) demonstrated, by equilibrium dialysis, that the four Ca2+-bindingsites of troponin C can indeed be classified into two sites with high affinity (Kapp= 2 x l o 7 M - * ) and two sites
16
IWAO OHTSUKI ET AL.
with low affinity (K,,, = 5 x lo5A 4 - l ) . The high-affinity sites also bind Mg2+(Kapp= 5 x lo3A4-l). But significant change in Ca2+binding of low-affinity sites was not detected by the addition of 2 mM Mg2+.Thus the high- and low-affinity sites were termed the Ca2+-Mg2+sites and Ca2+-specificsite, respectively. In the complexes of troponin I-C and troponin C-I-T, all four sites have the same affinity for Ca2+in the presence of 2 mM Mg2+.Since the change in free Mg2+concentration by about 2 mM did not affect the Ca2+sensitivity of myofibrillar ATPase activity, it was concluded that the low-affinity sites are related to the regulation of contraction (Potter and Gergely, 1975). At the same time, Ebashi and Endo (1968) showed in their review that higher Mg2+(- 8 mM) concentration at constant ionic strength shifted the Ca2+sensitivity of superprecipitation of myosin B and tension development of skinned fibers to higher Ca2+concentrations. This effect of Mg2+ could not be detected on desensitized myosin B which did not contain troponin-tropomyosin. Thus it is plausible that Mg2+competes with Ca2+for troponin C in this condition. Examination of the Ca2+binding property of troponin conjugated to Sepharose 4B demonstrated that Mg2+affected the Ca2+binding not only to the high-affinity sites but also, though to a lesser degree, to the low-affinity sites (Kohama, 1980). It was also shown that the Ca2+binding to high-affinity sites can regulate the superprecipitation of myosin B to some extent under low Mg2+and low ATP conditions. A study of the Ca2+-bindingproperties of troponin C and troponin using a metallochromatic indicator, tetramethylmurexide, revealed that Mg2+binds to both high- and low-affinity sites; apparent binding constants are 1000 and 520 M-l, respectively (Ogawa, 1985). He also reported that the apparent binding constants of highand low-affinity sites for Ca2+are 4.5 X lo6 and 6.4 x lo5M-l, respectively. These considerations have revealed that the effect of Mg2+not only on the high-affinity sites (Ca2+-Mg2+site) but also on the low-affinity site (Ca2+-specificsite) should be studied in detail. However, it should also be mentioned that discussions based on the results of Potter and Gergely (1975) are still valid in the range of relatively low Mg2+concentrations; Ca2+binding to the low-affinity sites apparently is not affected by Mg2+ under such conditions. Analysis of the transient kinetics by the stopped-flow method demonstrated that the release of Ca2+from the low-affinity sites of troponin C is much faster than that from the high-affinity sites (Johnson et al., 1979; Iio and Kondo, 1980, 1981; Levine et al., 1978; Anderson et al., 1981; Ogawa, 1985). Potter and Gergely (1975) also showed that Ca2+binding constants of
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
17
both high- and low-affinity sites of troponin C-I and troponin are about 10 times higher than that of isolated troponin C in the absence of Mg2+. This means that the affinity of troponin C for Ca2+is modified by the interaction with other components. Wnuk et al. (1984) reported that the Ca2+binding constant of troponin-tropomyosin from crayfish skeletal muscle was decreased by actin; troponin-tropomyosin-actin has the same affinity for Ca2+as does isolated troponin C. This finding indicates that the Ca2+-bindingproperty of troponin C is modified by all the constituent proteins in the thin filament.
2. Caz+-Binding Structure Troponin C is a single peptide with 159 residues. It contains 44 acidic and 15 basic residues (Collins et al., 1973, 1977) and shows extensive homology with the Ca*+-bindingsequences in parvalbumin, which had been analyzed by X-ray diffraction at the amino acid level (Kretsinger and Nockolds, 1973). Four homologous regions are termed sites I, 11, 111, and IV, beginning from the N terminus (Fig. 5). Each site is composed of a Ca2+-coordinatingloop rich in aspartic and glutamic acid residues and two a-helical segments on either side of the loop. This helix-loop-helix structure is called the E-F-hand, a name primarily derived from the Ca2+-bindingregion in parvalbumin. Kretsinger and Barry (1975) have predicted a tentative troponin C structure based on the Ca2+-bindingstructures of parvalbumin. Troponin C has been crystallized at acidic pH (-5) in the presence of divalent cations such as Mn2+(5 mM) or Ca2+(10 mM) (Mercola et al., 1975; Strasburg et al., 1980; Herzberg and James, 1985; Sundaralingam et al., 1985). Analysis at high resolution of the turkey skeletal troponin C crystal containing Ca2+(Herzberg and James, 1985) and of the chicken skeletal troponin C crystal containing Mn2+(Sundaralingam et al., 1985) showed that the molecular conformations of the two crystals were almost identical; both consisted of two domains connected by a nine-turn helix without direct interdomain interactions, and about 7 nm in length. Among four helix-loop-helix structures in the crystal, two in the Cterminal domain coordinated divalent cations but two in the N-terminal domain did not have divalent cations, though an excess of divalent cations was present in the crystallizing solution. The tertiary structure of the C-terminal domain was very similar to the Ca2+-coordinatingstructure in the parvalbumin crystal. The higher order structure of troponin C at physiologically neutral pH, however, would be somewhat different from that of the troponin C crystal prepared under acidic conditions in which the affinity for Ca2+of both high- and low-affinity sites of troponin C is known to be greatly lowered (Hincke et d.,1978).
II
I
-~
1
TH.
120
~
1
100
TRtE
04
15s
TRzC
09
CB9
135
FIG. 5. Schematic representation of the predicted secondary structure of troponin C from rabbit skeletal musde. Four calcium-binding domains are termed I, 11, 111, and IV, and are numbered from the N terminus. Each is composed of a heli-loop-helix structure. Helical regions are indicated by zig-zag lines and a loop is indicated as a half circle, according to the prediction by Nagano et al. (1982).The position of residues at both edges of each helical region is numbered below the sequence. The approximate interacting sites for troponin I are shown above the figure (Grabarek et al., 1981). Troponin T is considered to interact with the N-terminal half of the troponin C sequence (not shown in the figure). Positions of representative fragments (TH, , TR,E, TR&, CBg) of troponin C are indicated in the lower part of the figure.
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
19
Two classes of Ca2+-bindingsites along the troponin C sequence have been identified. Studies using optical and spin probes at Cys-98 (site 111) have suggested that site I11 has high affinity for Ca2+(Potter et al., 1976; Nagy and Gergely, 1979). Investigation of the Ca2+-bindingproperties of various proteolytic fragments of troponin C revealed that a tryptic fragment TR2 (residues 89-159) preserves the property of the highaffinity sites (Leavis et al., 1978). Findings from other fragments also indicate that sites I11 and IV are the Ca2+-Mg2+sites of troponin C and that sites I and I1 are the low-affinity sites (Fig. 5). Reid et al. (1980) synthesized the peptide loop of 12 amino acids of the ideal high-affinity site and its C-terminal side helix. This peptide, when the N-terminal residues were acetylated, showed Ca2+-inducedchanges in circular dichroism (CD) and ultraviolet (UV) difference spectra. These changes were amplified in hydrophobic media favorable for helix formation. The sensitive Ca2+ concentration range was between lo-’ and M. The CB9 fragment (residues 84-135) did not have the Ca2+-bindingability (Nagy et al., 1978). Since the tryptic fragment TR2 showed the property of the Ca2+-Mg2+site in troponin C, the minimal unit for Ca2+ binding seems a set of two Ca2+-binding sites: the Nterminal and C-terminal halves. The cooperative binding of Ca2+to troponin C was shown in reconstituted thin filament and actomyosin (Grabarek et al., 1983). Troponin C and its fragments T H I (containing sites 1-111) and TR2C (containing sites 111-IV) were labeled with the fluorescent probes dansylaziridine (DANZ) at Met-25 or 5-(iodoacetamidoethyl)aminonaphthalene-l-sulfonic acid (AEDANS) at Cys-98. An apparent Hill coefficient of 1.O- 1.1 was obtained for the Ca2+-inducedfluorescent change in troponin C and fragments (TH1, TR&) and their ternary complex with troponin IT. But the fluorescent change of DANZ, which reflects Ca2+binding to sites I and 11, became clearly cooperative when troponin C or the T H , fragment was bound to the thin filament, and the cooperativity increased in the presence of myosin up to a Hill coefficient of 3.0. No cooperativity was observed for the fluorescent change of AEDANS, which reflects Ca2+binding to sites I11 and IV. Analysis of Ca2+sensitivity of actomyosin ATPase in the presence of tropomyosin and various combinations of the above-mentioned ternary complexes of troponin indicated a Hill coefficient of about 4,a value that was always higher than that determined by fluorescent changes. Grabarek et al. (1983) proposed that cooperative Ca2+binding to the lowaffinity sites of troponin C regulates the myosin-actin interaction. But an apparent Hill coefficient of the ATPase was 4 even in the presence of fragment TR2C, which did not give the cooperative fluorescent change
20
IWAO OHTSUKI ET AL.
in the thin filament. Some other cooperative processes might also be involved. 3 . Ca2+-Induced Structural Change Ca2+-inducedstructural change in troponin C has been investigated extensively (see Gergely and Leavis, 1980; Leavis and Gergely, 1984). The large spectral changes were found to depend upon Ca2+binding to the high-affinity sites of troponin C (Head and Perry, 1974; van Eerd and Kawasaki, 1972; Kawasaki and van Eerd, 1972; Carew et al., 1980; Nagy and Gergely, 1979). The results indicated that Ca2+binding to the high-affinity sites increased the helix content from approximately 30 to 50%. Magnesium ions produced similar structural changes. These changes were explained as the formation of two a-helical segments of 810 residues (Nagy and Gergely, 1979). Circular dichroism studies showed that helical change occurred in the region containing a Phe residue (Head and Perry, 1974; Nagy and Gergely, 1979), indicating that the N-terminal region helix of site I11 and C-terminal region helix of site IV are newly formed by Ca2+. Binding of Ca2+to the low-affinity sites caused a small change in the structure corresponding to the elongation of the a helix by only 1 or 2 residues (Nagy and Gergely, 1979). Magnesium ions did not cause any significant change in the structure of low-affinity sites. Binding of Ca2+ to troponin C produces heat (Potter et al., 1977; Yamada, 1978; Yamada and Kometani, 1982). A calorimetric study (Yamada and Kometani, 1982) on metal-free troponin C showed that troponin C, in the absence of Mg2+,has at least three different classes of Ca2+-bindingsites: one site with highest affinity (Kapp= 1010-108 M-l), one site with high affinity (K,,, = 106-107 M-l), and two sites with low affinity (Kapp= 105-106 M-l). Results obtained at different temperatures were explained in terms of the overall structural change of troponin C as follows. In the absence of Mg2+,Ca2+binding to one site with the highest affinity causes the internalization of hydrophobic residues and tightening of the troponin C molecule. Calcium ion binding to the second high-affinity site, on the contrary, exposes the hydrophobic residues on the molecule, with a substantial loosening of overall molecular structure. Calcium ion binding to the low-affinity sites showed a moderate decrease in hydrophobicity and a moderate tightening of molecular structure; this would mean the stabilization of hydrophobic residues into the molecule without significant change in the overall molecular structure. These results were in good agreement with proton magnetic resonance spectroscopy on troponin C in the absence of Mg2+(Levine et al., 1977). In the presence of Mg2+,the changes in high-affinity sites caused
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
21
by Ca2+were greatly altered, whereas those in low-affinity sites were not much affected. When calorimetry was carried out on troponin instead of troponin C, heat produced by Ca2+binding was greater and the difference in the binding affinities of four Ca2+-bindingsites on the titration curve, which was significant in troponin C in the absence of Mg2+,disappeared. 4 . Interaction with Troponin I Troponin C interacts with troponin I irrespective of Ca2+concentration but the interaction is strengthened by Caz+(Ebashi, 1974a,b; Potter and Gergely, 1974; Ohnishi et al., 1975).In the presence of 8 M urea, the interaction was observed only in the presence of Ca2+(Head and Perry, 1974). Grabarek et al. (1981) investigated the binding properties of troponin C fragments with other components and the effect of these fragments on the Ca2+-sensitizingactivity of actomyosin ATPase, using 10 fragments of troponin C obtained by cleavage with trypsin, thrombin, and cyanogen bromide. Two fragments, TR2C (residues 89-159) and TRzE (residues lOl-159r bound to troponin I irrespective of Ca2+concentrations in the absence of urea. Thus the Ca2+-independentinteracting site was considered to be localized in the C-terminal half of troponin C. In the presence of urea and Ca2+,the TRzC fragment, but not the TRzE fragment, bound troponin I. This indicates that the region of residues 89-100, which is absent in TRZE, is involved in the interaction in the presence of both Ca2+and urea. On the other hand, fragments CB9 (residues 89-135), TRz’E (residues 101-153), THI (residues 1-120), TRlE (residues 1-loo), TRlC (residues 9-84), and CB8 (residues 4677) interacted with troponin I only in the presence of Ca2+but without urea. Fragment TH2 (residues 121-159) did not interact with troponin I. Studies on the relative reactivity of 9 Lys residues (20, 37, 52, 84, 88, 90, 136, 140, 153)of troponin C to acetylation (Hitchcock, 1981)demonstrated that, in troponin C complexed with troponin I or present in native troponin complex, the reactivity of Lys-52, (-84, -88, -go), and (-136, -140) was reduced compared with that of the isolated molecule. The presence of Ca2+made Lys-37 more reactive and Lys (-136, -140) less reactive. This indicated that the regions containing these Lys residues are related to the interaction with troponin I. These Lys residues belong to the N-terminal helix of sites 11, 111, and IV (Fig. 5). Based on the above results, Grabarek et al. (1981) suggested that at least three regions along the troponin C sequence, i.e., the N-terminal side helices of sites 11, 111, and IV, are involved in the interaction with
22
IWAO OHTSUKI ET AL.
troponin I (Fig. 5). The N-terminal helix of site I11 has been shown to be involved in the Ca2+-dependentinteraction with troponin I, which is present even in the presence of 8 M urea (Weeks and Perry, 1978; Grabarek et al., 1981). Site I V seems to show Ca2+-independentinteraction with troponin I. The N-terminal helix of site 11, which contains Lys52, would be involved in Ca2+-dependentinteraction, for the N-terminal half fragments, i.e., TRlC (residues 9-84) and CB8 (residues 46-77), also bound to troponin I in the presence of Ca2+(Evans et al., 1980). In view of the findings that the two regions CF1 (residues 11-49) and CN4 (residues 96-1 16) in the troponin I sequence interact with troponin C, three interacting regions in troponin C would form two groups on troponin I. It was already shown that the CB9 fragment, which contains residues 89-100, binds to the CN4 region of troponin I in a Ca2+dependent fashion (Weeks and Perry, 1978). At the same time, a synthetic peptide (residues 10-25 of troponin I) was found to interact with troponin C only in the presence of Ca2+(Katayama and Nozaki, 1982). Thus it is plausible that this region interacts with the N-terminal half of troponin C. Since CFl (residues 1-47) or CN5 (residues 1-25) of troponin I, which includes residues 10-25, binds to troponin C Ca2+-independently, it is probable that the N-terminal region of troponin I also interacts Ca2+-independentlywith the C-terminal half of troponin C. Further detailed investigations, however, will be required for a definite conclusion. Another interesting finding of troponin C fragments, in relation to the interacting properties, is the effect of these fragments on the Ca2+ sensitivity of actomyosin ATPase. Troponin C activates (neutralizes) the ATPase of actomyosin with tropomyosin, troponin I, and troponin T in the presence of Ca2+(Grabarek et al., 1981). The three proteolytic fragments THl , TR&, and TRlE showed essentially the same Ca2+-activating action as troponin C, though to a lesser extent: 50% activation was obtained at 2 X lO-'M Ca2+with troponin C and all fragments. This neutralizing action was about 60% for TH1 ,30%for TRzC, and 20% for TRlE. A slight effect was also shown by CB9 and TR1C. When the mixture of the fragments (THl ,TR&, TRlE) with troponin I and T was made in the presence of urea, almost the same neutralizing action as that of troponin C was observed. All three fragments contain the region of residues 89- 100, which shows Ca2+-dependent interaction with troponin I and should be critical for the Ca2+-regulatingaction. This finding, in turn, indicates that both fragments containing only high-affinity or low-affinity sites can sensitize actomyosin to Ca2+to some extent. This might be related to the finding that both high- and low-affinity sites can regulate the contractile response (Kohama, 1979, 1980).
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
23
Troponin C binds to troponin I irrespective of Ca2+concentration under ordinary conditions. However, it was demonstrated that, at low KCl near 0 mM, the removal of divalent cations from the high-affinity sites of troponin C by ethylenediaminetetraacetic acid (EDTA) resulted in the partial dissociation of troponin C from the myofibril (Zot and Potter, 1982). The removal of Ca2+ by ethylenebis(oxyethylenenitri1o)tetraacetic acid (EGTA) from the high- and low-affinity sites did not dissociate troponin C from the myofibril. Increasing the KCl concentration reduced the extent of extraction of troponin C; troponin C was not extracted at all near physiological ionic strength (100 mM KCl). Since the troponin C molecules in muscle always contain Ca2+or Mg2+,even at resting low Ca2+concentrations at 150 mM KCl (Kitazawa et al., 1982), this interaction should be present irrespective of the change in Ca2+ concentration in muscle. It was suggested that this interaction is involved in the maintenance of the integrity of troponin components.
5. Interaction with Troponin T The interaction between troponin C and troponin T was first observed as the Ca2+-dependentturbidity change of the suspension of troponin C and troponin T (Ebashi et al., 1973). But the range of Ca2+concentration which caused the turbidity change was higher than that required for contractile regulation. A calcium-sensitive interaction with troponin T was also reported in a study of nuclear magnetic resonance (Ohnishi et al., 1975). Troponin C binds to troponin T irrespective of Ca2+concentration. Affinity chromatography of troponin C-Sepharose showed that the interaction is stable even at high ionic strength ( 1 .O M KC1) in the absence and presence of Ca2+ (Tanokura et al., 1983). Two fragments of troponin C, i.e., T H l (residues 1-120) and TRlE (residues 1-loo), were reported to bind to troponin T ; binding was Ca2+-dependent. Weak interaction was also observed for TRlC (residues 9-84) and CB9 (residues 84-135) fragments (Grabarek et al., 1981). These findings suggest that troponin T binds mainly to the N-terminal half of troponin C. It is still possible that another minor binding site on the C-terminal half is involved (Ohara et al., 1980). Hitchcock (1981) demonstrated that the reactivity of Lys-52 and (-54, -88, -90) of troponin C was reduced by the formation of a complex with troponin T in the presence of Ca2+. Since these lysine residues are believed to be located at the interacting regions for troponin I, the significance of this finding remains to be explored.
24
IWAO OHTSUKI ET AL.
D . Troponin T
Troponin T is a tropomyosin-binding component necessary for the Ca2+ regulation of contractile interaction (Fig. 2). This protein interacts with tropomyosin, troponin I, and troponin C. The interaction with tropomyosin and troponin I is involved in the Ca2+-regulatorymechanisms, whereas the significance of the interaction with troponin C is still obscure (Ebashi et al., 1973; Horwitz et al., 1979; Ohtsuki et al., 1981). The amino acid sequence determined by Pearlstone et al. (1976, 1977a,b,c) showed that this protein is a single peptide of 259 residues with a high percentage of charged residues: 61 acidic and 70 basic. Although these charged residues are distributed over the entire sequence of troponin T , the N-terminal region is rich in acidic residues and the C-terminal region rich in basic residues. There are three serine residues in troponin T susceptible to phosphorylation (Fig. 6). When the troponin complex is isolated from muscle, N-terminal acetyl-Ser of troponin T is usually phosphorylated by a specific troponin kinase (Kumon and Villar-Palasi, 1979; Gusev et al., 1980). Isolated troponin T is phosphorylated by phosphorylase kinase at Ser-149 and Ser-156 and, to a lesser extent, at Ac-Ser-1 (Moir et al., 1977).The action of troponin T in the regulatory process is not affected by phosphorylation. I . Chyrnotryptic Subfragments Troponin T is soluble in the presence of KCl only at concentrations higher than 0.4 M and precipitates at physiological ionic strength. Hence the interacting properties of troponin T has been investigated by using relatively small soluble fragments prepared by treatment with cyanogen bromide, pepsin, and BNPS-skatole (Jackson et al., 1975; Pearlstone and Smillie, 1977, 1978, 1980). In 1979 it was found that troponin T is split into two soluble subfragments, troponin T1and troponin T2 (or Tna), by mild treatment with chymotrypsin. The polar axial orientation of these subfragments along the thin filament as well as some interacting properties were investigated (Ohtsuki, 1979b). Troponin TI is a fragment of the N-terminal 158 residues (MW 18,700) and troponin T2 is a fragment of the C-terminal 101 residues (MW 11,900) (Tanokura et al., 1981).An affinity chromatographic study revealed that troponin T 1 binds solely but strongly to tropomyosin, and that troponin T2 interacts with tropomyosin, troponin I, and troponin C (Ohtsuki, 197913; Katayama, 1979; Tanokura and Ohtsuki, 1982; Tanokura et al., 1982, 1983; Pearlstone and Smillie, 1981, 1982, 1983). Three troponin T2P subfragments are produced from troponin T2by the removal of C-terminal residues by chymotryp-
I
>
c’
TI
158.159
156
259
T2( T2 2
159
T201
,159
TZBII
(159
Tzglll
259, 242 221 222
FIG. 6. Schematic representation of amino acid sequence of troponin T . Approximate binding sites for tropomyosin, troponin I, and troponin C are shown in the upper part of the figure. The helical region (residues 68-150) of troponin TI and the C-terminal region of troponin T:, are the binding sites €or tropomyosin. The troponin I binding site is the localized region including residues 223-227 of troponin T2.A wide region of troponin T2 interacts with troponin C; the C-terminal half region binds to troponin C Gas+-dependently and the N-terminal half Ca2+-dependently. Positions of serine and acetylserine residues susceptible to phosphorylation are also shown. Positions-of chymotryptic subfragments are shown in the lower part of the figure. Tick marks below the open bar indicate the position of Lys residues reactive to acetylation; long tick marks indicate residues whose reactivity is affected significantly by troponin I or troponin C. For details see text.
26
IWAO OHTSUKI ET AL. TABLE I1 Cakium-SensitizingAction and Interaction of Chymotryptic Troponin T Subfragments” Immobilized proteins
Troponin T subfragments (residues)
T Ti
(1-259) (1-158) T2aa (156-259) T2a (159-259) T&I (159-242) T@II (159-223) T@III (159-222)
Troponin C Ca2+-sensitizing action
+ -
+ -
-
Tropomyosin Troponin I 1 .o 1 .o 0.2 0.2 0.1
Urea 0.0
0.1
Urea Urea Urea Urea
0.1
0.0
+ Ca2+
- Ca2+
Urea 0.05 Urea Urea Urea Urea Urea
Urea 0.0
1.0 0.5 0.5 0.2 0.2
Numbers in the table are molar concentrations of KCl at which subfragments were eluted from affinity chromatography columns upon increasing the concentration stepwise. “Urea”means that the subfragments were not eluted by a solution containing 1.0 M KCl, but were eluted with 6 M urea (Nakamura et al., 1981; Ohtsuki et al., 1981; Tanokura et al., 1983; Onoyama and Ohtsuki, 1986).
sin. The helical content of troponin T is about 40% as determined by circular dichroism. Among chymotryptic fragments, only troponin TI showed a high percentage of a helix (68%), whereas all troponin T2 subfragments showed a helical content of about 20%. This is consistent with the view that higher order structure of all troponin T2 subfragments would be about the same (Tanokura et al., 1982). The most characteristic feature of the properties of chymotryptic troponin T subfragments is that the Ca*+-sensitizingaction of troponin T is retained only in the smaller subfragment, troponin T2 (Table 11)(Nakamura et al., 1981). Study under improved experimental conditions has shown that the action of troponin T2 is almost the same as, or a little less than, that of troponin T (Onoyama and Ohtsuki, 1986). It was also found that troponin T$1, produced by removal of the C-terminal residues from troponin T2 with further chymotryptic treatment, showed neither Ca2+-sensitizing action nor tropomyosin-binding ability (Ohtsuki, 1979b; Ohtsuki et al., 1981; Onoyama and Ohtsuki, 1986). This strongly indicates that the Ca2+-sensitizingaction of troponin T is closely correlated with the binding to tropomyosin through the troponin T2 region. This regulatory role of troponin T is also discussed in Section II,F,2. Seven chymotryptic subfragments of troponin T have been isolated: troponin TI (residues 1-158), troponin T2a (residues 159-259), troponin T2aS(residues 156-259), troponin T2PI (residues 159-242), tro-
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
27
ponin T$II (residues 159-227), troponin T2PIII (residues 159-222), and troponin Tpy (residues 243-259) (Tanokura et al., 1981, 1982) (Table 11, Fig. 6). A fragment consisting of residues 156-227 was also reported (Morris and Lehrer, 1981; Pearlstone and Smillie, 1981). The pathway of chymotryptic fragmentation of troponin T is as follows (Tanokura et al., 1982). Chymotrypsin splits the peptide bond between Tyr-158 and Leu-159 of most troponin T molecules into troponin T1 and troponin Tpa. Some troponin T is cleaved between Tyr-155 and Ser-156, producing a small amount of troponin T2aS.The region between Tyr- 155 and Leu- 159 is situated at the end of a long helical region predicted to lie between residues 122 and 146 (Pearlstone et al., 1976)or residues 90 and 148 (Nagano et al., 1980, 1982), and hence may be exposed to the solvent. The fact that the peptide bond between Tyr-158 and Leu- 159 is more susceptible to chymotrypsin than that between Tyr155 and Ser-156 would be due to the higher order structure. The above interpretation is in accord with the finding that Ser-149 and Ser-156 are easily phosphorylated by phosphorylase kinase (Moir et al., 1977). Troponin T1 is relatively resistant to chymotrypsin, whereas troponin Tp ( T p a ) is digested by chymotrypsin into three kinds of smaller fragments, troponin TpPI, -11, and -111, through different pathways depending on the presence or absence of urea. In the presence of 6 M urea, troponin T2 is digested into troponin T@I and then troponin T2PIII. On the other hand, in the absence of urea, most troponin T2 is digested into troponin T@II through troponin T$I, without any production of the troponin TpPIII fragment, and some troponin T2 is cleaved directly into troponin T2PII.These findings indicate that, in the absence of urea, the regions of residues Phe-242-Ser-243 and Tyr-227-Asp-228 are exposed to the solvent, at least for troponin T p . The presence of urea in the solvent would change the three-dimensional structure of troponin T p and make the peptide bond of Leu-222-Lys-223 susceptible to chymotrypsin.
2. Interaction with Tropomyosin The interaction of troponin T with tropomyosin was first shown by viscosity measurements of mixtures of the two components (Yamamoto and Maruyama, 1973) as well as by measurements of the electron microscopic density increase of specific regions in tropomyosin paracrystals or crystal nets (Greaser et al., 1973; Margossian and Cohen, 1973). Both of two chymotryptic subfragments, troponin TI and Tza,bind to tropomyosin (Ohtsuki, 1979b; Tanokura et al., 1982; Pearlstone and Smillie, 1981, 1982). Thus two regions of troponin T are involved in the
28
IWAO OHTSUKI E T AL.
binding to tropomyosin. Detailed examinations revealed that the stability strength of the binding in relation to the ionic strength is in the order troponin T (residues 1-159) = T1 (residues 1-158) > T2aS(residues 156-259) > T2a(residues 159-259) > T2PI (residues 159-242) > T 4 I I (residues 159-227) > T2PIII (residues 159-222) (Tanokura et al., 1982, 1983) (Table 11). Troponin T1 and troponin T showed the same binding stability for tropomyosin. A 26K-Da subfragment of troponin T devoid of the 45 Nterminal residues has the same affinity for tropomyosin as does troponin T. This finding has limited the possible region of tropomyosin binding to residues 46-158, which lie within the troponin T1 sequence (Ohtsuki et al., 1984). This finding is consistent with the previous results by Jackson et al. (1975) that two CNBr fragments of troponin T, CNBl’ (residues 1-151) and CNB2 (residues 71-151), bind to tropomyosin. These researchers thus concluded that the region of residues 7 1- 151, which is rich in a helix (-80%; Pearlstone and Smillie, 1977), is the main binding site for tropomyosin. Study of the secondary structure also demonstrated that the continuous region of residues 68-150 has a strong tendency to form a-helical structure (Nagano et al., 1980, 1982). Nagano et al. (1980) suggested that the helical region of residues around 90-150 forms a triple-stranded binding complex with the coiled coil of tropomyosin. Troponin T2 binds to tropomyosin. The binding stability is weakened by the removal of the 17 or more C-terminal residues from troponin T2 (Ohtsuki, 1979b; Tanokura et al., 1982). Since all troponin T2 fragments (Tza, T@I, TzPII, and T2PIII) contain the same range of a helix and /3 structure, the higher order structure of these troponin TPsubfragments should be similar and the C-terminal region of troponin T2 should be involved in the interaction with tropomyosin (Fig. 6). Troponin C competes with the binding of troponin T2 to tropomyosin, and the competition is Ca2+-dependent (Pearlstone and Smillie, 1983; Tanokura and Ohtsuki, 1984). The binding of troponin T to tropomyosin is not at all affected by troponin C. Chong and Hodges (1982b) demonstrated that a heterobifunctional reagent attached to Cys-190 of a-tropomyosin is photolyzed to a troponin T&II region. The T$II region must therefore be in close steric contact with tropomyosin, though the mutual affinity is relatively low (Tanokura et al., 1982, 1983). Morris and Lehrer (1981) also demonstrated that a fluorescent probe attached to Cys-190 of a-tropomyosin is affected by troponin T2but not by troponin TI. These findings support the view that the troponin T2 region is actually involved in the binding to tropomyosin. In summary, troponin T binds to tropomyosin through two regions,
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
29
i.e., the a-helical region of residues 70-150 in troponin T I ,under physiological conditions, the C-terminal region of troponin T2 (Fig. 6). 3. Interaction with Troponin I The binding of troponin T to troponin I is considered to be one of the necessary interactions for the Ca2+-regulatoryfunction (Horwitz et al., 1979). Close apposition of troponin T with troponin I in the troponin complex was demonstrated by using a cross-linking reagent (Hitchcock, 1975b), and their interaction was then observed by Horwitz et al. (1979). Troponin I was shown to bind to troponin T2 but not to troponin TI (Katayama, 1979; Tanokura and Ohtsuki, 1982; Tanokura et al., 1983). All troponin T2 subfragments, except troponin T@III, form stable complexes with troponin I (Tanokura and Ohtsuki, 1982; Tanokura et al., 1983) as deduced by affinity chromatography and coprecipitation with actin-tropomyosin-troponin I. Analysis of the amino acid sequence showed that the difference between the troponin T$II and 111 fragments is only five residues (Lys-Arg-Ala-Lys-Tyr) and no significant difference in secondary structure could be detected. Thus it is very probable that the region of residues 223-227 of troponin T is directly and critically involved in the interaction with troponin I (Fig. 6). Hitchcock et al. (1981) investigated the relative reactivity of 35 Lys residues (total 39 residues) of troponin T to acetylation in native troponin, troponin T-C, and troponin T-I, and found that the Lys residues showing the greatest change in reactivity are concentrated between residues 114 and 223 (Fig. 6). The relative reactivities of most of these residues were reduced in both troponin T-C and troponin T-I binary complexes as well as in native troponin. Lys-185, (-200, -202), and -223 were exceptions and showed lower reactivity in troponin T-I than in troponin T-C, and would presumably be involved in the interaction with troponin I. Lys-226 and -241, on the other hand, were less reactive in troponin T-C than in troponin T-I. Chong and Hodges (1982~) showed that a heterobifunctional reagent attached to Cys-48 and -64 residues of troponin I is in close apposition with the troponin T2 region within a distance of 1.4 nm. At present, it seems that the main binding region of troponin T for troponin I is the localized region containing Lys-223 (Fig. 6); the removal of the five residues 223-227 greatly diminished the affinity of troponin T2 for troponin I and the relative reactivity of Lys-223 was most depressed among all Lys residues when it was complexed with troponin I. This binding to troponin I would be stabilized by the interaction through the broad region of residues 185-223.
30
IWAO OHTSUKI ET AL.
This interaction within the troponin complex has been indicated to be influenced by Ca2+(Chong and Hodges, 1982c; Hitchcock-de Gregori, 1982).This suggests that three components are packed closely together. It was also suggested that the formation of the complex of troponin I with troponin T potentiated the interaction of troponin T with tropomyosin (Tanokura and Ohtsuki, 1984). 4 . Interaction with Troponin C Troponin T binds to troponin C irrespective of Ca2+concentration (Grabarek et al., 1981; Tanokura et al., 1983). The mixture of troponin T and C is turbid in the absence of Ca2+and becomes transparent in the presence of Ca2+(Ebashi et al., 1973; Ebashi, 1974a,b).This increase in turbidity in the absence of Ca2+was interpreted as a change of conformation of the troponin T-C complex and a subsequent formation of aggregates. The Ca2+concentration necessary for clearing the turbidity is definitely higher than that for the regulation of contractile interaction. Troponin C was shown to bind to troponin T2 but not to troponin TI (Ohtsuki, 1979b; Tanokura et al., 1983; Pearlstone and Smillie, 1983). This is consistent with the previous affinity chromatographic studies using cyanogen bromide and other fragments of troponin T, in which the involvement of the region of residues 176-258 was implicated in the binding to troponin C (Pearlstone and Smillie, 1978).Ohara et al. (1980) showed that the region of residues 206-258 was cross-linked to troponin C. Studies using chymotryptic troponin T2 fragments have given more detailed information about the interaction. In the presence of Ca2+,all a,PI, PII, PIII) bind stably to troponin C. On troponin T2 fragments (as, the other hand, the binding stability is weakened in the absence of Ca2+; the decrease in stability is most remarkable in troponin T$III. This would mean that the region of troponin TZPIII is mostly involved in the Ca2+-dependent interaction with troponin C and that the C-terminal portion of troponin T2 contributes to the Ca2+-independentinteraction (Tanokura et al., 1983). This view is in accord with the previous finding that a fragment of residues 1-205 bound to troponin C dependent on Ca2+concentration (Pearlstone and Smillie, 1978). Thus the broad region of the troponin T2 sequence should be related to the interaction with troponin C in the binary complex. One interesting aspect of the interaction of troponin C with troponin T2 is its Ca2+-dependentcompetition with tropomyosin (Pearlstone and Smillie, 1983; Tanokura and Ohtsuki, 1984). When the mixture of troponin C-troponin T2 was applied to tropomyosin-Sepharose 4B, both troponin C and troponin T2 were eluted in the presence of Ca2+but only
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
31
troponin C was eluted in the absence of Ca2+.This indicates that troponin C competes with tropomyosin for troponin T2 and that the interaction is Ca2+-dependent.These findings are interpreted by assuming that troponin C and tropomyosin compete for a similar region on the troponin TZsequence. In the absence of Ca2+,troponin C binds more weakly to troponin TZthan to tropomyosin and thus troponin C is not able to bind to troponin Tz . This Ca2+-dependent competition is observed for all troponin T2subfragments but not for original troponin T . Troponin T was retained in a tropomyosin-Sepharose column even in the presence of troponin C at all Ca2+ concentrations. Thus further study is required to established whether or not troponin T shows Ca2+dependent interaction with troponin C in native troponin.
E . Tropomyosin Tropomyosin is a very stable protein, which was isolated and extensively investigated by Bailey (1946, 1948). This protein binds to troponin T and F-actin. Early physicochemical studies revealed that tropomyosin assumes a filamentous form of about 40 nm with a high percentage (-90%) of a helix (Tsao et al., 1951; Cohen and Szent-Gyorgyi, 1957; Ooi et al., 1962). It was also shown that this protein consists of two almost identical subunits of 33,500 Da (Woods, 1967). It is now believed that tropomyosin is a filamentous molecule composed of a coiled coil of two parallel a helices, each approximately 40 nm in length (McLachlan and Stewart, 1976a) (Fig. 7). In the thin filament, end-to-end filaments of tropomyosin, each molecule of which attaches one globular troponin, are distributed along the grooves of actin double strands. 1 . Subunits and Primary Structure The SDS-gel electrophoretic pattern of tropomyosin showed two bands, designated a and p, around 35,000 Da (Cummins and Perry, 1973). The ratio of the two subunits varied from tissue to tissue: mostly a subunit in fast skeletal muscle, equimolar a and p subunits in slow skeletal muscle, and exclusively a subunit in cardiac muscle (Bronson and Schachat, 1982). The tropomyosin molecule consists of aa or ap dimers. A pp molecule produced in vitro showed the same Ca2+-sensitizingaction as that of ap- or aa-tropomyosin (Cummins and Perry, 1973). Hence physiologically both a and p subunits are identical. The amino acid sequence of a and p subunits of tropomyosin from rabbit skeletal muscle has been determined (Stone and Smillie, 1978; Sodek et al., 1978; Mak et al., 1980). Both subunits consist of a single peptide of 284 amino acid residues. The N-terminal residue is acetylated methionine, and the C-terminal residues is Ile. Assuming that all resi-
32
IWAO OHTSUKI ET AL.
FIG. 7. Predicted coiled-coil structure of a-tropomyosin. The coiled-coil arrangement of two a-tropomyosin subunits of the region of residues 150-240 is shown (McLachlan and Stewart, 1976a). Amino acid residues are indicated by the one-letter code: A (Ala), C (Cys), E (Glu), F (Phe), G (Gly), H (His), I (Ile), K (Lys), M (Met), N (Asn), Q (Gln), R (Arg), S (Ser), T (Thr), V (Val), and Y (Tyr).
dues form an a-helical structure, the subunit length is estimated as 42 nm. The a-tropomyosin subunit contains one Cys residue at position 190 (Fig. 7), and the two Cys-190 residues in aa-tropomyosin were found to form a disulfide bridge when tropomyosin was oxidized (Stewart, 1975b; Lehrer, 1975). Thus two a! subunits in tropomyosin are aligned in parallel and almost in register, and the subunit length roughly equals the length of the tropomyosin molecule.
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
33
Fourier analysis of the a subunit demonstrated strong fourteenthorder peaks in the profiles of both negatively charged and nonpolar amino acids, with a period of 194 residues and with an overall repeat length of 275 amino acid residues. This repeat length is shorter than the sequence of 284 residues (McLachlan and Stewart, 1975, 1976a). This periodicity could be subdivided into two sets of alternating bands, designated a and p, each of which would be related to the binding site for actin. The nine C-terminal residues are not included in the periodicity and would be related to the formation of end-to-end bonding of tropomyosin molecules. Thus the repeating period of the end-to-end filament of tropomyosin is approximately 40 nm. The amino acid sequence of the P-tropomyosin subunit is mostly the same as that of a-tropomyosin, with the difference being 39 residues. Most residues were replaced by the same family of amino acid residues, with the exception that Ser-229 and His-276 in the a-subunit are replaced by Glu-229 and Asn-276 in the p-subunit. Hence, P-tropomyosin is a little more acidic than a-tropomyosin (Mak et al., 1980).
2. Crystals and Paracrystals Bailey ( 1948) crystalized tropomyosin in the ammonium sulfate solution at pH 5.4. A typical orthorhombic dodecahedra1 crystal contained 5% protein. The structure of the crystal was investigated by X-ray diffraction and electron microscopy (Caspar et al., 1969; Higashi-Fujime and Ooi, 1969; Cohen et al., 1971). Tropomyosin molecules in the crystal are associated head-to-tail in polar filaments with a 40-nm period. The filaments are periodically bent as a consequence of the cross-connections at two sites alternpely separated by about 23 and 17 nm in the 40-nm axial period. The projected image of the crystal in the electron microscope shows a kite-shaped meshwork consisting of long (23 nm) and short (17 nm) arms. Troponin was found to be located at the middle portion of the long arms (Higashi-Fujime and Ooi, 1969; Cohen et al., 1971). Several types of two-dimensional crystal nets were also formed (Caspar et al., 1969).All of these crystals are composed of a polar tropomyosin filament with a 40-nm unit of periodicity. When tropomyosin solutions are dialyzed against divalent cations such as Mg2+or Ca2+,fibrous paracrystals are formed. Paracrystalline tactoids under these conditions have a characteristic axial repeat period of 40 nm (Cohen and Longley, 1966).The axial periodicity is formed by the alternating arrangement of wide bright (27 nm) and narrow dark (13 nm) bands, each of which is surrounded by clear boundary lines. Two sets of oppositely directed tropomyosin filaments are arranged in the paracrystalline structure and each boundary line corresponds to the position of
34
IWAO OHTSUKI ET AL.
the head-to-tail bonding portion of the two sets of tropomyosin filaments (Ohtsuki, 1974). Troponin increases the density of the 12-nm width region at the middle of the wide bright band (Higashi-Fujime and Ooi, 1969; Ohtsuki, 1974). The width of density increase by troponin would correspond to the axial size of the main portion of the molecule. It was then demonstrated that the position of Cys- 190 of the a-tropomyosin paracrystal is situated at the center of wide bright bands (Stewart, 1975a; Stewart and Diakiw, 1978). These findings suggest that the troponin molecule is located at the region in tropomyosin that is about two-thirds of the molecular length (27 nm) from the N terminus. 3 . Interactions Filamentous tropomyosin molecules form long end-to-end filaments by ionic interaction. Thus aqueous solutions of tropomyosin are highly viscous. The viscosity of the solution is decreased by the addition of salt, and increased by the addition of troponin (Ebashi and Kodama, 1965). The C-terminal nine residues of a-tropomyosin are considered to be involved in the bond formation; consequently the nine N-terminal residues would also be involved (McLachlan and Stewart, 1976a). Tropomyosin binds stoichiometrically to F-actin in the presence of 0.1 M KCl (Laki et al., 1962), but does not bind in the presence of 20 mM KCl or 0.8 mM MgC12 (Martonosi, 1962). High concentration of KC1 (0.6 M ) dissociates tropomyosin from actin (Spudich and Watt, 1971). The binding of tropomyosin to F-actin showed strong positive cooperativity in the presence of 2.4 mM Mg2+ (Yang et al., 1979). Increase in the concentration of Mg2+ and/or KCl increases the binding affinity of tropomyosin for actin, but the cooperativity is not affected. On the other hand, the presence of heavy meromyosin or S1 also increased the binding affinity of tropomyosin for F-actin but, in this case, the cooperativity decreases slightly. Troponin decreases the exchangeability of tropomyosin attached to an actin filament (Drabikowski et al., 1968). Under certain experimental conditions, tropomyosin inhibits actomyosin ATPase activity by binding to actin. This inhibition is potentiated by troponin I with some increase in the binding of tropomyosin to actin (Eaton et al., 1975). Troponin I may connect tropomyosin and actin, or it may induce tropomyosin to bind to actin. Tropomyosin shows strong affinity for troponin. Ebashi and Kodama (1965) first demonstrated that troponin greatly increased the viscosity of tropomyosin solution. Studies have indicated that troponin increases the polymerization of tropomyosin and stabilizes the polymers. Analysis of troponin-tropomyosin paracrystals clearly indicated that troponin, and consequently troponin T , is located around Cys-190, a
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
35
position about two-thirds (27 nm) of the molecular length from the N terminus of tropomyosin of 40 nm length (as mentioned in Section II,E,2). This formed the basis of the structural relation of troponin to tropomyosin. The fluorescence of a-tropomyosin labeled with N-(1aminonaphthyL4)maleimide at Cys-I90 was enhanced by troponin T and troponin TZ(residues 159-259 of troponin T) (Morris and Lehrer, 1981). Chong and Hodges (198213) observed that a heterobifunctional photoaffinity probe, attached to Cys-190 of a-tropomyosin, was photolyzed only to troponin T, but not to other troponin components. Absence of Ca2+increased the extent of labeling by a factor of 1.7. These findings, indicating that the T2 region of troponin T is in close apposition to Cys-190 of tropomyosin, are consistent with the results obtained by the above paracrystal analysis. At the same time, it was shown that nonpolymerizable tropomyosin, which is devoid of the 11 C-terminal residues, did not bind to the CB1 fragment (residues 1-151) of troponin T, and that the reactivity of Tyr261 and Tyr-267 of a-tropomyosin to iodination was decreased in the presence of the CB 1 fragment (Mak and Smillie, 1981; Pato et al., 1981). This suggests that the binding region for troponin T extends to the Cterminal region of tropomyosin from the region around Cys-190. But results obtained by use of the proteolytic fragments instead of native troponin to deduce the physiological binding site on tropomyosin should be treated with caution, for fragments very often show different properties from the native molecule. Actually we could repeat these findings by using troponin TI but not as yet with native troponin. Further study, especially on the native thin filament, should provide conclusions of physiological significance. 4 . Effects on Actomyosin
Tropomyosin influences the actomyosin ATPase activity. At high ATP concentration, where actomyosin shows the clearing response to ATP, tropomyosin depresses actomyosin superprecipitation and actomyosin ATPase activity. At low ATP concentration, where actomyosin superprecipitates, tropomyosin elevates the actomyosin ATPase activity. Tropomyosin elevates the acto-S 1 ATPase at low ATP concentration (Bremel and Weber, 1972). These phenomena are explained by assuming that the formation of a rigor complex between actin and S1 not only activates one actin but also activates other actin molecules allosterically through tropomyosin. Tawada et al. (1975) reported that tropomyosin, from which C-terminal residues were removed by carboxypeptidase treatment, does not polymerize but instead binds to actin. The mixture of troponin and unpolymerizable tropomyosin gave
36
IWAO OHTSUKI ET AL. TN-1-1
c~'+I-I [Rdaxationl
1/1
TN-C-TN-I
TN.1-T
M
Actin
Myosin
M
FIG.8. Calcium regulation of muscle contraction. T N .C, Troponin C; TN .I, troponin I; TM, tropomyosin; TN .T, troponin T.
the Ca2+ sensitivity to actomyosin ATPase activity, but with lowered cooperativity. F. Some Aspects of Calcium-Regulatory Mechanism As mentioned in Section II,A, Ca2+regulation by troponin-tropomyosin consists of the depression of contractile interaction between myosin and actin by troponin-tropomyosin in the absence of Ca2+ and the release of the inhibition by Ca2+(Ebashi et al., 1968, 1969) (Fig. 2). In the regulatory processes of the troponin complex, the essential mechanism is considered to consist of the Ca2+-dependentinteraction between troponin C and troponin I and of the inhibitory interaction of troponin I with actin-tropomyosin; the two interactions are complementary to each other (Potter and Gergely, 1974; Ebashi, 1974a; Perry, 1979). The regulatory interactions and their change by Ca2+are shown schematically in Fig. 8. In the absence of Ca2+,troponin I interacts with troponin C weakly, but it binds strongly to actin-tropomyosin and inhibits the actin from interacting with myosin. When Ca2+acts on troponin C, the interaction of troponin C with troponin I is strengthened and the interaction of troponin I with actin weakens and consequently the contractile interaction is activated. Some aspects of the Ca2+-regulatory processes are discussed in the following sections. 1 . Inhibitory Action of Troponin Z The inhibitory action of troponin I on actomyosin-tropomyosin represents the depression of contractile interaction of myosin-actin by troponin-tropomyosin in the absence of Ca2+.In the absence of tropomyosin, the inhibitory action of troponin I is weak, whereas about 80% of the ATPase of actomyosin is inhibited by troponin I in the presence of
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
37
tropomyosin (Syska et al., 1976). The requirement for tropomyosin has also been shown for the inhibitory action of the cyanogen bromide fragment CN4 (residues 96-1 16). Troponin I binds to actin (Potter and Gergely, 1974; Hitchcock, 1975a). Evidence indicates an interaction of troponin I with tropomyosin (Ebashi et al., 1974; Eaton et al., 1975; Katayama, 1980), though this interaction cannot be detected under usual experimental conditions. However, the requirement of tropomyosin for the inhibitory action strongly indicates that troponin I forms a complex with both actin and tropomyosin with high affinity. Since the small 21-residue CN4 fragment of troponin I has essentially the same inhibitory action as troponin I, both actin and tropomyosin would interact with this limited CN4 region of troponin I and consequently the interacting region should include this relatively localized three-dimensional space, compared with the overall size of these proteins. The inhibitory action of isolated troponin I is neutralized by the addition of stoichiometric amounts of troponin C, regardless of Ca2+concentration (Perry,et al., 1973). It has also been shown that the complex of troponin C-troponin I binds to actin-tropomyosin in the absence of Ca2+but not in the presence of Ca2+(Potter and Gergely, 1974). Since the complex of troponin I-troponin C does not bind to an actin filament without tropomyosin even in the absence of Ca2+,this binary complex may interact with tropomyosin. In this respect, the action of the binary complex of troponin C-I is definitely different from that of the ternary complex of troponin C-I-T, in which troponin I binds to actin and tropomyosin in the absence of Ca2+.In the binary complex, the conformation of troponin I should be changed by the interaction with troponin C in such a way that the inhibitory CN4 region cannot interact with actin. The CN4 region of troponin I would be, at least partly, masked by troponin C regardless of Ca2+concentration. This state of the troponin I-C complex corresponds to that of the thin filament at relatively high Ca2+concentration. This view is supported by the finding that the binding of Ca2+to the binary troponin I-C complex is stronger than that of the ternary complex in the thin filament of crayfish skeletal muscle (Wnuk et al., 1984). Troponin I intensifies the affinity of isolated troponin C for Ca2+,and, in turn, the CN4 region of troponin I would be bound by troponin C and not available for inhibitory sites on actintropomyosin even at low Ca2+ concentration. Thus the troponin Ctroponin I complex does not cause the Ca2+-sensitivechange in actintropomyosin. Troponin T is required, in addition to troponin C and troponin I, for the appearance of the Ca2+sensitivity of the contractile interaction.
38
IWAO OHTSUKI ET AL.
2. Regulatory Role of Troponin T When troponin I forms a complex with troponin C, it interacts with tropomyosin-actin in the absence of Ca2+and dissociates in the presence of Ca2+,but the contractile interaction itself stays activated regardless of Ca2+concentration. Hence troponin I, by binding to troponin C, acquires sensitivity to Ca2+with regard to the interaction with tropomyosin-actin, but without giving the contractile interaction Ca2+sensitivity. This may be due to the Ca2+-independentbinding of troponin C to the CN4 region of troponin I, which makes the CN4 region unable to show the inhibitory interaction with actin-tropomyosin. Tkoponin T, in addition to troponin I and troponin C, is required for the Ca2+regulation of the contractile interaction in the presence of tropomyosin. Thus, troponin T serves to integrate the physiological function of the troponin complex. In other words, the control nature of the troponin complex is localized in this tropomyosin binding component. Investigations of the chymotryptic troponin T subfragments have demonstrated that the integrating or Ca2+-sensitizingaction of troponin T is retained mostly in the C-terminal chymotryptic subfragment, troponin T2 (T2a)(Onoyama and Ohtsuki, 1986). Thus troponin T2 represents the physiologically essential portion of the troponin T molecule. Troponin T@I, a fragment of troponin T2 from which the 17 C-terminal residues were removed, was shown not to give the Ca2+-sensitizing action at all (Ohtsuki et al., 1981; Onoyama and Ohtsuki, 1986). The affinity of troponin T 4 I for tropomyosin is much weaker than that of troponin T2, while this fragment interacts with troponin C and troponin I in the same way as troponin T2. Thus it is very plausible that the binding of the C-terminal region of troponin T2 to tropomyosin has a critical role in the function of troponin T. One essential mechanism of the action of troponin T is to adjust the position of troponin I-C in relation to tropomyosin and actin in such a way that troponin I can bind to actin-tropomyosin in the absence of Ca2+(Ohtsuki, 1980). Although the troponin TI region also binds strongly to tropomyosin and helps to connect the troponin complex to tropomyosin, this interaction is not essential for the integrating action. The binding of the troponin T2 region to tropomyosin gives the most Ca2+sensitivity. At the same time, the fact that troponin C can neutralize the inhibitory action of troponin I but cannot neutralize the inhibitory action of the troponin I-troponin T complex in the absence of Ca2+suggests that the higher level structure of troponin I is also influenced by troponin T. Troponin T , probably in cooperation with actin-tropomyosin, would induce some conformational change in troponin I in a direction opposite to that
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
39
caused by troponin C in the absence of Ca2+ (Ohtsuki and Nagano, 1982). The strong binding of the troponin TI region to tropomyosin should undoubtedly contribute to the steric stabilization of the position of the whole troponin complex on tropomyosin-actin. The Ca2+sensitivity of actomyosin ATPase by troponin T2 is a little less cooperative than that by troponin T, and thus the depressive effect of free Mg2+on the Ca2+ sensitivity of the actomyosin ATPase with troponin T2 is less remarkable. Troponin TI may be involved in these aspects. The maximum activation of actomyosin ATPase by troponin T2 in the presence of tropomyosintroponin I-C was a little less than that by troponin T. Troponin T1 itself depressed the ATPase and superprecipitation of actomyosin-tropomyosin-troponin I-C at all Ca2+concentrations. The mixture of troponin TI and T2 also depressed the ATPase at all Ca2+concentrations to the same extent as troponin T1.This suggests that the native position of troponin T1 in the thin filament is different from that of isolated troponin TI in the reconstituted filament. The interaction of troponin T with troponin I is considered necessary for the Ca2+regulation of contractile interaction (Horwitz et al., 1979; Ebashi, 1980). Troponin T also interacts with troponin C. However the physiological significance of this interaction is still obscure. Ebashi et al. (1973) observed that the turbid suspension of the mixture of troponin T-troponin C became clear on increasing the Ca2+concentration. But the effective concentration of Ca2+was definitely higher than that for the Ca2+regulation of contractile interaction. It was demonstrated, in the investigation of various combinations of hybrid troponins from skeletal and cardiac sources, that the Ca2+sensitivity of the solubility of the binary complex of troponin T-C was inversely related to the extent of Ca2+-sensitiveactivity of superprecipitation containing corresponding troponin components (Ebashi, 197413). It was also suggested that this Ca2+-dependentchange of the troponin T-C complex contributes to the maintenance of the integration of the troponin complex at extremely high Ca2+ concentrations beyond the physiological range (Ebashi, 1974a). It was reported that troponin C shows a Ca2+-dependentcompetition with tropomyosin for troponin T2 (Pearlstone and Smillie, 1983; Tanokura and Ohtsuki, 1984). This Ca2+-dependentcompetition was, however, observed not only for troponin T2 but also for smaller fragments of troponin T2 (i.e., troponin T&), which do not retain the Ca2+sensitizing action. Thus the involvement of this competitive interaction in the regulatory mechanism seems rather unlikely at present. This is in accord with the finding that troponin C does not bind to troponin T bound to tropomyosin at all Ca2+concentrations (Tanokura and Oht-
40
IWAO OHTSUKI ET AL.
TABLE I11 Ca-Sensitizing Activities of Various Kinds of H y M Troponins Combination" TN * T
TN * C
TN * I
Ca2+sensitivityb 15.4 7.4 8.4 32.4 3.4 4.6 2.6 20.8
@Originof the subunit: c, cardiac; s, skeletal; T N * T , troponin T ; T N * C, troponin C; T N . I, troponin I. Ca2+sensitivity is taken as the ratio of the time required to reach half-maximal superprecipitation in the absence of Ca2+to that in the presence of Ca2+. Values for original cardiac troponins ranged from 20 to 27. For other details see Ebashi (1974a,b).
suki, 1984). This also suggests that the interaction between troponin 1 and troponin C is weak or absent in native troponin and that troponin T binds to tropomyosin through both troponin TI and T2 regions regardless of Ca2+concentration, though definitive experimental evidence has not been reported. 3. Comparative Aspects and Hybrid Troponins Since the comparative aspects of troponin components and tropomyosins from vertebrate and invertebrate sources have been comprehensively reviewed previously (Obinata et al., 19Sl), this section refers briefly to the several aspects having some relevance to the Ca2+-regulatory mechanisms. Troponin components of bovine cardiac muscle were separated by essentially the same procedure as for skeletal troponin components (Tsukui and Ebashi, 1973).Studies on the Ca2+sensitivity of superprecipitation with various combinations of troponin components from rabbit skeletal and bovine cardiac muscles showed that most hybrid troponins gave the lower Ca2+ sensitivity of superprecipitation (Ebashi, 1974a,b) (Table 111). The lowest value was obtained with the combination of skeletal troponin C-T and cardiac troponin I. However, when cardiac troponin T was combined with skeletal troponin I-C, the calcium sensitivity was exceptionally high and exceeded those obtained by the native
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
41
skeletal and cardiac troponin complexes. This particular effect of cardiac troponin T was also confirmed on actomyosin ATPase activity with troponin T from porcine cardiac muscle (Yamamoto, 1983).This combination of hybrid troponin gave the maximum activation of the ATPase at about 1.5 times that of native troponin. This indicates that, in the presence of the troponin combination of skeletal C-I-cardiac T, Ca2+ not only removes the inhibition by troponin I but also further activates the contractile interaction. The regulation by troponin-tropomyosin is activated not only by Ca2+ but also by other divalent cations such as Sr2+,though a much higher concentration is required than for Ca2+(Ebashi and Endo, 1968; Ebashi et al., 1968). The observation in studies of superprecipitation that Sr2+ sensitivity of cardiac contraction is higher than that of skeletal muscle (Ebashi et al., 1968) was confirmed in the experiments of tension development of glycerinated muscles (Kitazawa, 1976). It was also shown that the affinity of cardiac troponin for Sr2+is higher than that of skeletal troponin (Ebashi et al., 1968; Kohama, 1979).This difference in binding to Sr2+could explain the difference between the Sr2+sensitivity of skeletal and cardiac muscles. In a study of hybrid troponins from rabbit skeletal and porcine cardiac muscles all hybrid troponins containing cardiac troponin C gave high Sr2+ sensitivity of actomyosin ATPase, whereas those containing skeletal troponin C gave low Sr2+sensitivity (Yamamoto, 1983). This result has established firmly that the Sr2+sensitivity of cardiac or skeletal types is determined solely by the species of troponin C. The Sr2+ sensitivity of actomyosin ATPase with cardiac troponin complex was not affected by the phosphorylation of troponin I, whereas the Ca2+sensitivity was depressed to some extent (Yamamoto, 1983). The change in higher level structure of cardiac troponin C induced by Sr2+should be different from that induced by Ca2+. Calmodulin, a calcium-binding protein homologous to troponin C (Kakiuchi et al., 1970; Cheung, 1970; Kretsinger and Barry, 1975), has been shown to neutralize the inhibitory action of troponin I (without troponin T) on the actomyosin ATPase only in the presence of Ca2+ (Amphlett et al., 1976; Dedman et al., 1977; Yamamoto, 1983). The presence of troponin T weakens the neutralizing action of calmodulin (Yamamoto, 1983); Ca2+ or Sr2+ sensitivity of actomyosin ATPase is depressed and shifted to higher concentration. Thus calmodulin cannot replace troponin C under physiological conditions in the presence of troponin I-T. All the observations described above strongly indicate that the Ca2+regulating action of the troponin complex is not determined solely by the Ca2+-bindingability of troponin C but modulated by other components, namely troponin I and troponin T. The regulatory processes
42
IWAO OHTSUKI ET AL.
within the troponin complex are presumably mediated through a subtle change in the interaction among the three components by Ca2+. The contraction of ascidian smooth muscle was found to be regulated through the troponin-tropomyosin system. But the action of troponin components was different from that of troponin of vertebrate striated muscles (Endo and Obinata, 1981).In this system, the inhibitory action of troponin I (MW 24,000)is less remarkable compared with vertebrate skeletal troponin I, and troponin C (MW 18,000)does not neutralize the inhibition by troponin I. But upon further addition of troponin T (MW 33,000) in the concomitant presence of all three components and tropomyosin, the contractile interaction of myosin and actin is activated. In this case, the action of troponin T has some similarity with that of the above-mentioned cardiac troponin T hybridized with skeletal troponin C-I. Since actomyosin, without these regulatory proteins, is inhibited regardless of Ca2+concentration, Ca2+and troponin-tropomyosin are activators for contraction of actomyosin in ascidian smooth muscle. In this respect, the type of Ca2+regulation of ascidian smooth muscle is the same as that for vertebrate smooth muscles which do not contain troponin (Ebashi, 1980). G . Structural Aspects of Troponin and Tropomyosin 1 . Arrangement of Troponan and Tropomyosin an the Thin Filament
The first indication concerning the structural relationship of troponin-tropomyosin was obtained by immunoelectron microscopy of troponin localization in myofibrils (Ohtsuki et al., 1967;Ohtsuki, 1974). The antibody to troponin formed 24 narrow transverse striations, with regular intervals of 38 nm, along the whole length of the thin filament region of chicken skeletal myofibril. This indicates that troponin distributes in register along the thin filaments extending from the Z-line structure. In view of the finding that tropomyosin was a filamentous molecule of about 40 nm length (Tsao et al., 1951;Ooi et al., 1962),it was then deduced that the head-to-tail filaments of fibrous tropomyosin molecules underlie the intermittent distribution of globular troponins. This was supported by the demonstration that ferritin-labeled troponin is located at specific sites on the fibrous tropomyosin paracrystal with a 40nm period (Nonomura et al., 1968). These considerations led us to propose the first model of the thin filament as a complex of troponin-tropomyosin-actin (Ebashi et al., 1969)(Fig. 9A).In this model, two head-to-tail filaments of tropomyosin run almost in register along the grooves of actin double strands. Troponin attaches to a specific region of each tropomyosin and thus is
43
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
Troponin
/
Acjin
A Tropomyosin
Troponin Actin
1/1
Tropomyosin 10 nm
FIG.9. Structural models of thin filaments. (A) The first model; (B) the refined model. For detailed explanations, see text.
distributed at 38-nm intervals. Each troponin period covers the length of a chain of seven actin monomers along a single strand. Hence the stoichiometry between actin, tropomyosin, and troponin is 7 : 1 : 1 in the thin filament, a ratio consistent with biochemical findings (Ebashi et al., 1968). This model clearly accounted for the calcium regulatory pathway over which information on calcium binding to troponin is transmitted to actin molecules through tropomyosin. Essential features of the thin filament as proposed above have been confirmed by studies including X-ray diffraction of muscle and optical diffraction as well as three-dimensional reconstruction of electron micrographs (Ohtsuki and Wakabayashi, 1972; Spudich et al., 1972; Huxley, 1973; Haselgrove, 1973; Wakabayashi et al., 1975). The model has been refined by two lines of study (Ohtsuki, 1974).The first refinement was made by analysis of the troponin-tropomyosin relationship in the paracrystalline structure (discussed in Section II,E,2). The analysis has confirmed that troponin lies approximately two-thirds of the molecular length (i.e., 27 nm) from one end of a filamentous tropomyosin molecule of 40-nm length. Another refinement was based on consideration of the arrangement of actin molecules in the thin filament. Corresponding molecules in two long-pitched strands of actin in the filament are shifted relative to each other by a distance of half the
44
IWAO OHTSUKI ET AL.
size of an actin molecule (Hanson and Lowy, 1963). Hence two troponin-tropomyosin filaments in the thin filament should also be shifted relative to each other, by a distance of one-half the actin size. Based on the above considerations, the refined model of the thin filament is presented in Fig. 9B (Ohtsuki, 1974; Ebashi, 1980). In this second model, the position of end-to-end bonding of tropomyosins is indicated, and the two troponin-tropomyosin filaments in the grooves of actin double strands are shifted by half the actin size relative to each other. The shift between two troponin-tropomyosin filaments has been verified by X-ray diffraction studies on invertebrate striated muscles (Wray et al., 1978; Maeda et al., 1979; Namba et al., 1980). The fact that the distance between the top of the thin filament and the position of the nearest troponin is 27 nm (i.e., two-thirds of the troponin period length) (Ohtsuki, 1974) indicates that the top of the filament is situated at the left in the model of the thin filament of Fig. 9B. There are three possible structural factors responsible for the alignment of the two troponin-tropomyosin filaments in register in the facing grooves of actin double strands. The presence of tropomyosin at the extreme edge of the filament, indicated by the immunoelectron microscopy of troponin localization, would be one of the determinant factors for the distribution of tropomyosin filaments in register (Ohtsuki, 1974). At the same time, it has also been shown that troponin molecules align almost in register even in reconstituted troponin-tropomyosin-actin filaments, which are much longer than the native thin filament of 1 pm (Ohtsuki and Wakabayashi, 1972; Ohtsuki, 1974). In such a case, it is rather unlikely that the position of binding of the tropomyosin molecule is determined solely by the edge of the filament. It is then attractive to assume that the binding of troponin-tropomyosin along one groove of the actin filament favors the binding of tropomyosin to the other facing groove in register. Thus conformational changes of actin, if any, would be related to this cooperative binding of troponin-tropomyosin. This view is in accord with the study of three-dimensional reconstruction of electron microscopic images showing that the shape of actin molecules in a pure actin filament is different from that in a tropomyosin-actin filament (Toyoshima and Wakabayashi, 1985). Seven single-stranded actin molecules in the 38-nm period, which covers the length of a single tropomyosin molecule, might each be affected differently from the others, depending on the distance from troponin. The possible cooperative involvement of actin molecules for the alignment of tropomyosin filaments would be correlated with the finding indicating that the Ca2+-regulatory unit is composed of 11-16 actin units, depending on the Ca2+concentration (Nagashima and Asakura, 1982). Involvement of troponin as a
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
45
third possible factor cannot be ruled out at the present time. Troponin I among three troponin components would be the likely candidate for the determinant factor. Troponin I would change the conformation of onethree among seven actin molecules attached to one tropomyosin. This change in the conformation of actin may have some role in adjusting the position of troponin-tropomyosin in the opposite groove of actin double strands. It is well known that the length of thin filaments is constant in myofibrils. This is also reflected by the finding that anti-troponin always formed 24 striations along each thin filament region extending from the Z line in chicken skeletal myofibril (Ohtsuki, 1974). The thin filament of rabbit skeletal muscle shows 26 anti-troponin striations. In relation to the morphogenesis of thin filaments, application of the immunocytological procedure called the saponin method revealed that, in myofibrils of the embryonic chicken breast muscle in situ, more than 24 anti-troponin striations were formed along the thin filament region (Ohtsuki, 1979a). Longer thin filaments were present even after hatching and disappeared within 3 weeks. One explanation for this finding is that, through some unknown mechanism, the thin filaments once formed shorten after hatching to a constant length of 1 pm. The other is that thin filaments formed only after hatching have a constant length of 1 pm. In crab muscle, where the A-band width is variable in the same muscle fiber, the length of the thin filament is proportional to that of the thick filament in the corresponding sarcomere (Franzini-Armstrong, 1970). This suggests that thin filament length is determined by the interaction with thick filaments. Actually actin filaments in an actomyosin thread became more homogeneous in length after repeated contractile interactions (Maruyama and Kimura, 1972). These considerations make the former explanation rather plausible. 2 . Arrangement of Troponin Components
The localization of troponin components along the thin filament of chicken skeletal muscle was investigated by the use of immunoelectron microscopy (Ohtsuki, 1975). Antibodies against troponin C and troponin I formed narrow transverse striations along the thin filament at the same position as the antibody against native troponin. On the other hand, anti-troponin T formed a wide band in each period along the thin filament; the width of the band reached a maximum at 20 nm. Detailed examination revealed that each band is composed of a pair of transverse striations. The position of one striation at the top side of the filament coincided with the binding sites of the antibodies against other compo-
46
IWAO OHTSUKI ET AL.
nents, and the other striations were about 13 nm apart in the direction of the Z line. Thus troponin T was indicated to have a definite axial length on the order of 10 nm. This asymmetric distribution of troponin components also indicated that troponin molecules are oriented along the axis of the filament. The fine axial orientation of troponin T was then demonstrated using the antibodies against chymotryptic troponin T subfragments, i.e., troponin T1(MW 18,700) and troponin T2(MW 11,900)of chicken skeletal muscle (Ohtsuki, 1979b). Anti-troponin T2 stained the same position as that stained by the antibodies against troponin I and C, whereas antitroponin T1 stained a position 13 nm toward the Z line from that stained by anti-troponin T2. Thus each member of the striation pair formed by anti-troponin T was proved to be derived from different regions in the molecule. If the positions of the antibodies are taken to represent the overall position of troponin components in this case, the troponin T molecule occupies the region of tropomyosin over the width of 13 nm including the edge of the Z-band side. Troponins C, I, and T2 would form a triangle complex in the plane either perpendicular or parallel to the filament axis. The other extreme edge of the troponin T molecule would be available for the antibody against troponin TI. For this explanation to apply, the shape of the troponin molecule would be far from globular. It should also be considered, however, that the distance of 13 nm between the positions of anti-troponin TI and T2 is within the size range of an antibody molecule. This made another explanation possible: namely, that troponin T is situated at the same location as troponin T*-I-C, in accord with the analysis of the troponin-tropomyosin paracrystal, but that the major portion of the troponin T1region is embedded within the troponin complex and exposed only in the direction of the Z line. Thus the anti-troponin T1 antibody binds to the antigenic region only from the Z-band side, and the overall position of the antibody is to be shifted from the antigenic region by 10 nm in the direction of the Z line. On separated thin filaments of rabbit skeletal muscle, anti-troponin T1 formed a relatively wide striation at the same position as in the case of chicken skeletal muscle. The Fab fragment of the antibody formed a narrow striation at a position corresponding to the top side of the filament of the wide striation formed by y-globulin (I. Ohtsuki, unpublished results). This is consistent with the latter explanation at least for the orientation of the antigenic regions in troponin T I . A possible arrangement of troponin T1 and T2 regions in the thin filament is shown in Fig. 10.
-
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
47
FIG.10. Possible arrangement of troponin components in the thin filament. The thin filament is viewed obliquely from the Z line side toward the top of the filament of the far side (Ohtsuki, 1980).
Flicker et al. (1982) reported that the rotary shadowed image of troponin is rod-shaped with a small globular head of 26.5 nm in length and that troponin T is a filamentous molecule of 18 nm. However, it has been revealed that the situation is not so simple, and that troponin molecules show polymorphic shapes (Ebashi and Ohtsuki, 1983). Filamentous particles of about 30 nm length as reported by Flicker et al. (1982) were observed. But, in addition, particles of other shapes, i.e., completely globular particles and globular particles with tails of various length, were also found; the size of the globular portion seemed inversely related to the tail length. In fresh preparations of troponin, globular particles predominated. On the other hand, filamentous parti-
48
IWAO OHTSUKI ET AL.
FIG. 11. Polymorphic shapes of troponin molecules observed by rotary shadowing (X660,OOO).
cles predominated in aged preparations, though some filamentous particles could be observed even in the fresh preparation. The polymorphic shapes of troponin are shown in Fig. 1 1. These findings indicate that the higher order structure of native troponin is maintained by subtle interactions among three components, and is easily unfolded or folded under various experimental conditions. Further investigations with several different techniques on troponin in native thin filament are required for a more detailed discussion.
3. Structure Prediction In order to clarify the Ca2+-regulatoryprocesses in troponin-tropomyosin, it is important to know the molecular structure of these protein systems under physiological conditions. Results on the higher order structures of troponin components and tropomyosin have been documented by several reviews (Perry, 1979; Gergely and Leavis, 1980; Leavis and Gergely, 1984). In this article, we refer only to a series of attempts to construct and visualize the higher order structure of tro-
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
49
ponin components and tropomyosin from the known amino acid sequences by predictive methods. This will be useful for integrating the experimental evidence at the structural level. Kretsinger and Barry (1975) presented a predictive model of troponin C. This model was based on the presence of sequences in troponin C homologous with the two Ca2+-bindingloop-helix-loop structures of the parvalbumin crystal, which they had analyzed at high resolution (Kretsinger and Nockolds, 1973). Skeletal troponin C has four Ca2+binding regions along the sequence (Collins et al., 1973). It was also shown that troponin C of bovine cardiac and rabbit slow skeletal muscle has three homologous regions along the sequence (van Eerd and Takahashi, 1976; Wilkinson, 1980). In the above model of skeletal troponin C, two parvalbumin structures are packed so that their hydrophobic surfaces are in contact with each other. The distance between the Ca2+ sites is 1.0 nm within the parvalbumin structural unit and 3.0 nm between two structural units. The two C-terminal Ca2+-bindingstructures (high-affinitysites) of this model were shown to be essentially the same as those of troponin C crystals from chicken and turkey skeletal muscle prepared at acidic pH (-5) (Herzberg and James, 1985; Sundaralingam et al., 1985; see also discussion in Section II,C,2). Extensive analysis of the structure of tropomyosin has also been done. It was already shown that tropomyosin is a coiled coil of two parallel a helices of about 40-nm length. Thus the decisive predictive analysis has been made by McLachlan and Stewart (1975, 1976a).According to their analysis, two a-helical subunits of 284 residues form a coiled-coil structure by the interaction between hydrophobic residues or ionic residues of two helices. They pointed out the 14-fold periodicity of the distribution of charged and nonpolar residues along the a-tropomyosin sequence, which would reflect the binding of each tropomyosin to 14 actin molecules in the groove of the thin filament. It was also demonstrated that these 14-fold periods could be divided into alternating a and p bands. McLachlan and Stewart (197613) pointed out that the region of residues 190-2 14 has somewhat different features from other portions of the tropomyosin sequence and they have deduced that this region is involved in the binding to troponin T. Jackson et al. (1975) first noticed that a fragment of troponin T, probably between residues 70 and 151, interacts with tropomyosin. Pearlstone et al. (1976) sequenced troponin T and suggested that two long helical sections, residues 80-102 and 122-146, are the binding sites for tropomyosin. In view of the coiled-coil nature of tropomyosin and the high ahelical content of the CB 1 fragment of troponin T, a close packing of the two should be achieved most easily by a triple-stranded coiled-coil struc-
50
IWAO OHTSUKI E T AL.
FIG. 12. Predicted approximate quaternary structure of troponin-tropomyosin. T, and T2,TIand T2regions of troponin T; C, troponin C; T M , tropomyosin; 0,position of N terminus; 0, position of C terminus. Symbols @, @, and @ indicate four calcium ions bound in troponin C sequence numbered from the N-terminal end. Area surrounded by dotted line indicates the projected position of troponin I, which is situated at the far side of troponin T and C in this figure. Shaded area indicates the approximate projected position of the inhibitory CN4 segment of residues 96- 116 in troponin I.
a,
ture. Based on these considerations, Nagano et al. (1980, 1982) presented four plausible structures for the possible interacting structure of the troponin T-tropomyosin complex. The approximate quaternary structure of the troponin complex has also been predicted, based on the preceding interacting structure of troponin T-tropomyosin (Nagano and Ohtsuki, 1982) (Fig. 12). Overall arrangements of troponin components are based on the discussion of immunoelectron microscopy in Section II,G,2. In this model, the ahelical segment of residues 90-148 of the troponin TI region binds tightly to tropomyosin and the rest of residues 1-59 form the globular portion at the end of a triple-stranded stem of troponin T-tropomyosin. Most of the troponin T2 region (residues 159-259) covers the other half of the triple-stranded shaft of troponin T-tropomyosin and binds to troponin C and troponin I, both of which are in close steric apposition to the TI region without significant interaction. In the model, a globular troponin C-I-T:! is located on the troponin T I that binds tropomyosin. The structure mentioned above is by no means a unique solution for the structure of tropomyosin and troponin. But it is certain that these studies are useful in integrating the enormous amount of experimental evidence, and in visualizing how Ca2+changes the troponin complex at the level of the amino acid sequence. Information on the interacting regions of the actin molecule for tropomyosin and troponin I will surely lead to another breakthrough for this line of approaches.
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
51
4. Calcium Regulation in the Thin Filament In the absence of Ca2+,troponin I inhibits the thin filament from interacting with myosin. Troponin T is also involved in this inhibition. Troponin T, on one hand, binds to tropomyosin and, on the other hand, binds to troponin I. The action of troponin T is mostly to adjust the relation of troponin I to tropomyosin-actin for the inhibitory state in the absence of Ca2'. Action of Ca2+on troponin C results in the potentiation of the interaction between troponin C and troponin I, and, in turn, the inhibitory interaction of troponin I to actin-tropomyosin is depressed and consequently the contractile interaction is activated. It has been demonstrated by three-dimensional reconstruction of electron micrographs that, in the troponin I-troponin T-tropomyosin-actin filament, the position of tropomyosin deviates from the center of the groove of actin double strands to one of two strands (Wakabayashi et al., 1975). Since this state corresponds to the inhibited condition of the thin filament, the deviation of tropomyosin would be related to the inhibitory action of troponin I, for tropomyosin stays near the center of the groove in the tropomyosinactin filament (without troponin complex), which mostly corresponds to the activated state of the thin filament. H. E. Huxley (1973) considered that the essential mechanism of Ca2+ regulation is the change in the steric relationship of tropomyosin and actin double strands. He proposed the steric block hypothesis, in which movement of the tropomyosin filament in the actin groove sterically enables the active site on actin to interact with myosin. Since the exact relation of actin and myosin during contractile interaction is not yet clear (Toyoshima and Wakabayashi, 1985), this hypothesis remains to be examined in the future. At the same time, evidence indicates that troponin I forms a stable complex with tropomyosin-actin in the absence of Ca2+.Thus it is also conceivable that a certain change in the preexisting interaction between actin and tropomyosin, which is unfavorable for the interaction with myosin, is induced in actin molecules cooperatively through tropomyosin. The cooperative role of tropomyosin in the regulatory processes of the thin filament has been suggested (Weber and Murray, 1973). Change in the interaction between actin molecules, probably through tropomyosin, was also suggested by the finding that Ca2+ makes the reconstituted thin filament more flexible (Yanagida et al., 1984). A kinetic process of ATP hydrolysis by myosin bound with actin has also been shown to be affected by Ca2+-troponin-tropomyosin, though at extremely low ionic strength (Chalovich and Eisenberg, 1982). All these possibilities should be taken into account at the present time. Intensive studies from several
52
IWAO OHTSUKI E T AL.
viewpoints will finally concentrate on the mechanism by which Ca2+ regulates the quaternary structure of the thin filament, an ordered aggregate of five kinds of proteins. In this section, properties of calcium regulatory proteins from vertebrate skeletal muscle were reviewed with particular reference to the physiological structure and function that have been the major interest since the discovery of native tropomyosin.
111. CONNECTIN (TITIN) In a discussion that lasted for a century it was proposed that there must be an elastic component in skeletal muscle. Natori (1954) first demonstrated that an internal elastic structure exists in skinned muscle fibers because of tension generation upon stretch beyond the overlap of thick and thin filaments. Although the existence of a “third filament” in addition to myosin and actin was claimed from time to time (Sjostrand, 1962; dos Remedios and Gilmour, 1978; Locker, 1984),it was not widely accepted because of inadequate evidence. Starting from Natori’s classical work, Maruyama et al. (1976) observed that a salt-insoluble protein was responsible for the elasticity of myofibrils and named it connectin (Maruyama, 1976). The main component was of very high molecular weight, and was obtainable only by cutting the polyacrylamide gels after electrophoresis in the presence of SDS (Maruyama et al., 1977b). Meanwhile, during a survey of high-molecular-weight actin-binding protein (ABP, filamin) discovered by Stossel and Hartwig (1975), Wang et al. (1979) observed the presence of doublet bands of lo6 MW in the SDS-gel electrophoresis pattern of a total SDS extract of whole myofibrils from rabbit and chicken skeletal muscles. The protein in question was isolated in a denatured state by gel filtration in the presence of SDS and termed titin (Wang et al., 1979). Subsequently, titin was shown to be identical with connectin (Maruyama et al., 1981b).’ From 1983 to 1984, three separate laboratories in Japan, the United States, and England succeeded in isolating native connectin, and investigated it by various physicochemical methods (Kimura and Maruyama, 1983a,b; Wang et al., 1984; Trinick et al., 1984; Kimura et al., 1984b). This has opened a new approach to the characterization of this unusually high-molecular-weight protein.
In this article, we use the term connectin instead of titin.
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
53
A. Preparation Connectin consists of doublet bands, a- and p-connectins (also called titin 1 and titin 2), and it appears that p-connectin is a proteolytic product of a-connectin (see Section 111,E).p-Connectin is extracted, together with myosin, in a salt solution (Maruyama et al., 1981a), leaving a-connectin in the residues. When actin is solubilized with 0.6 M KI, the remaining a- and p-connectins are translocated toward the Z lines and become insoluble together with residual myosin and actin (Granger and Lazarides, 1978). It is p-connectin that has been isolated as the native form. Wang et al. (1984) and Trinick et al. (1984) extracted connectin with myosin, and the latter was removed by column chromatography or by lowering ionic strength followed by centrifugation. Connectin was finally isolated by gel filtration. The yield was 40 mg (Wang et al., 1984) or 200 mg (Trinick et al., 1984) from 100 g of rabbit skeletal muscle. On the other hand, Kimura and Maruyama (1983b) adopted a selective extraction of connectin with 0.1 M sodium phosphate buffer, pH 6.5. Chicken breast muscle was stored overnight at 0°C to convert a-connectin to p-connectin. Then myofibrils were prepared and washed with 5 mM NaHC03 and extracted with 0.1 M phosphate buffer, pH 6.5, leaving myosin and actin in the residue. The yield of connectin was as much as 400 mg per 100 g chicken breast muscle.
B . Content in Myofbrils Connectin content in myofibrils of vertebrate skeletal muscle is as high as 10% of the total myofibrillar structural proteins. Since the connectin content is next in amount to that of actin, Wang et al. (1979) called connectin the third most abundant protein of muscle. Seki and Watanabe (1984) confirmed that the connectin content of carp skeletal muscle was 13% by measuring the amount of the gel-filtered connectin. Direct estimation of the connectin content in cow semimembranous muscle showed a value of 12% (King, 1984). Trinick et al. (1984) also reported the connectin content in rabbit psoas muscle to be approximately 10%. C . Molecular Size and Shape Estimation of the molecular weight of such a giant molecule as connectin is a very difficult problem. Using cross-linked myosin heavy chains (MHC) as standard, Wang (1982) estimated the molecular weight as 1.2 and 1.4 million, respectively, for a-and p-connectins. In this study it was assumed that MHC are cross-linked to form dimer, trimer, tetramer, etc. However, it turned out that MHC first dimerized and then the dimers
54
IWAO OHTSUKI E T AL.
form tetramers, hexamers, etc. (Maruyama et al., 1984a). Thus the MW of a-and p-connectins is estimated to be 2.8 and 2.1 million. The molecular shape of isolated p-connectin was revealed by lowangle-shadowing techniques in an electron microscope (Maruyama et al., 1984a; Wang et al., 1984; Trinick et al., 1984). It is a long, flexible thin filament, whose length varies widely from 0.2 to 1 pm. Flow birefringence data suggested particle lengths of 0.3-0.4 pm in solution (Maruyama et al., 1984a). A hypersharp sedimentation boundary of connectin in solution with 13-16 S also supports its asymmetric shape. The width of the connectin filament is estimated as 4-5 nm in the negatively stained sample under the electron microscope (Trinick et al., 1984), a value supported by hydrodynamic data (Maruyama et al., 1984a). D . Other Physicochemical Properties p-Connectin is soluble at more than 0.1 M KC1 at natural pH. At lower salt concentrations it forms aggregates. Connectin is precipitated by (NH4)2S04 at 35% saturation, but it is necessary to dialyze against 0.6 M KCl, pH 7.9, to redissolve the precipitate (Trinick et al., 1984). The viscosity of a connectin solution depends greatly on the shearing force of the measurements: When the usual Ostwald type of viscometer was used (mean velocity gradient, 500 sec-l), the reduced viscosity was estimated to be about 2 g/dl. In a slowly rotating viscometer, however, the viscosity value of a connectin solution (0.3 mg/ml) was as high as 17,000 CPat a velocity gradient of 0.0007 sec-l and was reduced to 230 CPat 0.08 sec-l. This was due to entanglements of long flexible filaments (Maruyama et al., 1984a). Circular dichroism measurements suggested that the a-helical content of connectin is almost zero, and connectin filaments are thought to consist of random coils (Trinick et al., 1984). This was supported by connectin’s high proline content (Table IV). The amino acid composition of connectin is listed in Table IV. It is an acidic protein rather similar to actin. There are hardly any recognizable differences in amino acid composition between p-connectin and a-connectin-rich samples. It is of some interest to note that C protein (MW 135 kDa) located on myosin filaments has an amino acid composition similar to connectin (Offer et al., 1973). Immunological tests showed that the two proteins are distinguishable. A question arises on the possible presence of covalently bonded subunits in a connectin molecule because of its extraordinarily high molecular weight. The presence of a lysine derivative cross-link, hydroxylysinonorleucine, was suggested as in collagen (Fujii and Maruyama,
55
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
TABLE IV Amino Acid Composition of Connectin and Its Proteolytic Fragment ___
~
Denatured (a-+ p-connectin) Amino acid Asx Thr Ser Glx Pro GlY Ala Cysl2
Val Met Ile Leu TYr Phe LYS His A% TrP Referencec
Bovinea
Rabbitb
Chicken"
Chicken
Native p-connectin (chicken)
85 68 67 116 83 70 65 13 93 55 61 29 26 87 15 49 13
96 79 74 137 74 65 66 8 74 8 44 65 28 24 102 17 44 -
95 75 69 116 74 71 62 11 85 10 59 67 30 27 82 15 55 -
93 66 67 118 67 76 75 6 78 16 56 76 30 29 79 18 50 -
96 76 68 111 74 74 65 2 87 12 60 66 31 26 86 15 59 -
a
a
b
b
b
12
Prepared from whole myofibrils. Prepared from KI-extracted myofibrils. ' a, Lusby el al. (1983); b, Kimura and Maruyama (1983b).
a
1982).This was not confirmed later, and the presence of glutamyllysine was not detected (Gruen et al., 1982). An attempt to discover any peptides smaller than p-connectin at the time of biosynthesis during embryonic and postnatal development of chickens has shown that a peptide even larger than a-connectin is first synthesized (Yoshidomi et al., 1985). Findings of locations of monoclonal antibodies against connectin in myofibrils suggest that the connectin filament is made up of a single peptide chain (see Section 111,G). Because of its size and susceptibility to proteolysis, the determination of N and C termini of connectin is very difficult. E . Hydrolysis As first observed by Wang et al. (1979), a direct SDS extract of a piece of freshly excised muscle contains both a- and p-connectin, the ratio being approximately 4 : 1. Seki and Watanabe (1984) showed that stor-
56
IWAO OHTSUKI E T AL.
age of muscles at 0°C results in a complete change of a-connectin to pconnectin within 10 hours for carp muscle and 72 hours for rabbit muscle. Lusby et al. (1983) reported that the breakdown of connectin was faster when bovine skeletal muscle was aged at higher temperatures. Rapid breakdown of connectin from sheep muscle also occurs when it is heated for 40 minutes at 60-80°C, or when stored for 3 weeks at 2°C (King, 1984). The conversion of a-connectin to p-connectin was confirmed using chicken breast muscle. The change was complete for whole muscle stored for 1 day at O"C, and for isolated myofibrils stored for 3 days at 0°C (Hu et al., 1984). In myofibrils, addition of 10 mM EGTA retarded the conversion but did not prevent it completely. It is very likely that pconnectin is a proteolytic product of a-connectin (mother molecule of connectin). However, it is not clear whether p-connectin is functional in situ or not. It should be mentioned that there is a minor band just below p-connectin in the total SDS extract of adult chicken muscle (Yoshidomi et al., 1985). This was not likely to be a proteolytic product of a- or pconnectin, because it was still present unchanged in isolated p-connectin. Connectin is very easily hydrolyzed by proteases such as trypsin or chymotrypsin both in situ (Maruyama et al., 198la) and in vitro (Kimura et al., 1984b). Native p-connectin is fragmented into 400-kDa fragments by a mild chymotrypsin treatment (Kimura et al., 1984b). F. Interaction with Myosin On addition of connectin to a myosin suspension at low ionic strengths (C0.2), a flocculent precipitate appeared (Kimura and Maruyama, 1983a). Increase in turbidity was observed in the presence of 50-140 mM KC1 at pH 7.0. Flow birefringence measurements clearly showed aggregation of myosin filaments. Electron microscopic observations revealed entanglements of myosin filaments (Kimura et al., 1984a). Trinick et al. (1984) observed connectin filaments, identified as end filaments, extruding from both ends of myosin filaments (native thick filaments) (Trinick, 1981). A tryptic fragment of connectin, the 400-kDa peptide, has been demonstrated to cause aggregation of myosin filaments. Further digested fragments of chain weights less than 40 kDa lost this ability (Kimura et al., 1984b). It is of interest that the S1 subfragment, the head of the myosin molecule, does not induce any aggregation in the presence of connectin (Maruyama et ul., 1985a). However, heavy meromyosin interacted with connectin to form aggregates. The neck portion S2 of myosin did not act on connectin. L-Meromyosin and the rod portion of myosin were markedly
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
57
aggregated by connectin. It should be noted that their interactions with connectin also depended on salt concentration: At KCl concentrations higher than 0.15 M there were no interactions. It appears that the connectin-myosin interaction is of an electrostatic nature. G . Interaction with Actin
Connectin also causes aggregation of actin filaments in a manner similar to myosin at low ionic strengths (Kimura and Maruyama, 1983a; Kimura et al., 1984b). Bundles of actin filaments wee formed in the presence of connectin. The bundles appeared to resemble those formed by a-actinin. However, unlike the case with a-actinin, bundle formation by connectin was not affected by tropomyosin bound to actin filaments (H. Yoshidomi, unpublished results). Furthermore, the 400-kDa fragment of connectin did not act on actin filaments at all. It is likely that connectin filaments enhance bundle formation of actin filaments by topological restrictions. H . Localization Using polyclonal antibodies against denatured connectin, immunofluorescent studies have shown that connectin is located mainly at the AI junction area of a sarcomere (Wang et al., 1979; Maruyama et al., 1981b). Sometimes 2 lines and M lines were also stained with fluorescent-labeled antibodies. Investigation on stretched skinned fibers of frog skeletal muscle has revealed an interesting fact: The A band becomes considerably elongated with the antibody treatment (Maruyama et al., 1984b). Electron microscopic observations showed very thin filaments extruding from myosin filaments in the A-I junction area. These were what Sjostrand (1962) called gap filaments. Furthermore, immunoelectron microscopy clearly indicated that the thin filaments in question originated from the myosin filaments all the way through the I band reaching the Z lines (Maruyama et al., 1985b). Hence it is now evident that connectin filaments link the myosin filament to Z lines. As shown in Fig. 13, there are several symmetrical periodicities in the immunoelectron micrographs of frog myofibrils starting from the edges of the central bare zone to the A-I junction region (Maruyama et al., 1985b). The simplest explation is that the connectin filaments originate from the periphery of the M band and extend along each side of the myosin filament all the way out to the Z lines. Assuming that the contents of connectin and myosin in the myofibril are 10 and 43% and the molecular weights are 2.8 X lo6 and 5.2 X lo5, respectively, it is estimated that there are about 24 myosin molecules for every connectin filament in a sarcomere. If every myosin filament con-
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
59
sists of 300 myosin molecules, the number of connectin molecules per myosin filament will be 12. This means that 6 connectin filaments are present in each half of a myosin filament. Trinick et al. (1984) reported that about four endfilaments extrude from the tip of a thick filament. Wang (1984) reported in an elegant study the use of monoclonal antibodies against denatured connectin which reacted with both a-and p-connectins. There are four groups of the antibodies that specifically stain two symmetrical sites in the sarcomere: the periphery of the M band, the middle of the myosin filament, the A-I junction, and midway along the I band. Greaser’s school also observed that their monoclonal antibodies only stained the A-I junction area (Wang and Greaser, 1985). These surprising observations suggest that only a single connectin filament extends from the middle portion of a sarcomere to the Z line: Even in a relaxed state, it is 1.1 pm long! When stretched, the distance will be over 3 pm. Although isolated p-connectin is estimated to be approximately 0.4 pm in length, a double-stranded random coil of MW 2.4 X lo6 could be stretched to as much as 3.7 pm. We should mention Locker’s view that connectin filaments exist in the hollow core of the myosin filaments (Locker, 1984). Magid et al. (1984) have presented a similar idea. It appears that this possibility is small, in view of the immunological reactions along the surface of a myosin filament, although it cannot be excluded. Wang (1984) claimed that connectin filaments are linked to a nebulin meshwork near the Z lines. The content of nebulin is as much as 5% of the myofibrillar proteins. Nebulin has not yet been isolated in native form (Wang and Williamson, 1980) and there has been no information about its molecular shape in the native state and its functions. Therefore, we have to wait for such information before we consider further the possible link between the two proteins.
I . Function In chicken tissues, fluorescent antiserum against connectin stained only skeletal and cardiac muscles. Chicken gizzard and aorta were negatively stained (Ikeya et al., 1983). Examinations by SDS-gel electrophoresis confirmed that connectin-like proteins are not present in nonmuscle cells (D. H. Hu, unpublished observations, 1984). ~~~~
~
FIG. 13. Electron micrograph of myofibrils of frog skeletal muscle treated with polyclonal antibodies against chicken breast muscle p-connectin. Note that there are several symmetrical stripes in each half of the A bands. For further details, see Maruyama et al. (1985b).
60
IWAO OHTSUKI E T AL.
FIG. 14. Diagram of the parallel elastic component of vertebrate skeletal muscle sarcomere. Connectin filaments originating from the edges of the central bare zone of a thick (myosin) filament run through the thin (actin) filament to the 2 lines.
As was already discussed at length above, connectin is a very long, flexible elastic filament linking the myosin filament to the Z lines in vertebrate skeletal muscle. Therefore, it is very plausible that connectin serves as the “parallel elastic component” in a myofibril (Fig. 14). T h e elastic nature of connectin filaments appears to make possible the passive tension generation when a myofibril is stretched beyond the overlap of the thick and thin filaments (Natori, 1954). It also explains why such overstretched myofibrils slowly return to the original state upon release (Natori, 1954). ‘ Indirect evidence for these phenomena has been obtained with trypsin-treated myofibrils (Yoshioka et al., 1986). Trypsin treatments resulted in a decrease in passive tension generation as the connectin filaments were cleaved and finally torn off. Higuchi and Umazume (1985) have shown that the passive tension generation is proportionally decreased as myosin is dissolved away by increasing KCI concentration. This is explained by the fact that connectin is freed from myosin at high ionic strengths (Kimura et al., 1984a). Evidently, more physiological work is needed to elucidate the elastic role of connectin in skinned muscle fibers. REFERENCES Amphlett, G . W., Vanaman, T. C., and Perry, S. V. (1976). FEBS Lett. 72, 163-168. Anderson, T., Drakenberg, T., Forsen, S., and Thulin, E. (1981). FEBS Lett. 125, 39-43. Bailey, K. (1946). Nature (London) 157, 368-369. Bailey, K. (1948). Biochem. J . 43, 271-279. Bailin, G. (1979). A m . J . Physiol. 236, c41-c46. Bechtel, P. J. (1979).J. Biol. Chem. 254, 1755-1758. Bremel, R. D., and Weber, A. (1972). Nature (London) 238, 97-101. Bronson, D. D., and Schachat, F. H. (1982).J . Biol. Chem. 257, 3937-3944. Carew, E. B., Leavis, P. C., Stanley, H. E., and Gergely, J. (1980). Biophys.J. 30, 351-358. Caspar, D. L. D., Cohen, C., and Longley, W. (1969).J. Mol. B i d . 41, 87-107. Chalovich, J. M., and Eisenberg, E. (1982).J. Biol. Chem. 257, 2432-2437. Cheung, W. Y. (1970). Biochem. Biophys. Res. Commun. 38, 533-538. Chong, P. C. S., and Hodges, R. S . (1982a).J. Biol. Chem. 257, 2549-2555.
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
61
Chong, P. C. S., and Hodges, R. S. (1982b).J. Eiol. Chem. 257, 9152-9160. Chong, P. C. S., and Hodges, R. S. (1982c).J. Eiol. Chem. 257, 11667-11672. Cohen, C., and Longley, W. (1966). Science 152, 794-796. Cohen, C., and Szent-Gyorgyi, A. G. (1957).J. Am. Chem. Sot. 79, 248. Cohen, C., Caspar, D. L. D., Parry, D. A. D., and Lucas, R. M. (1971). Cold Spring Harbor Symp. Q m n t . Eiol. 36, 205-216. Collins, J. H., Potter, J. D., Horn, M. J., Wilshire, G., and Jackman, N. (1973). FEES Lett. 36, 268-272, Collins, J. H., Greaser, M. L., Potter, J. D., and Horn, M. J. (1977). J . Eiol. Chem. 252, 6356-6362. Craig, R., and Offer, G. (1976). Proc. R. SOC.London, Ser. B . 192, 451-461. Cummins, P., and Perry, S . V. (1973). Eiochem. J. 133, 765-777. Dedman, j.R., Potter, J. D., and Means, A. R. (1977).J. Biol. Chem. 252, 2437-2440. Dennis, J. E., Shimizu, T., Reinach, F. C., and Fischman, D. A. (1984). J . Cell Eiol. 98, 1514- 1522. dos Remedios, C. G., and Gilmour, D. (1978).J. Eiochem. 84, 235-238. Drabikowski, W., Kominz, D. R., and Maruyama, K. (1968).J. Eiochem. 63, 802-804. Eaton, B. L., Kominz, D. R., and Eisenberg, E. (1975). Biochemistry 14, 2718-2724. Ebashi, S. (1960).J. Eiochem. 48, 150-151. Ebashi, S. (1963). Nature (London) 200, 1010. Ebashi, S . (1972).J . Biochem. 72, 787-790. Ebashi, S. (1974a). Essays Biochem. 10, 1-36. Ebashi, S . (1974b). In “Lipmann Symposium: Energy, Biosynthesis and Regulation in Molecular Biology” (D. Richter, ed.), pp. 165-178. de Gruyter, Berlin. Ebashi, S. (1980). Proc. R. SOC.London, Ser. E 207, 259-286. Ebashi, S., and Ebashi, F. (1964).J. Eiochem. 55, 604-613. Ebashi, S . , and Ebashi, F. (1965).J . Eiochem. 58, 7-12. Ebashi, S., and Endo, M. (1968). Prog. Eiophys. Mol. Eiol. 18, 123-183. Ebashi, S., and Kodama, A. (1965).J. Eiochem. 58, 107-108. Ebashi, S., and Ohtsuki, I. (1983). Calcium Binding Proteins 1983, pp. 251-262. Ebashi, S., Ebashi, F., and Maruyama, K. (1964). Nature (London) 203,405-406. Ebashi, S., Kodama, A., and Ebashi, F. (1968).J. Eiochem. 64,465-477. Ebashi, S . , Endo, M., and Ohtsuki, I. (1969). Q. Rev. Eiophys. 2, 351-384. Ebashi, S., Ohtsuki, I., and Mihashi, K. (1973). Cold Spring Harbor Symp. Quant. Eiol. 37, 2 15-224. Ebashi, S., Ohnishi, S., Maruyama, K., and Fuji, T. (1974). Proc. FEES Meet. 9th, pp. 7183. Endo, T., and Obinata, T. (1981).J. Eiochem. 89, 1599-1608. England, P. J. (1975). FEBS Lett. 50, 57-60. Evans, J. S., Levine, B. A., Leavis, P. C., Gergely, J., Grabarek, Z., and Drabikowski, W. (1980). Biochim. Biophys. Acta 623, 10-20. Evans, R. R., Robson, R. M., and Strorner, M. H. (1984).J . Biol. Chem. 259,3916-3924. Flicker, P. F., Phillips, G. N., Jr., and Cohen, C. (1982).J. Mol. Eiol. 162, 495-501. Franke, W. W., Schmid, E., Osborn, M., and Weber, K. (1978). Proc. Natl. Acad. Scz. U.S.A. 75, 5034-5038. Franzini-Armstrong, C. (1970).J. Cell Sci. 6, 559-592. Fujii, K., and Maruyama, K. (1982). Eiochem. Biophys. Res. Commun. 104, 633-640. Funatsu, T., and Ishiwata, S. (1985).J. Eiochem. 98, 535-544. Geiger, B. (1979). Cell 18, 193-205. Geisler, N., and Weber, K. (1982). E M B O J . 1, 1649-1656. Gergely, J., and Leavis, P. C. (1980). In “Muscle Contraction: Its Regulatory Mechanisms”
62
IWAO OHTSUKI E T AL.
(S. Ebashi et al., eds.), pp. 191-206. Japan Sci. SOC.Press, Tokyo/Springer-Verlag, Berlin and New York. Goldstein, M. A., Schroeter, J . P., and Sass, R. L. (1982).J . Muscle Res. Cell Motil. 3, 333348. Corner, R. H., and Lazarides, E. (1981). Cell 23, 524-532. Grabarek, Z., Drabikowski, W., Leavis, P. C., Rosenfeld, S. S., and Gergely, J. (1981). J . Biol. Chem. 256, 13121-13127. Grabarek, Z., Grabarek, J., Leavis, P. C., and Gergely, J. (1983).J. Biol. Chem. 258, 1409814102. Grand, R. J. A,, Levine, B. A., and Perry, S. V. (1982). Biochem. J . 203, 61-68. Granger, B. L., and Lazarides, E. (1978). Cell 22, 1253-1268. Granger, B. L., and Lazarides, E. (1979). Cell 18, 1053-1063. Granger, B. L., and Lazarides, E. (1980). Cell 22, 727-738. Greaser, M. L., and Gergely, J. (1971).J. Biol. Chem. 246, 4226-4233. Greaser, M. L., Yamaguchi, M., Brekke, C., Potter, J. D., and Gergely, J. (1973). Cold Spring Harbor Symp. Quant. Biol. 37, 235-244. Grove, B. K., Kurer, V., Lehrer, C., Doetschman, T. C., Perriard, J. C., and Eppenberger, H. M. (1984).J. Cell Ezol. 98, 518-524. Gruen, L. C., King, N. L., and McKenzie, I. J. (1982). 1nt.J. P e p . ProteinRes. 20,401-407. Gusev, N. B., Dobrovolskii, A. B., and Severin, S. E. (1980). Biochem. J. 189, 219-226. Hanson, J., and Lowy, J. (1963).J . Mol. Biol. 6, 46-60. Hartshorne, D. J., and Mueller, H. (1968). Biochem. Biophys. Res. Commun. 31, 647-653. Haselgrove, J. C. (1973). Cold Spring Harbor Symp. Quunt. Biol. 37, 341-352. Head, J. F., and Perry, S. V. (1974). Bzochem. J . 137, 145-154. Herzberg, O., and James, M. N. G. (1985). Nature (London) 313, 653-659. Higashi-Fujime, S., and Ooi, T. (1969).J. Microscop. (Parir) 8, 535-548. Higuchi, H., and Umazume, Y. (1985). Ei0phys.J. 48, 137-147. Higuchi, H., Kimura, S., Umazume, Y., and Maruyama, K. (1986). In preparation. Hincke, N. T., McCubbin, W. D., and Kay, C. M. (1978). Can. J . Bzochem. 56, 384-395. Hitchcock, S. E. (1975a). EUT.J . Bzochem. 52, 255-263. Hitchcock, S. E. (1975b). Biochemistry 14, 5162-5167. Hitchcock, S. E. (1981).J. Mol. Biol. 147, 153-173. Hitchcock, S. E., Zirnmerman, C. J., and Smalley, C. (1981).J. Mol. Biol. 147, 125-151. Hitchcock-De Gregori, S. E. (1982).J . Biol. Chem. 257, 7372-7380. Holroyde, M. J.. Robertson, S. P., Johnson, J. D., Solaro, R. J., and Potter, J. D. (1980). J . Biol. Chem. 255, 11688-1 1693. Horwitz, J., Bullard, B., and Mercola, D. (1979).J. Biol. Chem. 254, 350-355. Hu, D. H., Kimura, S., and Maruyama, K. (1984). 2001.Sci. 1, 907. Huxley, H. E. (1973). Cold Spring Harbor Symp. Quunt. Biol. 37, 361-376. Iio, T., and Kondo, H. (1980).J . Biochem. 88, 547-556. Iio, T., and Kondo, H. (1981).J. Biochem. 90, 163-175. Ikeya, H., Ohashi, K., and Maruyama, K. (1983). Biomed. Res. 4, 1 1 1-1 16. Isenberg, G., Leonard, K., and Jockusch, B. (1982).J. Mol. Biol. 158, 231-249. Jackson, P., Amphlett, G. W., and Perry, S. V. (1975). Biochem. J . 151, 85-97. Johnson, J. D., Charlton, S. C., and Potter, J. D. (1979).J. Biol. Chem. 254, 3497-3502. Kakiuchi, S., Yamazaki, R., and Nakajima, M. (1970). Proc. Jpn. Acad. 46, 587-592. Katayama, E. (1979).]. Biochem. 85, 1379-1381. Katayama, E. (1980).In “Muscle Contraction: Its Regulatory Mechanisms” (S. Ebashi et al., eds.), pp. 25 1-258. Japan Sci. SOC.Press, Tokyo/Springer-Verlag, Berlin and New York.
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
63
Katayama, E., and Nozaki, S. (1982).J. Biochem. 91, 1449-1452. Kawasaki, Y., and van Eerd, J.-P. (1972). Biochem. Biophys. Res. Commun. 49, 898-905. Kimura, S., and Maruyama, K., (1983a). Biomed. Res. 4, 607-610. Kimura, S., and Maruyama, K. (1983b).J . Biochem. 94, 2083-2085. Kimura, S., Maruyama, K., and Huang, Y. P. (1984a).J . Biochem. 96, 499-506. Kimura, S., Yoshidomi, H., and Maruyama, K. (1984b).J. Biochem. 96, 1947-1950. King, N. L. (1984). Meat Sc. 11, 27-43. Kitazawa, T. (1976).J. Biochern. 80, 1129-1 147. Kitazawa, T., Shuman, H., and Somlyo, A. P. (1982).J. Muscle Res. Cell Motil. 3,437-454. Kohama, K. (1979).J. Biochem. 86, 81 1-820. Kohama, K. (1980).J. Biochem. 88, 591-599. Kretsinger, R. H., and Barry, C. D. (1975). Biochim. Biaphys. Acta 405, 40-52. Kretsinger, R. H., and Nockolds, C. E. (1973).J. Biol. Chem. 248, 3313-3326. Kumon, A., and Villar-Palasi, C. (1979). Biochirn. Biophys. Acta 566, 305-320. Kuroda, M., and Maruyama, K. (1976).J. Biochem. 80, 315-332. Kuroda, M., and Masaki, T. (1984). E x f . Biol. Med. 9, 148-154. Kuroda, M., Tanaka, T., and Masaki, T. (1981).J. Biochem. 89, 297-310. Laki, K., Maruyama, K., and Kominz, D. R. (1962). Arch. Biochem. Biophys. 98, 323-330. Lazarides, E. (1980). Nature (London) 283, 249-256. Lazarides, E., and Hubbard, B. D. (1976). Proc. Natl. Acad. Sci. U.S.A. 73, 4344-4348. Leavis, P. C., and Gergely, J. (1984). CRC Crit. Rev. Biochem. 16, 235-305. Leavis, P. C., Rosenfeld, S. S., Gergely, J., Grabarek, Z., and Drabikowski, W. (1978). J. Biol. Chem. 253,5452-5459. Lehrer, S. S. (1975). Proc. Natl. Acad. Sci. U.S.A. 72, 3377,3381. Levine, B. A., Coffman, D. M. D., and Thornton, J. M. (1977).J. Mol. Biol. 115, 743-760. Levine, B. A., Thornton, J. M., Fernandes, R., Kelly, C. M., and Mercola, D. (1978). Biochim. Biophys. Acta 535, 11-24. Lin, S., Cribbs, D. H., Wilkins, J. A., Casella,J. F., Magargal, W. W., and Lin, D. C. (1982). Philos. Trans. R. SOC.Lond. 299, 263-273. Locker, R. H. (1984). Food Microstruct. 3, 14-32. Lusby, M. L., Ridpath, J. F., Parrish, F. C., Jr., and Robson, R. M. (1983).J. Food Sci. 48, 1787-1790. McLachlan, A. D., and Stewart, M. (1975).J . Mol. Biol. 98, 293-304. McLachlan, A. D., and Stewart, M. (1976a).J. Mol. Biol. 103, 271-298. McLachlan, A. D., and Stewart, M. (1976b).J. Mol. Biol. 106, 1017-1022. Maeda, Y., Matsubara, I., and Yagi, N. (1979).J. Mol. Biol. 127, 191-201. Magid, A., Ting-Beall, H. P., Carvell, M., Kontis, T., and Lucaveche, C. (1984). In “Contractile Mechanisms in Muscle” (G. H. Pollack and H. Sugi, eds.), pp. 307-327. Plenum, New York. Mak, A. S., and Smillie, L. B. (1981).J. Mol. Biol. 149, 541-550. Mak, A. S., Smillie, L. B., and Stewart, G. R. (1980).J. Biol. Chem. 255, 3647-3655. Mani, R. S., Herasymowych, 0. S., and Kay, C. M. (1980). Int.J. Biochem. 12, 333-338. Margossian, S. S., and Cohen, C. (1973).J. Mol. B i d . 81, 409-413. Martonosi, A. (1962). J. Biol. Chem. 237, 2795-2803. Maruyama, K. (1976).J. Biochem. 80,405-407. Maruyama, K. (1985a). In “Developments in Meat Science”(R. Lawrie, ed.), Vol. 3, pp. 2250. Applied Science, London. Maruyama, K. (1985b). Zool. Sci. 2, 155-162. Maruyama, K., and Ebashi, S. (1965).J. Biochem. 58, 13-19. Maruyama, K., and Kimura, S. (1972).J . Biochem. 72, 483-486.
64
IWAO OHTSUKI ET AL.
Maruyama, K., Natori, R., and Nonomura, Y. (1976). Nature (London) 262, 58-60. Maruyama, K., Kimura, S., Ishii, T., Kuroda, M., Ohashi, K., and Muramatsu, S. (1977a). J . Biochem. 81, 215-232. Maruyama, K., Matsubara, S., Natori, R., Nonomura, Y., Kimura, S., Ohashi, K., Murakami, F., Handa, S., and Eguchi, G. (1977b).J . Biochem. 82, 317-337. Maruyama, K., Kimura, M., Kimura, S., Ohashi, K., Suzuki, K., and Katunuma, N. (1981a).J. Biochem. 89, 711-715. Maruyama, K., Kimura, S., Ohashi, K., and Kuwano, Y. (1981b).J. Biochem. 89,701-709. Maruyama, K., Kimura, S., Yoshidomi, H., Sawada, H., and Kikuchi, M. (1984a).J. Biochern. 95, 1423-1433. Maruyama, K., Sawada, H., Kimura, S., Ohashi, K., Higuchi, H., and Umazume, Y. (1984b).J. Cell Biol. 99, 1391-1397. Maruyama, K., Kimura, S., Yamamoto, K., and Wakabayashi, K. (1985a). Biomed. KPS.6, 343-346. Maruyama, K., Yoshioka, T., Higuchi, H., Ohashi, K., Kimura, S., and Natori, K. (1985b). J . Cell Biol. 101, 2167-2172. Masaki, T., and Takaiti, 0. (1974).J. Biochem. 75, 367-380. Mercola, D., Bullard, B., and Priest, J. (1975). Nature (London) 254, 634-635. Miyahara, M., Kishi, K., and Noda, H. (1980).J. Biochm. 87, 1341-1345. Moir, A. J. G., Cole, H. A., and Perry, S. V. (1977). Biochem. J . 161, 371-382. Moir, A. J. G., Ordidge, M., Grand, R.J. A., Trayer, I. P., and Perry, S. V. (1978). Biochem. J . 173,449-457. Moir, A. J. G., Solaro, R. J., and Perry, S. V. (1980). Biochern.1. 185, 505-513. Moir, A. J. G., Ordidge, M., Grand, R. J. A., Trayer, I. P., and Perry, S. V. (1983).Biochem. J . 209,417-426. Moos, C., and Feng, I. M. (1980). Biochim. Biophys. Acta 632, 141-149. Morris, E. P., and Lehrer, S. S. (1981). Biophys. J . 33, 239a. Muguruma, M., Kobayashi, K., Fukazawa, T., Ohashi, K., and Maruyama, K. (1981). J . Biochem. 89, 1981-1984. Nagano, K., and Ohtsuki, I. (1982). Proc. J f n . Acad. Ser. B 58, 73-77. Nagano, K., Miyamoto, S., Matsumura, M., and Ohtsuki, I. (1980).J. Mol. Biol. 141, 217222. Nagano, K., Miyamoto, S., Matsumura, M., and Ohtsuki, I. (1982).J. Theor. Biol. 94,743782. Nagashima, H., and Asakura, S. (1982).J. Mol. B i d . 155, 409-428. Nagy, B., and Gergely, J. (1979).J. Biol. Chem. 254, 12732-12737. Nagy, B., Potter, J. D., and Gergely, J. (1978).J. Biol. Chem. 253, 5971-5974. Nakamura, S., Yamamoto, K., Hashimoto, K., and Ohtsuki, I. (1981).J . Biochem. 89, 16391641. Namba, K., Wakabayashi, K., and Mitsui, T. (1980). J . Mol. Bzol. 138, 1-26. Natori, R. (1954).Jikeikai Med. J. 1, 119-126. Nonomura, Y., Drabikowski, W., and Ebashi, S. (1968). J . Biochem. 64, 419-422. Obinata, T., Maruyama, K., Sugita, H., Kohama, K., and Ebashi, S. (1981). Muscle Nerue4, 456-488. OBrien, E. J., Gillis, J. M., and Couch, J. (1975).J. Mol. Biol. 99, 461-475. Offer, G., Moos, C., and Starr, R. (1973).J. Mol. B i d . 74, 653-676. Ogawa, Y. (1985).J. Biochem. 97, 101 1-1023. Ohara, O., Takahashi, S., and Ooi, T. (1980). In “Muscle Contraction: Its Regulatory Mechanisms” (S. Ebashi et al., eds.), pp. 259-265. Japan Sci. SOC.Press, Tokyo/ Springer-Verlag, Berlin and New York. Ohashi, K., and Maruyama, K. (1979).J. Biochem. 85, 1103-1105.
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
65
Ohashi, K., and Maruyama, K. (1985).J. Biochem. 97, 1323-1328. Ohashi, K., Mikawa, T., and Maruyama, K. (1982).J. Cell Biol. 95, 85-90. Ohnishi, S., Maruyama, K., and Ebashi, S. (1975).j. Biochem. 78, 73-81. Ohtaki, T., Mimura, N., and Asano, A. (1986).J . Muscle Res. Cell Motil., in press. Ohtsuki, I. (1974).J. Biochem. 75, 753-765. Ohtsuki, I. (1975).J. Biochem. 77, 633-639. Ohtsuki, I. (1979a).J. Biochem. 85, 1377-1378. Ohtsuki, I. (1979b).J. Biochem. 86, 491-497. Ohtsuki, I. (1980). I n “Muscle Contraction: Its Regulatory Mechanisms” (S. Ebashi et al., eds.), pp. 237-249. Japan Sci. SOC.Press, TokyoISpringer-Verlag, Berlin and New York. Ohtsuki, I., and Nagano, K. (1982). Adv. Biophys. 15, 93-130. Ohtsuki, I., and Wakabayashi, T. (1972).J. Biochem. 72,369-377. Ohtsuki, I., Masaki, T., Nonomura, Y., and Ebashi, S. (1967).J. Biochem. 61, 817-819. Ohtsuki, I., Yamamoto, K., and Hashimoto, K. (1981).J. Biochem. 90, 259-261. Ohtsuki, I., Shiraishi, F., Suenaga, N., Miyata, T., and Tanokura, M. (1984).J. Biochem. 95, 1337- 1342. Onoyama, Y., and Ohtsuki, I. (1986).J. Biochem., in press. Ooi, T., Mihashi, K., and Kobayashi, H. (1962). Arch. Biochem. Biophys. 98, 1-11. Pardo, J. V., Siliciano,J., and Craig, S. W. (1983a). Proc. Natl. Acad. Sci. U.S.A. 80, 10081012. Pardo, J. V., Siliciano,J., and Craig, S. W. (1983b).J. Cell Biol. 97, 1283-1287. Pato, M. D., Mak, A. S., and Smillie, L. B. (1981).J. Biol. Chem. 256, 602-607. Pearlstone, J. R., and Smillie, L. B. (1977). C a n . J . Biochem. 55, 1032-1038. Pearlstone, J. R., and Smillie, L. B. (1978). Can. J . Biochem. 56, 521-527. Pearlstone, J. R., and Smillie, L. B. (1980). Can. J. Biochem. 58, 649-654. Pearlstone, J. R., and Smillie, L. B. (1981). FEBS Lett. 128, 119-122. Pearlstone, J. R., and Smillie, L. B. (1982).J. Biol. Chem. 257, 10587-10592. Pearlstone, J. R., and Smillie, L. B. (1983). J. Biol. Chem. 258, 2534-2542. Pearlstone, J. R., Carpenter, M. R., Johnson, P., and Smillie, L. B. (1976). Proc. Natl. Acad. Sci. U.S.A. 73, 1902-1906. Pearlstone, J. R., Carpenter, M. R., and Smillie, L. B. (1977a).J. Biol. Chem. 252,971-977. Pearlstone, J. R., Carpenter, M. R., and Smillie, L. B. (1977b).J. B i d . Chem. 252, 978-982. Pearlstone, J. R., Johnson, P., Carpenter, M. R., and Smillie, L. B. (1977c).J. Biol. Chem. 252,983-989. Perry, S. V. (1979). Biochem. SOC. Tram. 7, 593-617. Perry, S. V. (1980). In “Muscle Contraction: Its Regulatory Mechanisms” (S. Ebashi et al., eds.), pp. 207-220. Japan Sci. SOC.Press, Tokyo/Springer-Verlag, Berlin and New York. Perry, S. V., Cole, H. A., Head, J. F., and Wilson, F. J. (1973). Cold Spring Harbor Symp. Quunt. Biol. 37, 251-262. Potter, J. D., and Gergely, J. (1974). Biochemistry 13, 2697-2703. Potter, J. D., and Gergely, J. (1975).J. Biol. Chem. 250, 4628-4633. Potter, J. D., Seidel, J. C., Leavis, P. C., Lehrer, S. S., and Gergely, J. (1976).J. Biol. Chem. 251, 7551-7556. Potter, J. D., Hsu, F.-J., and Pownall, H. J. (1977)..j. Eiol. Chem. 252, 2452-2454. Ray, K. P., and England, P. J. (1976). FEBS Lett. 70, 11-16. Reddy, Y. S., and Wyborny, L. E. (1976). Biochem. Biophys. Res. Commun. 73, 703-709. Reid, R. E., Clare, D. M., and Hodges, R. S. (1980).J. Biol. Chem. 255, 3642-3646. Schaub, M. C., and Perry, S. V. (1969). Bi0chem.J. 115, 993-1004. Seki, N., and Watanabe, T. (1984).J. Bzochem. 95, 1161-1167.
66
IWAO OHTSUKI ET AL.
Sjostrand, F. S. (1962).J . Ultrmtruct. Res. 7, 225-246. Small, J. V., and Sobieszek, A. (1977). J . Cell Sci. 23, 243-268. Sodek, J., Hodges, R. S., and Smillie, L. B. (1978).J. Biol. Chem. 253, 1129-1136. Spudich, J. A,, and Watt, S. (1971). J . Biol. Chem. 246,4866-4871. Spudich, J. A,, Huxley, H. E., and Finch, J. T. (1972).J. Mol. Biol. 72,619-632. Starr, R., and Offer, G. (1971). FEES Lett. 15, 40-44. Starr, R., and Offer, G. (1978). Bi0chem.J. 171, 813-816. Stewart, M. (1975a). Proc. R. Sac. London, Ser.B 190, 257-266. Stewart, M. (1975b). FEES Lett. 53, 5-7. Stewart, M., and Diakiw, V. (1978). Nature (London) 274, 184-186. Stone, D., and Smillie, L. B. (1978).J. Biol. Chem. 253, 1137-1148. Stossel, T. P., and Hartwig, J. H. (1975).J. Biol. Chem. 250, 5706-5712. Strasburg, G. M., Greaser, M. L., and Sundaralingam, M. (1980).J. Biol. Chem. 255,38063808. Sundaralingam, M., Bergstrorn, R., Strasburg, G., Rao, S. T., Roychowdhury, P., Greaser, M. L., and Wang, B. C. (1985). Science 227, 945-948. Suzuki, A., and Nonami, Y. (1982). Agric. Biol. Chem. 46, 1103-1104. Suzuki, A., Saito, M., lwai, H., and Nonami, Y. (1978). A p ’ c . Biol. Chem. 42, 2117-2122. Syska, H., Perry, S. V., and Trayer, I. P. (1974). FEES Lett. 40, 253-257. Syska, H., Wilkinson, J. M., Grand, R. J. A., and Perry, S. V. (1976). Biochem. J . 153,375387. Takahashi, K., Nakamura, F., Hattori, A., and Yamanoue, M. (1985).J. Biochem. 97,10431051. Talbot, J. A., and Hodges, R. S. (1981).J. Biol. Chem. 256, 2798-2802. Tanokura, M., and Ohtsuki, I. (1982). FEES Lett. 145, 147-149. Tanokura, M., and Ohtsuki, I. (1984).J. Biochem. 95, 1417-1421. Tanokura, M., Tawada, Y., Onoyama, Y., Nakamura, S., and Ohtsuki, I. (1981).J. Biochem. 90,263-265. Tanokura, M., Tawada, Y., and Ohtsuki, I. (1982).J. Biochem. 91, 1257-1265. Tanokura, M., Tawada, Y., Ono, A,, and Ohtsuki, I. (1983).J. Biochem. 93, 331-337. Tawada, Y., Ohara, H., Ooi, T., and Tawada, K. (1975).J. Biochem. 78, 65-72. Tokuyasu, K. T. (1983).J. Cell Biol. 97, 562-565. Tokuyasu, K. T., Maher, P. A., and Singer, S. 1. (1984).J. Cell Biol. 98, 1961-1972. Toyoshima, C., and Wakabayashi, T. (1985).J. Biochem. 97, 245-263. Trinick, J. (1981).J. Mol. Biol. 151, 309-314. Trinick, J., Knight, P., and Whiting, A. (1984).J. Mol. Biol. 180, 331-356. Tsao, T.-C., Bailey, K., and Adair, G. S. (1951). Biochem. J. 49, 27-36. Tsukui, R., and Ebashi, S. (1973).J. Biochem. 73, 1119-1121. van Eerd, J.-P., and Kawasaki, Y. (1972). Biochem. Biophys. Res. Commun. 47, 859-865. van Eerd, J.-P., and Takahashi, K. (1976). Biochemistry 15, 1171-1180. Wakabayashi, K., and Namba, K. (1981). Biophys. Chem. 14, 111-122. Wakabayashi, T., Huxley, H. E., Amos, L. A:, and Klug, A. (1975).J. Mol. Biol. 93,477497. Wallimann, T., Turner, D. C., and Eppenberger, H. M. (1977).J. Cell B i d . 75,297-3 17. Wallimann, T., Schlosser, T., and Eppenberger, H. M. (1984). J . Biol. Chem. 259, 52385246. Wang, K. (1982). I n “Methods in Enzymology” (D. W. Frederiksen and L. W. Cunningham, eds.), Vol. 85, pp. 264-274. Academic Press, New York. Wang, K. (1984). I n “Contractile Mechanisms in Muscle” (G. H. Pollack and H. Sugi, eds.), pp. 285-305. Plenum, New York.
REGULATORY PROTEINS OF VERTEBRATE MUSCLE
67
Wang, K. (1985). Int. Congr. Cell Biol. Tokyo. Wang, K., and Williamson, C. L. (1980). Proc. Natl. Acad. Sci. U.S.A. 77, 3254-3258. Wang, K., McClure, J., and Tu, A. (1970). Proc. Natl. Acad. Sci. U.S.A. 76, 3698-3702. Wang, K., Ramirez-Mitchell, R., and Palter, D. (1984). Proc. Natl. Acad. Sci. U.S.A. 81, 3685-3689. Wang, M. S., and Greaser, M. L. (1985).J. Muscle Res. Cell Motil. 6, 293-312. Weber, A., and Murray, J. M. (1973). Physiol. Rev. 53, 612-673. Weeks, R. A., and Perry, S. V. (1978). Biochem. J. 173,449-457. Wilkins, J. A., and Lin, S. (1982). Cell 28, 83-90. Wilkinson, J. M. (1980). Eur.J. Biochem. 103, 179-188. Wilkinson, J. M., and Grand, R. J. A. (1975). Biochem. J. 149, 493-496. Wilkinson, J. M., and Grand, R. J. A. (1978). Nature (London) 271, 31-35. Wnuk, W., Schoechlin, M., and Stein, E. A. (1984).J. Biol. Chem. 259, 9017-9023. Woodhead, J. L., and Lowey, S. (1983).J. Mol. Biol. 168, 831-846. Woods, E. F. (1967).J. Biol. Chem. 242, 2859-2871. Wray, J. S., Vibert, P. J., and Cohen, C. (1978).J. Mol. Biol. 124, 501-521. Yamada, K. (1978). Biochim. Biophys. Acta 535, 342-347. Yamada, K., and Kometani, K. (1982).J. Biochem. 92, 1505-1517. Yamamoto, K. (1983).J. Biochem. 93, 1061-1069. Yamamoto, K. (1984).J. Biol. Chem. 259, 7163-7168. Yamamoto, K., and Maruyama, K. (1973).J . Biochem. 73, 1111-1 114. Yamamoto, K., and Ohtsuki, I. (1982).J. Biochem. 91, 1669-1677. Yanagida, T., Nakase, M., Nishiyama, K., and Oosawa, F. (1984). Nature (London) 307, 58-60. Yang, Y.-Z., Korn, E. D., and Eisenberg, E. (1979).J. Biol. Chem. 254, 7137-7140. Yates, L. D., and Greaser, M. L. (1983).J. Mol. Biol. 168, 123-141. Yoshidomi. H., Ohashi, K., and Maruyama, K. (1985). Bioined. Re.s. 4, 207-212. Yoshioka, T., Higuchi, H., Kimura, S., Ohashi, K., Umazume, Y., and Maruyama, K. (1986). Biomed. Res. 7 , in press. Zot, H. G., and Potter, J. D. (1982).J. Biol. Chem. 257, 7678-7683.
This Page Intentionally Left Blank
By ANTHONY MAXWELL' and MARTIN GELLERT Laboratory of Molecular Biology, National institute of Arthritis, Diabetes, and Digestive and Kidney Diseases, National institutes of Health, Bethesda, Maryland 20892
I. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
11. The Reactions of Topoisomerases . . . . . . 111. DNABinding . . . . . . . . . . . . . . .
IV.
V. VI. VII. VIII. IX. X.
. . . . . . . . . . . . . . 72 . . . . . . . . . . . . . . 78 DNACleavage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 A. The Cleavage Reaction. . . . . . . . . . . . . . . . . . . . . . . . 83 B. Cleavage-Site Specificity . . . . . . . . . . . . . . . . . . . . . . . 86 91 C. The DNA-Protein Bond. . . . . . . . . . . . . . . . . . . . . . . DNAReunion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 ATP Hydrolysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Processivity in Topoisomerase Reactions . . . . . . . . . . . . . . . . . 97 Covalent Modification of Topoisomerases . . . . . . . . . . . . . . . . . 98 Mechanistic Models. . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Concluding Remarks. . . . . . . . . . . . . . . . . . . . . . . . . . 102 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 I. INTRODUCTION
DNA topoisomerases are enzymes that catalyze changes in the topology of circular DNA.With a closed-circular double-stranded DNA,one type of reaction alters the number of times the two strands are wound around each other and thus changes the degree of supercoiling. Supercoiled DNA molecules are prevalent in cells, and enzymes that can modify this property are important in DNA metabolism. Reactions involving other topological isomers of DNA are also known; various topoisomerases can form or resolve knotted or catenated structures in circular duplex DNA,or form knots in single-stranded circular DNA. Some of these reactions also have biological importance; for instance, replication of a circular-duplex DNA often produces two catenated circles which then have to be separated. Cells of all organisms examined to date have been found to contain DNA topoisomerases; commonly there are several distinct types in a cell. Where a genetic test has been feasible, the presence of at least one topoisomerase has been found to be essential for cell growth. It is the purpose of this article to discuss various lines of information bearing on the enzymatic mechanisms of topoisomerases. Other reviews, with a 1 Present address: Department of Biochemistry, University of Leicester, Leicester LE 1 7RH, England.
69 ADVANCES IN PROTEIN CHEMISTRY, Vol. 38
70
ANTHONY MAXWELL AND MARTIN GELLERT
more general viewpoint, deal with their biological as well as enzymatic properties (Gellert, 1981; Drlica, 1984;Wang, 1985;Vosberg, 1985). The topological state of a circular molecule is, by definition, invariant under deformation, and can be changed only as a result of breaking and resealing the chain. Topoisomerases are enzymes capable of performing just such breakage-resealing cycles in a DNA chain. To simplify later discussion, a few definitions relevant to DNA supercoiling are useful. (For more detailed discussions of DNA supercoiling, see Fuller, 1978;Bauer, 1978;Bauer et al., 1980;Wang, 1986). The topological state of a closed-circularduplex DNA that is not catenated or knotted can be characterized by its linking number, a,which expresses the number of times the two strands are interwound. (For discussions of catenanes and knots, see Schill, 1971;White and Cozzarelli, 1984;COZzarelli et al., 1984.) A simple way of determining a is to consider a projection of the molecule onto a plane; the excess of right-handed over left-handed strand crossings is the linking number. The linking number can be partitioned into twist and writhe (a = T w + Wr), where twist is determined by the local pitch of the helix, and writhe is a measure of the contortion of the helix axis in space (Fig. 1).For a relaxed DNA molecule, we set a = ao.The average value of writhe of a relaxed DNA is zero, and its twist is the number of base pairs (N) divided by the helical repeat (10.5base pairs per turn in solution; Shore and Baldwin, 1983; Horowitz and Wang, 1984),so that T w o = N/10.5.For supercoiled DNA, the linking difference a - ao is equal to ATw + Wr, where ATw = T w - N/10.5.Although a can be changed only upon breaking and rejoining strands, the proportions of twist and writhe vary in response to solution conditions, for example, temperature and salt concentration (see Bauer, 1978,for review). The linking number (a)of supercoiled DNA can be greater or less than ao. If a C ao,the DNA is said to be negatively supercoiled; if a > ao, the DNA is positively supercoiled. Covalently circular DNA isolated from both procaryotic and eucaryotic sources is generally found to be negatively supercoiled. A consequence of negative supercoiling is that the DNA helix is more easily unwound, i.e., the strands are more readily separated, whereas positive supercoiling, by tightening the pitch of the helix, would make unwinding more difficult. In either case, supercoiled DNA has excess free energy over that of the relaxed state. This excess free energy is found to be closely proportional to the square of the linking difference (a- ao)for moderate values of a (Depew and Wang, 1975;Pulleyblank et al., 1975).The free energy involved can be quite large. For a negatively supercoiled DNA with a specific linking difference [defined as (a - cyo)la0]of -0.06 (a typical value for bacterial
a =40 TW = 40 Wr = 0
a = 32 TW = 32 Wr = 0
0140
=32 Tw = 3 2 a
Wr
=o
30
a = 32 TW = 40 Wr = -8
FIG. 1. The relationship among writhe, twist, and linking number. The upper panel shows a relaxed closed-circular DNA molecule with 40 turns of double helix. (In all the illustrations, the numbers along the molecule count units of twist.) The molecule lies flat in the plane of the paper and thus has no writhe. In the molecule in the central illustration, the linking number and twist have been reduced by 8; this change would require breaking and resealing of the DNA backbone. This deficiency is represented by the open region in the molecule; the helical repeat of the DNA (number of base pairs per turn) is unchanged throughout the remainder of the molecule, and the writhe is zero. This molecule can be considered as being in equilibrium with other conformers having altered helical repeat andlor writhe. Two extreme possibilities are illustrated at the bottom of the figure. The molecule at the lower left has accommodated the decreased linking number by a uniform increase in the helical repeat such that the reduced twist is now spread evenly through the DNA; the writhe is still zero. In the illustration at the lower right, the molecule retains the original helical repeat and accommodates the deficiency in linking number by having 8 (negative) units of writhe. The conformation of DNA molecules of moderate size (several kilobases) in solution will fall between these two extremes, having changes in linking number apportioned to both twist and writhe (Shore and Baldwin, 1983; Horowitz and Wang, 1984).
72
ANTHONY MAXWELL AND MARTIN GELLERT
DNA), an increase (relaxation) of 1 unit in linking number decreases its free energy by 9 kcal/mol. Similarly, the binding of a protein that unwinds DNA would be associated with a favorable free energy, as would transitions to alternate structures in DNA, such as cruciforms or Z-DNA, which relieve superhelical stress. The topological state of DNA is thus important to its biological functions, as also indicated by the profound influence of DNA supercoiling on such processes as replication, transcription, and recombination. Since the description of the first topoisomerase, o protein (now called Eschem'chia coli topoisomerase I), by Wang (1971), many enzymes of this type from both procaryotic and eucaryotic sources have been identified. A summary of the properties of selected topoisomerases is presented in Table I. Despite their diverse origins, these enzymes show common features in their reactions and appear to conform to the same general mechanism of action. In addition to the many enzymes that have been identified on the basis of their topoisomerase activities, some enzymes, originally known for other catalytic activities, have been found subsequently to be topoisomerases. These include the Int protein of bacteriophage A and the resolvase proteins of the transposons T n 3 and $3 (for reviews, see Nash, 1981; Weisberg and Landy, 1983; Grindley, 1983; Grindley and Reed, 1985). These proteins can carry out recombination reactions in which breakage of DNA strands is followed by their joining to a different site in the same or another DNA molecule; these reactions have a clear analogy to topoisomerization, in which rejoining is to the same site at which breakage occurred. It is likely that the topoisomerization reactions of the Int and resolvase proteins represent aborted recombination reactions where the transfer of DNA strands to new locations has not been achieved. It has been suggested (Mizuuchi, 1984) that the whole class of enzymes could be considered as DNA strand transferases without regard for the intra- o r intermolecular nature of the joining.
11. THEREACTIONSOF TOPOISOMERASES T h e first activity of DNA topoisomerases to be described was the relaxation of supercoiled closed-circular DNA, i.e., conversion to a less supercoiled form (Wang, 197 1). This activity was clearly distinct from that of nucleases since the products were covalently closed and relaxation could occur in a stepwise fashion. A dependence on DNA ligase was ruled out, since no energy source was required for this reaction. All topoisomerases discovered subsequently can relax negatively supercoiled DNA; the ability to relax positively supercoiled DNA is less gen-
TABLE I Properties of Selected TopoisOmerases
Enzyme
Type
Source
Gene
Subunit MW (kDa)
Procaryotic topoisomerase I (oprotein) Int Resolvase
1
Bacteria (e.g., E. coli)
topA
I I
int dnpR
40 21
Eucaryotic topoisomerase I
Phage A Transposons y6 and Tn3 Yeast
MAK 1
90
I
DNA gyrase (procar yotic topoisomerase 11) E . coli topoisomerase 11' T4 topoisomerase
Eucaryotic topoisomerase I1
I1
Rat liver HeLa cells, etc. Bacteria (e.g., E. coli)
I1
E . coli
I1
Phage T4
Yeast
Monorner
ATP requirement?
(+)
(-)
Supercoiling
5'
No
Yes
No
No
? ?
3' 5'
Yes Yes ? Yes
No No
No No
Monomer
3'
Yes Yes
No
Nob
?
-1ow
gYrA WB
100 90
A&
5'
No'
Yes
Yes
Yesd
gYrA WB Gene 39 Gene 52 Gene 60 TOP2
100 50' 57 48 18 150
A&;
5'
Yes Yes
No
No
5'
Yes Yes
No
Yes
5'
Yes Yes
No
Yes
I1
?
Dimer Drosophih HeLa cells, etc.
a
-100
Structure
Proteinbound DNA end
Relaxation of supercoils?
?
-170"
Also active as proteolytic fragments. ATP may stimulate or inhibit DNA relaxation by some eucaryotic type I enzymes. Will relax positively supercoiled DNA only in the presence of ADPNP. Required for supercoiling reaction only. Proteolytic fragment (B') of gyrase B protein.
74
ANTHONY MAXWELL AND MARTIN GELLERT
eral. The only topoisomerase so far shown to be able to introduce negative supercoils into DNA is DNA gyrase (Gellert et al., 1976a). A “reverse gyrase” that introduces positive superhelical turns into DNA in the presence of ATP has been isolated from the thermophilic bacterium Sulfolotm (Kikuchi and Asai, 1984; Mirambeau et al., 1984); this organism also contains a conventional DNA gyrase. It has been suggested that positive supercoiling of DNA could be useful to an organism growing at high temperature because it will tend to prevent denaturation (Kikuchi and Asai, 1984). It is clear that some DNA transformations can be accomplished by the cleavage of a single DNA strand while others (such as catenation of two closed-circularduplexes) must involve cleavage of both strands. This has led to the classification of topoisomerases whose reactions proceed via a transient single-strand break as type I, while enzymes whose reactions proceed via double-strand breaks are described as type I1 (Liu et al., 1980). Figures 2 and 3 display these possible reactions. In addition to this nomenclature, enzymes have also been named in order of their discovery in a particular cell type. Thus the enzyme originally designated o protein is now called E. coli topoisomerase I and DNA gyrase is named E. coli topoisomerase 11. This system has been adopted for some topoisomerases (such as E. coli topoisomerase I and HeLa topoisomerase 11) whereas trivial names are more commonly used with others (such as DNA gyrase). Throughout the remainder of this article we have elected to use the names for the topoisomerases which are most frequently used; reference to Table I will give alternative names for these enzymes. A variety of methods has been used to study DNA supercoiling and relaxation. Electrophoresis in agarose gels is convenient and, for DNA with molecular weight below lo’, is capable of resolving topological isomers differing by 1 unit in linking number. Other methods, including ethidium fluorescence, sedimentation analysis, and electron microscopy, have also been used. DNA catenanes and knots can be identified by gel electrophoresis or electron microscopy. (For discussions of methods used in topoisomerase assays, see Wang and Kirkegaard, 1981; several papers in Wu et al., 1983; Wang, 1985; Vosberg, 1985.) A consequence of topoisomerase reactions that occur via either transient single- or double-strand breaks in DNA is that the former will occur in linking number increments of 1, while the latter must proceed with linking number increments in multiples of 2 (Fuller, 1978). This has proved to be an effective diagnostic test for distinguishing type I and type I1 topoisomerases (Fig. 4). Incubation of a purified single DNA topoisomer with either E. coli topoisomerase I or DNA gyrase, under appropriate conditions, shows that the former relaxes the DNA in link-
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
o+o __*
75
DUPLEX FORMATION
CATENATION/ DECATENATION
FIG. 2. Reactions of type I topoisomerases. The figure shows the major DNA transformations that can be carried out by type I topoisomerases.Note that catenation or decatenation of circular duplexes will occur only if one DNA molecule bears a single-strand break.
ing number increments of 1 , whereas the latter both supercoils and relaxes DNA in steps of 2 (Brown and Cozzarelli, 1979; Mizuuchi et al., 1980a). Other topoisomerases have also been subjected to this diagnostic test (see for example, Liu et al., 1980; Miller et al., 1981; Brown and Cozzarelli, 1981).
76
ANTHONY MAXWELL AND MARTIN GELLERT
KNOlTlNGlUNKNOTTING
CATENATIONI DECATENATION
FIG. 3. The reactions of type I1 topoisomerases. The figure shows the major DNA transformations that can be carried out by type I1 topoisomerases.
Aside from the ability to relax (or supercoil) DNA in steps of either 1 or 2, other criteria have been used to distinguish type I and type I1 enzymes. One useful criterion is the catenation and decatenation of duplex circles (Liu et al., 1980). Type I enzymes, because of their inability to make double-strand breaks, can only catalyze these reactions when at least one circle bears a single-strand nick, whereas type I1 enzymes can perform these reactions with intact circles (Tse and Wang, 1980). Among the two types of topoisomerases, some enzymes are able to carry out only a subset of the reactions shown in Figs. 2 and 3. For example, the type I topoisomerase from mouse embryo is able to relax positively supercoiled DNA (Champoux and Dulbecco, 1972), whereas E. coli topoisomerase I does not display this activity (Wang, 1971). All of the DNA forms in Figs. 2 and 3 have been found in vivo or may be generated in vitru. For example, catenated DNA molecules have been found in mitochondria (Hudson and Vinograd, 1967) and as products
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
a
b
C
d
77
e
+2 +1
0
-1
-2
Type 1/11 - 8 -11 -12
-
-
--
- 8 - 9 -10
-1 1
-ii
FIG. 4. Topoisomerases change the linking number of DNA in steps of either 1 or 2. The figure shows a representation of DNA topoisomers after their resolution by gel electrophoresis. Each band represents a topoisomer with a unique linking difference (a! ao)given by the numbers at the extreme left and right of the figure. In a hypothetical experiment, a group of negatively supercoiled topoisomers (a) with the given distribution of linking numbers, is relaxed by either a type I or type I1 topoisomerase. The result is a group of topoisomers of increased linking numbers (b), irrespective of whether relaxation was by a type I or type I1 enzyme. However, if a single topoisomer from (a) is selected (c), then relaxation by type I or type I1 topoisomerases gives different results. Relaxation by type I enzymes (d) gives a result similar to that found in (b), whereas relaxation by type I1 enzymes (e) generates topoisomers differing in linking number from the substrate, and from each other, by multiples of 2.
of plasmid replication (Novick et al., 1973; Sakakibara et al., 1976), SV40 replication (Jaenisch and Levine, 1973), and in vitro recombination mediated by both bacteriophage X Int protein and the resolvase of transposon y6 (Mizuuchi et al., 1980b; Reed, 1981b). The kinetoplasts of trypanosomes contain many double-stranded DNA circles in a vast catenated network which can be resolved to simple circles by the action of topoisomerases (Marini et al., 1980). Knotted DNA can be formed by the action of T4 topoisomerase on duplex circles (Liu et al., 1980)and by recombination by Int protein in plasrnids containing two attachment sites in inverted orientation (Mizuuchi et al., 1980a). If we assume a common mechanism for the topoisomerase reactions illustrated in Figs. 2 and 3, then this mechanism must possess certain features. DNA binding and cleavage are clearly required and the positions of broken ends relative to another single- or double-stranded segment of DNA must somehow be altered before rejoining. As the product DNA is covalently closed, this break can only be transient and the enzyme must be capable of reforming the broken bond(s). Phosphodiester
78
ANTHONY MAXWELL AND MARTIN GELLERT
bond formation is an energy-requiring process, yet many topoisomerases act with no external energy source. Therefore, the energy of the broken phosphodiester bond is conserved within the enzyme-DNA complex. Mechanistic studies of topoisomerases have revealed several of these aspects and putative reaction intermediates have been isolated. We now review the experimental data which demonstrate these features of topoisomerase action, with special emphasis on the most recent work. 111. DNA BINDING There are interesting differences among the DNA-binding properties of topoisomerases. These differences are reflected in the reactions they can perform. Procaryotic topoisomerase I has a preference for binding to single-stranded DNA. With single-stranded or highly supercoiled DNA molecules, the enzyme forms complexes that are stable even in concentrated salt solutions such as 3 M CsCl; these complexes can be detected by ultracentrifugation (Depew et al., 1978; Liu and Wang, 1979). Addition of Mg2+to these complexes leads to their dissociation. However, addition of alkali leads to the appearance of cleaved DNA molecules having protein covalently attached to the DNA (see below). This enzyme also forms complexes with relaxed duplex DNA but these complexes are not dissociated by the addition of Mg2+and cleavage of the DNA is not found when alkali is added (Liu and Wang, 1979). Complexes between procaryotic topoisomerase I and DNA can also be detected by their retention on nitrocellulose filters. Using this method the preferential binding of the enzyme to an internal single-strand break in a linear duplex DNA has been shown (Dean et al., 1982; Dean and Cozzarelli, 1985). Positively supercoiled DNA is not normally a substrate for relaxation by E. colz topoisomerase I (Wang, 1971), presumably due to its lack of single-stranded character. However, if a positively supercoiled DNA molecule containing a single-stranded heterologous loop is constructed, it can then be relaxed (Kirkegaard and Wang, 1985),indicating that the single-stranded region allows the enzyme to bind. These data suggest that the preference of procaryotic topoisomerase I for binding to singlestranded DNA governs its choice of substrate. Tse-Dinh et d.(1983) have shown that E. coli topoisomerase I can bind to and cleave singlestranded oligonucleotides as short as seven bases in length; thus the normal reaction of the enzyme may involve binding to only a short stretch of DNA. The Int protein of bacteriophage A was originally identified through its role in site-specific recombination, but has since been found also to
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
79
have a topoisomerase activity (Nash et al., 1977; Kikuchi and Nash, 1979). Int protein mediates site-specific recombination between attachment sites on phage A and E. colz DNA. These sites are called attP and attB, respectively (for a review, see Nash, 1981). Although the binding of Int protein to DNA is rather specific for att sites (Kotewicz et al., 1977; Kikuchi and Nash, 1978), the topoisomerase activity may be less sensitive to DNA sequence as evidenced by the ability of the Int protein to relax DNA molecules lacking att sites (Kikuchi and Nash, 1979).However, it is likely that the sites of topoisomerase action by Int protein have close homologies to sequences within the attachment sites (Craig and Nash, 1983). Examination by electron microscopy of complexes formed between the Int protein and attP DNA suggest the occurrence of Int in an oligomeric form in these complexes (Hamilton et al., 1981). A more detailed electron microscopic study (Better et al., 1982) showed that Int protein condensed about 230 bp of DNA (in the attP region) into a complex containing 4-8 Int monomers. It has been suggested that an oligomer of the Int protein forms a complex with attP in which DNA is wrapped around a protein core (Hamilton et al., 1981; Better et al., 1982; Pollock and Nash, 1983). It is unlikely that the topoisomerase reaction of Int requires the formation of such a structure because DNA molecules that are relaxed by Int do not appear to have clusters of Int binding sites. Like Int, the resolvase proteins of transposons Tn3 and yi3 also exhibit topoisomerase activity; these proteins carry out site-specific recombination between appropriately oriented res sites in supercoiled DNA as part of the transposition process (for a review, see Grindley, 1983). Using nuclease-protection studies, Grindley et al. (1982) and Kitts et al. (1983) have shown that resolvase protects three regions 30-40 bp in size in this segment of DNA. Each binding site shows a degree of dyad symmetry, suggesting the existence of six half-sites, and some sequence homology between binding sites is apparent. Half-sites have been shown to permit specific binding of resolvase, though not as efficiently as an intact site (Grindley et al., 1982). The recombinational crossover event occurs within site I (Kostriken et al., 1981; Reed, 1981a), but all three sites are required for the resolution reaction to occur (Grindley et al., 1982). A distinctive feature of the recombination reaction mediated by Tn3 and y6 resolvases is the requirement for two res sites present as direct repeats (Reed and Grindley, 1981; Kitts et al., 1983). DNA relaxation by Tn3 resolvase also occurs efficiently only on DNA molecules containing two res sites in a directly repeated orientation (Krasnow and Cozzarelli, 1983). DNA molecules with two res sites in inverted orientation or with one or no res sites are not substrates for either the relaxation or resolu-
80
ANTHONY MAXWELL AND MARTIN GELLER’T
tion reactions. This similarity in the requirements for the two processes implies that the same breakage-reunion reaction is intrinsic to both resolution and topoisomerization by resolvase. The apparently strict requirements regarding this orientation of res sites in DNA has led to the proposal of “tracking” models of resolvase action, involving the onedimensional diffusion of the resolvase protein along the DNA helix (Krasnow and Cozzarelli, 1983; Kitts et al., 1983; Grindley and Reed, 1985). The resolvase protein from y6 has been crystallized (Weber et al., 1982), but high-resolution structure determination has been hampered by the poor diffraction of X rays by these crystals. However, chymotryptic cleavage of resolvase yields two defined fragments, one of which (MW 15,500) forms crystals suitable for X-ray diffraction analysis (Abdel-Meguid et al., 1984), but shows no capacity for DNA binding. The other fragment (MW 5000) was found to bind to res sites, but, unlike the intact protein, displayed differential affinity for the six half-sites (AbdelMeguid et al., 1984). Of the type I1 topoisomerases, the best studied with respect to the formation of a protein-DNA complex is DNA gyrase. The properties of the gyrase-DNA complex have been examined by many methods, including nuclease protection, filter binding, sedimentation, and electron microscopy. The first indication of the nature of the gyrase-DNA complex came from the studies of Liu and Wang (1978a). They incubated nickedcircular DNA with Mzcrococcw luteus DNA gyrase in the absence of ATP and found that, after sealing the nicks with DNA ligase, the DNA was positively supercoiled. This suggested that the binding of gyrase to DNA involves wrapping of the DNA around the enzyme with a unique handedness. The possibility that DNA wraps around the enzyme was further supported by digestion of gyrase-DNA complexes with staphylococcal nuclease or pancreatic DNase I (Liu and Wang, 1978b). Digestion of nicked DNA by staphylococcal nuclease in the presence of either E. coli or M. luteus gyrase led to the accumulation of DNA fragments about 140 bp in length. Digestion with DNase I gave a group of single-stranded DNA fragments differing in length by about 10 bp, identified by electrophoresis under denaturing conditions. Both the length of the nucleaseresistant segment and the periodicity of the DNase I cleavage pattern are consistent with wrapping of DNA around the enzyme; these properties are similar to those of the nucleosome (reviewed by McGhee and Felsenfeld, 1980). Sedimentation studies of the complex formed between M. luteus gyrase and about 140 bp of T7 DNA were consistent with a particle of
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
81
molecular weight 470,000 (Klevan and Wang, 1980). SDS (sodium dodecyl sulfate)-polyacrylamide gel electrophoresis indicated that this particle contained about equal amounts of the A and B proteins, and crosslinking with dimethyl suberimidate yielded a protein complex of MW 420,000. Taken together, these results suggest a complex between DNA gyrase and 140 bp of DNA containing two molecules of each of the gyrase proteins (A2B2). In addition to the 420,000 MW species, cross-linking studies of the gyrase-DNA complex also identified protein species of MW 230,000 and 330,000, which were tentatively assigned as the A2 and APBspecies, respectively (Klevan and Wang, 1980). Cross-linking of the purified A protein also yielded the 230,000 MW species, identifying this species as the A dimer, but the purified B protein gave no cross-linked products. The more detailed topography of gyrase complexed with short DNA molecules has been revealed by several analyses of the products of pancreatic DNase I digestion (Fisher et al., 1981; Kirkegaard and Wang, 1981; Morrison and Cozzarelli, 1981); the results from these studies are essentially in agreement. Gyrase protects a region of 120-155 bp of DNA from nuclease digestion, with a stretch of 40-50 bp, roughly centrally located, being the most strongly protected. Flanking this stretch are regions less strongly protected, showing sites of enhanced sensitivity to DNase I spaced 10- 11 bp apart. These sensitive sites are staggered by two bases on complementary strands. Oxolinic acid-directed DNA cleavage by gyrase will be discussed below but it is relevant to note here that the site of cleavage is found within the 40-50 bp of DNA most strongly protected from the nuclease. Although DNA gyrase protects DNA from nucleases, little or no protection from methylation by dimethyl sulfate was observed (Kirkegaard and Wang, 1981).This indicates that the entire length of the DNA in the gyrase-DNA complex is accessible to solvent, which is again similar to observations made on the nucleosome (McGhee and Felsenfeld, 1980). Complexes between DNA and DNA gyrase may also be detected by their retention on nitrocellulose filters (Peebles et al., 1978; Morrison et al., 1980; Kirkegaard and Wang, 1981; Higgins and Cozzarelli, 1982; Maxwell and Gellert, 1984). These complexes appear to be very stable, with half-lives as long as 60 hours (Higgins and Cozzarelli, 1982). Complex formation requires approximately equivalent amounts of the A and B proteins and the enzyme binds more readily to relaxed or linear DNA that to supercoiled DNA (Higgins and Cozzarelli, 1982); single-stranded DNA is not bound by gyrase (M. Gellert and M. H. O’Dea, unpublished observation). Addition of oxolinic acid leads to a complex that is more stable to the addition of high salt (Higgins and Cozzarelli, 1982).
82
ANTHONY MAXWELL AND MARTIN GELLERT
Using linear DNA fragments of different lengths, Morrison et al. (1980) found gyrase to form a more stable complex with a 509-bp DNA fragment than with a 176-bpfragment and no detectable complex with a fragment of 77 bp. In similar experiments, Maxwell and Gellert (1984) found that gyrase formed a filter-stable complex with DNA fragments 117 bp or greater in size but not with those of 55 bp or less. However, at high DNA concentrations, the 55-bp fragment did form filter-stable complexes with DNA gyrase; the binding isotherms were consistent with these complexes containing two DNA molecules per gyrase molecule. Hence, although about 140 bp of DNA is protected from nucleases by gyrase, the enzyme can form filter-stable complexes with shorter DNA molecules under certain conditions. Examination of the complexes formed between DNA and DNA gyrase by electron microscopy (Moore et al., 1983; Lother et al., 1984; Kirchhausen et al., 1985) reveals particles 20-25 nm in diameter, likely to represent the A2B2 complex. Although all catalytic activities of DNA gyrase require both subunits (apart from the ATPase activity of the B protein; see below),,these authors also observed particles associated with DNA when only the A protein was present. In the experiments of Lother et al. (1984), DNA gyrase complexed with linear, relaxed, or supercoiled DNA appeared to be associated with a single region of DNA. However, the gyrase complexes observed by Moore et al. (1983) were frequently located at the intersection of two DNA duplexes, forming looped structures with closed-circular DNA. The differences between the complexes observed by these groups may simply reflect differences in the conditions used to form gyrase-DNA complexes or to prepare them for electron microscopy. Examination of electron micrographs of gyrase complexes at high magnification led Kirchhausen et al. (1985) to suggest that the A2B2 complex is “heart-shaped” with the A proteins forming the upper and larger lobes of the structure. Estimates of the amount of DNA associated with gyrase in these complexes gave values of 115 or 161 bp, depending upon how the measurements were made. These sizes are not inconsistent with the extent of DNA protected by gyrase from nucleases. DNA binding by T4 topoisomerase is evidenced by the retention of fragments of T 4 DNA on nitrocellulose filters (Kreuzer and Alberts, 1984). When visualized by electron microscopy, complexes between T4 topoisomerase and DNA appear as particles of heterogeneous size (2130 nm in diameter), somewhat larger than the enzyme in the absence of DNA (15 nm; Kreuzer and Huang, 1983). In complexes with closedcircular DNA, enzyme molecules are found exclusively at the intersection of DNA strands, forming looped structures with the enzyme located
-
-
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
83
at the base of the loops. It has been suggested that these loops may indicate structures important in the topoisomerase reaction. Although the DNA-binding properties of several procaryotic topoisomerases have been well characterized, little information is currently available concerning eucaryotic enzymes. Some eucaryotic topoisomerases may be intimately associated with other nuclear proteins; HeLa topoisomerases I and I1 have been found to be associated with chromatin (Javaherian and Liu, 1983; Liu et al., 1983b). HeLa topoisomerase 1 has been shown to bind to the nonhistone protein HMGl7, which also stimulates DNA catenation by the enzyme (Javaherian and Liu, 1983; Tse et al., 1984). It has been suggested that type I1 topoisomerase is an important component of the chromosomal scaffold in interphase nuclei and mitotic chromosomes from chicken cell lines (Earnshaw et al., 1985). In addition, an ATP-dependent topoisomerase has been found associated with several other enzymes of DNA metabolism in a complex (termed the replituse complex) isolated from the nuclei of Chinese hamster embryo fibroblast cells (Noguchi et al., 1983).
IV. DNA CLEAVAGE A . The Cleavage Reaction DNA cleavage by a topoisomerase was first shown with E. coli topoisomerase I (Depew et al., 1976, 1978). Addition of alkali to the saltstable complex formed between the enzyme and circular single-stranded bacteriophage fd DNA led to breakage of the DNA and covalent binding of the protein to the 5' end; the other end was shown to have a free 3'hydroxyl group (Depew et al., 1978). If the DNA is labeled throughout with 32P,label can be found transferred to the protein subsequent to DNA cleavage and removal of the DNA by hydrolysis (Tse et al., 1980). These results are consistent with the cleavage of a phosphodiester bond by E. coli topoisomerase I and the covalent attachment of the enzyme to the newly formed 5'-phosphoryl terminus. Single-stranded and negatively supercoiled DNA are substrates for this reaction but relaxed closed-circular DNA is not (Liu and Wang, 1979); nicked circular DNA has been found to be cleaved by the enzyme at sites on the intact strand near to the nick (Kirkegaard et al., 1984; Dean and Cozzarelli, 1985). Synthetic single-stranded DNA homopolymers and copolymers are also cleaved by this enzyme (Depew et al., 1978)and single-stranded oligonucleotides as short as seven bases in length will serve as substrates; in this case, cleavage can occur spontaneously without the addition of alkali (Tse-Dinh et al., 1983).
84
ANTHONY MAXWELL AND MARTIN GELLERT
The interaction of eucaryotic type I topoisomerases with DNA can also be interrupted by alkali, leading to DNA cleavage (Champoux, 1976), but in this case free 5’-hydroxyl termini are generated and the protein is attached to a 3’-phosphoryl group (Champoux, 1977). Eucaryotic type I topoisomerases will cleave both single- and doublestranded DNA (Champoux, 1976; Prell and Vosberg, 1980); singlestranded DNA can be cleaved without the addition of alkali (Been and Champoux, 1980). Cleavage of single-stranded DNA has been shown to occur only at regions with the potential for intramolecular base pairing (Been and Champoux, 1985),implying that a duplex region is an important recognition feature in the interaction of this enzyme with DNA. This contrasts with E. colz topoisomerase I, which appears to require single-stranded regions for both DNA binding and DNA cleavage. The topoisomerases involved in site-specific recombination, Int and resolvase, have both been shown to cleave DNA (Reed and Grindley, 1981; Craig and Nash, 1983). Breakage of DNA by y8 resolvase results in attachment of the protein at a 5’-phosphoryl group, leaving free 3’hydroxyl termini (Reed and Grindley, 1981); however, Int protein attaches to 3’-phosphoryl groups (Craig and Nash, 1983).The Int protein is the only procaryotic topoisomerase so far shown to bind covalently to the 3’-phosphoryl termini of DNA. DNA cleavage by E. coli DNA gyrase can be induced by incubation in the presence of the drug oxolinic acid followed by the addition of a detergent such as SDS (Gellert et al., 1977; Sugino et al., 1977); both strands are cleaved. Oxolinic acid is a member of the quinolone class of antibacterial compounds, which inhibit the topoisomerase activity of DNA gyrase. Although the quinolone drugs were once thought to bind to the A protein, experiments have failed to demonstrate binding of these drugs to DNA gyrase, but have instead reported binding to DNA (Shen and Pernet, 1985).Thus, the effect of oxolinic acid on gyrase may be mediated by interaction with DNA. Little DNA cleavage by M. lutezu DNA gyrase is induced by the oxolinic acid/SDS treatment, but the addition of alkali promotes cleavage by this enzyme (Tse et al., 1980). Linear, relaxed, and supercoiled DNA are all substrates for cleavage by E. colz gyrase (Sugino et al., 1977), but the cleavage of single-stranded DNA has not been reported, consistent with the observation that this DNA is not bound by gyrase. Cleavage does not require ATP, but ATP or the analog ADPNP (5’-adenylyl &y-imidodiphosphate) increases the efficiency of cleavage and alters the pattern of preference among cleavage sites (Sugino et al., 1978; Peebles et al., 1978). Although DNA is usually broken in both strands in this reaction, some single-strand breakage can also be detected (Gellert et al., 1977). DNA cleavage by gyrase
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
85
leads to the generation of free 3’-hydroxyl termini and the A protein is found covalently bound to the newly formed 5’-phosphoryl termini (Morrison and Cozzarelli, 1979; Morrison et al., 1981). The related enzyme topoisomerase 11’ (composed of the gyrase A protein and a fragment of the B protein) also demonstrates oxolinic acid-dependent DNA cleavage in a similar fashion (Gellert et al., 1979; Brown et al., 1979). DNA cleavage by DNA gyrase in the absence of oxolinic acid may be induced when Ca2+is substituted for Mg2+in reaction mixtures (L. M. Fisher, M. H. O’Dea, and M. Gellert, unpublished observations). Cleavage induced by Ca2+occurs at the same loci as that induced by oxolinic acid but with different relative efficiencies. Cleavage of DNA by T4 topoisomerase can be induced by the addition of detergent (Kreuzer and Huang, 1983), and the presence of oxolinic acid greatly enhances cleavage (Kreuzer and Alberts, 1984) to the point that almost every topoisomerase molecule breaks DNA. Unlike DNA gyrase, the T 4 enzyme also cleaves single-stranded DNA (Kreuzer, 1984); this reaction does not require the addition of detergent and is inhibited (not stimulated) by oxolinic acid. Addition of ATP or the analog ATPyS [adenosine 5’-0-(3-thiotriphosphate)],has little effect upon the double-stranded cleavage of cytosine-containing DNA by the T4 enzyme, but with native T4 DNA (containing glucosylated hydroxymethylcytosine residues) as the substrate, ATP or ATPyS dramatically increases the frequency of cleavage (Kreuzer and Alberts, 1984).As with DNA gyrase, DNA cleavage by the T4 enzyme leads to covalent attachment of a subunit of the enzyme (in this case the gene 52 protein) to the 5’-phosphoryl end and the generation of free 3’-hydroxyl termini (Rowe et al., 1984). DNA cleavage by the eucaryotic type I1 topoisomerases from calf thymus and Drosophila can also be induced by the addition of detergent (Liu et al., 1983a; Sander and Hsieh, 1983); both strands are cleaved and the protein is covalently attached to the 5’-phosphoryl termini of the DNA. ATP is not required for DNA cleavage by either enzyme, but in the case of the Drosophila enzyme, ATP has a stimulatory effect without altering the cleavage specificity (Sander and Hsieh, 1983). The calf thymus enzyme also cleaves single-stranded DNA and it sometimes cleaves doublestranded DNA in only one strand (Liu et al., 1983a). In the presence of several drugs known for their antitumor activities, calf thymus topoisomerase I1 has been shown to cleave DNA more efficiently (Nelson et al., 1984; Chen et al., 1984).These drugs include both DNA intercalaters like ellipticine and 4’-(9-acridinylamino)methanesulfon-m-aniside (m-AMSA), and nonintercalative compounds such as the epipodophyllotoxins VP- 16 and VM-26. The drug m-AMSA had
86
ANTHONY MAXWELL AND MARTIN GELLERT
previously been shown to produce protein-associated DNA breaks in vivo (Zwelling et al., 1981); interestingly, the ortho isomer, o-AMSA, which has no antitumor activity, does not produce DNA breaks in vivo and shows a reduced stimulation of the cleavage reaction by calf thymus topoisomerase I1 (Nelson et al., 1984). The results suggest that topoisomerase I1 may be the primary in vivo target for these drugs and that stimulation of DNA cleavage may be the cause of their cytotoxicity. It has also been reported that the production of double-strand breaks induced by m-AMSA in mouse mastocytoma cells is inhibited by novobiocin, a known inhibitor of type I1 topoisomerases (Marshall et al., 1983). T h e nonintercalative drugs VP-16 and VM-26 and related compounds have also been shown to induce DNA breaks by the type I1 topoisomerase from Novikoff hepatoma cells (Minocha and Long, 1984). I n this case, the role of the enzyme in cytotoxicity is suggested by the observation of a similar hierarchy in the potency of different drugs for cytotoxicity, DNA breakage, and inhibition of the DNA catenation reaction of the enzyme. The mechanism of action of these drugs has not been established, but their effects are reminiscent of those of oxolinic acid on DNA gyrase and T4 topoisomerase. Although the calf thymus topoisomerase cleavage reaction is not stimulated by oxolinic acid, the DNA gyrase cleavage reaction is weakly stimulated by m-AMSA (M. Gellert and M. H. O’Dea, unpublished results) and the cleavage of DNA by the T 4 enzyme is stimulated by both oxolinic acid and m-AMSA (Kreuzer and Alberts, 1984; L. F. Liu, personal communication), raising the possibility that these drugs may have common mechanisms of action. In a study of the interaction of calf thymus topoisomerase I1 with phosphorothioate-substituted DNA, Darby and Vosberg (1985) found that closed-circular DNA fully substituted with phosphorothioate linkages in one strand was a very poor substrate for relaxation, although it was cleavedjust as well as unmodified DNA. This indicates that inhibition of the interaction of topoisomerase I1 with phosphorothioate-substituted DNA occurs at a stage subsequent to DNA cleavage. B . Cleavage-Site Specificity DNA cleavage by many topoisomerases shows some sequence preference, generating DNA fragments of discrete sizes. However, a welldefined sequence specificity has been observed for only a few enzymes and analysis at the nucleotide level has in most cases revealed only a rough consensus sequence. It is possible that there are sequence deter-
MECHANISTIC ASPECTS OF
DNA TOPOISOMERASES
87
minants lying some distance from the cleavage site which have yet to be elucidated. The type I topoisomerases from both E. coli and M . luteus have been found to cleave several synthetic DNA homopolymers, e.g., poly(dT) and poly(dC), and also short oligomers of dG, dA, dT, and dC (Depew et al., 1978; Tse-Dinh et al., 1983), indicating a lack of absolute cleavage specificity. Despite this finding, the enzymes clearly show some sequence preference on natural DNA substrates. Using a single-stranded DNA fragment of plasmid pBR322, Tse et al. (1980) found several preferred cleavage sites for both the M. luteus and E . coli enzymes. Although there was considerable overlap between the site preference of the two enzymes, there were also some differences. The only rule that could be derived from these experiments was a preference for a C residue at the fourth position 5' to the cleavage site (Tse et al., 1980). Analysis of 66 sites of E. coli topoisomerase I cleavage in a 385-bp stretch of bacteriophage 4x174 DNA showed that 97% of the sites had a C residue four bases 5' to the cut and 82% had a C residue four bases 3' to the cut (Dean and Cozzarelli, 1985),extending the observations of Tse et al. (1980). However, as mentioned above, when a nicked circular duplex substrate is employed, the preferred sites of DNA cleavage are found in the intact strand at loci close to the nick. Using duplex DNA molecules bearing single-stranded gaps of various lengths, Kirkegaard et al. (1984) have shown that E. coli topoisomerase I cleaves within the gap close to the junction with double-stranded DNA. It seems that the sequence specificity of the enzyme can be overcome by its strong preference for nicks or single-stranded gaps in double-stranded DNA. With SV40 DNA as substrate, Edwards et al. (1982) mapped 68 sites of cleavage by eucaryotic type I topoisomerase in an 827-bp region, using enzyme from both calf thymus and HeLa cells. These two topoisomerases were found to share the same cleavage sites. Analysis at the nucleotide level revealed no stringent sequence specificity, but a preference for certain nucleotides surrounding the cleavage site was found, giving rise to the following consensus sequence:
Cleavage occurs at the site indicated by the asterisk. There are two or three alternative bases at each position; the order of preference reads from top to bottom. One cleavage site that occurred only on singlestranded DNA did not conform to the consensus sequence but was found to be located in the loop of a potential hairpin. In a similar study,
88
ANTHONY MAXWELL AND MARTIN GELLERT
Been et al. (1984a) mapped 223 rat liver topoisomerase I and 245 wheat germ topoisomerase I cleavage sites in six regions covering 1781 bp of SV40 viral DNA. These authors derived a consensus sequence for cleavage that was similar to that of Edwards et al. (1982), but which contained fewer degeneracies, and can be represented as 5’-(!)@)(I)T*-3’. However, where the sequence data of Edwards et al. and Been et al. overlap, differences in the cleavage sites are apparent, as a consequence either of the different sources of enzyme or the different reaction conditions used by the two groups. The consensus sequences for eucaryotic type I topoisomerase cleavage sites derived by Edwards et al. (1982) and Been et al. (1984a) are relatively nonspecific and suggest sequence preference rather than absolute specificity. However, in studies of topoisomerase I cleavage sites in the ribosomal genes of Tetrahymena, a higher degree of sequence specificity has been found (Gocke et al., 1983; Bonven et al., 1985). Identical cleavage sites for Tetrahymena type I topoisomerase were found in vitro on naked DNA or reconstituted chromatin, or in vivo, occurring in regions previously identified as being hypersensitive to DNase I in native chromatin and proposed to be devoid of nucleosomes. Examination of the sequences around these cleavage sites shows that they conform to the consensus sequence determined by Edwards et al. (1982) and Been et al. (1984a). However, several other sites within the ribosomal genes which also match these consensus sequences are not cleaved, implying a more stringent sequence discrimination in this case. The consensus sequence of Bonven et al. (1985) is 5’-A(z)ACTT*AGA(:)AAA(I)(I)(I)-3’.Using type I enzymes from other eucaryote sources (e.g., calf thymus and Drosophila), they found the same cleavage sites on Tetrahymena ribosomal DNA as had been found using the endogenous topoisomerase. These authors propose that topoisomerase I may be involved in the transcriptional control of the ribosomal genes. The differences between the cleavage specificities observed by Bonven et al. (1985) and those of Edwards et al. (1982) and Been et al. (1984a) may be a consequence of the effect of Ca2+on the cleavage reaction; Bonven et al. (1985) report that omission of Ca2+largely relieves the observed cleavage specificity. While breakage of double-strand DNA by wheat germ topoisomerase I required the addition of SDS, breakage of single-stranded DNA has been observed in the absence of denaturant. And although some sites overlapped with those on double-stranded DNA, others did not (Been et al., 198413). Been and Champoux (1985) have found that cleavage sites for the rat liver type I topoisomerase can occur in regions of single-stranded DNA with the potential for intramolecular base pairing. They found that the deletion of sequences remote from the cleavage
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
89
sites which disrupted potential hairpin structures also abolished cleavage at these sites. These results indicate that eucaryotic type I topoisomerases require at least a short region of duplex in order to execute the cleavage reaction (Been and Champoux, 1985). This contrasts with procaryotic topoisomerase I, which appears to require a single-stranded region in order to interact with DNA. Studies of DNA cleavage by the Int protein of bacteriophage A have shown that Int can cleave att-containing DNA at specific loci (Craig and Nash, 1983). The sites of breakage lie within the 15-bp core region common to both attB and attP; a single cut is made independently in each strand at positions seven bases apart. Thus, although Int binds to multiple sites, it cleaves only in the core, at the sites involved in strand exchange (Craig and Nash, 1983). This property of site-specificcleavage by Int protein reflects its role in recombination and contrasts with the relatively nonspecific cleavage of DNA by the archetypal procaryotic and eucaryotic type I topoisomerases. Similarly, the resolvase protein from yS has been shown to cleave DNA containing two res sites at a specific sequence within the res sites (Reed and Grindley, 1981).The sequence at the cleavage site is 5'-TTAT*AA-3', with the break occurring at the site indicated by the asterisk; a two-base 3' extension is thus generated. Studies of cleavage by DNA gyrase have shown that the enzyme cleaves the DNA in both strands yielding a four-base stagger between the cuts (Morrison and Cozzarelli, 1979; Gellert et al., 1980; Kirkegaard and Wang, 1981). Analysis of oxolinic acid-induced gyrase-directed DNA cleavage in vivo (Lockshon and Morris, 1985) has suggested the following consensus sequence: 5'-RNNNRNRT*GRYC [GI
where N is any nucleotide, R and Y are purine and pyrimidine, respectively, and the asterisk indicates the site of cleavage. The G and T in square brackets were preferred secondarily to T and G, respectively. This sequence shows elements of dyad symmetry about an axis midway between the cleavage sites in the two strands and is consistent with many of the sites found in vitro (Morrison and Cozzarelli, 1979; Fisher et al., 1981; Kirkegaard and Wang, 1981) and with a more limited consensus proposed by Morrison and Cozzarelli (1979). Although sites of oxolinic acid-induced DNA cleavage by gyrase are often found in those regions of DNA identified as gyrase-binding sites (see above), at least one binding site has been found not to contain a cleavage site (Kirkegaard and
90
ANTHONY MAXWELL AND MARTIN GELLERT
Wang, 1981). It should therefore not be assumed that gyrase will necessarily cleave sites to which it binds. Studies using an in vitro system have suggested that E. coli DNA gyrase may mediate nonhomologous recombination (Ikeda et al., 1980). T h e recombination reaction is stimulated by oxolinic acid and this stimulation is abolished if an extract from a bacterial strain carrying an oxolinic acid-resistant gyrA gene is used. Sequence analysis has shown that the crossover event often occurs at or near DNA gyrase cleavage sites (Naito et al., 1984; Ikeda et al., 1984). These data have been interpreted as suggesting that gyrase-directed DNA cleavage is involved in the recombination reaction. Topoisomerase 11’ cleaves DNA at the same sites as those cleaved by DNA gyrase, although with different efficiencies (Gellert et al., 1979; Brown et al., 1979). DNA gyrase from M. luteus or Bacillus subtilis also appears to cleave DNA at locations similar to those cleaved by the E. coli enzyme (Brown et al., 1979; Sugino and Bott, 1980). Cleavage of cytosine-containing T4 DNA by the T4 topoisomerase was found to be relatively nonspecific, whereas native T 4 DNA (containing glucosylated hydroxymethylcytosine) was cleaved at specific sites (Kreuzer and Alberts, 1984). The T 4 topoisomerase from T4 virus also cleaves single-stranded DNA; mapping of .these sites in 4x174 has shown that some lie in regions of potential secondary structure. It is suggested that the enzyme may cleave DNA at or near the base of hairpins (Kreuzer, 1984). Comparison of the sequences around the sites cleaved in 6x174 DNA in either single- or double-stranded form indicates that these two classes of sites are nonoverlapping and that different rules of recognition apply (Kreuzer and Alberts, 1984; Kreuzer, 1984). Double-stranded DNA cleavage by the Drosophila type I1 enzyme, like DNA gyrase, produces breaks four bases apart, thus generating protruding 5’ ends (Sander and Hsieh, 1983). A consensus sequence for cleavage by the Drosophila enzyme has been suggested following analysis of in vitrocleavage ofDrosophila DNA: 5’-GTN(+)AY*ATTNATNNG-3’ (Sander and Hsieh, 1985). A study of Drosophila type I1 topoisomerase cleavage sites in DNA from the heat-shock locus and the histone repeat unit of the Drosophila genome showed that many of these sites occurred in intergenic regions and corresponded to sites hypersensitive to micrococcal nuclease (Udvardy et al., 1985). These results suggest that topoisomerase I1 and micrococcal nuclease may recognize similar elements in DNA. It was proposed that the occurrence of topoisomerase I1 cleavage sites chiefly in intergenic regions could implicate this enzyme in transcriptional control. DNA cleavage by the calf thymus type I1 topoisomerase also generates
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
91
four-base 5’-protruding ends; this reaction is relatively nonspecific, with more than 12 cleavage sites being detected in a 74-bp fragment of SV40 DNA (Liu et al., 1983a). The pattern of cleavage sites induced by 2methyl-9-hydroxyellipticine is somewhat different from that induced by m-AMSA (Tewey et al., 1984).It seems, therefore, that in this case the cleavage specificity may be determined to some extent by the drug. This does not appear to be the case with DNA gyrase, where cleavages induced by oxolinic acid and Ca2+ occur at the same loci (see above), implying that in this case site specificity is determined principally by the enzyme. C . The DNA-Protein Bond Using 92P-labeledDNA, Tse et al. (1980) showed that, after incubation with procaryotic topoisomerase I and addition of alkali, some label was transferred to the protein and could be found chiefly associated with tyrosine. The properties of the DNA-protein bond identified it as a phosphate ester between a 5’-phosphoryl group in DNA and the 0 - 4 position of tyrosine. An identical linkage was also found between the A protein of E. coli and M . luteus DNA gyrase and DNA (Tse et al., 1980; Sugino et al., 1980). The protein-DNA bond formed between rat liver DNA topoisomerase I and DNA was also found to involve tyrosine, but in this case the linkage was to a 3’-phosphoryl group on DNA (Champoux, 1981). The protein-DNA bond of the T 4 enzyme is also a tyrosyl phosphate involving the gene 52 protein of the enzyme (Rowe et al., 1984), identifying this subunit as being involved in the nicking-closing reaction of the enzyme. The only known case with different chemistry is the y8 resolvase (Reed and Moser, 1984); in this case, the bond has been identified as a phosphate ester between serine and a 5’-phosphoryl group on DNA. Klevan and Tse (1983) have examined the effect of chemical modification of E. coli topoisomerase I and DNA gyrase by tetranitromethane, which reacts preferentially with tyrosine residues. With each enzyme, treatment with tetranitromethane led to abolition of the topoisomerase activity. Moreover, the enzymes were protected from this inactivation when bound to DNA, implying that some of the modified residues are involved in DNA binding. However, this study does not identify the tyrosine residues involved in the protein-DNA bond as being the amino acid residues whose modification inactivates the enzyme. In the case of DNA gyrase, which has two subunits, it was not determined which subunit was inactivated. Analysis of mutant resolvase proteins of the transposon y8 has identified amino acid residues probably involved in the recombination reac-
92
ANTHONY MAXWELL AND MARTIN GELLERT
tion (Newman and Grindley, 1984).Mutations at two of these residues, Ser- 10 and Gln- 14,resulted in reduced binding of the protein to res site I (the crossover site) compared with binding to sites I1 and 111, implying a direct involvement of these residues in binding to the crossover region. Based on the identification of a phosphoryl-serine bond between resolvase and DNA by Reed and Moser (1984),it has been suggested that Ser-10 could be the residue involved in the protein-DNA bond (Newman and Grindley, 1984). The existence of a phosphodiester bond between proteins and nucleic acids is by no means unique to topoisomerases. For example, the covalent bond formed between the poliovirus RNA-linked protein (VPg) and the 5’ terminus of the poliovirus RNA has been determined to be a phosphoryl-tyrosine linkage (Ambros and Baltimore, 1978). Proteins are found covalently attached to the 5’ termini of the genomes of phage 429 and adenovirus via a phosphoryl-serine bond (Hermoso and Salas, 1980;Desiderio and Kelly, 1981).It seems, therefore, that a phosphodiester bond between DNA and either serine or tyrosine residues in proteins may be a common feature of many proteins involved in DNA metabolism.
V. DNA REUNION Although phosphodiester bond cleavage has been shown for many topoisomerases, the reaction that recreates the phosphodiester bond has been harder to isolate. This is due to the difficulty of trapping a complex of broken DNA with the enzyme in an active form. However, some eucaryotic type I topoisomerases can form a stable covalent complex with DNA, the protein-bound end of which can join to the same or another DNA molecule. These reactions were first demonstrated by Been and Champoux (1981)using rat liver topoisomerase I. Incubation of circular single-stranded 4x174 DNA with this enzyme at low salt (50 mM KC1) generates linear molecules with the enzyme covalently bound at the 3’-phosphoryl termini. In the presence of high salt (250 mM KCl) or MgC12 (10 mM), these linear molecules can be recircularized. The recircularization reaction is blocked if the 5’ terminus of the linearized DNA is phosphorylated by polynucleotide kinase. The linearized 4x174 DNA could also be joined to single-stranded DNA fragments bearing a free 5‘-hydroxyl group but not to those with 5’-phosphoryl termini (Been and Champoux, 1981). Intermolecular strand transfer has also been demonstrated for the type I enzyme from HeLa cells (Halligan .dal., 1982).This enzyme can transfer a single-stranded (“donor”)DNA to a range of different “accep-
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
93
tor” DNAs, including double-stranded nicked circles and linear doublestranded DNA bearing either flush, 5’-protruding, or 5’-recessed termini. The only requirement of the acceptor species is that it must have a 5‘-hydroxyl terminus. Although the cleavage of DNA by this enzyme is somewhat site-specific (see above), the strand-transfer reaction showed no sequence specificity with respect to the acceptor DNA (Halligan et al., 1982). The type I enzyme from avian erythrocytes has also been shown to perform intermolecular strand transfer (Trask and Muller, 1983). DNA cleavage by E. coli topoisomerase I usually requires denaturant treatment (Depew et al., 1978), but with short oligonucleotides as substrates, cleavage can be achieved under native conditions (Tse-Dinh et al., 1983). Under these conditions, it has also been possible to show transfer of the enzyme-linked oligonucleotide to an acceptor DNA bearing a 3‘-hydroxyl group (Y.-C. Tse-Dinh, personal communication). So far, the strand transfer reaction has only been demonstrated for type I enzymes, where the enzyme-DNA complex can in some cases be isolated without denaturant treatment. For the type I1 enzymes, only indirect evidence exists for the bond-making reaction. In the oxolinic acid-directed cleavage of supercoiled DNA by DNA gyrase, brief heating of reaction mixtures to 80°C prior to the addition of SDS blocks the cleavage reaction and the supercoiled substrate is recovered intact (Gellert et d., 1977).Similarly, the extent of cleavage of supercoiled DNA by the calf thymus type I1 enzyme can be reduced by either shifting the reaction mixtures from 37 to 0°C or by the addition of high salt prior to denaturant treatment (Liu et al., 1983a), again resulting in the isolation of the supercoiled substrate. That supercoiled and not relaxed DNA is recovered in these reactions implies either that the broken ends of the DNA are held tightly by the enzyme or that the covalent linking of the enzyme to DNA occurs only after the addition of protein denaturant (Liu et al., 1983a). VI. ATP HYDROLYSIS Of the known DNA topoisomerases, only DNA gyrase and reverse gyrase have been shown to introduce supercoils into DNA (Gellert et al., 1976a; Kikuchi and Asai, 1984). DNA supercoiling is an energy-requiring process, and both these enzymes require ATP; DNA gyrase has been shown to hydrolyze ATP to ADP and inorganic phosphate (Mizuuchi et al., 1978; Sugino et al., 1978). Surprisingly, many other type I1 topoisomerases are also ATPases, even though they are only able to relax DNA, an energetically favorable reaction; these include T4 DNA topoisomerase (Liu et al., 1979), Drosophila type I1 topoisomerase (Hsieh and
94
ANTHONY MAXWELL AND MARTIN GELLERT
Brutlag, 1980), and Xenopw type I1 topoisomerase (Baldi et al., 1980). How these enzymes utilize ATP is not clear. A requirement for energy in the reunion of DNA would appear to be unnecessary due to the formation of a DNA-protein covalent bond. Not all the reactions of type I1 topoisomerases require ATP. For example, the DNA-relaxing activities of DNA gyrase and of the related enzyme topoisomerase 11’ are ATP-independent (Gellert et al., 1977, 1979; Sugino et al., 1977; Brown et al., 1979), some DNA relaxation and the formation of knots by the T 4 topoisomerase will occur in the absence of ATP (Liu et al., 1980), and a low level of DNA-relaxing activity by the calf thymus type I1 topoisomerase is apparent in the absence of ATP (Halligan et al., 1985). Some type I enzymes have also been shown to interact with ATP. T h e DNA-relaxing activity of vaccinia virus type I topoisomerase is stimulated by ATP (Foglesong and Bauer, 1984), although no ATPase activity has been detected. The type I enzymes from Ustilago and chicken erythrocytes are both inhibited by ATP (Rowe et al., 1981; Trask et al., 1984). It is possible that, in these cases, ATP acts as an effector which modulates the relaxing activity. Although an ATP requirement is generally considered to be a property only of the type I1 enzymes, these exceptions show that, by itself, this criterion cannot be used to distinguish type I and type I 1 activities. The ATPase activities of DNA gyrase and T 4 DNA topoisomerase have been the most intensively studied; the remainder of this section will focus on these two enzymes. Both enzymes have a low level of ATPase activity in the absence of DNA, which is greatly increased when DNA is added (Mizuuchi et al., 1978; Sugino et al., 1980; Liu et al., 1979; Kreuzer and Jongeneel, 1983). In the case of DNA gyrase, the DNA-independent activity can be demonstrated with the B protein alone (Staudenbauer and Orr, 1981; Maxwell and Gellert, 1984). That this activity is intrinsic to DNA gyrase is shown by its sensitivity to novobiocin (a specific inhibitor of DNA gyrase; Gellert et al., 1976b). The ATPase reaction of DNA gyrase is relatively nonspecific with regard to the structure and sequence of the DNA cofactor. Doublestranded linear, nicked circular, relaxed, and supercoiled DNA molecules stimulate the ATPase activity (Mizuuchi et al., 1978; Sugino and Cozzarelli, 1980), although supercoiled DNA is less effective than other forms. Single-stranded DNA, however, is a poor cofactor (Mizuuchi et al., 1978; Sugino and Cozzarelli, 1980). Similarly, single-stranded DNA is severalfold less effective than double-stranded DNA in stimulating the ATPase reaction of the T 4 topoisomerase (Liu et al., 1979; Kreuzer and Jongeneel, 1983).
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
95
DNA below a minimum length is a poor cofactor for the ATPase reaction of DNA gyrase. Klevan and Tse (1983) have shown that the ATPase activity of DNA gyrase from M. luteus is five times higher with a 240-bp DNA fragment than with a 100-bp fragment. Similarly, using a range of DNA fragments 46-171 bp in length, Maxwell and Gellert (1984) showed that DNA molecules of 100 bp or greater were effective cofactors for the ATPase reaction of E. coli DNA gyrase but that those of 70 bp o r less were not. However, when short DNA molecules (<70 bp) were employed at very high concentrations, the gyrase ATPase was stimulated and the dependence on DNA concentration was sigmoidal (Maxwell and Gellert, 1984), suggesting that two or more short DNA molecules can interact with each DNA gyrase molecule and functionally replace one longer DNA. Several nucleotides and ATP analogs have been found to competitively inhibit the ATPase reaction of DNA gyrase (Sugino et al., 1978; Sugino and Cozzarelli, 1980). Some of these inhibitors (e.g., ADP and ADPNP) are reported to have Ki values lower than the K M for ATP. In addition to being a n ATPase inhibitor, ADPNP will support limited supercoiling by DNA gyrase (Sugino et al., 1978). Incubation of high levels of DNA gyrase with ColEl DNA and ADPNP led to the introduction of about -0.3 supercoils per A protomer. It is not clear why this value is not - 1, but it may represent a lower bound determined by the proportion of inactive enzyme in the preparation. This experiment has been interpreted to suggest that the binding of the nucleotide is sufficient to promote one cycle of the supercoiling reaction but that turnover of the enzyme requires hydrolysis, which is not possible with ADPNP. Therefore a suggested role of ATP hydrolysis is to regenerate the conformational state of the enzyme required to resume the supercoiling cycle (Peebles et al., 1978). The ATPase reaction of the T 4 topoisomerase has been shown to be inhibited by ATPyS; this ATP analog will also support limited relaxation when the enzyme is present at substrate levels (Liu et al., 1979). By analogy with DNA gyrase, one cycle of the relaxation reaction of the T 4 enzyme could be promoted by nucleotide binding but enzyme turnover would require hydrolysis. The ATPase reaction of DNA gyrase is also inhibited by the coumarin drugs coumermycin A1 and novobiocin (Mizuuchi et al., 1978; Sugino et al., 1978). Despite having little structural resemblance to ATP, these drugs are reported to be competitive inhibitors of the ATPase reaction with Ki values at least four orders of magnitude lower than the K M for ATP (Sugino et al., 1978; Sugino and Cozzarelli, 1980). Eucaryotic type I1 topoisomerases are also inhibited by novobiocin and coumermycin A1
96
ANTHONY MAXWELL AND MARTIN GELLERT
but the binding of these drugs to the enzyme is not as tight as to DNA gyrase (Hsieh and Brutlag, 1980; Miller et al., 1981). Unlike the coumarin drugs, the quinolones (such as oxolinic acid) only partly inhibit the DNA-dependent ATPase activity of DNA gyrase (Mizuuchi et al., 1978). The effects of these drugs are consistent with their reported interaction with the A protein of DNA gyrase; they do not inhibit the DNA-independent ATPase activity of the B protein even in the presence of the A protein (A. Maxwell and M. Gellert, unpublished observations). The inhibition of the DNA-dependent ATPase activity presumably reflects interaction between the A and B proteins which modulates the activity of the B protein. Since the quinolone drugs can completely inhibit the supercoiling reaction (Gellert et al., 1977) but allow some ATPase activity to proceed, it would appear that, at least under these conditions, ATP hydrolysis and the DNA translocation reaction are not tightly coupled. Attempts to determine the coupling between the ATPase and supercoiling reactions of gyrase have been hampered by the difficulty of measuring low levels of ATP hydrolysis and by the fact that supercoiled DNA still activates the ATPase reaction (Mizuuchi et al., 1978; Sugino and Cozzarelli, 1980). An estimate of the stoichiometry of the ATPase and supercoiling reactions during the initial part of the reactions yielded a value of 2.5 supertwists introduced per ATP hydrolyzed (Sugino and Cozzarelli, 1980). In this analysis no account was taken of the ATP hydrolysis that would be seen in the presence of fully supercoiled DNA; this correction would tend to increase the estimated number of supertwists per ATP hydrolyzed. Similarly, a preliminary estimate of the stoichiometry between the ATPase and relaxation reactions of T4 DNA topoisomerase has been made (Liu et al., 1979). The value derived was 1 to 2 ATPs hydrolyzed per supercoil relaxed. Here again, no account of the continued hydrolysis in the presence of the DNA product (relaxed DNA) was taken. Another estimate can be derived from the free energy relations at the supercoiling limit of the gyrase reaction. With DNA gyrase, plasmid DNA can be driven to a superhelix density of -0.1, and apparently no further (Gellert et al., 1976a, and unpublished experiments). At this superhelix density, each additional unit change in linking number would require about 13 kcal/mol, provided that the free energy relations derived at lower superhelix densities (Pulleyblank et al., 1975; Depew and Wang, 1975) are still valid under these conditions. The free energy available from ATP hydrolysis under gyrase reaction conditions is - 12 to -13 kcal/mol (Hill, 1977), implying that the limit of gyrase-induced
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
97
supercoiling may be set by the available free energy. At this limit, each reaction cycle with its linking number change of two units would require hydrolysis of two molecules of ATP. Obviously, this is only a lower limit on the amount of ATP hydrolyzed under these conditions. The extent of coupling between ATP hydrolysis and the topoisomerase reaction for both these enzymes must be regarded as rather uncertain. At least for DNA gyrase, and possibly for other enzymes, there is some ATP hydrolysis in the absence of net changes in linking number, indicating that there must be slippage in the coupling of the two processes. The nature of the energy coupling, and of events during futile cycles, clearly demands more attention. VII. PROCESSIVITY IN TOPOISOMERASE REACTIONS Topoisomerase reactions have been shown to proceed via both distributive and processive modes (Wang and Liu, 1979). Distributive action involves the dissociation of the enzyme from the DNA after each catalytic cycle whereas processive action requires several catalytic cycles to occur before the enzyme dissociates. In the relaxation of supercoiled DNA, the appearance of fully relaxed products, while some supercoiled substrate still remains, indicates a processive mode. Alternatively, conversion of all supercoiled substrate to partially relaxed intermediates before the appearance of fully relaxed products indicates a distributive mode. Several topoisomerases act in either a processive or distributive manner depending upon the reaction conditions. For example, M. luteus topoisomerase I relaxes DNA distributively at 100 mM potassium phosphate but is more processive as the salt concentration is lowered (Kung and Wang, 1977). Similarly, the eucaryotic type I enzyme from rat liver acts in a processive mode below 100 mM KC1 but by a distributive mode above 150 mM KCI (McConaughy et al., 1981). The dependence of the mode of action on salt concentration points to the importance of electrostatic interactions in the binding of the enzymes to DNA, i.e., high salt concentrations destabilize protein-DNA interactions and promote dissociation of the enzyme from the DNA and thus favor a distributive mode of action. Several other topoisomerases have been shown to act processively on DNA (e.g., DNA gyrase: Morrison et al., 1980; Drosophila type I1 topoisomerase: Osheroff et al., 1983). This observation imposes certain restrictions on the possible mechanisms of these enzymes. If the enzyme does not dissociate from the DNA between catalytic cycles, then the interactions of the enzyme with DNA must be able to return to their original configuration at the end of each cycle. However, one question
98
ANTHONY MAXWELL AND MARTIN GELLERT
which must be considered in this regard is whether, during processive action, the enzyme has to execute multiple rounds of strand breakage and reunion or whether several cycles of topoisomerization can be achieved for each breakage-reunion event.
VIII. COVALENT MODIFICATIONOF TOPOISOMERASES DNA topoisomerases have been found to be subject to covalent modifications that modulate their DNA-relaxing activities. An example of this is the ADP-ribosylation of calf thymus topoisomerase I by poly(ADPribose) synthetase (Ferro et al., 1983; Jongstra-Bilen et al., 1983; Ferro and Olivera, 1984). The modified enzyme is less active in DNA relaxation. The possibility that topoisomerase I and poly(ADP-ribose)synthetase are associated in vivo was suggested by the observation that they copurify (Ferro et al., 1983; Jongstra-Bilen et al., 1983). Several topoisomerases have been shown to be substrates for protein kinases. Nuclear extracts from a human cell line contain a protein kinase which phosphorylates DNA topoisomerase I from the same cell line (Mills et al., 1982). The type I topoisomerase purified from Novikoff hepatoma cells was found to be a phosphoprotein (Durban et al., 1983). Treatment with alkaline phosphatase dephosphorylates the enzyme and reduces its DNA-relaxing activity. Subsequent treatment with protein kinase restores the activity of the topoisomerase to its original level (Durban et al., 1983). The type I1 topoisomerase from Drosophila has been found to copurify with a protein kinase activity (Sander et al., 1984). That this kinase activity resides in the same polypeptide as the topoisomerase activity was demonstrated by the appearance of a single band on a denaturing polyacrylamide gel after extensive purification, and by the parallel inactivation of the kinase and topoisomerase activities by heat and N-ethylmaleimide treatment. The protein kinase activity will phosphorylate several proteins, including histones and the topoisomerase itself. The phosphorylated residues were found to be threonine and serine; serine was the predominant residue phosphorylated in the topoisomerase (Sander et al., 1984). Incubation of some topoisomerases with the Rous sarcoma virus transforming gene product pp60"" (a tyrosine-specific protein kinase) or TPK75 (a tyrosine-specific protein kinase from normal rat liver) results in the phosphorylation of the topoisomerases and, in the case of calf thymus topoisomerase I and E. coli topoisomerase I, a greater than 90% loss in DNA-relaxing activity (Tse-Dinh et al., 1984). The phosphory-
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
99
lated residue in all cases was reported to be tyrosine. It is not known whether these enzymes are ever similarly phosphorylated in vivo. IX. MECHANISTIC MODELS From the foregoing sections, it is clear that all topoisomerase reaction mechanisms share several common features. After DNA binding, which in some cases organizes the DNA in a specific manner, DNA cleavage and the formation of a transient DNA-protein linkage ensue, followed by reformation of the broken phosphodiester bond(s). Reactions that must exist, but for which there is as yet little experimental evidence, are (1) constraint of the hydroxyl side of the bond by the enzyme, (2) movement of DNA strands between breakage and rejoining, and (3) in the case of DNA gyrase and possibly other type I1 enzymes, the step that couples ATP hydrolysis to DNA translocation. To integrate these features into an overall reaction scheme, several mechanistic models of topoisomerase action have been proposed. As the topoisomerases vary in the details of their reactions, different models have been framed for particular enzymes. Early studies of DNA relaxation by type I topoisomerases (Wang, 1971; Charnpoux and Dulbecco, 1972) suggested that the enzymes might form a covalent bond with one side of the break in the DNA chain. This suggestion, later shown to be correct, still left unspecified the DNA strand motions between breakage and resealing that would account for DNA relaxation. One possibility was that the noncovalently bound DNA end would be partly free to rotate, and could make one or several revolutions before being recaptured by the enzyme, thus dissipating superhelical turns. With the discovery that several type I enzymes can catenate duplex circles (Fig. 1; Tse and Wang, 1980; Brown and Cozzarelli, 1981) and cleave circular duplex DNA bearing nicks (Kirkegaard et al., 1984; Dean and Cozzarelli, 1985), an alternative mechanism has been suggested. In this model, the enzyme bridges the single-strand break, attaching covalently to one side and noncovalently to the other. Strand passage through the break is then supposed to occur without disrupting the bridge. Two kinds of evidence in support of this model have been cited. First, it is found that in nicked DNA E. colz topoisomerase I binds close to the site of the nick and, when allowed to cleave the DNA, cuts the intact strand nearly opposite the nick. Second, an important feature of the catenation reaction is that at least one DNA partner must contain a nick or a single-stranded gap. If the enzyme binds at the nick and allows passage of DNA through the intact strand, one can conclude that the
100
ANTHONY MAXWELL AND MARTIN GELLERT
enzyme must bind to both sides of the break that it creates, if linearization of the nicked circle is to be avoided. This reaction scheme has been extended to other reactions of the type I topoisomerases such as DNA relaxation, knotting, and the annealing of complementary single-strand circles (Brown and Cozzarelli, 198 1; Dean et al., 1982; Dean and Cozzarelli, 1985). For this mechanism to operate for all type I reactions, the enzyme must act similarly on nicked and unbroken DNA, and be able to pass either single- or double-stranded DNA through the break (though in the latter case it is not excluded that the strands could be translocated one at a time). It has not been easy to generate tests of the model as it applies to DNA relaxation. One experiment devised to discriminate between the rotation and enzyme-bridging mechanisms for DNA relaxation showed that at early times the only product of relaxation of a negatively supercoiled topoisomer had a linking number increased by exactly one (Brown and Cozzarelli, 1981). This result was interpreted to favor an enzyme-bridging mechanism by the argument that a rotation mechanism could allow larger changes in linking number in a single cycle; however, if the enzyme very efficiently recaptures the partly free end (e.g., if the mean number of rotations per breakage-reunion cycle is less than one), then a rotation mechanism would be compatible with this result. DNA gyrase has been the subject of intensive studies and most of the mechanistic information about type I1 topoisomerases concerns this enzyme. Many mechanistic models of DNA gyrase have been proposed (for reviews, see Gellert, 1981; and Wang, 1982). Acceptable models of gyrase action must take into account the following features of the gyrase reactions: 1. T h e wrapping of a DNA segment (about 120 bp) in a positively supercoiled sense about the enzyme molecule. 2. The cleavage of DNA in both strands and the covalent attachment of the 5’-phosphoryl termini to the A protein. 3. Changing of the linking number of DNA in increments of two. 4. The ability to supercoil, relax, knot, and catenate closed-circular duplex DNA.
Any model incorporating these aspects of the reaction requires that a double-stranded DNA segment be passed through a transient doublestrand break which is then resealed. Such models pose two difficulties. First, because both strands of the DNA are broken, the forces stabilizing the structure must come from the protein. If the DNA ends escaped from the complex, the reaction would be aborted. While the enzyme holds the complex together, it must also allow the translocated DNA
MECHANISTIC ASPECTS OF D N A TOPOISOMERASES
101
chain to pass through at least a part of the protein structure. Second, the question whether the D N A segment to be translocated is contained within the region wrapped around the enzyme, or comes from a distant part of the D N A chain, needs to be answered. In four different models, various assumptions are made to deal with these aspects. Two models (Brown and Cozzarelli, 1979; Mizuuchi et al., 1980a) tacitly assume that the translocated D N A passes through the entire complex. The first also assumes that the translocated D N A could come from either a nearby or distant part of the DNA. The second explicitly postulates that the translocated segment and the double-strand break both reside within the 120-bp wrapped length of DNA,so that, in principle, the fundamental reaction could be seen with a D N A fragment that small. The former model makes catenation easier to understand, because remote segments of a single D N A would not be distinguished from pieces of two different molecules; the latter model more naturally ensures that the translocated segment will be presented in the right orientation to produce negative supercoiling. Neither model gives an entirely obvious way of assuring that the broken D N A ends stay in place during D N A translocation. T w o other models address this question by splitting the translocation into two processes, so that D N A is first translocated through the doublestrand break with its attached A subunits, into the interior of the complex, and then later released through a transient aperture in the protein structure. One of these (Wang et al., 1981) proposes that the wrapped D N A segment remains in place, with its superhelical sense unchanged, throughout the supercoiling cycle. The D N A segment to be translocated can come from a nearby or remote part of the same molecule or, in the case of catenation, from another D N A molecule. Once again, if the translocated segment comes from a very distant part of the same molecule, there is no guarantee that it will present itself in the correct orientation to be supercoiled rather than relaxed. The other model (Morrison et al., 1981) postulates a “doughnutshaped” gyrase molecule, with the translocated D N A moving from the outside, through a double-strand break in the wrapped segment, into the interior of the protein ring. The translocated segment is considered to be near the wrapped segment in the D N A chain, thus ensuring proper directionality of supercoiling. In order for the translocated segment to escape later, the model postulates partial unwrapping of the exterior D N A segment and passage of the D N A through the same part of the protein by which it entered. Thus in both these models there is the unattractive requirement for two cycles of protein conformational change for each supercoiling step.
102
ANTHONY MAXWELL AND MARTIN GELLERT
None of these models therefore provides a wholly satisfactory account of the supercoiling process and further experimental work will be required to elucidate the details of the gyrase mechanism.
X. CONCLUDING REMARKS It is now clear that topoisomerases are a diverse and important group of enzymes. Although attention has until recently been focused on their ability to interconvert DNA topoisomers, this does not necessarily constitute the primary biological function for all of them. For instance, the recombinases Int and resolvase can relax DNA but this probably represents a side reaction of their recombination activity. It is likely that other enzymes, already known for different activities, will also be shown to be topoisomerases. Conversely, enzymes currently established as topoisomerases may be found to have other catalytic activities. Thus topoisomerases represent a rather heterogeneous class of enzymes in terms of biological function, but nevertheless share the same basic chemistry of DNA breakage and reunion. T h e cellular functions of topoisomerases demand much more attention, particularly in higher eucaryotes where little is known about their biological role. Tantalizing clues have come from their proposed involvement in chromosome structure and transcription complexes, and from demonstrations that their activities can be modulated by covalent modification. These aspects remain to be fully explored. T h e mechanism of action of topoisomerases is also of major interest, especially with the ATP-dependent enzymes, which are among the simplest biological examples of coupling between chemical and mechanical processes. Although certain aspects of the reaction mechanisms have been clarified, much is still unknown. From what is currently established, future approaches to this problem can be suggested. T h e isolation of further intermediates of the topoisomerization reaction (e.g., species involved in strand passage) would assist in the dissection of the steps involved in this process. For type I1 enzymes, a more detailed analysis of the ATPase reaction and its relationship to strand passage would be very informative. Physical techniques for monitoring changes in protein structure (such as fluorescence methods) could be usefully employed to determine the role of protein conformational change in the topoisomerization reaction. The availability of some of these enzymes in large amounts through cloning procedures (e.g., E. coli topoisomerase I: Wang and Becherer, 1983; DNA gyrase: Mizuuchi et al., 1984; yeast type I1 topoisomerase: Goto and Wang, 1984) will allow more extensive structural studies of these enzymes and their complexes with DNA. Such
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
103
structural information, by analogy with studies of other enzymes, should provide powerful insights into the reaction mechanism.
ACKNOWLEDGMENTS We would like to thank P. Boon Chock, Howard Nash, Mary O’Dea, and Gerald Selzer for critically reading this review, K. Mizuuchi for advice and discussion, the many colleagues who sent preprints of unpublished work, and Mary Lou Miller for typing the manuscript.
REFERENCES Abdel-Meguid, S. S., Grindley, N. D. F., Templeton, N. S., and Steitz, T. A. (1984). Proc. Natl. Acad. Sci. U.S.A. 81, 2001-2005. Ambros, V., and Baltimore, D. (1978).J. Biol. Chem. 253, 5263-5266. Baldi, M. I., Benedetti, P., Mattoccia, E., and Tocchini-Valentini, G. P. (1980). Cell 20, 46 1-467. Bauer, W. R. (1978). Annu. Rev. Biophys. Bioeng. 7, 287-313. Bauer, W. R., Crick, F. H. C., and White, J. H. (1980). Sci. Am. 243, 118-133. Been, M. D., and Champoux, J. J. (1980).Nucleic Aclds Res. 8, 6129-6142. Been, M. D., and Champoux, J. J. (1981). Proc. Natl. Acad. Sci. U.S.A. 78, 2883-2887. Been, M. D., and Champoux, J. J. (1985).J. Mol. Biol. 180, 515-531. Been, M. D., Burgess, R. R., and Champoux, J. J. (1984a). Nucleic Acids Res. 12, 30973114. Been, M. D., Burgess, R. R., and Champoux, J. J. (1984b). Biochim. Biophys. Acta 782,304312. Better, M., Lu, C., Williams, R. C., and Echols, H. (1982). Proc. Natl. Acad. Sci. U.S.A. 79, 5837-5841. Bonven, B. J., Gocke, E., and Westergaard, 0. (1985). Cell 41,541-551. Brown, P. O., and Cozzarelli, N. R. (1979). Science 206, 1081-1083. Brown, P. O., and Cozzarelli, N. R. (1981). Proc. Natl. Acad. Sci. U.S.A. 78, 843-847. Brown, P. O., Peebles, C. L., and Cozzarelli, N. R. (1979). Proc. Natl. Acad. Sci. U.S.A. 76, 61 10-61 14. Champoux, J. J. (1976).Proc. Natl. Acad. Sci. U.S.A. 73, 3488-3491. Champoux, J. J. (1977). Proc. Natl. Acad. Sci. U.S.A. 74, 3800-3804. Champoux, J. J. (1981).J. Eiol. Chem. 256, 4805-4809. Champoux, J. J., and Dulbecco, R. (1972).Proc. Natl. Acad. Sci. U.S.A. 69, 143-146. Chen, G . L., Yang, L., Rowe, T. C., Halligan, B. D., Tewey, K. M., and Liu, L. F. (1984).j. Biol. Chem. 259, 13560- 13566. Cozzarelli, N. R., Krasnow, M. A., Gerrard, S. P., and White, J. H. (1984). Cold Spring Harbor Symp. &ant. Biol. 49, 383-400. Craig, N. L., and Nash, H. A. (1983). Cell 35, 795-803. Darby, M. K., and Vosberg, H.-P. (1985).J.Biol. Chem. 260, 4501-4507. Dean, F. B., and Cozzarelli, N. R. (1985).J. Biol. Chem. 260, 4984-4994. Dean, F. B., Krasnow, M. A., Otter, R., Matzuk, M. M., Spengler, S. J., and Cozzarelli, N. R. (1982). Cold Spring Harbor Symp. @ant. Biol. 47, 769-777. Depew, R. E., and Wang, J. C. (1975). Proc. Natl. Acad. Sci. U.S.A. 72,4275-4279. Depew, R. E., Liu, L. F., and Wang, J. C. (1976).Fed. Proc. Fed. Am. SOC.Exp. Biol.35, 1493. Depew, R. E., Liu, L. F., and Wang, J. C. (1978).J.Biol. Chem. 253, 51 1-518. Desiderio, S. V., and Kelly, T. J. (1981).J. Mol. Bzol. 145, 319-337.
104
ANTHONY MAXWELL AND MARTIN GELLERT
Drlica, K. (1984). Microbiol. Rev. 48, 273-289. Durban, E., Mills, J. S.,Roll, D., and Busch, H. (1983). Biochem. Biophys. Res. Commun. 111, 897-905. Earnshaw, W. C., Halligan, B., Cooke, C. A., Heck, M. M. S., and Liu, L. F. (1985).J. Cell Biol. 100, 1706-1715. Edwards, K. A., Halligan, B. D., Davis, J. L., Nivera, N. L., and Liu, L. F. (1982). Nucleic Acids Res. 10,2565-2576. Ferro, A. M., and Olivera, B. M. (1984).J. Biol. Chem. 259, 547-554. Ferro, A. M., Higgins, N. P., and Olivera, B. M. (1983).J. Biol. Chem. 258,6000-6003. Fisher, L. M., Mizuuchi, K., ODea, M. H., Ohmori, H., and Gellert, M. (1981).Proc. Natl. Acad. Sci. U.S.A. 78, 4165-4169. Foglesong, P. D., and Bauer, W. R. (1984).J. Virol. 49, 1-8. Fuller, F. B. (1978). Proc. Natl. Acad. Sci. U.S.A. 75, 3557-3561. Gellert, M. (1981). Annu. Rev. Biochem. 50, 879-910. Gellert, M., Mizuuchi, K., O’Dea, M. H., and Nash, N. A. (1976a). Proc. Nutl. Acad. Sci. U.S.A. 73, 3872-3876. Gellert, M., ODea, M. H., Itoh, T., and Tornizawa, J. (1976b). Proc. Natl. Acad. Sci. U.S.A. 73,4474-4478. Gellert, M., Mizuuchi, K., O’Dea, M. H., Itoh, T., and Tornizawa, J. (1977). Proc. Natl. Acad. Sci. U.S.A. 74,4772-4776. Gellert, M., Fisher, L. M., and O’Dea, M. H. (1979).Proc. Natl. Acad. Sci. U.S.A. 76,62896293. Gellert, M., Fisher, L. M., Ohmori, H., ODea, M. H., and Mizuuchi, K. (1980). Cold Spring Harbor Symp. Quant. Biol. 45, 391-398. Gocke, E., Bonven, B. J., and Westergaard, 0. (1983). Nucleic Acids Rex 11,7661-7678. Goto, T., and Wang, J. C. (1984). Cell 36, 1073-1080. Grindley, N. D. F. (1983). Cell 32, 3-5. Grindley, N. D. F., and Reed, R. R. (1985). Annu. Rev. Biochem. 54,863-896. Grindley, N. D. F., Lauth, M. R., Wells, R. D., Wityk, R. J., Salvo, J. J., and Reed, R. R. (1982). Cell 30, 19-27. Halligan, B. D., Davis,J. L., Edwards, K. A., and Liu, L. F. (1982).J.Biol. Chem. 257,39954000. Halligan, B. D., Edwards, K. A., and Liu, L. F. (1985).J. Biol. Chem. 260,2475-2482. Hamilton, D., Yuan, R., and Kikuchi, Y. (1981).J. Mol. Biol. 152, 163-169. Hermoso, J. M., and Salas, M. (1980). Proc. Natl. Acad. Sci. U.S.A. 77, 6425-6428. Higgins, N. P., and Cozzarelli, N. R. (1982). Nucleic Acids Res. 10, 6833-6847. Hill, T. L. (1977). “Free Energy Transduction in Biology,” p. 80. Academic Press, New York. Horowitz, D. S., and Wang, J. C. (1984).J. Mol. Biol. 173, 75-91. Hsieh, T., and Brutlag, D. (1980). Cell 21, 115-125. Huang, W. M., and Buchanan, J. M. (1974). Proc. Natl. Acad. Sci. U.S.A. 71, 2226-2230. Hudson, B., and Vinograd, J. (1967). Nature (London) 216, 647-652. Ikeda, H., Moriya, K., and Matsumoto, T. (1980). Cold Spring Harbor Symp. Quunt. Biol. 45, 399-408. Ikeda, H., Kawasaki, I., and Gellert, M. (1984). Mol. Gen. Genet. 196, 546-549. Jaenisch, R., and Levine, A. J. (1973).J . Mol. Biol. 73, 199-212. Javaherian, K., and Liu, L. F. (1983). Nucleic Acids Res. 11,461-472. Jongstra-Bilen, J., Ittel, M.-E., Niedergang, C., Vosberg, H.-P., and Mandel, P. (1983).Eur. J. Biochem. 136, 391-396. Kikuchi, A., and Asai, K. (1984). Nature (London) 309, 677-681.
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
105
Kikuchi, Y., and Nash, H. A. (1978).J. Biol. C h . 253, 7149-7157. Kikuchi, Y., and Nash, H. A. (1979). Proc. Natl. Acad. Sci. U.S.A. 76, 3760-3764. Kirchhausen, T., Wang, J. C., and Harrison, S. C. (1985). Cell 41, 933-943. Kirkegaard, K., and Wang, J. C. (1981). Cell 23, 721-729. Kirkegaard, K., and Wang, J. C. (1985).J . Mol. Biol. 185, 625-637. Kirkegaard, K., Pflugfelder, G., and Wang, J. C. (1984). Cold Spring Harbor Symp. Quant. Biol. 49,411-419. Kitts, P. A., Symington, L. S., Dyson, P., and Sherratt, D. J. (1983).EMBO J. 2, 1055-1060. Klevan, L., and Tse, Y.-C. (1983). Biochim. Biophys. Acta 745, 175-180. Klevan, L., and Wang, J. C. (1980). Biochemistry 19, 5229-5234. Kostriken, R., Morita, C., and Heffron, F. (1981). Proc. Natl. Acad. Sci. U.S.A. 78, 40414045. Kotewicz, M., Chung, S., Takeda, Y., and Echols, H. (1977). Proc. Natl. Acad. Sci. U.S.A. 74, 1511-1515. Krasnow, M. A., and Cozzarelli, N. R. (1983). Cell 32, 1313-1324. Krasnow, M. A., Matzuk, M. M., Dungan, J. M., Benjamin, H. W., and Cozzarelli, N. R. (1983). In “Mechanisms of DNA Replication and Recombination” (N. R. Cozzarelli, ed.), pp. 637-659. Liss, New York. Kreuzer, K. N. (1984).J. Biol. Chem. 259, 5347-5354. Kreuzer, K. N., and Alberts, B. M. (1984).J. Biol. Chem. 259, 5339-5346. Kreuzer, K. N., and Huang, W. M. (1983).In “Bacteriophage T 4 (C. K. Mathews, E. M. Kutter, G. Mosig, and P. B. Berget, eds.), pp. 90-96. American Society for Microbiology, Washington, D.C. Kreuzer, K. N., and Jongeneel, C. V. (1983). In “Methods in Enzymology” (S. Colowick and N. Kaplan, eds.), Vol. 100, pp. 144-160. Academic Press, New York. Kung, V. T., and Wang, J. C. (1977).J. Biol. Chem. 252, 5398-5402. Liu, L. F., and Wang, J. C. (1978a). Proc. Natl. Acad. Sci. U.S.A. 75, 2098-2102. Liu, L.F., and Wang, J. C. (1978b). Cell 15, 979-984. Liu, L. F., and Wang, J. C. (1979).J. Biol. Chem. 254, 11082-11088. Liu, L. F., Liu, C.-C., and Alberts, B. M. (1979). Nature (London) 281,456-461. Liu, L. F., Liu, C.-C., and Alberts, B. M. (1980). Cell 19,697-707. Liu, L. F., Rowe, T. C., Yang, L., Tewey, K., and Chen, G. L. (1983a).J. Biol. Chem. 258, 15365- 15370. Liu, L. F., Halligan, B. D., Nelson, E. M., Rowe, T. C., Chen, G. L., and Tewey, K. M. (1983b).In “Mechanisms of DNA Replication and Recombination” (N. R. Cozzarelli, ed.), pp. 43-53. Liss, New York. Lockshon, D., and Morris, D. R. (1985).J. Mol. Biol. 181,63-74. Lother, H., Lurz, R., and Orr, E. (1984). Nucleic Acids Res. 12, 901-914. McConaughy, B. L.,Young, L. S., and Champoux, J. J. (1981). Biochim. Biophys. Acta 655, 1-8. McGhee, J. D., and Felsenfeld, G. (1980). Annu. Rev. Biochem. 49, 1 1 15-1 156. Marini, J. C., Miller, K. G., and Englund, P. T. (1980).J . Biol. Chem. 255, 4976-4979. Marshall, B., Darkin, S., and Ralph, R. K. (1983). FEBS Lett. 161, 75-78. Maxwell, A., and Gellert, M. (1984).J.Biol. Chem. 259, 14472-14480. Miller, K. G., Liu, L. F., and Englund, P. T. (1981).J. Biol. Chem. 256, 9334-9339. Mills, J. S., Busch, H., and Durban, E. (1982). Biochem. Biophys. Res. Commun. 109, 12221227. Minocha, A., and Long, B. H. (1984). Biochem. Biophys. Res. Commun. 122, 165-170. Mirambeau, G., Duguet, M., and Forterre, P. (1984).J . Mol. Biol. 179, 559-563. Mizuuchi, K. (1984). Cell 39, 395-404.
106
ANTHONY MAXWELL AND MARTIN GELLERT
Mizuuchi, K., ODea, M. H., and Gellert, M. (1978).Proc. Natl. Acud. Sci. U.S.A. 75, 59605963. Mizuuchi, K., Fisher, L. M., ODea, M. H., and Gellert, M. (1980a). Proc. Natl. Acad. Sci. U.S.A. 77, 1847-1851. Mizuuchi, K., Gellert, M., Weisberg, R., and Nash, H. A. (1980b).J . Mol. Biol. 141, 485494. Mizuuchi, K., Mizuuchi, M., O’Dea, M. H., and Gellert, M. (1984).J. Biol. Chem. 259,91999201. Moore, C. L., Klevan, L., Wang, J. C., and Griffith, J. D. (1983).J.Eiol. Chem. 258,46124617. Morrison, A., and Cozzarelli, N. R. (1979). Cell 17, 175-184. Morrison, A., and Cozzarelli, N. R. (1981). Proc. Natl. Acad. Sci. U.S.A. 78, 1416-1420. Morrison, A., Higgins, N. P., and Cozzarelli, N. R. (1980).J.Biol. Chem. 255, 2211-2219. Morrison, A., Brown, P. O., Kreuzer, K. N., Otter, R., Gerrard, S. P., and Cozzarelli, N. R. (1981).I n “Mechanistic Studies of DNA Replication and Genetic Recombination” (B. M. Alberts and C. F. Fox, eds.), pp. 785-806. Academic Press, New York. Naito, A., Naito, S., and Ikeda, H. (1984). Mol Ga. Genet. 193, 238-243. Nash, H. A. (1981). Annu. Rev. Genet. 15, 143-167. Nash, H. A., Enquist, L. W., and Weisberg, R. A. (1977).J. Mol. Biol. 116, 627-631. Nelson, E. M., Tewey, K. M., and Liu, L. F. (1984).Proc. Natl. Acud. Sci. U.S.A. 81, 13611365. Newman, B. J., and Grindley, N. D. F. (1984). Cell 38, 463-469. Noguchi, H., veer Reddy, G. P., and Pardee, A. B. (1983). Cell 32,443-451. Novick, R. P., Smith, K., Sheehy, R. J.. and Murphy, E. (1973). Biochem. Biophys. Res. Commun. 54, 1460-1469. Osheroff, N., Shelton, E. R., and Brutlag, D. L. (1983).J. Eiol. Chem. 258, 9536-9543. Peebles, C. L., Higgins, N. P., Kreuzer, K. N., Morrison, A., Brown, P. O., Sugino, A., and Cozzarelli, N. R. (1978). Cold Spring Harbor Symp. Quunt. Eiol. 43, 41-52. Pollock, T. J., and Nash, H. A. (1983).J. Mol. Eiol. 170, 1-18. Prell, B., and Vosberg, H.-P. (1980). Eur. J . Biochem. 108, 389-398. Pulleyblank, D. E., Shive, M., Tang, D., Vinograd, J., and Vosberg, H.-P. (1975).Proc. Nutl. Acad. Sci. U.S.A. 72, 4280-4284. Reed, R. R. (1981a). Proc. Nutl. Acad. Sci. U.S.A. 78, 3428-3432. Reed, R. R. (1981b). Cell 25, 713-719. Reed, R. R., and Grindley, N. D. F. (1981). Cell 25, 721-728. Reed, R. R., and Moser, C. D. (1984). Cold Spring Harbor Symp. Quant. Eiol. 49, 245-249. Rowe, T. C., Rusche, J. R., Brougham, M. J., and Holloman, W. K. (1981).J . Biol. Chem. 256, 10354-10361. Rowe, T. C., Tewey, K. M., and Liu, L. F. (1984).J . Eiol. C h m . 259, 9177-9181. Sakakibara, Y., Suzuki, K., and Tomizawa, J. (1976).J . Mol. Biol. 108, 569-582. Sander, M., and Hsieh, T. (1983).J. Biol. Chem. 258, 8421-8428. Sander, M., and Hsieh, T. (1985). Nucleic Acids Res. 13, 1057-1072. Sander, M., Nolan, J. M., and Hsieh, T. (1984).Proc. Nutl. Acad. Sci. U.S.A. 81,6938-6942. Schill, G . (197 1). “Catenanes, Rotaxanes and Knots,” pp. 1-2 1. Academic Press, New York. Shen, L. L., and Pernet, A. G. (1985). Proc. Natl. Acad. Sci. U.S.A. 82, 307-311. Shore, D., and Baldwin, R. L. (1983).J. Mol. Eiol. 170, 983-1007. Staudenbauer, W. L., and Orr, E. (1981). Nucleic Acids Res. 9, 3589-3603. Sugino, A., and Bott, K. F. (1980).J. B a c t e d . 141, 1331-1339. Sugino, A., and Cozzarelli, N. R. (1980).J . Eiol. Chem. 255, 6299-6306.
MECHANISTIC ASPECTS OF DNA TOPOISOMERASES
107
Sugino, A., Peebles, C. L., Kreuzer, K. N., and Cozzarelli, N. R. (1977).Proc. Natl. Acad. Sci. U.S.A. 74, 4767-477 1. Sugino, A,, Higgins, N. P., Brown, P. O., Peebles, C. L., and Cozzarelli, N. R. (1978).Proc. Natl. Acad. Sci. U.S.A. 75,4838-4842. Sugino, A., Higgins, N. P., and Cozzarelli, N. R. (1980). Nucleic Acids Res. 8, 3865-3874. Tewey, K. M., Chen, G. L., Nelson, E. M., and Liu, L. F. (1984).]. Biol. Chem. 259,91829187. Trask, D. K., and Muller, M. T. (1983). Nucleic Acids Res. 11, 2779-2800. Trask, D. K., DiDonato, J. A., and Muller, M. T. (1984). EMBOJ. 3, 671-676. Tse, Y.-C., and Wang, J. C. (1980). Cell 22, 269-276. Tse, Y.-C., Kirkegaard, K.,and Wang, J. C. (1980).J. Biol. Chem. 255, 5560-5565. Tse, Y.-C., Javaherian, K., and Wang, J. C. (1984). Arch. Biochem. Bdophys. 231, 169-174. Tse-Dinh, Y.-C., McCarron, B. G. H., Arentzen, R., and Chowdry, V. (1983). Nucleic Acids Res. 11,8691-8701, Tse-Dinh, Y.-C., Wong, T. W., and Goldberg, A. R. (1984).Nature (London) 312,785-786. Udvardy, A., Schedl, P., Sander, M., and Hsieh, T. (1985). Cell 40, 933-941. Vosberg, H.-P. (1985). Cum. Top. Microbiol. Immunol. 114, 19-102. Wang, J. C. (1971).J. Mol. Biol. 55, 523-533. Wang, J. C. (1982). Sci. Am. 247,94-109. Wang, J. C. (1985). Annu. Rev. Biochem. 54, 665-697. Wang, J . C. (1986). In “Cyclic Polymers” (J. A. Semlyen, ed.), pp. 225-260. Elsevier, Amsterdam. Wang, J. C., and Becherer, K. (1983).Nuclek Acids Res. 11, 1773-1790. Wang, J. C., and Kirkegaard, K. (1981). In “Gene Amplification and Analysis” (J. G. Chirikjian and T. S. Papas, eds.), Vol. 2, pp. 455-473. Elsevier, Amsterdam. Wang, J. C., and Liu, L. F. (1979). In “Molecular Genetics” (J. H. Taylor, ed.), Part 3, pp. 65-88. Academic Press, New York. Wang, J. C., Gumport, R. I., Javaherian, K., Kirkegaard, K., Klevan, L., Kotewitz, M. L., and Tse, Y.-C. (1981). In “Mechanistic Studies of DNA Replication and Genetic Recombination” (B. M. Alberts and C. F. Fox, eds.), pp. 769-784. Academic Press, New York. Weber, P. C., Ollis, D. L., Bebrin, W. R., Abdel-Meguid, S. S., and Steitz, T. A. (1982).J. Mol. Biol. 157, 689-690. Weisberg, R. A., and Landy, A. (1983).In “Lambda 11” (R. W. Hendrix, J. W. Roberts, F. S. Stahl, and R. A. Weisberg, eds.), pp. 21 1-250. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. White, J. H., and Cozzarelli, N. R. (1984). Proc. Natl. Acad. Sci. U.S.A. 81, 3322-3326. Wu, R., Grossman, L., and Moldave, K. (Eds.) (1983). “Methods in Enzymology,” Vol. 100, pp. 133-180. Academic Press, New York. Zwelling, L. A., Michaels, S., Erickson, L. C., Ungerleider, R. S., Nichols, M., and Kohn, K. W. (1981). Biochemistry 20,6553-6563.
This Page Intentionally Left Blank
MOLECULAR MECHANISMS OF PROTEIN SECRETION: THE ROLE OF THE SIGNAL SEQUENCE
.
.
By MARTHA S BRIGGS' and LILA M GIERASCH Department of Chemistry. University of Delaware. Newark. Delaware 19716
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 I1. Historical Background . . . . . . . . . . . . . . . . . . . . . . . . . 110 A. Work Done before the Signal Hypothesis. . . . . . . . . . . . . . . 110 B . Discovery of Signal Sequences . . . . . . . . . . . . . . . . . . . . 111 C . Proposal of the Signal Hypothesis . . . . . . . . . . . . . . . . . . 111 I11. The Signal Sequence. . . . . . . . . . . . . . . . . . . . . . . . . . 113 A . Necessity of Signal Sequences for Secretion . . . . . . . . . . . . . . 113 B . Internal Signal Sequences . . . . . . . . . . . . . . . . . . . . . . 116 C. Interchangeability of Signal Sequences . . . . . . . . . . . . . . . . 117 D. Length of Signal Sequences . . . . . . . . . . . . . . . . . . . . . 119 E . The Charged Region . . . . . . . . . . . . . . . . . . . . . . . . 119 F. The Hydrophobic Region . . . . . . . . . . . . . . . . . . . . . . 120 G . The Signal Peptidase Cleavage Site . . . . . . . . . . . . . . . . . 125 H . Predictions of Signal Sequence Conformation . . . . . . . . . . . . . 126 IV. Components of the Secretory Apparatus . . . . . . . . . . . . . . . . . 128 A . The Membrane . . . . . . . . . . . . . . . . . . . . . . . . . . 129 B . Signal Peptidase and Signal-Peptide Peptidase . . . . . . . . . . . . 130 C . Proteins Implicated in Eukaryotic Protein Secretion . . . . . . . . . . 132 D . Proteins Implicated in Prokaryotic Protein Secretion . . . . . . . . . 137 V. How Does Secretion Occur? . . . . . . . . . . . . . . . . . . . . . . 142 143 A. Models of Protein Secretion . . . . . . . . . . . . . . . . . . . . . B. What Is the Nature of the Translocation Site? . . . . . . . . . . . . 146 C. Is Transfer Vectorial or by Domains? . . . . . . . . . . . . . . . . 148 D . How Much Energy Is Required for Secretion, and Where Does It Come From? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 VI What Are the Roles of the Signal Sequence? . . . . . . . . . . . . . . . 152 A. Studies of Precursor Proteins and Isolated Signal Sequences . . . . . . 152 B. Conformational Studies of Signal Sequences . . . . . . . . . . . . . 153 C. Interactions with Lipids. . . . . . . . . . . . . . . . . . . . . . . 157 D . Conformations of Isolated Signal Sequences in Membranes . . . . . . 162 166 E. Interactions with Proteins . . . . . . . . . . . . . . . . . . . . . . VII . Recapitulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 VIII A Model for the Initial Interactions of Signal Sequences with the Membrane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 IX . Signal Sequences as Membrane-Interacting Sequences . . . . . . . . . . 171 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
.
.
Present address: Chemistry Department. University of Pennsylvania. Philadelphia. Pennsylvania 19104. 109 ADVANCES IN PROTEIN CHEMISTRY. V06.38
Copyright 0 1986 by Academic Press. Inc. All rights of reproduction in any form reserved.
110
MARTHA S. BRIGGS AND LILA M. GIERASCH
I. INTRODUCTION All cells make proteins that are destined for noncytoplasmic locations, such as the extracellular fluid, the lumina of organelles, or the cell membranes. These proteins are almost invariably synthesized in the cytoplasm. Thus, such a protein must cross or enter one or more membranes in order to reach its destination. Furthermore, this process is specific in that a particular protein travels only to its proper location, and only proteins destined for a particular location are transported there. How do large, polar, often water-soluble proteins cross the nonpolar membrane, which generally serves as a barrier to protein movement? How does the cell control which membrane(s) each protein will cross? The molecular mechanism of these processes has been the subject of intense investigation in the last decade. Most knowledge concerning the secretory process has come from genetic and biochemical studies. More recently, however, biophysical techniques have been used to investigate properties of the secretory apparatus and the signal sequence. The aims of this review are (1) to describe what is required for protein secretion and (2) to evaluate hypotheses of how secretion occurs. Particular emphasis will be given to the role of the signal sequence. 11. HISTORICAL BACKGROUND A. Work Done before the Signal Hypothesis Early work on protein secretion, largely performed by Palade and coworkers in the late 1960s, focused on the organellar route of a secreted protein from its synthesis to its release into the extracellular fluid (Palade, 1975). Their work showed that exported proteins are synthesized on polysomes bound to the membrane of the rough endoplasmic reticulum (RER). These proteins are not found in the cytoplasm, but are immediately sequestered in the lumen of the endoplasmic reticulum (ER); they are subsequently transported to the Golgi apparatus and then to secretory vesicles, from which they are secreted. This export route is called the Palade pathway. The immediate segregation of proteins in the ER suggested that translocation across the membrane may be coupled to translation. The experiments of Redman and Sabatini (1966) supported this concept on a molecular level. After addition of puromycin, an inhibitor of protein elongation, to growing cells, N-terminal fragments of secretory proteins were found in the lumen of the ER, and not in the cytoplasm. In later work, Sabatini and Blobel (1970) showed that inclusion of microsomes of RER in a cell-free system translating mRNA coding for secretory pro-
MOLECULAR MECHANISMS OF PROTEIN SECRETION
111
teins protects nascent polypeptide chains from added protease. This experiment is now central to the accepted assay for translocation of a secreted protein (e.g., Scheele, 1983). B . Discovery of Signal Sequences The specificity of protein transport necessitates a method for the cell to distinguish cytoplasmic from secretory proteins. In two studies of in ve'tro translation of mRNA from myeloma cells, Milstein et al. (1972) and Swan et al. (1972) found that the immunoglobulin light chain is synthesized as a precursor 3000 Da larger than the mature protein. They also found that the extra material is a polypeptide extension at the amino terminus of the mature protein product, and is cleaved by a protease activity found in ER membrane. Schechter et al. (1975) later determined the amino acid sequence of this amino-terminal addition. These results implied that the information necessary to specify secretion is contained in the extra material. C. Proposal of the Signal ffypotheszf As early as 1971, Blobel and Sabatini (1971) proposed that the Nterminal extension, which became known as the signal sequence, serves to bind the ribosome to the ER membrane. In 1975, Blobel and Dobberstein (1975a,b) expanded this idea into the signal hypothesis, based on results of experiments using their newly developed in vitro assay for protein translocation. The signal hypothesis has greatly influenced thinking in the area of protein secretion. It has been amply supported by studies in eukaryotes and is likely to apply in its essential elements in prokaryotes as well. The version of the signal hypothesis described below (and diagrammed in Fig. 1) represents a synthesis of current, rather than historical, knowledge (see, for example, Walter et al., 1984). The following sections describe the various components and steps in greater detail.
Step 1. The ribosome binds mRNA coding for a secretory protein. This step occurs in the cytoplasm, and is identical to the process that occurs in the case of soluble, cytoplasmic proteins. Step 2. The ribosome begins to synthesize the protein. This step also takes place in the cytoplasm. Step 3. After synthesis of approximately 80 amino acids the signal sequence should protrude from the ribosome. At this point, the signalrecognition particle (SRP), an 11 S ribonucleoprotein, binds the signal sequence and halts translation.
112
MARTHA S. BRIGGS AND LILA M. GIERASCH
5‘
’“‘< K
h r N
R ribosome N ?ribosome
A 3’
FIG. 1. The process of protein secretion according to the signal hypothesis.
Step 4. The mRNA/ribosome/nascent polypeptide/SRP complex moves (by diffusion?) to the ER membrane where the SRP binds to the SRP receptor, or “docking protein,” which is membrane bound. This binding relieves the translation block imposed by the SRP. Step 5. Other membrane proteins, called ribophorins, bind the ribosome to the membrane. Step 6. One or more proteins form a pore through the membrane. As translation proceeds the nascent polypeptide is extruded through this pore into the ER lumen. Step 7. During or after translation, signal peptidase removes the signal sequence. Signal peptide peptidase digests the signal sequence after cleavage. Step 8. Release of the completed protein from the ribosome causes detachment of the ribosome from the ER membrane and mRNA, and closing of the membrane pore. The signal hypothesis can also describe the process of insertion of integral membrane proteins, simply by postulating that translocation of the protein is halted by a “stop transfer sequence” (Blobel, 1980),leaving the protein anchored in the bilayer. Similarly, multiple transmembrane segments may be arranged by alternating signallike and stop transfer segments (Blobel, 1980; Friedlander and Blobel, 1985; Coleman et al., 1985).
MOLECULAR MECHANISMS OF PROTEIN SECRETION
113
Evidence for the signal hypothesis was derived from eukaryotic systems. Information concerning protein secretion in prokaryotes has been more elusive, due in part to the difficulty of separating cytoplasmic from membrane-bound ribosomes and inner from outer membrane, and the lack (until recently) of an in vitro translocation assay. Prokaryotic systems are better characterized genetically than eukaryotic ones, however, and several genetic loci that affect secretion have been mapped (see Section IV,D). Genetic data strongly suggest that the prokaryotic secretion process is very similar to that in eukaryotes. SEQUENCE 111. THESIGNAL Nearly all proteins that cross a membrane, and many membranebound proteins, require a signal sequence for proper localization. The signal sequence was the first component of the secretory apparatus to be discovered and, so far, it appears to be the most universal requirement of the secretion process. The amino acid sequences of over 300 signal peptides are known. Watson (1984) has compiled sequences published before mid-1984. Signal sequences have been found on secretory proteins from all kinds of organisms, including animals, plants, bacteria, and viruses (see Table I). This wealth of sequence information invites attempts to develop structure-function correlations. However, as discussed in the next sections, such attempts are confounded by the extreme variability of signal sequences despite their parallel functions. A . Necessity of Signal Sequences for Secretion Secretion does not occur in the absence of a functional signal sequence. Yeast invertase provides a clear-cut demonstration of the necessity for a signal sequence in protein secretion. This enzyme is made in both cytoplasmic and secreted forms. The same gene codes for both forms; the mRNAs differ only in the presence (in the secreted form) or absence (in the cytoplasmic form) of the segment coding for the signal sequence (Carlson and Botstein, 1982; Perlman et al., 1982). Other evidence that the signal sequence is essential for secretion comes from experiments in which signal sequences have been changed or significantly shortened. Secretory proteins with altered signal sequences often are not secreted, but remain in the cytoplasm as the unprocessed precursor form. Specific examples of signal sequence mutations and alterations are discussed in the following sections. Whether the signal sequence alone is sufficient to specify localization of a protein is unclear. The signal sequences of some mitochondria1 and chloroplast proteins, when linked to a soluble, cytoplasmic protein, can
114
MARTHA S. BRIGGS AND LILA M. GIERASCH
TABLE I Representative Signal Sequences" Protein Bovine Proparathyroid AChR a-subunit Hamster proglucagon Human Growth hormone Relaxin y-Interferon Murine Ig heavy chain Complement comp. 3 Ovine a-Sl casein a-Lactalbumin Porcine relaxin Chicken lysozyme Caiman crocodylus Ig(v) heavy chain TorpedoAChRa-subunit Bee promelittin Drosophilamelanogaster68C glue protein-8 Brasica napus napsin Pisum satiuum Vicilin Prolectin
Signal sequenceC M M S A K D M V K V M I V M L A1 C F L A R S D G I MEPRPLLLLLGLCSAGLVLG I MKNI Y I V A G F F C G A G Q G S W Q I MATGS R T S L L L A F G L L C L P WLQE G S A I MP R L F L F H L L E F C L L L N Q F S R A V A A I MKYTSYI L A F Q L C I V L G S L G I MKVLSLLYLLTAIPHIMS I MGPASGSQLLVLLLLLASS P L A L G I MKLLI LTCLVA MMSFVSLLLVG MP R L F S Y L L G V MRSLLI LVLCF MGLGLHLLVLA
VALA I I LFWATQA I WLLL SQL P R E I P G I LPLAALG I AALQGAWS I
MI L C S Y W H V G L V L L L F S C C G L V L G I MKFLVNVALVFMVVYISYI A I M K L L V V A V I A C I ML I G F A D P A S G I MANKL F L VS A T L AF F F L L T N A I
MLLAI A F L A S V C V S S I M A S L E T E MI S F Y A 1 F L S I L L T T I L F F KVNS I Zeamaizezeinprotein22.1 M A T K I L A L L A L L A L L VS A T N A I Saccharomyces pro-a-factor 1 MR F P S I F T A V L F A A S S A L A I Bacillwlichenofomisa-amylase M K Q Q K R L Y A R L L T L L F A L I F L L P H S A A AA I Corynebacteriumdiphtheriae MSRKLFASI LI GALLGI G A P P S A G A I phage proDiphtheria toxin Escherichia coli a-Amylase MF A K R F K T S L L P L F A G F L L L F H L V L A G PAAAS I Lipoprotein MKATKLS LGAVI LGS T L L A G I OmpF MMKRNI L A V I V P A L L V A G T A N A I P-Lactamase TEM MRI Q H F R VAL1 P F F A A F C L P VF G I Alkaline phosphatase MKQSTI A L A L L P L L F T P V T K A I Phosphate-binding protein M K V M R T T V A T V V A A T L S M S A F S V F A I Phage M I3 major coat MKKSLVLKASVAVATLVP M L S F A I protein
MOLECULAR MECHANISMS OF PROTEIN SECRETION
115
TABLE I (Continued) Protein Staphylococcus aurm 8-Lactamase Adenovirus 2 29K glycoprotein Herpes simplex glycoprotein D- 1 Human influenza AIVictoria HA AlJapan HA BILee HA Rabies virus CVS glycoprotein Vesicular stomatitis virus glycoprotein G a
Signal sequence' MKKL I F L I V I A L V L S AC N S N S S H A I MRYMI L G L L A L A A V C S A A 1 MGGTAARLGAVI L F V V I VGLHGVRGI M M M M
K T I I A L S Y I F C L V F A I AIIYLI LLFTAVRGI KAIIVLLMVVTSNAI VPQVLLFVLLLGFS LCFG 1
MKCLLY L A F L F I H V N C 1
Excerpted from Watson (1984). Abbreviations: AChR, acetylcholine receptor; HA, hemagglutinin; Ig, immunoglobulin. The cleavage site is indicated by a slash (I).
cause proper processing and localization of the fusion product to the organelle (Hurt et al., 1984; van den Broek el ad., 1985). The signal sequence plus the first five residues of the Escherichzu coli periplasmic protein /3-lactamase can direct the transport of globin across the microsoma1 membrane or the E. coli inner membrane (Lingappa et al., 1984). The signal peptide is cleaved at the proper processing site. In contrast, hybrid proteins composed of the signal sequence from the E. coli outer membrane proteins PhoE or the A-receptor protein (LamB),or periplasmic proteins alkaline phosphatase (PhoA) or maltose-binding protein (MBP or MalE), and the mature cytoplasmic protein P-galactosidase (pgal) remain in the cytoplasm and are not processed (Moreno et al., 1980; Bassford et al., 1979; Michaelis et al., 1983; Tommassen et al., 1983). A fusion of the signal sequence of the periplasmic protein plactamase with chicken triose-phosphate isomerase (a cytoplasmic enzyme) also remains uncleaved and in the cytoplasm (Kadonaga et d., 1984). Furthermore, there is evidence that correct localization of the E . coli LamB and PhoE proteins to the outer membrane depends on regions within the mature protein, as well as the signal sequence (Tommassen et al., 1983; Benson and Silhavy, 1983). While the number of examples studied is too small for generalizations regarding sufficiency of the signal sequence for membrane transport, localization, and processing, available data suggest that the nature of the mature protein influences its ability to be exported.
116
MARTHA S. BRIGGS AND LILA M. GIERASCH
B . Internal Sagnal Sequences Not all signal sequences are removed from the secreted protein after translocation. The amino-terminal regions of some proteins resemble signal sequences, but are not cleaved during secretion. Others have internal signal sequences; that is, the protein contains a stretch of amino acids that resembles a signal sequence and is required for export, but is not located at the amino terminus. The first secreted protein that was found to lack a cleaved signal sequence was ovalbumin, and the existence and location of its internal signal sequence were debated (Palmiter et al., 1978; Lingappa et al., 1979; Braell and Lodish, 1982). The amino-terminal region of ovalbumin was suggested to contain a signal sequence based on the following data: 1. A 50-60 residue ovalbumin nascent chain is the smallest protein segment able to bind to the ER membrane (Meek et aZ., 1982). 2. In cell-free translation systems, glycosylation and segregation of the ovalbumin chains to the interior of added microsomes do not occur if the microsomes are added after 150 or more residues of the nascent chain have been synthesized (Braell and Lodish, 1982). 3. Ovalbumin expressed in E. coli is synthesized on membrane-bound ribosomes and is localized to the periplasm, but in a strain producing a shortened ovalbumin lacking the first 126 amino acids, the protein is synthesized on free polysomes and is not exported (Baty et al., 1981). Further support for the presence of a signal sequence in ovalbumin is the fact that its secretion is SRP-dependent in vitro (P. Walter, personal communication). Nascent chains of ovalbumin also compete with the nascent chains of other secretory proteins for the cell's transport apparatus (Palmiter et al., 1978), which implies that secretory proteins and ovalbumin have a common structure that is necessary for secretion. Using Xenopzls laevk oocytes as a secretion system, and several mRNA fusions of the ovalbumin amino-terminal region with chimpanzee a-globin, Tabe et al. (1984) localized the signal region to between residues 22 and 41 of ovalbumin. This finding is consistent with the earlier suggestion of Meek et al. (1982) that the first 60 residues of ovalbumin insert into the membrane as a loop consisting of two transmembrane helices, as proposed in the helical hairpin hypothesis (Section V,A,4) (Engelman and Steitz, 1981). In secretory proteins with a cleaved signal sequence, the signal sequence is the first (amino-terminal) segment of the loop, while in ovalbumin, the signal sequence would be the second segment of the loop (residues 22-41), and is not cleaved. Other secreted or membrane proteins that lack a cleaved signal sequence include the E. coli signal peptidase, lactose permease, and NADH
MOLECULAR MECHANISMS OF PROTEIN SECRETION
117
dehydrogenase, cytochromes P-450 from various organisms, and some bacterial pilins (Watson, 1984). Hybrid proteins consisting of 45 or 23 amino acids from the product of the E. coli erythromycin resistance gene fused to the amino terminus of E. coli prelipoprotein have been constructed (Hayashi et al., 1985). The hybrids thus contain a “signal sequence” 65 or 43 amino acids long, of which the last 20 are the wild-type lipoprotein signal sequence. The extra material appears to have no effect on secretion, as these proteins are translocated and processed normally. A similar fusion of the last 18 residues of E. cola P-galactosidase with rat preproinsulin is also secreted and cleaved properly in E. coli (Talmadge et al., 1981). Interestingly, a construction in which the cytoplasmic protein globin was fused to preprolactin led to the appearance in microsomes in an in vitro translocation system of correctly processed prolactin and globin with the preprolactin signal sequence attached to its C-terminus (Perara et al., 1986). This process appears to be SRP independent. By contrast, the signal sequence of carp preproinsulin retains its function, including SRP interaction, when transposed to an internal location (Wiedmann et al., 1986b). A fusion of the first half of the E. coli penicillinase signal sequence to proinsulin lacks a complete signal sequence and is not secreted (Talmadge et al., 1981). The ability of these native or constructed internal signal sequences to function in secretion suggests that the disposition of the signal near the N terminus is not essential to its recognition by the export apparatus. C . Interchangeability of Signal Sequences
The known signal peptides usually lack sequence homology, even among closely related proteins. For instance, the E. coli proteins OmpF and PhoE are highly homologous, but their signal sequences exhibit little homology beyond the usual similarities of charge and hydrophobicity (Tommassen et al., 1983). Similarly, the signal sequences for the same proteins from different species vary significantly more than do the mature proteins (Hahn et ul., 1983). On the other hand, there is an example of two proteins that have identical signal sequences: chicken transferrin and chicken conalbumin (Magner, 1982). Despite the lack of homology, signal sequences share some similarities in overall structure, which are described below. Furthermore, signal sequences are almost entirely interchangeable from protein to protein, even among widely different organisms, indicating that certain aspects of the secretion pathway have been highly conserved during evolution. Secretory proteins from organisms as diverse as E. coli, rats, frogs, chickens, locusts, barley, and field beans are secreted and processed
118
MARTHA S. BRIGGS AND LILA M. GIERASCH
normally when their genes are injected into X. laevis oocytes (Lane et al., 1980; Bassuner et al., 1983; Wiedmann et al., 1984). The in vitro translocation and processing system derived from dog pancreas is also capable of recognizing, secreting, and cleaving secretory proteins from a number of species, including E. coli (Muller et al., 1982),Drosophila melanogaster (Brennan et al., 1980), and cows (Paul and Goodenough, 1983).Drosophila can secrete mouse proteins (Brennan et al., 1980),while E. coli can translocate and cleave secretory proteins from rat (Talmadge et al., 1980a) and Pseudomonas aermginosa (Ding et al., 1985). Several types of gene fusions consisting of the DNA coding for the signal sequence (plus more or less of the structural gene) from one protein, and the structural sequence of another, have been constructed. These hybrid genes may then be expressed in any of several organisms. Among E. coli proteins, the signal sequence of the outer-membrane protein OmpF can be replaced by that of PhoE, another outer-membrane protein (Tommassen et al., 1983).Transport of the OmpF protein to the outer membrane and processing are normal. E. cola can also transport hybrids consisting of an E. coli protein signal sequence (e.g., plactamase or pre-OmpA) and an exogenous structural sequence [rat proinsulin (Talmadge et al., 1980b) and Staphylococcus aurew nuclease A (Takahara et al., 1985), respectively]. The Bacillus subtilis a-amylase signal sequence directs secretion of E. coli p-lactamase in B . subtilis (Palva et al., 1982). Thus, signal sequences are not gene-specific; they can direct translocation of other secretory proteins. Furthermore, the secretion apparatus in one type of cell can recognize signal sequences from other types of cell, and secrete the attached protein. That prokaryotic proteins can be secreted in eukaryotic cells and vice versa indicates that the secretion apparatus has changed little (at least in some respects) in the process of evolution. The interchangeability among signal peptides despite their lack of sequence homology argues that they have other properties in common. Indeed, study of the amino acid sequences of signal peptides reveals certain shared structural features. In general, signal sequences consist of three regions: The central part of the signal sequence is a core of strongly hydrophobic residues. A more polar, but usually uncharged, segment lies between the carboxyl end of the hydrophobic core and the signal peptidase cleavage site at the C terminus of the signal sequence. The region from the amino terminus of the signal sequence to the beginning of the hydrophobic core contains one or more charged, usually basic, residues. These generalizations are discussed in the following sections. [They are valid only for signal sequences of secreted proteins and
MOLECULAR MECHANISMS OF PROTEIN SECRETION
119
some membrane-bound proteins; signal sequences for proteins imported into mitochondria and chloroplasts are quite different, and have been reviewed elsewhere (Hay et al., 1984; Kreil, 1981).] The representative signal sequences given in Table I illustrate the points discussed. D. Length of Signal Sequences Signal sequences2range in length from 13 to 36 amino acids (excluding mitochondria1 and chloroplast proteins (von Heijne, 1985). There appears to be no maximum length for the signal sequence,judging from the results of gene fusions that create an internal, cleavable signal sequence, as discussed in Section I1,B. However, there does seem to be a minimum length required for an effective signal sequence, as none shorter than 13 residues has been found (von Heijne, 1985). Bedouelle and Hofnung (1981a,b) have defined the hydrophobic axis length, described in Section III,F, and postulated that the minimum length for the hydrophobic region of the signal sequence is 10 amino acid residues. Allowing for the initiating methionine and one or more charged residues near the amino terminus, a total minimum length of 12 or 13 residues is reasonable. The effects of shortening the signal peptide by truncation or deletion appear to depend on which portions of the sequence are omitted, and, in some cases, reflect differences in hydrophobicity, secondary structure, or other factors resulting from the change in length. These effects are discussed in the corresponding sections. With the exception of the experiments described above in which the signal sequence is “lengthened” (actually internalized) by fusion to part of an unrelated protein, there are no data on the effects of making a signal peptide longer. E . The Charged Regton The amino-terminal part of the signal sequence usually contains one or more charged amino acid residues, In prokaryotes the net charge is invariably positive, and often there are two or more basic residues (Inouye and Halegoua, 1980). The most common total charge is + 2 (von Heijne, 1984a). As prokaryotes use formylated methionine, which is uncharged, to initiate protein synthesis, the amino terminus does not add another positive charge to the total. Acidic residues are seldom found in prokaryotic signal sequences. Eukaryotic signal sequences show more variation in this region. The charged residues are often basic, but acidic amino acids occur occasionally. There are usually fewer charged residues than in prokaryotic signal sequences. The charge due to side
* Unless specified otherwise, we will be discussing amino-terminal signal sequences.
120
MARTHA S. BRIGGS AND LILA M. GIERASCH
+
chains is often 1, with the amino terminus contributing another positive charge (von He2ne, 1984a). The length of the charged region is extremely variable, ranging from two amino acids to more than eight. Most are two to five residues long. Inouye and co-workers (Inouye et al., 1982; Vlasuk et al., 1983)investigated the effects of changing the charge of this portion of the E. coli lipoprotein signal peptide by deleting positively charged residues and/or substituting negative or neutral amino acids for neutral or positive ones. The wild-type signal peptide of lipoprotein is 25 residues long, with two lysine residues, at positions -19 and -16.3 These were deleted or replaced by an asparagine, aspartic acid, or glutamic acid residue to yield mutants with net charges of +1, 0, -1, or -2. The mutants with a charge of +1 or 0 showed reduced synthesis of lipoprotein, but the protein was transported and processed at the same rate as wild type. The mutants with the negatively charged signal sequences synthesized less precursor, most of which accumulated in the cytoplasm. Thus, the presence of net positive charge (or at least the absence of net negative charge) in the amino-terminal region of the signal sequence is important for proper secretion. The function of the charges has not been proven, but it has been proposed that the positively charged signal sequence binds electrostatically to the negatively charged head groups of the membrane lipids as one step of the secretion process (Austen, 1979; Inouye and Halegoua, 1980; Austen and Ridd, 1981). Another function for the charged residues may be coupling of synthesis with protein export. Hall et al. (1983) found that a mutation in which Arg -20 of the LamB signal sequence is changed to Ser causes a 75% reduction in synthesis of the LamB protein. The molecular mechanism of this effect is unknown. F. The Hydrophobic Region
The most distinctive feature of the signal sequence is its stretch of uncharged, mostly hydrophobic, amino acid residues, often called the hydrophobic core. This region follows the charged segment just described, and usually precedes a more polar section, which contains the signal peptidase cleavage site. A plot of average hydrophobicity clearly demarcates the core from these adjacent regions, as shown in Fig. 2 (von Heijne, 1985). The hydrophobic regions of various signal sequences exhibit almost no sequence homology, although certain residues occur with high frequency (see below). The function of this domain has been In this article signal sequences are numbered from the C terminus ( - 1 ) to the N terminus. The first residue of the mature protein is numbered + 1.
MOLECULAR MECHANISMS OF PROTEIN SECRETION
-15
-10
-5
-I +I
121
+5
Position
FIG.2. Mean hydrophobicity ofthe signal sequence as a function of position from the average of a cleavage site. ( X ) Average of a sample of eukaryotic signal sequences; (0) sample of prokaryotic signal sequences. Negative values are more hydrophobic. Reprinted, with permission, from von Heijne (1985).
postulated to depend only on its length and hydrophobicity (von Heijne, 1980a, 1981). The length of the hydrophobic region ranges from 7 to 20 amino acids, with most having between 10 and 15 residues (von Heijne, 1985). It has been proposed that the function of this segment is to partition into the membrane lipids and span the bilayer (Bedouelle and Hofnung, 1981a,b). The apolar part of the membrane is about 30 A wide (Tanford, 1980), which would require a hydrophobic peptide to be from 8 or 9 residues long, if in extended conformation, to about 20 residues long, if in an a helix. These numbers agree well with the observed lengths of this part of the signal sequence. Bedouelle and Hofnung (198lb) have predicted a minimum length of the hydrophobic core region based on an inspection of several native and mutant signal sequences. They defined a hydrophobic axis length, (HAL), as the “physical length of the longest stretch of uncharged amino acids measured along the axis of the periodic structure.” The threshold HAL (tHAL) is the value of the HAL below which no export occurs. The tHAL in the bacterial signal peptides and mutants that they examined appears to be 18 A, corresponding to 12 amino acids in an a-helical conformation, or 5 amino acids in extended (P-sheet) conformation. Note that, according to this theory, the minimum length of the hydrophobic region depends on its conformation. The conformational preferences of signal sequences are discussed in Section II1,H.
122
MARTHA S. BRIGGS AND LILA M. GIERASCH
30
I-
t
w
0 p1
15
IU 0
0 0
15 LENGTH
30
FIG. 3. Distribution of the lengths of the most hydrophobic segments in samples of cytosolic proteins (0,N = 134), signal peptides (8,N = 170), and C-terminal transmembrane segments (0,N = 35). Courtesy of Gunnar von Heijne.
The hydrophobic region may also have an upper limit on its length. von Heijne (unpublished results) pointed out that the membrane-spanning “anchor” segments of transmembrane proteins are usually about 25 residues long, while the hydrophobic cores of signal sequences average 12 residues (Fig. 3). The length of the hydrophobic region appears to function, in part, as a “label”that distinguishes between signal sequences and transmembrane segments of proteins. The core region is composed primarily of hydrophobic amino acid residues: Leucine, alanine, valine, phenylalanine, and isoleucine are very common; methionine and cysteine are found frequently; some less hydrophobic residues, such as serine, proline, and threonine, are usually present (Perlman and Halvorson, 1983). The mean hydrophobicity of the region, calculated by any of several methods, is high (von Heijne, 1980a, 1981; Engelman and Steitz, 1981);i.e., the calculated free energy of transfer of the segment from water to a nonpolar environment is large and negative. Genetic studies in bacteria have established that changes in the length of the hydrophobic core can result in an export-negative phenotype for the attached protein. Deletion of seven amino acids from the core region of the E. coli MBP signal sequence shortens the hydrophobic segment from 18 residues to 11 residues and causes precursor to be retained in the cytoplasm: less than 1% is exported and processed (Bankaitis et al., 1984). The export deficiency can be almost totally reversed by a second mutation (Fig. 4), which causes insertion of three hydrophobic residues into the core region, lengthening it to 14 residues. Two other second-site
MOLECULAR MECHANISMS OF PROTEIN SECRETION
123
lamda Receptor Protein -25
-20 -15 -10 -5 -1 l M e t M e t Ile Thr Leu ~ r Lgy s L e u Pro Leu Ala Val Ala Val Ala Ala Gly Val Met Ser Ala Gln Ala M e t Ala/Val
I
ApJ"I
Glu
A m
mltoae Binding Protein -25 -20 -15 -10 -5 -1 +1 M e t LYS 110 LYS Thr G l y Ala Arg Ile Leu Ala L e u Ser Ala Leu Thr Thr Met Met Phe Ser Ala See A h L e u Alafiys
+
Pro
J'
L!s
I1
Arq
ATq
Alkaline Phosphatase -20 -15 -10 -5 -1 +1 Met L y s Gln Ser Thr Ile Ala L e u Ala L e u Leu Pro L e u Leu Phe Thr Pro V a l Thr L y s Ala/Arq
t
Gln
1
AT9
FIG. 4. Mutations in E . o l z signal sequence that result in export-defective proteins. From Silhavy et al. (1983).
mutations restore export competence by replacing a charged residue, arginine, with an uncharged residue, leucine or cysteine, which lengthens the uncharged region to 15 amino acids. Another pseudorevertant in which only one amino acid is inserted, resulting in a hydrophobic core of 12 residues, partially regains export competence; it exports MBP at 25% of the wild-type efficiency. The authors interpreted these results in terms of the HAL, and concluded that there is, indeed, a threshold HAL below which no export is possible. Changes in the length of the hydrophobic core region can affect other properties, such as overall hydrophobicity or conformational preferences. Many mutations are known in which a charged residue replaces a hydrophobic residue in the core region. Whether these substitutions change the length or the overall hydrophobicity of the core is a matter of viewpoint. These mutations are described below. Another kind of change, found in the E. colz X-receptor protein signal sequence, decreases the length of the hydrophobic region by deletion of four amino acids in the core. Its export defect is thought to be the result of a change in conformational preference, rather than a change in length (Emr and Silhavy, 1983). This mutation and its pseudorevertants have been the subjects of biophysical studies in our laboratory, and are discussed in more detail in the sections concerning properties of isolated signal sequences (V1,B and D). The introduction of a charged residue into the hydrophobic core results in an export defect. Numerous mutations of this type are known
124
MARTHA S. BRIGGS AND LILA M. GIERASCH
in various E. coli signal sequences. Some of these are shown in Fig. 4. These export-defective mutants accumulate precursor in the cytoplasm. Ryan et al. (1986b) isolated export-competent pseudorevertants of a mutant of this kind. Substitution of Arg for Met at either position - 9 or - 8 of the E. coli MBP signal sequence causes a >99% export block for this protein, as measured by the relative amounts of precursor and mature forms of MBP. Not surprisingly, the second mutations that are most effective at relieving this export block are those that change the arginine residue to an uncharged residue, either the wild-type methionine, or glycine or serine. Changing Arg -19 to cysteine restored export to a lesser extent (-50-70% of wild type). This mutation lengthens the hydrophobic region by four amino acid residues (to Lys - 23). Duplications in the signal-sequence coding region that result in addition of two to four uncharged residues are only slightly effective at relieving the export block; these double mutants export about 10-40% of the wild-type levels of MBP. Substitution of a more hydrophobic residue (Met or Phe) for a polar residue (Thr - 11 or Ser - 14, respectively) does not change the length of the core region, but increases its mean hydrophobicity. These mutations suppressed the export block to varying extents. Thus, even a charged residue can sometimes be tolerated in the hydrophobic core, if the remaining residues are sufficiently hydrophobic. From these and other data, the authors conclude that the length of the hydrophobic core may not be as important as its overall hydrophobicity. A mutant of PhoA in which Gln is introduced in place of Leu - 14 in the hydrophobic core causes an export defect, which argues that overall hydrophobicity, and not just the presence of a charge, may alter signal sequence function (Michaelis et al., 1983). Hortin and Boime (1980) studied the effect of hydrophobicity changes on secretion by incorporating a polar analog of leucine, P-DL-hydroxyleucine, into nascent chains of several eukaryotic secretory proteins. Addition of the analog to a cellfree translation, secretion, and processing system caused inhibition of translocation and processing of the proteins. Proteins with several leucine residues in the signal sequence (bovine preprolactin, rat preprolactin, and human placental prelactogen) were affected more strongly than one with few leucines in the signal sequence (the a subunit of human chorionic gonadotropin). In a similar experiment, Walter et al. (198 1) found that incorporation of P-hydroxyleucine into preprolactin partially alleviated the translation block that occurs in the presence of SRP (Section IV,C, 1). Incorporation of P-hydroxyleucine affects only the overall hydrophobicity of the core region, and not its length. These results indicate that decreasing the hydrophobicity of this domain is sufficient to abolish signal-sequence function.
MOLECULAR MECHANISMS OF PROTEIN SECRETION
125
It has been suggested that the hydrophobic region may be involved in specific interactions, for example, with SRP (Emr et al., 1980; Silhavy et al., 1983). Clearly the mutations described above, as well as substitutions of P-hydroxyleucine for leucine, could disrupt such interactions and cause an export defect. However, the wide variation of signal sequences, and their interchangeability, argue against its specific recognition on the basis of sequence.
G. The Signal Peptidare Cleavage Site
The last five (in eukaryotes) or six (in prokaryotes) residues of the signal sequence are more polar than those in the hydrophobic region, and define the cleavage site for signal peptidase (von Heijne, 1984b). von Heijne (1985) called this the ‘‘c region.” There is markedly less variability in the length and sequence of this domain than in the rest of the signal sequence. Although the c region ranges in length from 10 residues in S . aweus protein A, to zero in ovine a-S2 casein (von Heijne, 1985) (in this case, part of the hydrophobic region forms the signal peptidase cleavage site), almost all of the known signal sequences have four to seven residues in this region. The effect of varying the length of this domain is not known. The cleavage site is defined primarily by the last three residues of the c region. von Heijne (1983) and Perlman and Halvorson (1983) have postulated the “-3,- 1” rule, which states that the residues that occupy positions - 1 and - 3 of the signal sequence must have small neutral side chains. Alanine is most common by far at these sites, but cysteine, serine, threonine, and glycine are found occasionally. Position - 2 is more variable, but frequently has a large aromatic or hydrophobic side chain. The remaining residues of the c region are also variable, but are usually polar, and have been predicted to favor formation of a P turn (Perlman and Halvorson, 1983) (see Section 111,H). A change in the amino acid at the cleavage site (the - 1 residue) can result in a lack of processing or a change in the processing site. For example, substitution of valine for alanine at the cleavage site of yeast invertase inhibits and delays processing (Schauer et al., 1985). While none of the protein is cleaved at the proper site, a small amount of cleavage occurs at an alr ‘mate site, between Ser + 1 and Met + 2 of the mature protein. Sii iilarly, incorporation of P-hydroxynorvaline in place of threonine at the processing site of rat preprolactin causes a change in cleavage site and slowed processing (Hortin and Boime, 1981a,b).
126
MARTHA S. BRIGGS AND LILA M. GIERASCH
H . Predictions of Signal Sequence Conformation
The lack of sequence homology among signal peptides, combined with their interchangeability in vivo, has prompted searches for conformational similarities. The method of Chou and Fasman (1974a,b) for predicting conformation from primary structure has been applied (Austen, 19’79). The Chou-Fasman method is based on the frequencies of occurrence of the amino acids in various types of secondary structure in water-soluble, globular proteins. Application of these data to signal sequences may be questioned, as signal sequences are probably found in a hydrophobic environment, such as the membrane or an apolar pocket in a protein. In general, the results indicate that signal sequences have a high probability of adopting a helix and fl sheet in the hydrophobic region. This result is not surprising, since the residues that favor the interior of globular proteins are generally hydrophobic, and also tend to occur in a-helical and @sheet conformations. Often both a and p structures are predicted, with one or the other being only slightly more probable. Most signal sequences are predicted to adopt a /3 turn in the c region, near the signal peptidase cleavage site, while the charged aminoterminal domain shows no consistent conformational preference. A conformational energy calculation carried out on the murine K light chain signal sequence revealed a favored a-helical structure throughout the hydrophobic region (Pincus and Klausner, 1982). The importance of signal-sequenceconformation for proper function has been tested by determining the effect on activity of sequence changes that are predicted to change conformational tendencies; these are discussed below. Only in the case of the LamB mutants (see below and Section II1,F) have correlations been made between the actual and predicted conformational preferences of the altered and native sequences. Brown et al. (1984) introduced insertions of three or four amino acids into the yeast invertase signal sequence near its amino terminus. The insertions were predicted to stabilize an (Y helix, favor a p turn, or to destabilize both a-helix and /I-sheet formation. None of these alterations prevented proper secretion of the protein. Thus, the amino-terminal domain of the signal sequence is relatively unconstrained as to conformation. The c region is more sensitive to conformational alterations. The E. coli wild-type lipoprotein signal sequence is predicted to form a p turn at positions -7 to -4. Alanine was substituted for serine (position - 6 ) , which occurs frequently in p turns, or threonine (position - 5 ) , which is also found in j3 turns, but less often than serine, or both (Vlasuk et al.,
MOLECULAR MECHANISMS OF PROTEIN SECRETION
127
1984). Substitution of Ala for Thr -5 alone did not cause a predicted loss of &turn conformation, while replacement of Ser -6 or both Ser -6 and Thr -5 yielded a structure predicted to lack a p turn. The phenotypes of the mutants correlate with the predicted presence or absence of the turn. Those predicted to lack a @ turn accumulate membrane-bound precursor lipoprotein, which is slowly processed to mature lipoprotein. Thus, the absence of a region favoring p turn in the c domain can inhibit removal of the signal sequence. Alterations predicted to change the conformation of the hydrophobic core can have profound effects on signal sequence function. The wildtype E. coli MBP is required for use of maltose as a nutrient. Its signal sequence is predicted (Chou and Fasman, 1974a,b)to be an a helix or p structure over much of its length. A mutant that has a Pro, which disfavors both a helix and /3 sheet, substituted for a Leu, which favors both conformations, at position - 17 is defective for MBP export (Bedouelle et al., 1980). This mutation decreases the number of hydrophobic residues capable of supporting an a helix from 18 to 10. The length of such a sequence (10 residues X 1.5 &residue in an Q helix) is 15 A, which is below the proposed minimum hydrophobic axis length required for signal sequence function (18 A, Bedouelle and Hofnung, 1981b). In this strain, precursor MBP accumulates in the cytoplasm, and growth on maltose minimal medium is very slow. This mutation is as effective in preventing MBP export as other mutations that place a charged residue in the hydrophobic core of the signal sequence. A similar mutation has been found in the signal sequence of E. coli ribose-binding protein (Iida et al., 1985). Substitution of Leu for Pro - 17 results in total inhibition of export. A pseudorevertant at position - 15 has a Phe in place of a Ser. Restored export function could result from altered conformation of the signal sequence or from increased hydrophobicity. The signal sequence of E. coli LamB is shown in Fig. 5. Its hydrophobic region is predicted to adopt a largely a-helical conformation, despite the presence of helix-breaking proline and glycine residues (Bedouelle and Hofnung, 1981a; Emr and Silhavy, 1983). Because the proline and glycine are separated by seven residues, the predicted helical potential (Chou and Fasman, 1974a,b) of the peptide is not severely reduced at any one point. A mutant strain defective for LamB export was found to lack four amino acid residues in the hydrophobic core of the signal sequence (Emr et al., 1980).This deletion brings the proline and glycine to within four residues of each other. The conformation of the hydrophobic region is now expected to be random (Bedouelle and Hofnung, 1981b; Emr and Silhavy, 1983). Bedouelle and Hofnung (1981b) predicted that a second mutation changing the proline to leucine,
128
MARTHA S. BRIGGS AND LILA M. GIERASCH
Wild Type
- 20 -15 -10 -5 . -1 l Met Met Ile Thr Leu Arg LyS Leu Pro Leu Ala Val Ala Val Ala Ala Gly Val Met Ser Ala Gln Ala Met Ala/Val
-25
Deletion Mutant -25
-20
-15
-in -5 -1 +l V a l Ala Ala Gly Val Met Ser Ala Gln Ala net Ala/Val
-15
-in Val Ala Ala
Met Met Ile Thr Leu ATg LyS Leu Pro
Gly+Cys
Pseudorevertant
1
-25 -20 Met Met Ile Thr Leu Arg LyS Leu Pro
Pro-Leu
Pseudorevertant
1
-25 -20 Met Met Ile Thr Leu A r g LyS Leu Leu
I
CyS
-5 -1 t 1 Val M e t Set' Ala Gln Ala M e t Ala/Val
-5 -1 +1 la Gly Val Met Sex Ala Gln Ala Met Ala/Val
-10
-15
val Ala
FIG. 5. Wiid-type and mutant E . coli A-receptor protein signal sequences. From Emr and Siihavy (1983).
threonine, alanine, or serine, or changing the glycine to cysteine or serine, would restore export competence, as the signal sequence, though shortened, would be able to adopt an a-helical conformation. Emr and Silhavy ( 1983) subsequently isolated two pseudorevertants from the export-defective deletion mutant. In one, the proline residue had been replaced by leucine, and in the other, the glycine residue had been replaced by cysteine. Both strains were able to export the A-receptor protein at 75-80% of the wild-type efficiency. The implication is clear that adoption of a regular secondary structure, in this case a helix, is necessary to signal sequence function. Hence, bacterial genetic results provide many suggestions of structure-function correlations in signal sequences. These results serve as a point of departure for studies that determine the physical properties of isolated signal sequences as a means of elucidating their mode of action (see Section VI).
IV. COMPONENTS OF THE SECRETORY APPARATUS Understanding of signal sequence action requires knowledge of the species with which it might interact in vivo. Secretion of a protein requires, at the very least, a membrane, a signal sequence, and a signal peptidase. Various other components of the secretory apparatus have been isolated or implicated, such as the SRP, the SRP receptor or docking protein, ribophorins, signal-peptide receptor, and signal-peptide peptidase. This section summarizes biochemical and genetic knowledge of the components of the secretory system.
MOLECULAR MECHANISMS OF PROTEIN SECRETION
129
A . The Membrane The role of the membrane in protein secretion is disputed. In some models it is regarded merely as an obstacle that plays no direct part in translocation (Blobel and Dobberstein, 1975a,b). This view implies that the secreted protein crosses the membrane through a proteinaceous pore or export apparatus, and has no contact with the membrane lipids (Gilmore and Blobel, 1985). Others see the membrane as an important participant in some parts of the secretion process. Various authors have suggested that: (1) The signal sequence initiates translocation by partitioning into the hydrophobic region of the membrane bilayer (Engelman and Steitz, 1981). (2) The membrane potential provides energy required for protein translocation (Rhoads et al., 1984; Wickner, 1980; Oxender et al., 1984; Bakker and Randall, 1984). (3) The membrane phospholipids “pull” the signal sequence and the secreted protein through the membrane by formation of small nonbilayer regions within the membrane (Rapoport, 1985; Nesmayanova, 1982).The evidence for and against these ideas is considered in this section. Note that all of the research described below has been performed on systems derived from E. coli. Although it is tempting to speculate that secretion processes in prokaryotes and eukaryotes are very similar, this is not proven, and extension of the conclusions of these experiments to eukaryotic secretion is risky at best. Lipid Fluidity A major part of the evidence for the participation of the membrane in protein translocation comes from studies of the effects of altered lipid fluidity on secretion. Treatment of E. coli cells with phenethyl alcohol (PEA), which greatly increases membrane fluidity, caused a decrease in secretion of the outer membrane and periplasmic proteins (Pages and Lazdunski, 1981). The amount of protein associated with the inner membrane increased; this increase, and the decrease in secretion, were reversible on removal of PEA, suggesting that secretory proteins accumulate at the inner membrane in the presence of the lipid perturbant, but can be secreted and processed when the proper lipid fluidity is restored. [PEA also dissipates proton motive force, which is thought to be required for protein secretion (Daniels et al., 1981); see below.] Decreasing the membrane fluidity also affects secretion. Pages et al. (1978) varied fluidity by changing the temperature at which E. coli spheroplasts were grown. They found that the amount of secreted alkaline phosphatase (as a fraction of total synthesized protein) dropped dramatically from 29 to 13”C,while the amounts found in the cytoplasm
130
MARTHA S. BRIGGS AND LILA M. GIERASCH
and associated with the membrane increased. The “transition temperature,” as determined from Arrhenius plots, occurred at about 22°C.This result was obtained with both normal cells and a fatty acid auxotroph grown on oleic acid. Using data on alkaline phosphatase secretion from a fatty acid auxotroph grown on elaidic acid, which has a higher phase transition temperature than oleic acid, a “transition temperature” of 33°C was obtained. Using similar methods, DiRienzo and Inouye (1979) also found that reduced membrane fluidity inhibited localization of periplasmic and membrane proteins in E. coli. Membrane perturbants and temperature changes also affect translocation in an in vitro system (Rhoads el al., 1984; Chen et al., 1985). Thus, protein secretion in E. coli is dramatically affected by the physical state of the membrane lipids. Both increased and decreased fluidity inhibit secretion. It is possible that this effect arises because the signal sequence and/or the secreted protein interact with the membrane lipids, and these interactions are perturbed when the lipid fluidity is changed, However, changes in lipid fluidity also affect other membrane functions such as active transport (DiRienzo and Inouye, 1979).Thus the effect of membrane fluidity on protein secretion may be due to altered activity of a membrane-bound part of the secretory apparatus, and may not be an indication of signal sequence-membrane interaction. B . Signal Peptidase and Signal-Peptide Peptadase 1 . Signal Peptidase
Signal peptidase is the enzyme that proteolytically removes the signal peptide from the nascent secretory protein. Signal peptidases have been found in a variety of organisms, including E. coli (Zwizinski and Wickner, 1980), hen (Lively and Walsh, 1983),pig (Fujimoto et al., 1984),dog (Zimmerman et al., 1980), and D . melanogaster (Brennan et al., 1980). E . coli has at least two different signal peptidases, one of which is specific for the lipoprotein signal peptide [lipoprotein signal peptidase (LSP or peptidase 11)](Tokunaga et al., 1982, 1984),and one of which appears to cleave all of the other signal peptides (peptidase I). Only the E. coli signal peptidase I has been purified to homogeneity (Zwizinski and Wickner, 1980; Wolfe et al., 1983b). The signal peptidases found so far all seem to be integral membrane proteins (Brennan et al., 1980; Lively and Walsh, 1983; Fujimoto et al., 1984; Wolfe et al., 1983b). The amino terminus of the E. coli enzyme appears to be an uncleaved signal peptide (Wolfe and Wickner, 1984). In E. coli the signal peptidase has been reported to reside in approximately equal amounts in both the inner and outer membranes (Mandel
MOLECULAR MECHANISMS OF PROTEIN SECRETION
131
and Wickner, 1979).This result is surprising, as there is no other example of a protein that is localized to both of the E. coli membranes (Silhavy et al., 1983). In light of the recognized difficulty of clean separation of the two membranes (Tommassen et al., 1985), these conclusions should be regarded as questionable. Signal peptidase activity in dog pancreas appears to be limited to the RER membrane (Jackson and Blobel, 1977). Signal peptidase can be extracted from membranes using various detergents (Jackson, 1983). The detergent-solubilized bacterial enzyme is active (Wolfe et al., 1983b),while the canine enzyme requires phospholipid for activity (Jackson and White, 1981). The molecular weight of signal peptidase I from E. coli is about 37,000 (Wolfe et al., 1983b).The canine signal peptidase has a Stokes radius of 55 A; if the protein is spherical this size corresponds to a molecular weight of about 300,000 (Jackson and White, 1981). On the basis of this large molecular weight it has been suggested that the enzyme is part of a complex of proteins involved in translocation (Jackson and White, 1981). Neither the bacterial nor any of the eukaryotic signal peptidases is inhibited by the usual protease inhibitors such as tosyl-L-phenylalanine chloromethyl ketone, tosyl-Llysine chloromethyl ketone, o-phenanthroline, and phenylmethylsulfonyl fluoride (Wdlfe et al., 1983b; Stern and Jackson, 1985). Mammalian signal peptidases are also unaffected by leupeptin, pepstatin, antipain, diisopropyl fluorophosphate, and bestatin (Fujimoto et al., 1984; Stern and Jackson, 1985). Signal peptidases appear to be endoproteases. Treatment of bovine preproteins with an extract containing the enzyme in vitro yielded fragments with molecular weights corresponding to those of the mature protein and of the signal sequence (Stern and Jackson, 1985). Signal peptides from E. coli proteins accumulate in the cell envelope under conditions that inhibit the action of signal-peptide peptidase (see below) (Hussain et al., 1982). The fragments are also seen in vitro in the absence of signal-peptide peptidase (Silver et al., 1981). Thus, the signal peptidase cleaves only at the cleavage site between the mature protein and the signal sequence. The signal peptidase cleavage site has been described above. The interactions of the signal sequence and/or mature secretory proteins with the signal peptidase have not been studied extensively. The variability of the sequences of the signal peptidase cleavage sites and the predicted tendency of the c region of the signal sequence to fold into a /3 turn make it likely that the signal peptidase recognizes a secondary structure rather than a particular amino acid sequence. von Heijne (1983) has proposed a signal sequence-signal peptidase complex in which the small neutral side chains of residues -1 and -3 fit into a pocket of the protease, while Perlman and Halvorson (1983) have sug-
132
MARTHA S. BRIGGS AND LILA M. GIERASCH
gested that the predicted 6 turn is important for proper access of signal peptidase to the cleavage site. Cleavage appears to take place on the lumenal side of the ER membrane, or on the outside of the E. coli inner membrane. Wolfe et al. (1983a) presented evidence that the E. coli signal peptidase is oriented such that the bulk of the protein is at the outer surface of the cytoplasmic membrane and possibly also the outer membrane. This orientation is consistent with the peptidase’s site of action. Recent data suggest that blocking cleavage of precursors causes their accumulation in the E. coli inner membrane (Dalbey and Wickner, 1985). This result implies a catalytic role for the peptidase in release of exported proteins from the membrane.
2. Signal-Peptide Peptidase Signal-peptide peptidase activity has been found in E . coli (Zwizinski and Wickner, 1980; Hussain et al., 1982; Silver et al., 1981). The protein resides in the inner membrane, but is not identical to the signal peptidase or the lipoprotein signal peptidase. This enzyme has been purified, and has been identified as protease IV of the E. coli cell envelope (Ichihara et al., 1984). This signal-peptide peptidase is inhibited by a wide range of protease inhibitors, including antipain, leupeptin, chymostatin, and elastatinal (Hussain et al., 1982). It degrades signal peptide only after it is cleaved from the mature protein, and appears not to attack membrane proteins (Ichihara et al., 1984). Ichihara et al. (1984) have speculated that the signal-peptide peptidase is a carboxypeptidase that initiates digestion after cleavage of the signal peptide. Novak et al. (1986) have reported two cytoplasmic protease S with signal peptide hydrolyzing activity. They suggest that the membraneresident and cytoplasmic proteases may function in combination. C . Proteins Implicated in Eukaryotic Protein Secretion The signal hypothesis postulates the existence of several proteins necessary for secretion. These include the components of the SRP, which is proposed to bind to the signal sequence and block further translation of the mRNA coding for the mature protein; SRP receptor, or docking protein, which relieves the translation block imposed by the SRP; ribophorins, which bind the ribosome to the ER membrane; signal peptidase and signal-peptide peptidase, discussed above; and other proteins, which form a pore or transport apparatus in the membrane. Some of these proteins, the SRP, SRP receptor, signal peptidase, and ribophorins, have been isolated from eukaryotic cell extracts and characterized.
MOLECULAR MECHANISMS OF PROTEIN SECRETION
133
1 . The Signal-Recognition Particle
The signal-recognition particle (SRP) is an 11 S ribonucleoprotein composed of a 7 S RNA molecule and six nonidentical polypeptide chains (Walter and Blobel, 1983). It is viewed as functioning to couple the translation of secretory proteins to their translocation through the membrane (Walter and Blobel, 1981b). SRP is isolated from the microsoma1 membrane by a high-salt wash (Walter and Blobel, 1981a). It can be disassembled into its individual components, which can then be reconstituted into an active particle (Walter and Blobel, 1983). The components of the SRP have been characterized. The sequence of the 7 S RNA is known, and a predicted secondary structure has been computed (Gundelfinger et al., 1984).The regions of the RNA to which the proteins bind have been defined. Only five of the six proteins bind the RNA; these are extremely basic, asjudged by isoelectric focusing and their tight binding to acidic chromatographic materials. The sixth protein binds to two of the other proteins, and not to the RNA (Walter and Blobel, 1983). Electron microscopy indicates that the SRP is an elongated rod about 24 nm long and 5-6 nm in diameter (Andrews et al., 1985). Since the distance between the nascent chain exit site from the ribosome and the peptidyl-transfer center is about 16 nm, it is possible that the SRP exerts its effect on protein translation by binding to both the signal sequence (near the exit site) and a component of the peptidyltransfer site (Andrews et al., 1985). The components of the SRP appear to have been conserved among eukaryotes. Chimeric SRPs reconstituted from mammalian SRP proteins and 7 S RNA from either X. laevis or D . melunogaster functioned both to block synthesis of secretory proteins temporarily and to cause their translocation through microsomal membranes (Walter and Blobel, 1983). A chimeric SRP composed of 6 S RNA (described in Section IV,D) from E. coli and mammalian proteins was inactive (Walter and Blobel, 1983). However, mammalian SRP can recognize both prokaryotic and eukaryotic signal sequences. Secretion of P-lactamase from E. colz by a canine system requires SRP (Miiller et al., 1982). SRP recognizes the prokaryotic signal sequence and binds to it, resulting in a translation arrest, and assists in the protein’s cotranslational translocation into dog pancreas microsomes. The SRP recognizes polysomes synthesizing secretory proteins (Walter and Blobel, 1981a). Recent evidence from synthesis of preprolactin with 6-azidobenzoyllysyl-tRNA as a photoaffinity group supports a direct interaction between the SRP 54 kDa polypeptide and the signal sequence (Kurzchaliaet al., 1986). Furthermore, construction of a secreted protein
134
MARTHA S . BRIGGS AND LILA M. GIERASCH
lacking its signal sequence led to loss of the SRP-imposed translation block (Weidmann el al., 1986a). In a cell-free translation system from wheat germ SRP binding imposes a translation block, which is relieved when SRP, the ribosome, and/or the signal sequence interact with the SRP receptor at the ER membrane (Walter and Blobel, 1981b). The ability of the SRP to cause an elongation arrest has been localized to two of its six proteins (Siege1 and Walter, 1985). In the presence of reconstituted SRP lacking these proteins [(-9/14) SRP], no translation arrest occurs. Secretory proteins are still translocated, although at a somewhat lower efficiency than usual. In synchronized translation experiments4 in the presence of (-9/14) SRP, the amount of protein translocated depends on the time of addition of microsomes. No translocation occurs if microsomes are added more than 5 minutes after translocation has begun. This amount of time corresponds to synthesis of a polypeptide of maximum length 150 residues. In contrast, in translation systems containing whole SRP, 100% translocation occurs regardless of the time of microsome addition. These results demonstrate that the translation arrest imposed by SRP does in fact couple protein synthesis to export. Meyer et al. (1982) have shown that reticulocyte lysates contain an SRP-like activity that causes a synthesis arrest on addition to a wheat germ translation system. This activity does not arrest translation in the reticulocyte lysate, however. In addition, no translation arrest occurs when canine SRP is added to cell-free translation systems derived from HeLa cells or rabbit reticulocytes (Meyer, 1985). He concludes that the translation arrest is peculiar to the wheat germ translation system. In these experiments, however, the translation system, SRP, and export apparatus have been derived from two or more organisms. Although some parts of the secretion apparatus, such as signal sequences and parts of the SRP, appear to have been conserved through evolution, others, such as the SRP receptor or some ribosomal proteins, may not have been. Thus, the lack of translation arrest in a given experiment may be due to a lack of interaction between molecules that come from different species. Conversely, the translation arrest observed in the wheat germ translation system may reflect a more tolerant requirement for interaction of some component critical for secretion. Clearly, the definitive
A synchronized translation system is one in which all of the ribosomes are translating the same part of a given mRNA at the same time. Synchronizationis achieved by allowing protein synthesis to proceed for a short period of time (e.g., 30 seconds) and then adding an inhibitor of further initiation. Thus, chains that are already being synthesized are completed, but no new chains are started.
MOLECULAR MECHANISMS OF PROTEIN SECRETION
135
experiment would involve using a translation system, SRP, and export assay from the same organism. Another possible explanation for the lack of observed export block is the difference in rate of translation in vivo and in vitro. The rate of chain elongation in vivo in eukaryotes is about 180 residuedminute, while in vitro translation proceeds at about 30 residues/minute. If an SRP-nascent chain-ribosome complex has a half-life of, e.g., 1 second, it would cause a significant pause in synthesis in vitro, but would probably not be noticed in viva Such a short-lived complex may be sufficient to couple translation to translocation in vivo, but not in vitro, as the time required for the ribosome to diffuse to the membrane will depend on how far it has to go. Inside the cell, the ribosome will have a much smaller distance to travel than in an in vitro translocation mixture.
2. SRP Receptor The SRP receptor or docking protein (Walter et al., 1981; Walter and Blobel, 1981b; Meyer et al., 1982) is an integral membrane protein of the RER (Hortsch and Meyer, 1985) with a mass of 72,000 Da (Gilmore et al., 1982; Hortsch et al., 1985). One of its functions appears to be to relieve the SRP-imposed translation arrest by binding tightly to the SRP (Walter and Blobel, 1981b). It is required for secretion even in the absence of a translation block, however, so it must have at least one additional function (Siegel and Walter, 1985). Siegel and Walter (1985) have suggested that the SRP receptor’s primary function is to direct the SRP-bound nascent polypeptides to the ER membrane where the ribosome can be bound to the membrane and translocation can occur. The SRP receptor can be cleaved into two or three domains by proteolysis (Hortsch et al., 1985). Mild treatment with elastase generates a soluble fragment of 59,000 Da and a membrane-bound domain of 14,000 Da. The 59,000-Da fragment may be capable of restoring translocation activity to microsomes depleted of SRP receptor and can release an SRP-mediated translation arrest. This result conflicts with those of Lauffer et al. (1985) on what is allegedly the same fragment. Treatment with both trypsin and elastase results in three fragments: soluble pieces of 46,000 and 13,000 Da, and a membrane anchor of 14,000 Da. The 46,000-Da fragment was not able to reconstitute the activity of microsomes lacking the SRP receptor, nor did it bind SRP. The authors concluded that the soluble 13,000-Da fragment is required for both membrane association and SRP binding by the SRP receptor. The complete primary structure of the SRP receptor has been determined (Lauffer et al., 1985). It appears that the protein has an uncleaved signal sequence near its amino terminus. Its membrane-bound region is
136
MARTHA S. BRIGGS AND LILA M. GIERASCH
in the amino-terminal part of the protein, which has two hydrophobic stretches of amino acids that may insert into the membrane as a double “helical hairpin” (Lauffer et al., 1985). The remainder of the protein is quite hydrophilic. Three regions in particular have a large proportion of charged amino acids; basic residues are about twice as numerous as acidic residues. These regions resemble nucleic acid binding proteins, and may be responsible for binding to the RNA molecule of the SRP. It is unlikely that the SRP receptor binds the signal sequence. Association between polysomes and the SRP receptor appears to require the SRP as an intermediary, The extreme basicity of the SRP receptor makes it likely that it binds a highly acidic entity (Lauffer et al., 1985); no part of the signal sequence is acidic. Furthermore, Gilmore and Blobel (1983) have presented evidence that the affinity of the SRP receptor for the ribosome and the nascent polypeptide is low, both in the presence and absence of SRP. 3 . Ribophorins
T h e low affinity of the SRP receptor for the ribosome, and the substoichiometric ratio of SRP and SRP receptor to membrane-bound polysomes, suggest that interaction of the SRP and SRP receptor with the ribosome is transient (Gilmore and Blobel, 1983). Thus, the observed binding of the ribosome to the membrane must occur by some other means. Two integral membrane glycoproteins, ribophorins I and 11, have been identified (Kreibich et al., 1978, 1983). They appear to be required for translocation (Amar-Costesec et al., 1984), and may be responsible for binding of the ribosome to the membrane. Ribophorins I and I1 are immunologically distinct integral membrane proteins found in the RER membrane (Amar-Costesec et al., 1984; Marcantonio et a,!., 1982). The proteins were first isolated from rat liver, but very similar (in location, immunological behavior, and molecular weight) proteins have been found in RER membranes from other rat tissues and from other species, such as rabbit and dog (Marcantonio et al., 1982). Rat liver ribophorins I and I1 have molecular masses of 65,000 and 63,000 Da, respectively (Kreibich et al., 1978). Binding of ribosomes by microsoma1 subfractions correlates with the presence of these proteins, and they can be chemically cross-linked to ribosomes (Kreibich et al., 1983; Amar-Costesec et al., 1984). T h e molar ratio of the ribophorins to each other and to bound ribosomes is about one, supporting a model in which one of each of the ribophorins is required to bind one ribosome to the membrane (Marcantonio et al., 1984). Microsomal subfractions containing ribophorins can be stripped of their native ribosomes. These stripped microsomes can then bind ribosomes that are not carrying a
MOLECULAR MECHANISMS OF PROTEIN SECRETION
137
nascent secretory polypeptide (Amar-Costesec et al., 1984). This result suggests that the signal sequence does not play a part in ribophorinribosome binding. 4 . Other Proteins The signal hypothesis postulates that there are, in addition to the SRP, SRP receptor, ribophorins, and signal peptidase, other components of the secretory apparatus that are responsible for the transport of the nascent polypeptide across the membrane. To date, a protein that performs this function has not been isolated, although there is evidence for a proteinaceous site in the RER membrane that specifically binds signal peptides in the absence of ribosomes and at high ionic strength, which implies the absence of SRP (Bendzko et al., 1982; Prehn et al., 1980, 1981; Robinson et al., 1985). The activity resides at the cytoplasmic side of the ER, as signal-peptide binding is abolished by treatment with added protease (Prehn et al., 1980). Cross-linking experiments tentatively identify the binding activity with a membrane-bound protein of approximately 45,000 Da (Robinson et al., 1985). This “signal-peptide receptor” is saturable, with a K d of 1 X lO-’M. Its function is unknown (Prehn et al., 1980; Robinson et al., 1985). A number of proteins necessary for invertase secretion have been identified in genetic studies of yeast. Most of these affect secretion events that occur after translocation of the protein into the ER lunien (Novick et al., 1980). However, two mutant strains that are temperature-sensitive for the product of genes sec53 and sec59 accumulate secretory-protein precursors bound to the ER membrane at the nonpermissive temperature (Ferro-Novick et al., 1984a). The gene products of sec53 and sec59 have not been fully characterized, but the phenotypes of the mutant strains indicate that neither codes for components of the SRP or SRP receptor (Ferro-Novick et al., 1984b). Thus sec53 and sec59 may specify new proteins in the secretion machinery. D . Proteins Implicated in Prokaryotic Protein Secretion information on components of the prokaryotic protein-secretion apparatus has come primarily from genetic studies in E. coli. The chief technique has been to obtain a strain that is defective in some aspect of protein secretion and then to screen for strains in which the defect is suppressed by a compensating second-site mutation (Emr et al., 1981; Bankaitis and Bassford, 1985). Another approach has been to select for mutations that exhibit a pleiotropic export-defective phenotype (It0 et al., 1983; Oliver and Beckwith, 1981).These procedures have identified a host of genetic loci that specify proteins that affect secretion. Two of
138
MARTHA S. BRIGGS AND LILA M. GIERASCH
these proteins have been partially purified and characterized. Possible protein-protein, protein-membrane, and protein-ribosome interactions have also been determined. Recent results show that the effects of suppressor mutations may be indirect and should be interpreted with caution (Strauch et al., 1986). Another potential complexity is the finding that signal sequence mutations can directly influence levels of expression, either increasing or decreasing them (Kadonaga et al., 1986; Matteucci and Lipetsky, 1986). 1 . secA The secA gene product has a pleiotropic effect on protein secretion (Oliver and Beckwith, 1981). Lack of the SecA protein abolishes synthesis of most exported proteins (Liss and Oliver, 1986), and this defect is lethal to the cells. The product of the secA gene has been identified as a 92,000-Da protein that is located at the cytoplasmic face of the inner membrane (Oliver and Beckwich, 1982a,b). This protein appears to interact with the product of the prlA (secY) gene (described below), as defects in theprlA gene can suppress a secA defect (Brickman et al., 1984; Oliver and Liss, 1985). The protein may also interact with the signal sequence of exported proteins (Kumamoto et aZ., 1984): In the absence of SecA, synthesis of MBP is abolished. In three mutants with nonfunctional MBP signal sequences, the elimination of SecA does not prevent MBP synthesis. Instead, precursor of MBP accumulates in the cytoplasm. Kumamoto et al. (1984) had suggested that SecA might correspond to the part of the eukaryotic SRP that is responsible for binding to the SRP receptor and relieving the translation block. However, it has recently been found (Strauch et al., 1986) that the signal sequence mutations not only restore synthesis of the same protein; the effect appears to be a nonspecific one. Hence, either signal sequence mutations alter synthesis and secretion of other proteins or the secA product is induced by the presence of the mutation. Furthermore, cyclic AMP was found to restore the level of synthesis of exported proteins in SecA amber mutants (Strauch et al., 1986). Synthesis of SecA is regulated in response to protein export (Oliver and Beckwith, 1982b).When export is inhibited, the production of SecA increases at least 10-fold to compensate for the secretion defect. Oliver and Beckwith (1982b) have speculated that the presence of precursors of secreted proteins in the cytoplasm could serve as a regulator of the expression of proteins needed for synthesis. 2. secB and secC Strains containing mutations in the secB gene exhibit mild defects in secretion of a subset of exported proteins, including MBP, OmpF, and
MOLECULAR MECHANISMS OF PROTEIN SECRETION
139
LamB (Kumamoto and Beckwith, 1983, 1985). The precursors of these proteins accumulate in the cytoplasm. The gene product, a protein with an apparent mass of 12,000 Da, appears not to be essential for growth, possibly because not all exported proteins are affected. Double mutants defective in both secA and secB grow more poorly than either of the parent strains (Kumamoto and Beckwith, 1983). The effects of the two mutations are synergistic, suggesting that the gene products are part of the same export pathway, and may interact with each other. Little is known of the mechanism of the secB export defect. A mutation at the secC gene, which codes for a ribosomal protein (S15, S. Ferro-Novick and J. Beckwith, personal communication), can suppress the secretion defect of a secA mutant (Ferro-Novick et al., 1984~). The phenotype of a temperature-sensitive secC mutant is similar to that of a secA- strain. At the nonpermissive temperature, synthesis of exported proteins is blocked. The synthesis of MBP can be restored in secC mutants by mutations in the hydrophobic core of its signal sequence. The new insights into SecA mutants (Strauch et al., 1986) cloud the interpretation of these results. 3. plA (secY) The gene prlA was first identified as part of the E . coli secretion apparatus because mutations in the gene can restore export of signalsequence mutants of the A-receptor protein (Emr et al., 1981). Subsequently, Ito et al. (1983) isolated a temperature-sensitive mutant, secY, that was pleiotropically defective in protein secretion. The prlA and secY genes are almost certainly identical (It0 et al., 1984). The plA gene is located near the promoter-distal end of the spc operon (Schultz et al., 1982). This operon consists of genes for several ribosomal proteins, as well as the prlA gene product, and the X gene, which codes for a protein of unknown function (Cerretti et al., 1983). Since the prlA gene is part of the spc operon, it is possible that its gene product is also a ribosomal protein. However, the chromosomal locations of most ribosomal genes are known, so it is unlikely that plA codes for a known ribosomal protein (Schultz et al., 1982). The prlA gene has been sequenced, and its gene product was identified (Ito, 1984). The protein’s molecular mass, predicted from its DNA sequence, is 49,000 Da. It has unusual properties characteristic of integral membrane proteins. Akiyama and Ito (1985) have shown that this protein resides in the cytoplasmic membrane. They have suggested, based on its highly hydrophobic nature, that PrlA may correspond to the ribophorins of eukaryotic cells. Alternatively, the protein could form a pore through the membrane for passage of the nascent polypeptide.
140
MARTHA S. BRIGGS AND LILA M. GIERASCH
T h e prlA gene product may interact with the signal sequences of exported proteins, as it can suppress mutations in this region (Emr et al., 1981; Shultz et al., 1982). Interestingly, mutations in this gene have been shown to suppress an export defect that resulted from a change in the first amino acid in the mature exported protein (Liss et al., 1985). T h e allele specificity of p-lA also supports the idea of direct contact between the signal sequence and PrlA. Certain mutations in prlA can suppress export defects due to certain mutations in the signal sequence, but not to others (Emr et al., 1981). Such allele specificity implies a direct interaction of PrlA with the signal sequence. In addition, as noted above, mutations inprlA can restore export function to secA mutants (Brickman et al., 1984). Thus, the PrlA protein may also interact with the secA gene product. 4. prlB, prlC, prlD
Th e prlB and prlC genes were also identified as sites of suppressor mutations that restore export to mutant A-receptor proteins (Emr et al., 1981). The prlB suppressor is a mutation in a gene coding for a periplasmic ribose-binding protein (Silhavy et al., 1983). It has been suggested that prlB does not code for a part of the export machinery, but suppresses A-receptor protein signal sequence mutations by bypassing the normal secretion pathway (Silhavy et al., 1983). T h e prlC mutants are similar in phenotype to theprlA mutant. T h e gene maps between 69 and 71 minutes on the E. coli chromosome (Emr et al., 1981). Little else is known of this suppressor. Another gene, prlD, was identified as affecting protein secretion in E. coli (Bankaitis and Bassford, 1985). Mutations in prlD suppress the export defect of a signal-sequence mutant of MBP. It can also suppress a LamB export defect. Mutations in prlD are allele-specific, which implies that the prlD gene product interacts directly with the signal sequence. There is evidence that prlD also interacts with prlA. Certain strains with mutations in both the prlA and prlD genes exhibit a general lack of protein export, and accumulate precursors of exported proteins in the cytoplasm. 5. Other Components of the Secretion Apparatus in E. coli
A number of other genes appear to affect protein secretion in E. coli. Most are poorly characterized, and could affect the export process, synthesis of exported proteins, and/or regulation of export or synthesis. A mutation in the e n d gene, which is involved in the transcriptional regulation of genes coding for OmpF and OmpC, has pleiotropic effects on
MOLECULAR MECHANISMS OF PROTEIN SECRETION
141
secretion (Silhavy et al., 1983). A mutation in the expA gene causes decreased secretion of periplasmic and outer-membrane proteins without affecting cytoplasmic and inner-membrane proteins (Dassa and Boquet, 1981). Oliver (1985) has identified five new genes, ssaD, ssaE, ssaF, ssaG, and ssaH, that are extragenic suppressors of mutations in the secA gene. Mutations in these genes decrease the synthesis of MBP (and presumably other exported proteins), strengthening the idea that E. coli has a mechanism for coupling protein synthesis and secretion, but complicated by the recent findings of Strauch et al. (1986). Shiba et al. (1984) have isolated a mutation that suppresses the protein-export defect of a secY mutation. The gene in which the mutation occurs was designated ssyA. The phenotype of the strain carrying the mutant ssyA gene is altered in protein synthesis as well as in export. Thus, ssyA may code for a protein that is part of both the synthetic machinery and the export apparatus. Miiller and Blobel (1984a,b) have partially purified a soluble factor required for protein secretion from an in vitro translocation system derived from E. coli. As it sediments at about 12 S, the authors have suggested that it is a complex of smaller molecules. It does not contain 6 S RNA (see below), but may contain some other small RNA. Its function is unknown. There is also evidence of at least one protein at the cytoplasmic surface of the E. coli membrane that is involved in translocation: Treatment of inverted membrane vesicles with protease renders the membrane inactive for subsequent translocation (Rhoads el al., 1984; Chen et al., 1985). Escherichza coli possesses a 6 S RNA of unknown function. This molecule complexes with protein to form an 11 S particle, whose function is also unclear (Lee et al., 1978). It was suggested that the 11 S particle is analogous to the eukaryotic SRP, and that the 6 S RNA corresponds to the eukaryotic 7 S RNA that is part of the SRP (Walter and Blobel, 1983). Lee et al. (1985) demonstrated that the 6 S RNA is not necessary for growth of E. coli or for protein export. The authors conclude that the E. coli 6 S RNA is not a component of a bacterial SRP. There is indirect evidence for the existence of an SRP-like entity in E. coli. Pages et al. (1985) have presented preliminary findings that indicate that a translation block may occur during the synthesis of pre-PhoS, a periplasmic phosphate-binding protein. It is known that the translation rate in E. coli is nonuniform. “Pause sites” (Pages et al., 1985) occur at codons complementary to uncommon tRNAs. However, a pause site in PhoS elongation corresponding to a peptide of 8 kDa is not accounted for by the presence of such codons. This peptide was subsequently con-
142
MARTHA S. BRIGGS AND LILA M. GIERASCH
verted into pre-PhoS. This intermediate could be the result of a translation block similar to that imposed by the eukaryotic SRP. There is also preliminary evidence for a protein secretion apparatus in Bacillus subtilis. Caulfield et al. (1984, 1985) have studied the “S complex,” a particle consisting of four proteins that appears to be involved in protein secretion. The complex is present on ribosomes as a small particle, essentially a third ribosomal subunit; its proteins can be cross-linked to the 50 S ribosomal subunit. In addition, a 64-kDa protein present in the S complex is protected from added protease in the presence of both ribosomes and membrane, but not by either alone. The S complex does not appear to cause an arrest of translation (Caulfield et al., 1984).The S complex aggregates to form a clathrin-like structure when it is removed from ribosomes (Caulfield et al., 1985). The authors have proposed that such a structure might serve to form a “cage”around a nascent secretory polypeptide, isolating it from the cytoplasm until it reaches the membrane. Subsequent to membrane binding, three of the proteins of the S complex dissociate and the 64-kDa protein remains associated. This latter protein may then play a role in secretion.
6. Summary The genetic evidence presented above makes it clear that E. coli, and possibly other bacteria, possess a complex set of proteins that act in the protein-secretion process. Although it appears that at least one protein, the M13 phage coat protein, can be localized and processed in the absence of proteins other than signal peptidase (Section V,B) (Silver et al., 1981; Ohno-Iwashita and Wickner, 1983; Watts et al., 1981), most proteins of the bacterial cell envelope require the participation of a secretion apparatus for proper localization. Whether the bacterial secretion process is analogous to the eukaryotic process remains to be seen. The recent development of in vitro translocation systems derived from E. coli should facilitate research in this area (Rhoads et al., 1984; Miiller and Blobel, 1984b). V. How DOESSECRETION OCCUR?
Despite large amounts of evidence concerning the requirements of protein secretion, the molecular mechanism of the process is still unclear. Various models of the export mechanism have been proposed. No one hypothesis accounts for all of the data collected to date; it may be that one mechanism is insufficient to explain the many types of export that occur. In this section, some current models of protein secretion are
MOLECULAR MECHANISMS OF PROTEIN SECRETION
143
summarized. The models are then compared to each other, and to existing data, with reference to some of the major questions concerning protein-export mechanisms. A. Models of Protein Secretion
1 . The Signal Hypothesis The signal hypothesis of Blobel and Dobberstein (1975a,b) is diagrammed and discussed in Section 11,C; the details will not be repeated here. Its distinguishing points are (1) recognition of the signal sequence by the SRP, and subsequent translation arrest; (2) release of the translation arrest by the SRP receptor at the membrane, and association of the ribosome with the membrane; and (3) vectorial transport of the nascent protein through a proteinaceous pore. Energy for translocation is derived from elongation of the peptide chain.
2 . The Membrane Trigger Hypothesis Wickner (1980) proposed an alternative mechanism of protein secretion, called the membrane trigger hypothesis. This model proposes that the signal sequence influences the precursor protein or a domain of the precursor to fold into a conformation that can spontaneously partition into the hydrophobic part of the bilayer. In prokaryotes, the membrane potential causes the protein to traverse the bilayer. The protein then regains a water-soluble conformation, and is expelled into the medium. Signal peptidase removes the signal sequence during or after this process. Thus, secretory proteins or domains are transported across the membrane posttranslationally without the aid of a proteinaceous secretory apparatus. An energy source, such as the membrane potential, is required for secretion. 3. The Loop Model Based on analysis of the amino acid sequences of signal peptides, Inouye and Halegoua (1980) proposed the loop model of protein secretion. While not a detailed mechanism in terms of energy source, temporal relationships of translation and translocation, and export site, these ideas have shaped subsequent thinking about the topology of export. In this mechanism, positively charged residues at the amino terminus of the signal sequence bind to the negative charges of the phosphatidylglycerol head groups at the membrane surface. The proline and glycine residues that occur in most signal sequences induce formation of a reverse turn, so that the signal peptide enters the membrane as a loop. As the peptide is elongated, the loop protrudes further into the membrane. The cleav-
144
MARTHA S. BRIGGS AND LILA M. GIERASCH
age site is eventually located at the outer face of the cytoplasmic membrane, while the charged amino terminus remains anchored at the inner face. This idea is supported by experiments in which the gene coding for P-galactosidase is fused to the gene for prelipoprotein. Thus, the gene product is a hybrid protein consisting of P-galactosidase followed by the signal sequence and the structural sequence for lipoprotein. The lipoprotein portion of the hybrid is localized to the outer membrane, as usual. The signal sequence remains in the membrane, and P-galactosidase is found on the cytoplasmic side of the membrane (M. Inouye, personal communication). A fusion of part of the E. coli erythromycin resistance gene to the amino terminus of prelipoprotein was described above (Section 111,B; Hayashi et al., 1985). Lipoprotein was secreted in this case also. These experiments substantiate the central contention of the loop model, as they confirm that the amino terminus of the signal sequence remains on the cytoplasmic side of the membrane, while the secreted protein is translocated. The recent work of Perara et al. (1986, described in Section II1,B) suggests that the presignal sequence component in these hybrid systems may affect the eventual disposition of the protein product, as globin with the prolactin signal sequence on its C terminus was translocated.
4 . The Helical Hairpin Hypothesis T h e helical hairpin hypothesis of Engelman and Steitz (198 1) and the direct transfer model of von Heijne and Blomberg (1979) share an emphasis on the thermodynamic basis of secretion. These models note the unusual hydrophobicity of signal sequences and calculate that in a helical conformation (either a helix or 310 helix) a signal sequence can partition into the hydrophobic part of the membrane. The helical hairpin mechanism involves two helical regions long enough to span the membrane. One is composed of the signal sequence, and the other is the first 15-25 residues of the mature protein. These two regions form a side-byside “hairpin” structure that inserts into the membrane as a loop. T h e rest of the mature protein is translocated cotranslationally, with each residue passing through the membrane as part of a helical structure. This model predicts that a proteinaceous export site is not required for export and that export is initially driven by the favorable free energy of transfer of the hydrophobic signal sequence from the cytoplasm to the membrane.
5. The Amphiphilic Tunnel Hypothesis Rapoport (1985) has also considered the thermodynamic aspects of protein insertion and secretion in his amphiphilic tunnel hypothesis. This
MOLECULAR MECHANISMS OF PROTEIN SECRETION
145
model is essentially identical to the signal hypothesis in describing the initial steps of protein secretion. The translocation process and the properties of the region through which the protein traverses the membrane are described in detail. It is assumed that the export apparatus in the membrane consists of an amphipathic tunnel that can bind both hydrophobic and hydrophilic parts of the nascent protein. The tunnel might be formed of a protein with several types of binding sites, or of lipids arranged to provide both polar and apolar regions, or both. The translocation process begins with the signal sequence binding to the hydrophobic region of the tunnel. The nascent chain enters the tunnel, and begins to fold into a low-energy conformation. Hydrophilic regions of the protein are generally not retained in the tunnel, and are expelled into the aqueous phase. Hydrophobic and amphiphilic parts remain in the membrane, either until they assemble into a polar domain and are released into the medium, or until translation is complete. When translation ends, the amphipathic tunnel disassembles, and the portions of the nascent chain remaining in the membrane are either transferred to the outer face of the membrane or retained in the bilayer, depending on the compatibility of the peptide segments with a hydrophobic environment. Thus, this hypothesis models both protein secretion and the insertion of membrane proteins.
6. A More Active Role for Membrane Lipids? Nesmayanova (1982) has proposed a model that includes a more active role for the membrane lipids in translocating the secreted protein. Briefly, the model postulates an initial association of the signal sequence with the acid phospholipids of the inner membrane. This association neutralizes the negative charge of the lipid head groups and stimulates transbilayer movement of both the phospholipids and the signal sequence as a unit. A hydrophilic channel is formed by the lipids in a hexagonal arrangement. The secreted protein is then forced through the channel by the elongation process and by the motion of the lipids. Evidence for this model rests primarily on a correlation of lipid biosynthesis and translocation with protein synthesis and secretion. Increased synthesis of one causes increased synthesis of the other (Nesmayanova, 1982; Pag& 1982). Both are inhibited by dissipation of the membrane potential (Bogdanov et al., 1984). Phosphatidylglycerol is present at the site of protein translocation, and may be involved in binding to the nascent chains and/or the ribosome (Bogdanov et al., 198513).Bogdanov et al. (1985a) claim that secretion of alkaline phosphatase is accompanied by the appearance in freeze-fracture electron micro-
146
MARTHA S. BRIGGS AND LILA M. GIERASCH
graphs of “intramembrane particles” that represent areas of hexagonal lipid arrangement.
7. The Domain Model Randall and Hardy (1984a,b) proposed a secretion mechanism that could be called the domain model. This model incorporates and modifies portions of the signal hypothesis and the membrane trigger hypothesis. As in the amphipathic tunnel model, the initial steps of secretion (those involving targeting of the translation complex to the proper membrane) are as set forth in the signal hypothesis. After recognition and membrane association, the nascent chain is elongated at the membrane surface. It does not enter the membrane, however, until most or all of the protein has been synthesized. The protein is then transported across the membrane in “domains,” and not in a vectorial manner. It is important that synthesis occur at the membrane in order to prevent folding of the domains into a translocation-incompetent conformation. The mechanism of translocation is not specified, and may or may not involve participation of a proteinaceous secretory apparatus. The energy required for secretion is derived either from the membrane potential, or from a conformational change of the exported protein. A multistep export mechanism with some features in common with the domain model has been proposed for j3-lactamase (Koshland et al., 1982; Kadonaga et al., 1986). Some of the outstanding questions concerning protein secretion are: (1) What is the nature of the translocation site? (2) Is translocation vectorial or by domains? (3) How much energy is required for secretion and where does it come from? The models described above predict the answers to these questions in various ways. In the following sections, the available experimental data on each of these questions are described and compared to the predictions of the models. B . What Is the Nature of the Translocation Site? Exported proteins must cross a biological membrane, which is composed largely of lipids and proteins. Is the translocation site made of lipids or protein? The membrane trigger hypothesis, the helical hairpin hypothesis, the domain model, and the model of Nesmayanova postulate that protein translocation occurs directly through the lipid bilayer and that no proteinaceous export site is necessary. Other proteins may be needed for recognition, and signal peptidase is required for removal of the signal sequence after export. Evidence for a lipid translocation site comes primarily from experiments in reconstituted export systems in Wickner’s laboratory. It has been shown that the precursors of M13
MOLECULAR MECHANISMS OF PROTEIN SECRETION
147
phage coat protein and the E. colz proteins OmpA and MBP can partially insert into, and be processed by, liposomes containing no proteins other than signal peptidase (Silver et al., 1981; Ohno-Iwashita and Wickner, 1983; Ohno-Iwashita et al., 1984; Watts et al., 1981).The small size of the M13 coat protein (only 50 amino acids) (Wickner et al., 1980)may render it incapable of binding to an SRP-like particle while it is still attached to the ribosome, as 30-40 amino acid residues are required before the polypeptide begins to protrude from the ribosome (Smith et al., 1978; Randall, 1983). Consequently, synthesis of the protein would be finished, and the ribosome would have disassembled before the signal sequence became long enough to completely emerge from the ribosome. Thus, it is not surprising that secretion of the M13 coat protein is independent of species like SRP. However, secretion of MBP and OmpA is affected by mutations in proteins of the export apparatus (Liss and Oliver, 1986), and therefore should require proteins other than signal peptidase for proper export. None of the proteins was completely translocated into the interior of the liposomes, however, indicating that some protein or proteins are needed. These may or may not be part of a proteinaceous translocation site. The signal hypothesis describes a mechanism in which the exported protein traverses the membrane through a proteinaceous pore. Gilmore and Blobel ( 1985) have described experiments that test the accessibility of the nascent chain to aqueous solutes. They conclude that integral membrane proteins are involved in both the initial attachment of the signal sequence to the membrane, and translocation of the exported protein through the membrane. They suggest that the nascent chain does not interact with the membrane lipids. However, there is indirect evidence in prokaryotes (from the membrane-fluidity experiments described above) that the exported proteins may contact the lipids during translocation. Bogdanov et al. (1985b) have presented results indicating that phosphatidylglycerol is present at the site of protein translocation. Two laboratories have reported that synthesis of acid phospholipids correlates with protein secretion (Nesmayanova, 1982; Pag& 1982). Furthermore, we have found that the interaction of synthetic signal sequences with phospholipid monolayers correlates with their activity in vivo (Section V1,C). Thus there is evidence for at least some contact of the nascent protein with the membrane lipids. Although the nature of the translocation site is far from understood, it is probable that the nascent polypeptide contacts both proteins and lipids within the membrane. It may be that the translocation process is initiated by a transitory protein-protein interaction, but proceeds in a lipid environment, or vice versa. Alternatively, the translocation site may
148
MARTHA S. BRIGGS AND LILA M. GIERASCH
be partially proteinaceous and partially lipid, as proposed by Rapoport (1985).
C . Is Transfer Vectorial or by Domains? The question of vectorial or domain transfer is closely related to that of the temporal relationship between synthesis and export, which has long been a matter of controversy. The signal hypothesis and the membrane trigger hypothesis directly contradict each other on this point. The signal hypothesis requires that secretion be vectorial and consequently cotranslational, while the membrane trigger hypothesis specifies that secretion is by domains, and often posttranslational. The helical hairpin hypothesis specifies vectorial translocation, and the amphipathic tunnel model of Rapoport allows for either vectorial or domain modes of translocation. There is evidence that both co- and posttranslational translocation can occur, although for some cells or proteins secretion may proceed exclusively by one route or the other. Cleavage of the signal sequence is generally assumed to take place during or soon after translocation. A related matter is whether processing is necessary for release of the exported protein from the membrane. These points are discussed below. In higher eukaryotes, the timing of translation and translocation is organelle-dependent. Mitochondrial, chloroplast, and peroxisomal proteins are generally translocated posttranslationally (Hay et al., 1984; Kreil, 1981; Grossman et al., 1980), while secretion of proteins into the ER lumen occurs cotranslationally in nearly all cases. Secretion in vitro occurs nearly exclusively when the transport apparatus (from microsoma1 membranes) is present during synthesis. Secretion is generally not observed when the transport machinery is added after synthesis has stopped (Blobel and Dobberstein, 1975b; Mostov et al., 1981). There is evidence that secretion in vitro is prevented when membranes are added after the nascent chains are about 80 residues long (Rothman and Lodish, 1977). Thus, in nearly all eukaryotic systems studied to date, translocation into the ER lumen is tightly coupled to protein synthesis. However, recent in vitro studies have found secretion of a-mating factor in yeast to proceed posttranslationally (Hansen et al., 1986; Rothblatt and Meyer, 1986; Waters and Blobel, 1986).Also, Mueckler and Lodish ( 1986) have reported posttranslational translocation of the human glucose transporter in vitro. They also observed cotranslational translocation regulated by SRP. These results all indicate that alternative mechanisms may obtain depending on the species of organism, the protein, and the experimental procedure.
MOLECULAR MECHANISMS OF PROTEIN SECRETION
149
In E. coli, the timing of translation and translocation appears to be quite variable. Although E. coli is certainly capable of cotranslational export (Smith et al., 1977, 1978;Josefsson and Randall, 1981), there is a large body of evidence that some bacterial proteins can be secreted posttranslationally, both in vivo (Josefsson and Randall, 1981; Goodman et al., 1981; Chen et al., 1985; Ryan and Bassford, 1985) and in vitro (Muller and Blobel, 1984b; Chen et al., 1985; Chen and Tai, 1985).A number of E. coli proteins appear to be secreted both co- and posttranslationally (Josefsson and Randall, 1981; Kadonaga et al., 1985). In the case of cotranslational secretion, Josefsson and Randall ( 1981) found that translocation of some E. coli proteins, including MBP, arabinose-binding protein, alkaline phosphatase, and OmpA, begins only after the nascent chains reach 80% of their full length. Randall (1983) has presented evidence that entire domains of nascent chains of MBP and ribose-binding protein are translocated after their synthesis. In direct conflict with reports of posttranslational secretion are the findings of Pag& et al. (1984),who found that precursors of the periplasmic phosphate-binding protein, PhoS, that have accumulated in the cytoplasm are not exported posttranslationally, but are slowly degraded. Precursors bound to the cytoplasmic membrane could be exported, however. Ryan and Bassford (1985) reported that an E. coli strain export-defective for MBP is incapable of rapid, apparently cotranslational, export of MBP, but can secrete the protein posttranslationally. The observed posttranslational secretion is much slower than wild-type secretion, however, and in many cases a fraction of the precursor pool remains in the cytoplasm. Thus, the temporal relationship between synthesis and secretion in E. coli remains somewhat unclear. Silhavy et al. (1983) and Rhoads et al. (1984) proposed that the coupling between the two processes is not as tight for prokaryotic secretion as it appears to be for eukaryotic secretion into the ER. The mode of secretion may be protein-specific. Some proteins [such as T E M P-lactamase (Josefsson and Randall, 1981)], may be exported primarily posttranslationally, whereas some [such as PhoS (Pages et al., 1984) and amp C @-lactamase (Josefsson and Randall, 198l)l may be exported primarily cotranslationally in vivo. At least one protein that is secreted cotranslationally in vivo (E. coli alkaline phosphatase) (Smith et al., 1977) can be translocated posttranslationally into E. coli membrane vesicles in vitro (Chen et al., 1985). The results of Ryan and Bassford ( 1985), described above, suggest that cotranslational export may be the major mode of secretion of wild-type proteins in “normal” (i.e., whole and not mutated) cells, while posttranslational secretion may be a backup system for use in case of damage to the secretory apparatus.
150
MARTHA S. BRIGGS AND LILA M. GIERASCH
In eukaryotes, cleavage of the signal sequence is thought to occur cotranslationally (Hortin and Boime, 198la,b), but can occur posttranslationally in cell-free systems (Jackson and Blobel, 1977),or in cases where cleavage is inhibited by changes in the cleavage site (Schauer et al., 1985; Hortin and Boime, 1981a,b). Inhibition of cleavage may slow translocation and affect subsequent processing and transport steps (Schauer et al., 1985). Signal-sequence cleavage does not appear to be necessary for translocation to the ER lumen (Hortin and Boime, 1981a,b), although in one case the precursors were associated with the membrane (Schauer et al., 1985). Cleavage of the signal sequence in prokaryotes can occur either cotranslationally (Josefsson and Randall, 1981) or posttranslationally (Wu et al., 1983). Cleavage does not appear to be necessary for protein export, as mutants deficient in the cleavage of lipoprotein (Lin et al., 1978), MBP (Ryan et al., 1986a), or the M13 coat protein (Russell and Model, 1981) signal sequences are localized properly. D . How Much Energy Is Required for Secretion, and Where Does It Come From? Secreted proteins are usually hydrophilic, but they must cross the hydrophobic membrane to leave the cell. The energy barrier for this process must be lowered or compensated by some mechanism. Engelman and Steitz (1981, 1984), von Heijne (1980b), and von Heijne and Blomberg (1979) have calculated that transfer of the signal sequence in an a helix or 3,o helix from the aqueous medium of the cytoplasm to the apolar region of the membrane is thermodynamically favorable. In the helical hairpin hypothesis, Engelman and Steitz (1981) show that the energy gained by insertion of the signal sequence into the bilayer is sufficient to “pull” an adjacent, more polar, helical segment into the membrane with it. Subsequently, the amino acid residues are translocated vectorially, each passing through the helical structure. Thus, for every amino acid that enters the membrane another leaves, and little energy is required other than that of protein elongation. This scheme requires a tight coupling between protein translation and translocation. Pages et al. (1978) have also calculated that no energy beyond that required for protein synthesis is necessary for protein secretion in E. coli, if secretion is cotranslational. The energy requirements of secretion into the ER support the idea that vectorial transport does not require an energy source. Cotranslational secretion is usually observed in the ER system, both in vivo and in vitro. The ER does not have the enzymes needed to generate an electrochemical gradient, and uncouplers and ionophores do not affect the ability of ER microsomes to translocate proteins (Rhoads et al., 1984).
MOLECULAR MECHANISMS OF PROTEIN SECRETION
151
An electrochemical gradient resulting in a membrane potential has been shown to be essential for protein secretion in E. coli (Daniels et al., 1981; Date et al., 1980a,b). Collapse of the transmembrane potential by addition to cells or spheroplasts of the ionophore carbonyl cyanide mchlorophenylhydrazone, or materials that form ion-permeable pores in the membrane, such as valinomycin or colicins El and A, inhibited translocation and processing of several proteins of the outer membrane (LamB, OmpF, OmpA) (Enequist et al., 1981; Pagb and Lazdunski, 1982a,b), periplasm (leucine-isoleucine-valine-binding protein, leucine-specific binding protein, alkaline phosphatase, 6-lactamase) (Daniels et al., 1981; Pagb and Lazdunski, 1982a,b), and inner membrane (MI3 coat protein) (Date et al., 1980a,b). The role (and necessity) of the membrane potential has been debated. Wickner (1980) and Daniels et al. (198 1) proposed that the membrane potential orients the signal peptide by “electrophoresing” a loop of protein into the membrane. The signal peptide in an a-helical conformation has a net dipole due both to the charged amino-terminal region, and to the alignment of the peptide bonds in the helix. The helix is positive near its amino terminus, and negative near its carboxyl terminus. The E. coli inner membrane has a net positive charge at its outer face, and a net negative charge at its inner face. Thus a helical signal peptide should orient itself within the membrane so that its amino terminus faces the cytoplasm and its carboxyl terminus faces the periplasm. In so doing, it could begin to pull the amino terminus of the mature secretory protein across the membrane. On the other hand, the membrane potential might be necessary for proper function of one or more proteins of the secretion apparatus (Rhoads et al., 1984). Such a protein might derive from the membrane potential energy needed for “active transport” of the secreted protein. Alternatively, an energized membrane might be required to maintain part of the secretion apparatus in an effective conformation. Another possibility is that the membrane potential is necessary to generate high-energy phosphate compounds, which are then used as energy sources for secretion. Various in vivo studies and in vitro translation/translocation experiments have refuted this idea (Bakker and Randall, 1984; Rhoads et d., 1984; Pag& and Lazdunski, 198213); it has been noted, however, that high-energy phosphate compounds are required for protein synthesis, and their contribution to the secretion process cannot be ruled out. Chen and Tai (1985) addressed this point by designing experiments to test the energy requirements of posttranslational secretion of the E. colz proteins OmpA and alkaline phosphatase. They found that if adenosine triphosphate (ATP) or a number of other nucle-
152
MARTHA S. BRIGGS AND LILA M. GIERASCH
otides were supplied, in the absence of a transmembrane potential, that the secretory proteins were translocated posttranslationally into E. coli inner-membrane vesicles. In the absence of both the transmembrane potential and added nucleotides, no translocation occurred. Thus, they concluded that ATP is essential for protein translocation and that a possible function of the membrane potential is the generation of ATP for use by the protein-secretion apparatus. Rhoads et al. (1984) noted that an energy source such as ATP or the membrane potential may be necessary only in posttranslational export processes. Supporting this idea is the finding that ATP is also required in posttranslational translocation in yeast in vitro systems (Hansen et al., 1986; Rothblatt and Meyer, 1986). Also, import of proteins into mitochondria and chloroplasts can occur posttranslationally and requires a transmembrane potential (Hay et al., 1984; Kreil, 1981). As noted above, the ER does not have a transmembrane potential and has usually displayed cotranslational secretion. Cotranslational secretion in E. coli may not require energy in addition to that supplied by protein synthesis (Pages et al., 1978). It is clear, however, that E. coli requires an energy source for (at least) posttranslational secretion. Thus, there appears to be a correlation between the energy requirements of protein secretion and the coupling of translation and translocation.
VI. WHATARETHE ROLESOF THE SIGNAL SEQUENCE? As discussed above, the roles of the signal sequence are not yet clearly defined, but its importance is well established. Among the questions raised by existing genetic and biochemical information are: With what components of the secretion apparatus does the signal sequence interact? Does the signal sequence come into direct contact with the membrane lipids? What are the principal factors (e.g., electrostatics, net hydrophobicity, amphiphilicity, conformation) influencing interactions of the signal sequence with the membrane lipids or proteins? These questions have been addressed by various studies of isolated signal sequences and precursor proteins. A . Studies of Precursor Proteins and Isolated Signal Sequences Genetic and biochemical data support the idea that the intrinsic nature and properties of signal sequences are related to function, and that it is not necessary to invoke the influence of the rest of the protein to explain at least part of the secretion process. For example, the ability of signal sequences to be transferred from one secreted protein to another while retaining export function argues that interactions of signal se-
MOLECULAR MECHANISMS OF PROTEIN SECRETION
153
quences with their associated proteins are less important to function than the properties of the signal sequences alone (see Section 111,C). In addition, precursor proteins are often detected by their reaction with antibody to the mature protein (Lingappa et al., 1984; Takahara et al., 1985; Palva et al., 1982; Bankaitis et al., 1984; Randall, 1983). As antibodies have been used as conformational probes (Lewis et al., 1983; Berzofsky, 1985), the ability of the precursors to cross-react with antibody to mature protein indicates conformational similarities, and implies that the non-signal sequence portion of the precursor protein folds into the same conformation as the mature protein. This is also supported by the reported activity of the maltose-binding protein precursor (Ferrenci and Randall, 1979) and by its detergent-binding characteristics (Dierstein and Wickner, 1986). Biophysical and biochemical studies of precursor proteins and isolated signal sequences have provided data on the conformational properties and membrane and protein interactions of signal sequences. The results of these studies, discussed in the following sections, are yielding information on the detailed molecular mechanism of signal-sequence action.
B . Confomational Studies of Signal Sequences Baty and Lazdunski (1979) used antibodies to demonstrate conformational homology among bacterial signal sequences. Some antibodies raised against the precursor form of E. coli alkaline phosphatase also react with mature alkaline phosphatase. These were removed by affinity chromatography; the remaining antibodies reacted with the alkalinephosphatase signal sequence. A disproportionate amount of the antibody population raised against the alkaline-phosphatase precursor is directed against the signal sequence, suggesting that this peptide is exposed at the surface of the precursor protein. This conclusion is strengthened by the finding that antibody to alkaline-phosphatase signal sequence also binds the signal sequence of leucine-isoleucine-valinebinding protein. However, the same antibody also bound aminopeptidase, which has no signal peptide, but does have a membrane-spanning region. The authors argue that this finding implies that the signal sequences and the membrane-bound part of aminopeptidase have similar conformations. More direct determinations of signal-sequence conformations have been obtained by circular dichroism (CD) and infrared (IR) spectroscopy of synthetic signal sequences, signal-sequence fragments, and peptides resembling signal sequences. In polar solvents, especially aqueous systems, CD data indicate that the 23-residue signal sequence of phage M13 coat protein (Shinnar and Kaiser, 1984) (Fig. 6), a 19-residue peptide
154
MARTHA S. BRIGGS AND LILA M. GIERASCH
50000
40000 d
8
(u‘ 30000
6
W
a w
g-10000 L
8
=-200 00 -30000 180
190
200
210
220
230
240
250
UAVUENGTH n m
FIG. 6. Circular dichroism spectra of synthetic phage M13 signal peptide in pH 2.8 phosphate buffer (-), showing predominantly random coil structure, and upon addition of 33% hexafluoroisopropyl alcohol (---), showing predominantly a-helical structure. Reprinted, with permission, from Shinnar and Kaiser (1984).
similar to that of pretrypsinogen (Austen and Ridd, 1981), and the 25residue signal sequence of the E. coli A-receptor protein (Briggs and Gierasch, 1984; Briggs, 1986) adopt largely random conformations. A 29-residue peptide consisting of the “prepro” region of the parathyroid hormone (the 23-residue signal sequence plus the polar 6-residue “pro” segment) adopted a predominantly p conformation (Rosenblatt et al., 1980). All of these synthetic signal peptides, as well as peptides resembling the signal sequences of lysozyme and lipoprotein (Reddy and Nagaraj, 1985),became partially a helical in polyfluorinated alcohols [trifluoroethanol (TFE) and hexafluoroisopropanol (HFIP)], which are more hydrophobic than water and have been used as models for the membrane interior. These solvents promote the formation of intramolecular hydrogen bonds, and consequently induce a-helix formation. In
MOLECULAR MECHANISMS OF PROTEIN SECRETION
155
contrast to the above results, Katakai and Iizuka (1984) have determined the conformations of synthetic signal-sequencefragments by CD and IR. They reported that peptides containing 13- or 14-residue fragments of three eukaryotic signal peptides (preimmunoglobulin light chain, pretrypsinogen, and pre-P-lactoglobulin) have random conformations in HFIP, become a helical in mixtures of HFIP and nonfluorinated alcohols, and adopt P structure in aqueous HFIP. These studies indicate that signal sequences adopt different conformations in polar and apolar environments. In general, the peptides have little structure in aqueous environments. An a-helical conformation is induced in the presence of apolar solvents. Thus, it has been proposed that the signal peptide undergoes a conformational change on passage from the cytoplasm to the membrane (Austen and Ridd, 1981; Rosenblatt etal., 1980; Katakai and Iizuka, 1984).It was also concluded that the active form of the signal sequence adopts an a-helical conformation in the membrane (Shinnar and Kaiser, 1984; Rosenblatt et al., 1980; Katakai and Iizuka, 1984; Reddy and Nagaraj, 1985). These experiments have not demonstrated that the conformational preferences of signal sequences are important to their ability to export proteins. To address this problem, we synthesized the family of E. coli Areceptor protein wild-type and mutant signal sequences (shown in Fig. 5 and described in Section II1,H) and determined their conformations in various polar and apolar environments by CD (Briggs and Gierasch, 1984; Briggs, 1986). The solvents for these experiments included aqueous buffer and TFE, as described above. In addition, sodium dodecyl sulfate (SDS) micelles and phospholipid vesicles were used as membrane model systems. The conformations of the LamB signal sequence family are summarized in Table 11. All of the LamB signal peptides are predominantly random in aqueous solution. In the other solvent systems, including those designed to mimic the membrane, the functional signal peptides take on more regular structures, while the nonfunctional deletion-mutant peptide remains mostly random. The functional peptides have a strong tendency to form an cr helix, but can also adopt a significant amount of P structure in some solvents. Thus, there is a clear correlation between the conformational tendencies of the signal sequences and their abilities to function in vzvo. These findings support the idea that the nature and properties of the isolated signal sequence are related to function, and that it is not necessary to invoke the influence of the rest of the protein to explain at least part of the secretion process. Our CD studies in water and SDS show that the nonfunctional deletion-mutant signal peptide has significantly lower tendency to form an a
156
MARTHA S. BRIGGS AND LILA M. GIERASCH TABLE I1 LamB Signal-Peptide Conformations","
Wild type Buffer 40 mM SDS 50% TFE POPEIPOPG vesicles Deletion mutant Buffer 40 mM SDS 50% TFE POPElPOPG vesicles Gly --* Cys pseudorevertant Buffer 40 mM SDS 50% TFE POPEIPOPG vesicles Pro + Leu pseudorevertant Buffer 40 mM SDS 50% TFE POPEIPOPG vesicles
% a Helix
% p Structure
7% Random
5 60 40 60
15 10 15 25
80 30 45 15
5 20 30 30
15 15 0 0
80 65 70 70
10 35 35 40
10 10 5 15
80 55 60 45
10 60 45 95
10 25 15 0
80 15 40 5
Based on curve-fitting of circular dichroism spectra [from Briggs (1986)l using reference spectra from Greenfield and Fasman (1969). bBuffer 5 mM Tris, pH 7.3. SDS, Sodium dodecyl sulfate; TFE, trifluoroethanol; POPEIPOPG, l-palmitoyl-2-oleoylphosphatidylethanolamine/~-palmitoyl-2-oleoylphosphatidylglycerol,65 : 35.
helix than do the functional wild-type or revertant peptides. These results support the proposal of Emr and Silhavy (1983) that functional signal sequences must adopt an a-helical conformation at some point during secretion. Conformational analysis suggests that the presence of proline and glycine separated by only three residues in the deletion mutant disrupts the helix-forming potential of the signal's hydrophobic core. The pseudorevertants, in which one of these residues is replaced by a helix-promoting residue while maintaining the same length as the deletion-mutant signal peptide, adopt a helical conformation in micellar SDS. It should be noted that these pseudorevertants are those predicted by Bedouelle and Hofnung (1981b) based on their HAL (see Section 111,F). CD spectra in lysolecithin micelles indicate that the functional
MOLECULAR MECHANISMS OF PROTEIN SECRETION
157
signal-peptide fragments also adopt a significant amount of /3 structure, but that the nonfunctional deletion-mutant peptide does not (Briggs and Gierasch, 1984). This correlation, like the conformational predictions discussed above, raised the possibility that the signal sequence may have to take on p structure or some other non-a-helical secondary structure during the secretion process.
C. Interactions with Lipids There is biological evidence for the interaction of the signal sequence with lipids and with proteins of the secretion apparatus [for example, see the experiments of DiRienzo and Inouye (1979), Rhoads et al. (1984), and the Pagits group (Pagits et al., 1978; Pagks and Lazdunski, 1981), in which alteration of the membrane lipid fluidity disrupted protein secretion, as discussed in Section IV,A]. Biochemical and biophysical studies using isolated signal peptides also suggest that the signal sequences interact with both lipids and proteins (e.g., SRP and/or a signal-peptide receptor; see Section IV,D) during the export process. Interaction of synthetic signal sequences with lipid vesicles has been observed in three laboratories. Nagaraj (1984) synthesized fragments of the chicken lysozyme signal sequence labeled at the amino terminus with the fluorescent dansyl (5-dimethylamino-1-naphthalenesulfonyl) group. Short (3-8 residues) fragments of the signal sequence show increased fluorescence intensity and a blue shift in the emission maximum when small unilamellar vesicles are added to an aqueous solution of peptide, indicating that the peptides are bound to the vesicles. Longer peptides (9 and 12 residues) had fluorescence spectra characteristic of an aggregated sample. The fluorescence intensity did not increase on addition of vesicles to these peptides, nor did the emission maximum shift. However, based on the effect of temperature on the fluorescence polarization, the author concluded that all of the peptides bind to the bilayers and suggested that the binding of the longer peptides does not affect the emission spectrum since the environment of the dansyl group may not differ significantly in aggregates and in the vesicles. Shinnar and Kaiser (1984) found that addition of the M13 coat protein signal sequence to small unilamellar vesicles caused aggregation, as judged by increased light scattering of the sample. The functional LamB signal peptides also induce vesicle aggregation at peptide to lipid ratios of about 1 : 50, while the nonfunctional deletion-mutant signal peptide does not cause aggregation, even at ratios of 1 : 10 (Briggs et al., 1985). Although the reasons for vesicle aggregation and fusion are unclear, it is apparent that there is some interaction of the signal peptide with the lipids. These observations, together with the results of Nagaraj and our
158
MARTHA S. BRIGGS AND LILA M. GIERASCH
studies of signal-peptide insertion into phospholipid monolayers described below, indicate that signal sequences partition into lipid environments spontaneously. IR spectroscopy of the A-receptor protein signal peptide in phospholipid monqlayers shows that the peptide affects the packing of the lipid hydrocarbon tails (M. S. Briggs, R. A. Dluhy, D. G. Cornell, and L. M. Gierasch, unpublished results). In samples formed at the same surface pressure, the lipid tails are oriented differently in the presence and absence of signal peptide. A phospholipase assay for structural defects in phospholipid bilayers (Jain et al., 1984) indicates that the A-receptor protein signal peptide interacts with vesicles to induce such defects. The peptides perturb the lipid structure at lower mole fractions than do various lysophospholipids. These data provide yet another indication that signal peptides interact with and perturb lipid complexes. However, Bendzko et al. (1982) found that preproinsulin does not bind to small vesicles of dimyristoylphosphatidylcholine (DMPC) or to smooth microsomal membranes, but does bind to rough microsomal membranes. In contrast, cytochrome b5, which has an N-terminal insertion sequence rather than a signal sequence, binds to rough and smooth microsomes and to DMPC vesicles. Binding of preproinsulin to rough microsomes is abolished by prior treatment of the microsomes with protease; protease treatment does not affect cytochrome b5 binding. These observations suggest that binding of preproinsulin to membranes is mediated by a specific proteinaceous receptor and not by spontaneous dissolution of the signal sequence in the bilayer. This result is in conflict with those described above. A possible explanation for this discrepancy is the following: In the studies that found interactions of signal sequences with lipid, isolated signal sequences were used. In the experiments that show no interaction with the membrane, an entire precursor protein was studied. Synthetic signal sequences are very hydrophobic and can be quite insoluble, either precipitating (Katakai and Iizuka, 1984) or aggregating (Nagaraj, 1984) in aqueous solution. Thus the results of Nagaraj (1984) and Shinnar and Kaiser (1984) may be explained as a favorable partitioning of the signal peptide into a hydrophobic medium. In contrast, precursor proteins are often water soluble, due to the presence of the hydrophilic mature protein, and may partition more strongly into water than do isolated signal peptides. Thus, partitioning of a precursor protein into a bilayer may be less favorable. It may bind loosely via an exposed signal peptide, but a tight association with the membrane may require an interaction of the signal peptide with a specific receptor. This then explains the results of Bendzko et al. (1982) (no binding to phospholipid vesicles, but binding to ER microsomes). In cases of cotranslational secretion, the entire precur-
MOLECULAR MECHANISMS OF PROTEIN SECRETION
159
sor is not found in the cytoplasm; thus studies of the isolated signal sequence may be more relevant to the situation in vivo (viz. signal sequences emerging from a ribosome/SRP complex). The question remains from these experiments whether the association of isolated signal sequences with lipids has functional significance. We have studied the lipid interactions of three synthetic signal peptides from the previously described family of functional and nonfunctional E. coli X-receptor protein signal sequences by surface tensiometry. Phospholipid monolayers have been used to simulate the membrane bilayer (Colacicco, 1970; Verger and Pattus, 1982; Rothfield and Fried, 1975; Fendler, 1982). A surface-active species such as a detergent or an amphiphilic peptide or protein dissolved in the aqueous phase beneath the monolayer can enter the monolayer and interact with the lipids at both the head groups and the hydrocarbon chains. Its insertion into the monolayer is indicated by an increase in the surface pressure or the surface area of the monolayer (Pethica, 1955). The result is a mixed monolayer. Adsorption of a solute to the lipid head groups without penetration into the hydrophobic region can also cause a change in surface pressure or surface area by electrostaticeffects, but the change is generally smaller (1-2 versus 8 or more dyn/cm) than is observed in cases of insertion (Mayer et al., 1983). Insertion of the signal peptides into the monolayer was studied by monitoring the increase in surface pressure at constant area (Briggs et al., 1985; Briggs, 1986). The increase in surface pressure depends on the concentration of the signal peptide in the subphase. At low concentrations, the surface pressure changes very little; as concentration increases, the surface pressure rises to a plateau, indicating saturation. The dependence of the surface pressure change on peptide concentration is shown in Fig. 7. The relative affinities of the signal peptides for the phospholipid monolayer were determined by measuring their critical insertionpressures. The tendency of a surface-active molecule to penetrate a monolayer of another is inversely proportional to the initial surface pressure of the monolayer (Verger and Pattus, 1982; Phillips and Sparks, 1980). The lipid monolayer pressure above which the penetrating molecule no longer inserts is called the critical pressure of insertion. It is obtained by measuring the dependence of the surface pressure increase on the initial monolayer surface pressure and extrapolating to a pressure increase of zero (Fig. 8). In both experiments, the wild-type and Pro + Leu pseudorevertant signal peptides behave very similarly. Particularly striking is the finding that the critical insertion pressure is 38 dyn/cm for both functional signal peptides. In contrast, the critical insertion pressure of the deletion-mutant signal peptide is 26 dynlcm. For purposes
160
MARTHA S. BRIGGS AND LILA M. GIERASCH
7-
1
h
\
c
k
0
4.8
I
8.8
5.6
7.2 [peptide]
6.4,
-log
8.0
8.8
FIG.7. The increase in surface pressure of phospholipid monolayers as a function of signal-peptide concentration for the various E. coli LamB synthetic signal sequences (from Briggs, 1986). A monolayer of egg phosphatidylethanolamine and egg phosphatidylglycerol(65 : 35) was spread from a benzene solution onto 5 mM Tris buffer, pH 7.3, yielding a final surface pressure of 20 dyn/cm after evaporation of the benzene. The peptide was added by injecting a concentrated solution below the lipid-water interface. The surface pressure was measured by the du Noiiy ring method with a Fisher Autotensiomat equipped with a platinum-iridium ring. The plateau values are plotted as a function of the peptide concentration for the wild-type (0), Pro + Leu pseudorevertant (A), and deletionmutant (0)peptides.
v
" I \ \
-€ 2 Y
C r U v
$a-
"0
8
16
24
32
40
ri(dyn/cm)
FIG.8. Determination of critical pressures of insertion of synthetic E . coli LamB signal peptides (from Briggs, 1986). A monolayer of 1-palmitoyl-2-oleoylphosphatidylglycerol and 1-palmitoyl-2-oleoylphosphatidylethanolaminewas spread as described in Fig. 7 to yield the desired initial surface pressure. Peptide was injected below the lipid surface LO a final concentration of 1 pA4 for the wild-type and Pro + Leu pseudorevertant peptides, and 2 pA4 for the deletion-mutant peptide. Surface pressure plateau values (AT) are plotted versus the initial surface pressure (P,) for wild-type (0),Pro -+ Leu pseudorevertant (A), and deletion-mutant (0)peptides.
MOLECULAR MECHANISMS OF PROTEIN SECRETION
161
of comparison, it is notable that the equivalent surface pressures of cell membranes are estimated to fall between these values: viz., near 34 dynlcm (van Zoelen et al., 1977; Ter-Minassian-Saraga, 1979; Quinn and Dawson, 1969;Jackson et al., 1979). Thus, it appears that the capability of inserting into a lipid phase at physiological surface pressure is related to the intrinsic properties of the native functional signal sequences. These results show that the ability of these signal peptides to interact with phospholipid monolayers indeed correlates with their in vim activity. The pressure increases due to the functional signal peptides (8-1 1 dyn/cm) are in the same range as those caused by proteins known to insert into monolayers (Bougis et al., 1981). In contrast, prothrombin, which binds only to the membrane surface, causes a pressure increase in a phospholipid monolayer of 1.9-2.3 dyn/cm (Mayer et al., 1983). These values are almost identical to those obtained for perturbation of the monolayer by the deletion-mutant signal peptide. These findings suggest that the functional and nonfunctional signal peptides interact with the monolayer in different ways. The functional signal peptides insert into the hydrocarbon region of the monolayer, while the nonfunctional peptide binds only to the head groups. In a high-salt buffer (5 mM Tris, 0.15 M NaCl, pH 7.3), the interaction of the deletion-mutant signal peptide fragment with the monolayer is abolished (Briggs et al., 1985), implying that the binding forces are electrostatic in nature. High salt also affects the adsorption of the functional signal peptide fragments. The pressure change decreases by about 5 dyn/cm. Thus, the interactions of the functional signal peptides with phospholipid monolayers are both hydrophobic and electrostatic. The critical insertion pressures yield a rough measurement of the point at which the forces favoring transfer of the peptide from the subphase to the monolayer are balanced by the compressional forces opposing the addition of material to the surface. The critical pressures can be multiplied by the cross-sectional area per peptide molecule, if known, to provide estimates of the energies of insertion of the peptides into the monolayer. The areas of the inserted peptides can be assumed to be between 120 and 450 A*, which represent the extremes of vertical and horizontal orientations of a helical signal peptide in the monolayer, as measured from models. The insertion energies of the functional signal peptides are thus nearly double that of the nonfunctional signal peptide, assuming the same orientation. Assuming that the cell membrane’s equivalent surface pressure is about 30 dynlcm, one can estimate the excess energy (i.e., energy in addition to that required for peptide insertion) available from this favorable signal peptide-membrane inter-
162
MARTHA S. BRIGGS AND LILA M. GIERASCH
action to be 1.5-5 kcal/mol, depending on ~rientation.~ This suggests that signal peptide-lipid interactions contribute significantly to lowering energy barriers to protein translocation. The correlation of surface activity with function among these peptides may arise because of the different tendencies of these peptides to adopt secondary structures, such as the a helix, that minimize free (i.e., nonhydrogen bonded) amide groups, and thus enhance the effective hydrophobicity of the uncharged core region and the amphiphilicity of the signal peptide overall. The distinctions among the peptides are most clearly illustrated by comparing the pseudorevertant and deletion-mutant peptides. Although the hydrophobicities of their size chains are nearly equal, their lengths are the same, and their charged residues are identical, the pseudorevertant has a greater propensity to form secondary structure and interacts more strongly with phospholipid monolayers than does the deletion mutant. It is clearly desirable to correlate the conformations of signal peptides with their behavior in phospholipid environments. D . Conformations of Isolated Signal Sequences in Membranes The experiments described in Sections VI,A,B show that two physical properties of the synthetic LamB signal peptides correlate with their in vivo export function: tendency to adopt an a-helical conformation in hydrophobic environments, and tendency to insert into lipid monolayers. These properties may be involved in the same step in the secretion process, or in different steps. An a-helical conformation may be required to generate a structure sufficiently hydrophobic to allow monolayer insertion. Alternatively, these properties may reflect separate roles of the signal sequence in protein secretion. For instance, an a-helical conformation may be necessary for binding to a proteinaceous site, while the ability to interact with lipids may be important for another step in the secretion process. We have studied the conformations of the synthetic LamB signal peptides in phospholipid vesicles and monolayers by CD and IR spectroscopy. The CD spectra of the synthetic signal peptides in vesicles (described in Section VI,A) confirm that the functional signal peptides indeed adopt an a-helical conformation in a bilayer, while the nonfunctional peptide does not. Thus, the tendency of the signal peptides to take on an The critical insertion pressure of the functional signal peptides is 38 dyn/cm, while the estimated equivalent pressure of biological membranes is only 30 dyn/cm. The difference between these values, 8 dyn/cm, can be multiplied by the peptide’s assumed molecular area to yield an estimate of the work that can be done by the peptide on insertion.
MOLECULAR MECHANISMS OF PROTEIN SECRETION
163
a-helical conformation appears to be directly related to their tendency to interact with lipid structures. The surface tensiometry experiments described in Section VI,C indicate that the synthetic LamB signal peptides can interact with phospholipid monolayers in two ways: by binding electrostatically to the head groups of the lipid and by inserting into the hydrocarbon region of the monolayer. By selection of the surface pressures at which the monolayer CD and IR samples are prepared, it is possible to determine the signal peptide’s conformation in each of these two binding modes. Two sets of samples were prepared. One set, in which the monolayer is spread at a surface pressure higher than the peptide’s critical insertion pressure, allows electrostatic binding, but prevents monolayer insertion. The other set of samples is prepared with the monolayer initially at a surface pressure lower than the critical insertion pressure; these conditions allow both insertion and electrostatic interactions. The procedure for depositing peptide/lipid monolayers on quartz plates or attenuated total reflectance (ATR) crystals for spectroscopy has been described by Cornell (1979, 1982).. The CD and IR spectra of the wild-type signal peptide in the presence of phospholipid monolayers are shown in Figs. 9 and 10. The CD spectrum of the peptide interacting with a monolayer below its critical insertion pressure fits to a conformation of 30% a helix and 70% p structure. The IR spectrum confirms the presence of both a helix and /?structure. This sample contains both peptide that is inserted into the monolayer and peptide that is electrostatically bound to the head groups, in unknown proportions. The electrostatically bound peptide alone is best fit as predominantly /3 structure, as judged from the CD spectrum of the peptide interacting with the monolayer at high pressure. The IR spectrum of this sample also shows a largely p structure, and furthermore indicates that the peptide is highly oriented (Briggs et al., 1986). Since the CD spectrum of the low-pressure peptidellipid monolayer contains contributions from both inserted and electrostatically bound peptide, and the electrostatically bound peptide is nearly entirely p structure, the inserted peptide is at least 30% a helical, and probably more. For example, if half of the peptide in the low-pressure monolayer is inserted, and half is electrostatically bound, the inserted peptide is 60% a helical and 40% /3 structure. There is a clear difference in the peptide’s conformation between the electrostatically bound and inserted modes of interaction with the monolayer. If these conformations also correlate with activity in vivo (experiments in progress), the difference may indicate that the signal peptides undergo a conformational change from /3 structure to a helix during the initial stages of protein secretion (see Section V1,A).
164
MARTHA S. BRIGGS AND LILA M. GIERASCH
6
CD 0
-3
- v
I90
210
230
250
X(nm) FIG.9. CD spectra of the wild type E. coli LamB synthetic signal peptide in phospholipid monolayers. The experiment was carried out as described in Figs. 7 and 8. The solid line spectrum was obtained for films spread at pressures below the peptide’s critical pressures of insertion, and the broken line spectrum for films spread above the peptide’s critical pressure of insertion. Hence, the former represents inserted plus adsorbed peptide, and the latter is from adsorbed peptide only. Experimental details are reported in Briggs et al. (1986). Copyright 1986 by the American Association for the Advancement of Science.
An even more striking comparison can be made between the wild-type signal peptide’s conformation when adsorbed to the monolayer and its conformation in aqueous solution. In both of these environments, the peptide should be solvated by water, but its conformations are very different. The peptide is 100%p structure when adsorbed to the monolayer, while it is 80% random in aqueous buffer. Thus, it appears that contact with the lipid surface induces substantial amounts of secondary structure in a molecule that takes on little structure in an aqueous environment. This finding implies that the initial binding of a signal sequence to a membrane may induce a particular structure, which may be important to the mechanism of signal-sequence function.
MOLECULAR MECHANISMS O F PROTEIN SECRETION
A
1800
1630
-4
1700
1600
165
4554
1500
1400
WAVENUMBER, C M - '
1800
1700
1600
WAVENUMBER,
1500
1 30
CM-'
FIG. 10. IR spectra of the wild-type E. coli LamB synthetic signal peptide in phospholipid monolayers. (A) Peptide adsorbed to the monolayer (film formed above the critical insertion pressure of the peptide). (B) Peptide adsorbed and inserted (film formed below the critical insertion pressure). Characteristic amide I bands for a-helix (or random coil): 1660 cm-I, for P-structure: 1630 cm-I. The amide 111 band (at lower frequencies) was used to confirm that the 1660 cm-l band was due to helix and not coil. Experimental details are reported in Briggs et al. (1986). Copyright 1986 by the American Association for the Advancement of Science.
166
MARTHA S. BRIGGS AND LILA M. GIERASCH
The CD spectra of the peptides in vesicles and monolayers confirm that the functional signal peptides adopt a helical conformation in the lipid phase. The monolayer CD work has allowed dissection of the modes of interaction of the wild-type signal peptide with lipids. When not inserted, the wild-type signal peptide adopts a /3 structure; as the signal peptides have predominantly random conformations in water, it is likely that interaction with the lipid surface induces this secondary structure. When inserted into the monolayer the wild-type peptide adopts an a-helical conformation. In fact, the affinity of the signal sequences for lipids may be due to their ability to become a helical. Adoption of a helical structure minimizes the surface exposure of the amide groups and thus enhances the hydrophobicity of the core region, and the amphiphilicity of the signal region overall. E . Interactions with Proteins The results described above in no way diminish the probability that various proteins are necessary for protein secretion, and that some of these proteins interact with signal sequences. Isolated signal sequences and precursor proteins have been shown to bind to proteins in the RER and to inhibit the translocation and processing of secretory proteins. For example, experiments of Rapoport and colleagues (Bendzko et al., 1982; Prehn et al., 1980, 1981) demonstrating the existence of a signalpeptide receptor in RER membranes were mentioned above (Section III,C,4). Precursors of secretory proteins were shown to bind to specific and saturable signal-peptide receptors in the RER membrane (Prehn et al., 1980). Binding is independent of ribosomes and is eliminated by prior treatment of the membrane with protease. Precursors of carp proinsulin and human placental lactogen competed for the receptors, while nonsecretory proteins did not, showing that binding is saturable and specific for exported proteins (Prehn et al., 1981). It was shown that the receptor sites determined in these studies of previously synthesized proteins are identical to those used during cotranslational export by experiments in which precursor proteins were bound to ER membranes before addition of a cell-free translation system. The newly synthesized nascent chains were not translocated or processed, presumably because the signal receptors were blocked by the bound precursors (Prehn et al., 1981). Synthetic signal sequences have also been found to bind to RER membranes. Habener et al. (1978) reported that the chemically synthesized signal sequence of bovine proparathyroid hormone associated with RER microsomes to a greater extent than does a peptide fragment of the mature proparathyroid hormone. Austen and colleagues (Austen and Ridd, 1983; Austen et al., 1984; Robinson et al., 1985) have designed and
MOLECULAR MECHANISMS OF PROTEIN SECRETION
167
synthesized a “consensus” signal peptide that incorporates the common structural features of known signal sequences. The signal peptide associated with RER membranes from which bound ribosomes had been removed. It did not bind to smooth ER membranes. A control peptide did not bind to microsomal membranes. Binding of the signal peptide was tight (Kd = 1 X lo-’ M) and saturable. Robinson et al. (1985) have identified the receptor as a 45,000-Da microsomal protein by cross-linking experiments. The protein is released from the membrane only by high concentrations of detergents, indicating that this is an integral membrane protein. Austen et al. (1984) found that the consensus signal peptide inhibits protein translocation into RER microsomes. Synthetic preproparathyroid hormone signal sequence also decreased the translocation and processing of four prehormones in a cell-free translation system (Majzoub et al., 1980). Synthesis of the prehormones appears to be unaffected. Addition of a control peptide failed to inhibit processing. A synthetic signal peptide also affected secretion in vivo. Koren et al. (1983) injected synthetic mouse light-chain immunoglobulin signal sequence into Xenopus oocytes. The peptide competitively inhibited localization of exported and membrane proteins. Cytoplasmic proteins were unaffected. The inhibition is time-dependent, rising to a maximum of 40% at 1 hour after injection, and returning to zero after 3 hours. No inhibition was observed when signal peptide was allowed to react with anti-signal peptide antibodies prior to injection into the cell. These observations are consistent with blockage of export sites or SRP binding by the signal peptide, followed by its degradation either in the cytoplasm or in the membrane. The signal peptide also affected the rate of secretion of proteins that have been translocated into the ER. The transfer of the proteins from the ER to the Golgi apparatus, and via the Palade pathway to the medium, is accelerated. The acceleration is a specific effect of the injected signal peptide, as injection of detergents or other peptides failed to accelerate secretion. Prior treatment of the signal peptide with anti-signal peptide antibody also abolished the acceleration of secretion. The authors suggest that the signal peptide has a regulatory role in the posttranslocational steps of protein secretion. The ability of the LamB synthetic signal sequences to inhibit protein translocation in vitro correlates with their activity in vivo (L. Chen and P. C. Tai, personal communication). The wild-type and mutant E. coli LamB signal sequences described above (Section II1,H) were added to the cell-free translation/translocation system of Chen et al. (1985). The wild-type peptide blocks translocation of OmpA and alkaline phosphatase; 50% inhibition is reached at a peptide concentration of 1-2 pM.
168
MARTHA S. BRIGGS AND LILA M. GIERASCH
T h e Pro + Leu pseudorevertant peptide, which is functional in uivo,is at least as effective as the wild type at inhibiting translocation. The other functional signal peptide, the Gly + Cys pseudorevertant, has somewhat less inhibitory effect. In contrast, the nonfunctional deletion-mutant signal peptide has little effect on translocation, even at -4 /AM.Since addition of the signal peptides after the precursor proteins have been translocated into membrane vesicles does not expose the mature protein product to added protease, these inhibitory effects are not due to membrane disruption. VII. RECAPITULATION From the work reviewed in the above sections, we can summarize what is known regarding roles of the signal sequence. I n eukaryotic systems, secretion of a protein across the ER requires the presence of a functional signal sequence and participation of SRP, the membrane and associated as yet undefined export apparatus, the SRP receptor, and a signal peptidase. Biochemical evidence derived from experiments in in vitro translocation systems generally supports the signal hypothesis (see Section 11,C). Although there is less biochemical information on secretion in prokaryotes, genetic data indicate that the process is likely to be similar to that in eukaryotes. The development of in vitro translocation assays for bacterial systems (Muller and Blobel, 1984a,b; Rhoads et al., 1984; Chen et al., 1985) should allow a more detailed analysis of the biochemistry of prokaryotic secretion in the near future. For purposes of discussing the roles of the signal sequence, the mechanism of secretion can be viewed as separable into three parts. [Others have suggested similar divisions of the secretion process. See for example, Randall and Hardy (1984a,b).] First the complex consisting of the ribosome, mRNA, and the nascent secretory protein must be directed to the appropriate membrane. T h e primary actor in this step in eukaryotes is the SRP, which serves as a delivery system (viz., interaction with SRP enhances the probability and rate of formation of a productive association with the membrane). T h e SRP dissociates from the synthetic machinery once this step is complete, and is not needed for subsequent parts of the secretion process. One SRP molecule may, in fact, be sufficient to target several adjacent polysomal ribosomes to a membrane as a group. Thus, the ratio of SRP to ribosomes synthesizing secretory proteins is less than one, as noted by Gilmore et al. (1982). T h e role of the signal sequence in this step is probably confined to
MOLECULAR MECHANISMS OF PROTEIN SECRETION
169
labeling the nascent protein as a secretory one, and allowing recognition and binding by SRP. However, it may also perform a regulatory function, as the differences in signal sequences could result in varying affinities for SRP, which would in turn translate into nonuniform rates of protein secretion. Once the synthetic machinerylnascent protein complex has been delivered to the membrane by SRP, it interacts with the membrane to form a translocation-competent species. It is in this process that we envision the signal sequence to play its most active and crucial part. In Section VIII we propose a model for the initial interactions of a signal sequence with the membrane. These interactions bring the signal sequence and nascent protein into the proper orientation for membrane binding and effect the initial entry of the protein into the membrane interior, readying it for the next step, which is translocation. The translocation step is probably the point at which prokaryotic and eukaryotic secretion differ most. The energy for this process may derive from different sources: from the energy of protein synthesis in eukaryotes, and from protein synthesis andlor ATP hydrolysis andlor the membrane potential in prokaryotes. [In fact there is evidence for more than one secretion pathway in E. coli. The degree of coupling between translation and translocation may also be different in prokaryotes and eukaryotes (Section V,C).] We predict that the signal sequence plays only a minor part in the translocation process. It may serve as an anchor to keep the ribosome in contact with the membrane. In some cases, however, it has been shown that removal of the signal sequence by signal peptidase occurs before translocation is complete (Sections IV,B and V,C), implying that the presence of a signal sequence is not required in this step. Posttranslational or domain translocation (Section V,C) requires a more active role for the signal sequence. As first postulated by Wickner (1980), such a mechanism may require that the signal sequence interact with the structural sequence of the protein in order to influence its conformation and membrane binding. Indirect evidence for such an interaction comes from genetic studies of mutations in the signal sequences of the E. coli proteins LamB and MBP that can be suppressed by second-site mutations in the structural gene (Silhavy et al., 1983; Bankaitis et al., 1984). In sum, the most active role played by the signal sequence in the export process appears to be in the second step, where the initial encounter of the export-competent assembly with the membrane occurs. In Section VIII we present a model for this encounter based largely on the biophysical studies presented in Section VI.
170
MARTHA S. BRIGGS AND LILA M. GIERASCH
FIG. 1 1. Model for initial interaction of a signal sequence with a membrane. The steps are described in the text. Note that the signal sequence is viewed as emerging from the ribosome into an aqueous environment (step I), then interacting with the charged surface of the membrane (step 2), and subsequently inserting into the hydrophobic region of the membrane (steps 3 and 4).Conformations adopted by the signal sequence in each of these steps are proposed to be random (aqueous), P-like (membrane surface), and (Y helical (inserted). A transient state (step 3) is suggested wherein the extended, /3-like signal peptide inserts then adopts a helical conformation. Associations with various components of the export apparatus may alter these simplified steps an vim. Modified from Briggs et al. (1986); copyright 1986 by the American Association for the Advancement of Science.
VIII. A MODELFOR THE INITIALINTERACTIONS OF SIGNAL SEQUENCES WITH THE MEMBRANE Presented below and illustrated in Fig. 11 is a model for the events that occur in uivo when a signal sequence first encounters the membrane. Step 1. At the membrane, the signal sequence binds electrostatically (via the basic residues) to the inner-membrane surface and adopts a folded p structure with a turn near residues -7 to -10 (Inouye and Halegoua, 1980). This structure has hydrophobic character overall, and an a proximate length of 34 8, (10 residues at a rise per residue of 3.4 ). Step 2. The signal peptide inserts into the membrane. We propose that the polar c region of the signal sequence resides transiently in the aqueous region of the periplasmic or lumenal side of the membrane.
w
MOLECULAR MECHANISMS OF PROTEIN SECRETION
171
This arrangement seems physically reasonable, and is supported by genetic data, since charged residues can be tolerated in this region, but not in the adjacent hydrophobic core (Silhavy et al., 1983). Step 3. The hydrophobic environment of the membrane causes a conformational change to an a- (or 310) helical conformation, in which intramolecular hydrogen bonding is maximized. As a result of this conformational change, a segment of the mature protein enters the membrane, provided that the basic residues continue to anchor the amino terminus on the inner surface of the membrane. In a helical conformation, the signal sequence spans the hydrophobic part of the bilayer, and the cleavage site falls on the opposite face of the membrane. This mechanism does not exclude the participation of proteins like the SRP and SRP receptor, as specified in the signal hypothesis. It does, however, require an exposed signal sequence, and thus is probably not consistent with the membrane trigger hypothesis, in which the signal sequence has a central role in the folding of the precursor protein. It can be accommodated in the domain model or the amphiphilic tunnel hypothesis as a first step in a folding process that occurs at or in the membrane. Like the helical hairpin hypothesis and the direct transfer model, it proposes that secretion is driven by a favorable free energy of transfer of the signal sequence from the cytoplasm to the membrane, followed by another low-energy step-the conformational change from a /3 sheet to an a helix. A /3 to a transition has been proposed in other models (Austen, 1979; Steiner et al., 1980; Rosenblatt et al., 1980). Thus, several features of this model are not new. For the first time, however, there are experimental data that support these features. IX. SIGNAL SEQUENCES AS MEMBRANE-INTERACTING SEQUENCES Signal sequences are characterized by exceptionally diverse and numerous roles; they are essential participants in the multistep process of protein secretion. While many details of this process are still poorly understood, it is clear that the involvement of the signal sequence includes recognition by proteins, interactions with membranes andlor membrane-resident components, facilitation of translocation, and specific cleavability. In these steps, the signal sequence probably undergoes conformational changes which themselves are required features of the mechanism. Yet despite these multifaceted functions of signal sequences, they are highly variable-between species, between related proteins, etc. Their sequence variability suggests remarkably few constraints on their evolutionary rate of change, whereas the apparent plethora of their functions
172
MARTHA S. BRIGGS AND LILA M. GIERASCH
g100,
20 -
0
'
would argue for highly constrained evolution. Figure 12 shows a comparison of the divergence rate of the mature chains of insulins from various species and their signal sequences, and illustrates that signal sequences do diverge rapidly with respect to other sequences. The explanation for this apparent paradox is that specific sequences are not required for signal peptide functions; it is instead the character of the signal sequence that must be preserved. Indeed, as discussed in detail in Section I11 of this review, the hydrophobic, ionic, polar, and conformational nature of the residues occurring in different regions of various signal sequences show strong homology, and the lengths of the characteristic regions of signal sequences are quite uniform. Table 111, showing several insulin signal sequences, demonstrates that the substitutions observed between species are conservative ones. Signal sequences represent a larger class of polypeptide sequences which share the characteristic of performing their functions by virtue of gross structural features and physical behavior in distinct environments. Other examples most likely include transmembrane sequences, viral fusion sequences, membrane entry sequences in toxins, and signal peptides of mitochondria and chloroplasts. These sequences are unlike seg-
173
MOLECULAR MECHANISMS OF PROTEIN SECRETION
TABLE 111 S i p 1 Sequences of Insulins" Species Human Monkey Dog Rat I
I1 Hamster Chicken Angler fish Salmon Carp Hagfish
Sequence MAL W M R L L P LLALL AL WG P DP AAA I MAL WMRL L P L LALL AL WG P DP A P A I MAL WMRL L P L L A L L A L WA P A P T R A I MAL WMRF L P L L A L L V L WE P KP AQA I MALWI RFLPLLALLI LWE PRPAQAI M T L W M R L L P L L T L L V L WE P NP ANA I MALWI RSLPLLALLVFS G P G T S Y A I M A A L W L Q S F S L L V L L V V S W P GS Q A I MALWLQAASLLVLLALS P GVDAI MAVWI QAGALLFLLAVS S VNAI MALS P F L A A V I P L V L L L S RAPPSADTI ~
All sequences from Watson (1984). The cleavage site is indicated by a slash (I).
ments of globular proteins or hormones or receptors, which are all involved in specific recognition processes (even if only required by packing within the protein structure). What this class of sequence has in common is the capacity to interact with membranes, usually by virtue of combined hydrophobic and electrostatic interactions, and to function by adopting conformations that are environmentally determined. The characteristics of these membrane-interacting sequences lead to two expectations, both already supported by preliminary evidence. First, since their functions do not depend strongly on specific interactions, one can fruitfully study their behavior using biophysical approaches in model systems that provide gross features of the in vivo environment. Our initial efforts to correlate biophysical studies of synthetic signal sequences with their ability to facilitate export in vivo support this expectation and encourage further work both on signal peptides and on other examples of membrane-interacting sequences. Second, signal sequences and other membrane-interacting sequences should be ready targets for genetic manipulation and de novo design. Many initial forays into these areas have been successfully accomplished (Adams and Rose, 1985; Davis and Model, 1985). It is also possible to alter a given native sequence to localize a normally cytoplasmic protein in a different cellular compartment (Hurt et al., 1984; van den Broek et al., 1985). The idea that multiple signals may exist in one protein, such as opsin (Friedlander and Blobel, 1985), which direct the topology of its incorporation in a membrane, leads to obvious applications in design of desired alterations in topology.
174
MARTHA S. BRlGGS AND LILA M. GlERASCH
Perhaps the most puzzling aspect of these emerging ideas about signal sequences is that signal sequences must also serve as the labels that inform the cellular machinery that the protein being synthesized must be delivered to the appropriate location. This step is postulated to occur via the SRP for export across the ER in eukaryotes. The SRP and other recognition assemblies must be able to bind specifically to sequences that share only the gross features described above-hydrophobicity, charge, conformation. This is a new concept in protein-based binding, and suggests that the proteins that interact with such sequences will be found to make use of novel modes of binding. ACKNOWLEDGMENTS We thank P. C. Tai and Tom Rapoport for critical reading of the manuscript and Gunnar von Heijne for sending us unpublished data. We are grateful to Peter Walter and Tom Silhavy for many helpful discussions. The willingnessof many to send us reprints and preprints is greatly appreciated. Th e writing of this review was supported, in part, by grants from the National Institutes of Health (GM 27616 and GM 34962). LMG is a Fellow of the A.P. Sloan Foundation (1984-86).
REFERENCES Adams, G. A., and Rose, J. K. (1985). Cell 41, 1007-1015. Akiyama, Y., and lto, K. (1985). EMBOJ. 4, 3351-3356. Amar-Costesec, A., Todd, J. A., and Kreibich, G. (1984).J. Cell B i d . 99, 2247-2253. Andrews, D. W., Walter, P., and Ottensmeyer, F. P. (1985). Proc. Nutl. Acad. Sci. U.S.A. 82, 785-789. Austen, B. M. (1979). FEBS Lett. 103, 308-313. Austen, B. M., and Ridd, D. H. (1981). Biochem. SOL. Symp. 46, 235-258. Austen, B. M., and Ridd, D. H. (1983). B i o c h . Soc. Tram. 11, 160-161. Austen, B. M., Hermon-Taylor, J., Kaderbhai, M. A., and Ridd, D. H. (1984). Biochem. J. 224,317-325. Bakker, E. P., and Randall, L. L. (1984). EMBOJ. 3, 895-900. Bankaitis, V. A., and Bassford, P. J., Jr. (1985). J. Bacterial. 161, 169-178. Bankaitis, V. A., Rasmussen, B. A., and Bassford, P. J., Jr. (1984). Cell 37, 243-252. Bassford, P. J., Silhavy, T. J., and Beckwith, J. R. (1979).J. Bucteriol. 139, 19-31. Bassuner, R., Huth, A., Manteuffel, R., and Rapoport, T. A. (1983). Eur.J. Bzochm. 133, 321-326. Baty, D., and Lazdunski, C. (1979). Eur. J. Biochem. 102, 503-507. Baty, D., Mercereau-Puijalon, O., Perrin, D., Kourilsky, P., and Lazdunski, C. (1981). Gene 16,79-87. Bedouelle, H., and Hofnung, M. (1981a). In “Membrane Transport and Neuroreceptors,” pp. 399-403. Liss, New York. Bedouelle, H., and Hofnung, M. (1981b). In “lntermolecular Forces” (B. Pullman, ed.), pp. 36 1-372. Reidel, Dordrecht, The Netherlands. Bedouelle, H., Bassford, P. J., Fowler, A. V., Zabin, I., Beckwith, J., and Hofnung, M. (1980). Nature (London) 285, 78-81. Bendzko, P., Prehn, S., Pfeil, W., and Rapoport, T. A. (1982). Eur. J . Biochem. 123, 121126.
MOLECULAR MECHANISMS OF PROTEIN SECRETION
175
Benson, S. A., and Silhavy, T. J. (1983). Cell 32, 1325-1335. Berzofsky, J. A. (1985). Science 229, 932-940. Blobel, G. (1980). Proc. Natl. Acad. Sci. U.S.A. 77, 1496-1500. Blobel, G., and Dobberstein, B. (1975a).J. Cell Biol. 67, 835-851. Blobel, G., and Dobberstein, B. (1975b).J. Cell Biol. 67, 852-862. Blobel, G., and Sabatini, D. D. (1971). Biomembrunes 2, 193-195. Bogdanov, M. V., Kulaev, I. S., and Nesmayanova, M. A. (1984). Bzoorg. M a b r . (Moscow) 1,495-502. Bogdanov, M. V., Suzina, N. E., and Nesmayanova, M. A. (1985a). Bioorg. M a b r . (Moscow) 2,367-376. Bogdanov, M. V., Tsfasman, I. M., and Nesmayanova, M. A. (1985b). Bioorg. Membr. (MOSCOW) 2, 623-629. Bougis, P., Rochat, H., Pitroni, G., and Verger, R. (1981). Biochemistry 20, 4915-4920. Braell, W. A., and Lodish, H. F. (1982).J. Biol. Chem. 257, 4578-4582. Brennan, M. D., Warren, T. G., and Mahowald, A. P. (1980).J. Cell Biol. 87, 516-520. Brickman, E. R., Oliver, D. B., Garwin, J. L., Kumamoto, C., and Beckwith, J. (1984). Mol. G a . Genet. 196, 24-27. Briggs, M. S. (1986). Ph. D. Thesis, Yale University, New Haven, Connecticut. Briggs, M. S., and Gierasch, L. M. (1984). Biochemistry 23, 31 11-31 14. Briggs, M. S., Gierasch, L. M., Zlotnick, A., Lear, J. D., and DeGrado, W. F. (1985). Science 228, 1096-1099. Briggs, M. S., Cornell, D. G., Dluhy, R. A,, and Gierasch, L. M. (1986). Science, in press. Brown, P. A., Halvorson, H. O., Raney, P., and Perlman, D. (1984). Mol. Gen. Genet. 197, 351-357. Carlson, M., and Botstein, D. (1982). Cell 28, 145-154. Caulfield, M. P., Horiuchi, S., Tai, P. C., and Davis, B. D. (1984). Proc. Natl. Acud. Sci. U.S.A. 81, 7772-7776. Caulfield, M. P., Furlong, D., Tai, P. C., and Davis, B. D. (1985).Proc. Natl. Acad. Sci. U.S.A. 82,4031-4035. Cerretti, D. P., Dean, D., Davis, G. R., Bedwell, D. M., and Nomura, M. (1983). Nucleic Acids Res. 11,2599-2616. Chen, L., and Tai, P. C. (1985). Proc. Natl. Acud. Sci. U.S.A. 82, 4384-4388. Chen, L., Rhoads, D., and Tai, P. C. (1985).J. Bucteriol. 161, 973-980. Chou, P. Y., and Fasman, G. D. (1974a). Biochemistry 13, 211-222. Chou, P. Y., and Fasman, G. D. (197413).Biochemistry 13, 222-245. Colacicco, G. (1970). Lipids 5, 636-649. Coleman, J., Inukai, M., and Inouye, M. (1985). Cell 43, 351-360. Cornell, D. G. (1979).J. Colloid Znterfuce Scz. 70, 167-180. Cornell, D. G. (1982).J . Colloid Interface Scz. 88, 536-545. Dalbey, R., and Wickner, W. (1985).J. Biol. Chem. 260, 15925-15931. Daniels, C. J., Bole, D. G., Quay, S. C., and Oxender, D. L. (1981). Proc. Natl. Acad. Sci. U.S.A. 78, 5396-5400. Dassa, E., and Boquet, P.-L. (1981). Mol. Gen. Genet. 181, 192-200. Date, T., Zwizinski, C., Ludmerer, S., and Wickner, W. (1980a).Proc. Nutl. Acud. Sci. U.S.A. 77,827-831. Date, T., Goodman, J. M., and Wickner, W. T. (1980b). Proc. Natl. Acad. Scz. U.S.A. 77, 4669-4673. Davis, N. G., and Model, P. (1985). Cell 41, 607-614. Dierstein, R., and Wickner, W. (1985).J. Biol. Chem. 260, 15919-15924. Ding, J., Lory, S., and Tai, P. C. (1985). Gene 33, 313-321.
176
MARTHA S. BRIGGS AND LILA M. GIERASCH
DiRienzo, J. M., and Inouye, M. (1979). Cell 17, 155-161. Emr, S. D., and Silhavy, T. J. (1983). Proc. Natl. Acad. Sci. U.S.A. 80, 4599-4603. Emr, S. D., Hedgpeth, J., Cltment, J.-M., Silhavy, T. J., and Hofnung, M. (1980). Nature (London) 285, 82-85. Emr, S. D., Hanley-Way, S., and Silhavy, T. J . (1981). Cell 23, 79-88. Enequist, H. G., Hirst, T. R., Harayama, S., Hardy, S. J. S., and Randall, L. L. (1981). Eur. J . Biochem. 116, 227-233. Engelman, D. M., and Steitz, T. A. (1981). Cell 23, 41 1-422. Engelman, D. M., and Steitz, T. A. (1984). In “The Protein Folding Problem” (D. B. Wetlaufer, ed.), Vol. 89, pp. 87-1 14. Am. Assn. Adv. Sci,, Washington, D.C. Fendler, J. H. (1982). “Membrane Mimetic Chemistry.” Wiley (Interscience), New York. Ferenci, T., and Randall, L. L. (1979).J. B i d . Chem. 254, 9979-9981. Ferro-Novick, S., Novick, P., Field, C., and Schekman, R. (1984a).J . Cell B i d . 98, 35-43. Ferro-Novick, S., Hansen, W., Schauer, I., and Schekman, R. (1984b).J. Cell Biol. 98,4453. Ferro-Novick, S., Honma, M., and Beckwith, J. (1984~).Cell 38, 211-217. Friedlander, M., and Blobel, G. (1985). Nature (London) 318, 338-343. Fujimoto, Y., Watanabe, Y., Uchida, M., and Ozaki, M. (1984).J. Biochem. 96, 1125-1 131. Gilmore, R., and Blobel, G. (1983). Cell 35, 677-685. Gilmore, R., and Blobel, G. (1985). Cell 42,497-505. Gilmore, R., Walter, P., and Blobel, G. (1982).J . Cell Biol. 95, 470-477. Goodman, J. M., Watts, C., and Wickner, W. (1981). Cell 24,437-441. Greenfield and Fasman (1969). Biochemistry 8 , 4 108-4 1 15. Grossman, A. R., Bartlett, S. G., Schmidt, G. W., and Chua, N.-H. (1980). Ann. N . Y. Acad. Sci. 343, 266-274. Gundelfinger, E. D., Di Carlo, M., Zopf, D., and Melli, M. (1984). EMBO J . 3, 23252332. Habener, J. F., Rosenblatt, M., Kemper, B., Kronenberg, H. M., Rich, A., and Potts, J. T., Jr. (1978). Proc. Natl. Acad. Sci. U.S.A. 75, 2616-2620. Hahn, V., Winkler, J., Rapoport, T. A., Liebscher, D.-H., Coutelle, Ch., and Rosenthal, S. (1983). Nucleic Aczds Res. 11, 4541-4552. Hall, M. N., Gabay, J., and Schwartz, M. (1983). EMBOJ. 2, 15-19. Hansen, W., Garcia, P. D., and Walter, P. (1986). Cell 45, 397-406. Hay, R., Bohni, P., and Gasser, S. (1984). Biochim. Biophys. Acta 779, 65-87. Hayashi, S., Chang, S.-Y., Chang, S., Giam, C.-Z., and Wu, H. C. (1985).J. Biol. Chem. 260, 5753-5759. Hortin, G., and Boirne, I. (1980). Proc. Natl. Acad. Scz. U.S.A. 77, 1356-1360. Horton, G., and Boime, I. (1981a).J. Biol. Chem. 256, 1491-1494. Hortin, G., and Boirne, I. (1981b). Cell 24, 453-461. Hortsch, M., and Meyer, D. I. (1985). Eur. J. Biochem. 150, 559-564. Hortsch, M., Avossa, D., and Meyer, D. I. (1985).J. Biol. Chem. 260, 9137-9145. Hurt, E. C., Pesold-Hurt, B., and Schatz, G. (1984). FEES Lett. 178, 306-310. Hussain, M., Ozawa, Y., Ichihara, S., and Mizushima, S. (1982). Eur.J. Biochem. 129,233239. Ichihara, S., Beppu, N., and Mizushima, S. (1984).J. Biol. Chem. 259, 9853-9857. Iida, A., Groarke, J. M., Park, S., Thom, J., Zabicky, J. H., Hazelbauer, G. L., and Randall, L. L. (1985). EMBOJ. 4, 1875-1880. Inouye, M., and Halegoua, S. (1980). CRC Crit. Rev. Biochem. 7, 339-371. Inouye, S., Soberon, X., Franchesini, T., Nakamura, K., Itakura, K., and Inouye, M. (1982). Proc. Natl. Acad. Sci. U.S.A. 79, 3438-3441. Ito, K. (1984). Mol. Gen. Genet. 197, 204-208.
MOLECULAR MECHANISMS OF PROTEIN SECRETION
177
Ito, K., Wittekind, M., Nomura, M., Shiba, K., Yura, T., Miura, A., and Nashimoto, H. (1983). Cell 32, 789-797. Ito, K., Cerretti, D. P., Nashimoto, H., and Nomura, M. (1984). EMBOJ. 3, 2319-2324. Jackson, R. C. (1983).I n “Methods in Enzymology” (S. Colowick and N. Kaplan, eds.), Vol. 96, pp. 784-794. Academic Press, New York. Jackson, R. C., and Blobel, G. (1977). Proc. Natl. Acad. Sci. U.S.A. 74, 5598-5602. Jackson, R. C., and White, W. R. (1981).J. Biol. Chem. 256, 2545-2550. Jackson, R. L., Pattus, F., and Demel, R. A. (1979). Biochzm. Biofhys. Acta 556, 369-387. Jain, M. K., Streb, M., Rogers, J., and DeHaas, G. H. (1984). Biochem. Pharmacol. 33,25412551. Josefsson, L.-G., and Randall, L. L. (1981).Cell 25, 151-157. Kadonaga, J. T., Gautier, A. E., Straw, D. R.,Charles, A. D., Edge, M. D., and Knowles, J. R. (1984).J.Biol. Chem. 259, 2149-2154. Kadonaga, J. T., Pluckthun, A., and Knowles, J. R. (1986).J. Biol. Chem. 260, 1619216199. Katakai, R., and Iizuka, Y. (1984).J. Am. Chem. Soc. 106, 5715-5718. Koren, R., Burstein, Y., and Soreq, H. (1983).Proc. Natl. Acad. Sci. U.S.A. 80,7205-7209. Koshland, D., Sauer, R. T., and Botstein, D. (1982). Cell 30, 903-914. Kreibich, G., Ulrich, B. L., and Sabatini, D. D. (1978).J. Cell Biol. 77, 464-487. Kreibich, G., Marcantonio, E. E., and Sabatini, D. D. (1983). In “Methods in Enzymology” (S. Colowick and N. Kaplan, eds.), Vol. 96, pp. 520-531. Academic Press, New York. Kreil, G. (1981). Annu. Rev. Biochem. 50, 317-348. Kumamoto, C. A., and Beckwith, J. (1983).J . Bacteriol. 154, 253-260. Kumamoto, C. A., and Beckwith, J. (1985).J. Bacteriol. 163, 267-274. Kumamoto, C. A., Oliver, D. B., and Beckwith, J. (1984). Nature (London) 308, 863-864. Kurzchalia, T. V., Wiedmann, M., Girshovich, A. S., Bochkareva, E. S., Bielka, H., and Rapoport, T. A. (1986).Nature (London) 320,634-636. Lane, C. D., Colman, A., Mohun, T., Morser, J., Champion, J., Kourides, I., Craig, R., Higgins, S., James, T. C., Applebaum, S. W., Ohlsson, R. I., Paucha, E., Houghton, M., Matthews, J., and Mifflin, B. J. (1980). Eur. J . Biochem. 111, 225-235. Lauffer, L., Garcia, P. B., Harkins, R. N., Coussens, L., Ullrich, A., and Walter, P. (1985). Nature (London) 318, 334-338. Lee, C. A., Fournier, M. J., and Beckwith, J. (1985).J. Bacteriol. 161, 1156-1161. Lee, S. Y., Bailey, S. C., and Apirion, D. (1978).J. Bacteriol. 133, 1015-1023. Lewis, R. M., Furie, B. C., and Furie, B. (1983). Biochemistry 22, 948-954. Lin, J. J. C., Kanazawa, H., Ozols,J., and Wu, H. C. (1978). Proc. Natl. Acad. Sci. U.S.A. 75, 4891-4895. Lingappa, V. R., Lingappa, J. R., and Blobel, G. (1979). Nature (London) 281, 117-121. Lingappa, V. R., Chaidez, J., Yost, C. S., and Hedgpeth, J. (1984). Proc. Natl. Acad. Sci. U.S.A. 81, 456-460. Liss, L. R., and Oliver, D. B. (1986).J. B i d . Chem. 261, 2299-2304. Liss, L. R., Johnson, B. L., and Oliver, D. B. (1985).J. Bacterial. 164, 925-928. Lively, M. O., and Walsh, K. A. (1983).J. Biol. Chem. 258, 9488-9495. Magner, J. A. (1982).J . Theor. Biol. 99, 831-833. Majzoub, J. A., Rosenblatt, M., Fennick, B., Maunus, R., Kronenberg, H. M., Potts, J. T., Jr., and Habener, J. F. (1980).J. Biol. Chem. 255, 11478-11483. Mandel, G., and Wickner, W. (1979). Proc. Natl. Acad. Sci. U.S.A. 76, 236-240. Marcantonio, E. E., Grebenau, R. C., Sabatini, D. D., and Kreibich, G. (1982). Eur. J . B i o c h a . 124, 217-222. Marcantonio, E. E., Amar-Costesec, A., and Kreibich, G. (1984).J. Cell. Biol. 99, 22542259.
178
MARTHA S. BRIGGS AND LILA M. GIERASCH
Matteucci, M., and Lipetsky, H. (1986).Biotechnology 4,51-55. Mayer, L. D., Nelsestuen, G. L., and Brockman, H. L. (1983).Biochemistry 22,316-321. Meek, R. L., Walsh, K. A., and Palmiter, R. D. (1982).J . Biol. Chem. 257, 1224512251. Meyer, D. I. (1985).E M B O J . 4, 2031-2033. Meyer, D.I., Krause, E.,and Dobberstein, B. (1982).Nature (London) 297, 647-650. Michaelis, S.,Guarente, L., and Beckwith, J. (1983).J.Bacterial. 154, 356-365. Milstein, C.,Brownlee, G. G., Harrison, T. M., and Mathews, M. B. (1972).Nature (London) New Biol. 239, 117-120. Moreno, F., Fowler, A. V., Hall, M., Silhavy, T. J., Zabin, I., and Schwartz, M. (1980). Nature (London) 286, 356-359. Mostov, K. E., DeFoor, P., Fleischer, S., and Blobel, G. (1981).Nature (London)292,87-88. Mueckler, M., and Lodish, H. F. (1986).Cell 44,629-637. Miiller, M., and Blobel, G. (1984a).Proc. Natl. Acad. Sci. U.S.A. 81, 7421-7425. Muller, M., and Blobel, G. (198413). Proc. Natl. Acad. Sci. U.S.A. 81, 7737-7741. Miiller, M., Ibrahimi, I., Chang, C. N., Walter, P., and Blobel, G. (1982).J. Biol. Chem. 257, 11860-1 1863. Nagaraj, R. (1984).FEBS Lett. 165, 79-82. Nesmayanova, M. A. (1982).FEBS Lett. 142, 189-193. Novak, P.,Ray, P. H., and Der, I. K. (1986).J.Biol. Chem. 261, 420-427. Novick, P., Field, C., and Schekman, R. (1980).Cell 21, 205-215. Novick, P.,Ferro, S., and Schekman, R. (1981).Cell 25, 461-469. Ohno-Iwashita, Y.,and Wickner, W. (1983).J.Biol. Chem. 258, 1895-1900. Ohno-Iwashita, Y.,Wolfe, P., Ito, K., and Wickner, W. (1984).Biochemistry 23,6178-6184. Oliver, D.B. (1985).J.Bacterial. 161, 285-291. Oliver, D.B., and Beckwith, J. (1981).Cell 25, 765-772. Oliver, D. B.,and Beckwith, J. (1982a).J.Bacterial. 150, 686-691. Oliver, D.B., and Beckwith, J. (1982b).Cell 30, 31 1-319. Oliver, D. B., and Liss, L. R. (1985).J.Bacterial. 161, 817-819. Oxender, D. L., Landick, R., Nazos, P., and Copeland, B. R. (1984).In “Microbiology1984”(L. Leive and D. Schlessinger, eds.), pp. 4-7. Am. SOC.Microbiol., Washington, D.C. Pagts, J.-M. (1982).Eur. J . Biochem. 122, 381-386. Pagts, J.-M., and Lazdunski, C. (1981).FEMS Microbial. Lett. 12, 65-69. Pages, J.-M., and Lazdunski, C. (1982a).FEBS Lett. 149, 51-54. Pagts, J.-M., and Lazdunski, C. (1982b).Eur. J . Biochem. 124, 561-566. Pagks, J.-M., Piovant, M., Varenne, S., and Lazdunski, C. (1978).Eur.1. Biochem. 86,589602. Pages, J.-M., Anba, J.. Bernadac, A., Shinagawa, H., Nakata, A., and Lazdunski, C. (1984). Eur. J . Biochem. 143, 499-505. Pages, J.-M., Anba, J.. and Lazdunski, C. (1985).Ann. Inst. PasteurJMicrobiol. (Paris) 136A, 105-110. Palade, G . (1975).Science 189, 347-358. Palmiter, R. D., Gagnon, J., and Walsh, K. A. (1978).Proc. Natl. Acad. Sci. U.S.A. 75,94-98. Palva, I., Sarvas, M., Lehtovaara, P., Sibakov, M., and Kaariainen, L. (1982).Proc. Natl. Acad. Sci. U.S.A. 79, 5582-5586. Paul, D. L., and Goodenough, D. A. (1983).J.Cell Biol. 96,636-638. Perara, E., and Lingappa, V. (1986).J.Cell Biol. 101, 2292-2301. Perlman, D.,and Halvorson, H. 0. (1983).J . Mol. Biol. 167, 391-409. Perlman, D.,Halvorson, H. O., and Cannon, L. E. (1982).Proc. Natl. Acad. Sci. U.S.A. 79, 781-785. Pethica, B. A. (1955).Tram. Faradaj Sac. 51, 1402-1411.
MOLECULAR MECHANISMS OF PROTEIN SECRETION
179
Phillips, M. C., and Sparks, C. E. (1980). Ann. N . Y . Acad. Sci. 328, 122-137. Pincus, M. R., and Klausner, R. D. (1982). Proc. Natl. Acad. Sci. U.S.A. 79, 3413-3417. Prehn, S., Tsamaloukas, A., and Rapoport, T. A. (1980). Eur. J . Biochem. 107, 185195. Prehn, S., Nurnberg, P., and Rapoport, T. A. (1981). FEBS Lett. 123, 79-84. Quinn, P. J., and Dawson, R. M. C. (1969). Biochem.]. 113, 791-804. Randall, L. L. (1983). Cell 33, 231-240. Randall, L. L., and Hardy, S. J. S. (1984a). Mod. Cell Biol. 3, 1-20. Randall, L. L., and Hardy, S. J. S. (198413).Microbiol. Rev. 48, 290-298. Rapoport, T. A. (1985). FEBS Lett. 187, 1-10. Reddy, G. L., and Nagaraj, R. (1985). Biochim. Biophys. Acla 831, 340-346. Redman, C. M., and Sabatini, D. D. (1966). Proc. Natl. Acad. Sci. U.S.A. 56, 608-615. Rhoads, D. B., Tai, P. C., and Davis, B. D. (1984).]. Bacteriol. 159,63-70. Robinson, A., Kaderbhai, M. A,, and Austen, B. M. (1985). Biochem. SOC. T r a m 13, 724726. Rosenblatt, M., Beaudette, N. V., and Fasman, G. D. (1980).Proc. Natl. Acad. Sci. U.S.A. 77, 3983-3987. Rothblatt, J. A., and Meyer, D. I. (1986). Cell 44, 619-628. Rothfield, L. I., and Fried, V. A. (1975). In “Methods in Membrane Biology” (E. D. Korn, ed.), Vol. 4, pp. 277-292. Plenum, New York. Rothman, J. E., and Lodish, H. F. (1977). Nature (London) 269, 775-780. Russell, M., and Model, P. (1981). Proc. Natl. Acad. Sci. U.S.A. 78, 1717-1721. Ryan, J. P., and Bassfotd, P. J., Jr. (1985).J. B i d . Chem. 260, 14832-14837. Ryan, J. P., Fikes, J. D., Bankaitis, V. A., Duncan, M. C., and Bassford, P. J., Jr. (1986a). Microbiology, in press. Ryan, J. P., Duncan, M. C., Bankaitis, V. A,, and Bassford, P. J., Jr. (1986b).J.Biol. Chem. 261, 3389-3395. Sabatini, D. D., and Blobel, G. (1970).]. CellBiol. 45, 146-157. Schauer, I., Emr, S. D., Gross, C., and Schekman, R. (1985).]. Cell B i d . 100, 1664-1675. Schecter, I., McKean, D. J., Guyer, R., and Terry, W. (1975). Science 188, 160-162. Scheele, G. (1983). I n “Methods in Enzymology” (S. Colowick and N. Kaplan, eds.), Vol. 96, pp. 94-111. Academic Press, New York. Schultz, J., Silhavy, T. J., Berman, M. L., Fiil, N., and Emr, S. D. (1982). Ce1131,227-235. Shiba, K., Ito, K., and Yura, T. (1984).J. Bacteriol. 160,696-701. Shinnar, A. E., and Kaiser, E. T. (1984).J. Am. Chem. SOC.106, 5006-5007. Siegel, V., and Walter, P. (1985).J. Cell Biol. 100, 1913-1921. Silhavy, T. J., Benson, S. A., and Emr, S. D. (1983). Microbiol. Rev.47, 313-344. Silver, P., Watts, C., and Wickner, W. (1981). Cell 25, 341-345. Smith, W. P., Tai, P. C., Thompson, R. C., and Davis, B. D. (1977). Proc. Natl. Acad. Sci. U.S.A. 74, 2830-2834. Smith, W. P., Tai, P. C., and Davis, B. D. (1978). Proc. Natl. Acad. Sci. U.S.A. 75.814-817. Steiner, D. F., Quinn, P. S., Chan, S . J., Marsh, J., and Tager, H. S. (1980).Ann. N . Y . Acad. Sci. 343, 1-16. Stern, J. B., and Jackson, R. C. (1985). Arch. Biochem. Biophys. 237, 244-252. Strauch, K. L., Kunamoto, C. A., and Beckwith, J. (1986).J. Bactenol. 166,505-512. Swan, D., Aviv, H., and Leder, P. (1972). Proc. Natl. Acad. Sci. U.S.A. 69, 1967-1971. Tabe, L., Krieg, P., Strachan, R., Jackson, D., Wallis, E., and Colman, A. (1984).J. Mol. B d . 180, 645-666. Takahara, M., Hibler, D. W., Barr, P. J., Gerlt, J. A., and Inouye, M. (1985).j.Biol. Chem. 260,2670-2674. Talmadge, K., Stahl, S., and Gilbert, W. (1980a). Proc. Natl. Acad. Sci. U.S.A. 77, 33693373.
180
MARTHA S. BRIGGS AND LlLA M. GlERASCH
Talmadge, K., Kaufman, J., and Gilbert, W. (1980b). Proc. Natl. Acad. Sci. U.S.A. 77,39883992. Talmadge, K., Brosius, J., and Gilbert, W. (1981). Nature (London) 294, 176-178. Tanford, C. (1980). “The Hydrophobic Effect.” Wiley (Interscience), New York. Ter-Minassian-Saraga, L. (1979).J. Colloid Interjace Sci. 70, 245-264. Tokunaga, M., Loranger, J. M., Wolfe, P. B., and Wu, H. C. (1982).J. Biol. Chem. 257, 9922-9925. Tokunaga, M., Loranger, J. M., and Wu, H. C. (1984).J. Cell. Biochem. 24, 113-120. Tommassen, J., van Tol, H., and Lugtenberg, B. (1983). E M B O J . 2, 1275-1279. Tommassen, J., Leunissen, J., van Damme-Jongsten, M., and Overduin, P. (1985). EMBO J . 4,1041-1047. van den Broek, G., Timko, M. P., Kausch, A. P., Cashmore, A. R., van Montagu, M., and Herrera-Estrella, L. (1985). Nature (London) 313, 358-363. van Zoelen, E. J. F., Zwaal, R. F. A., Reuvers, F. A. M., Demel, R. A., and van Deenen, L. L. M. (1977). Biocliim. Biophys. Acta 464, 482-492. Verger, R., and Pattus, F. (1982). Chem. Phys. Lipids 30, 189-227. Vlasuk, G. P., Inouye, S., Ito, H., Itakura, K., and Inouye, M. (1983).J. Biol. Chem. 258, 7141-7148. Vlasuk, G. P., Inouye, S., and Inouye, M. (1984).J. Biol. Chem. 259, 6195-6200. von Heijne, G. (1980a). Biochem. Sac. Symp. 46, 259-273. von Heijne, G. (1980b). Eur.1. Biochem. 103,431-438. von Heijne, G. (1981). Eur.J. Biochem. 116, 419-422. von Heijne, G. (1983). Eur. J. Bzochem. 133, 17-21. von Heijne, G. (1984a). E M B O J . 3,2315-2318. von Heijne, G. (1984b).J. Mol. Biol. 173, 243-251. von Heijne, G. (1985).J. Mol. Biol. 184, 99-105. von Heijne, G., and Blomberg, C. (1979). Eur.J. Biochem. 97, 175-181. Walter, P., and Blobel, G. (1981a). J. Cell Biol. 91, 551-556. Walter, P., and Blobel, G. (1981b).J. Cell Biol. 91, 557-561. Walter, P., and Blobel, G. (1983). Cell 34, 525-533. Walter, P., Ibrahimi, I., and Blobel, G. (1981).J. Cell Biol. 91, 545-550. Walter, P., Gilmore, R., and Blobel, G. (1984). Cell 38, 5-8. Waters, G. M.. and Blobel. G. (1986).I . Cell Biol., in press. Watson, M. E. E. (1984). Nucleic Acidr Res. 12, 5145-5164. Watts, C., Silver, P., and Wickner, W. (1981). Cell 25, 347-353. Wickner, W. (1980). Science 210, 861-868. Wickner, W., Ito, K., Mandel, G., Bates, M., Nokelainen, M., and Zwizinski, C. (1980). Ann. N . Y . Acad. Sn’. 343, 384-389. Wiedmann, M., Huth, A., and Rapoport, T. A. (1984). Nature (London) 309,637-639. Wiedmann, M., Huth, A., and Rapoport, T. A. (1986a). Biochem. Biophys. Res. Commun. 134, 790-796. Wiedmann, M., Huth, A., and Rapoport, T. A. (1986b). FEBS Lett. 194, 139-145. Wolfe, P. B., and Wickner, W. (1984). Cell 36, 1067-1072. Wolfe, P. B., Wickner, W., and Goodman, J. M. (1983a).J. Biol. Chem. 258, 12073-12080. Wolfe, P. B., Zwizinski, C., and Wickner, W. (1983b). I n “Methods in Enzymology” (S. Colowick and N. Kaplan, eds.), Vol. 97, pp. 40-46. Academic Press, New York. Wu, H. C., Tokunaga, M., Tokunaga, H., Hayashi, S., and Giam, C.-Z. (1983). J . Cell. B ~ o c 22, ~ .161-171. Zimmerman, M., Ashe, B. M., Alberts, A. W., Pierzchala, P. A., Powers, J. C., Nishino, N., Strauss, A. W., and Mumford, R. A. (1980). Ann. N.Y. Acad. Sci. 343,405-414. Zwizinski, C., and Wickner, W. (1980).J. Biol. Chem. 255,7973-7977.
VIBRATIONAL SPECTROSCOPY AND CONFORMATION OF PEPTIDES. POLYPEPTIDES. AND PROTEINS By SAMUEL KRIMM' and JAGDEESH BANDEKARt 'Biophysics Research Divisiod and Department of Physics and tBiophysics Research Division. University of Michigan. Ann Arbor. Michigan 48109
List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1. Theoretical Considerations . . . . . . . . . . . . . . . . . . . . . . . A . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . Normal Modes of Vibration . . . . . . . . . . . . . . . . . . . . . C . Polypeptide Chain Geometry and Coordinates . . . . . . . . . . . . D . Polypeptide Force Field . . . . . . . . . . . . . . . . . . . . . . . E. Band Assignments . . . . . . . . . . . . . . . . . . . . . . . . . 111. Extended Polypeptide Chain Structures . . . . . . . . . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . Antiparallel-Chain Rippled Sheet . . . . . . . . . . . . . . . . . . C. Antiparallel-Chain Pleated Sheet . . . . . . . . . . . . . . . . . . . IV . Helical Polypeptide Chain Structures . . . . . . . . . . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . aHelix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. 310Helix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D . 3,Helix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. L, D p Helices. . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Reverse Turns . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. p T u r n s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. y T u r n s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI . Characteristics of Polypeptide Chain Modes . . . . . . . . . . . . . . . A . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . Amide and Skeletal Modes of the Polypeptide Chain . . . . . . . . . VII . Vibrational Spectroscopy of Proteins . . . . . . . . . . . . . . . . . . A . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . Side-Chain and S-S Modes . . . . . . . . . . . . . . . . . . . . . C. Normal Modes of Proteins . . . . . . . . . . . . . . . . . . . . . VIII . Prospects for the Future . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
181 183 185 185 185 203 204 224 229 229 230 238 256 256 258 270 275 288 297 297 298 322 328 328 328 341 341 342 346 352 354
LIST OF SYMBOLS A ab APPS APRS as
integrated intensity; or constant in attractive term of nonbonded potential antisymmetric bend antiparallel-chain pleated sheet antiparallel-chain rippled sheet antisymmetric stretch 181
ADVANCES IN PROTEIN CHEMISTRY. Vol. 38
Copyright 0 1986 by Academic Press. Inc. All rights of reproduction in any form reserved.
182
fY f
F G-'
h
H ib IR
JY
J
L
N, ob PI
P PED (1
ql
Q r
5 Ar r
R S
S, sb sh ss t T TDC tw U
SAMUEL KRIMM AND JAGDEESH BANDEKAR bend constant in repulsive term of nonbonded potential matrix elements relating Cartesian coordinates x, to the internal coordinate r, velocity of light constant in exponent of repulsive term of Buckingham nonbonded potential character of symmetry operation for the ath species deformation interaction constant between peptide groups i and j in perturbation treatment unit vector unit matrix electric field vector force constant associated with 9#e matrix of force constants force constant matrix in internal coordinates inverse kinetic energy matrix in internal coordinates Planck's constant transformation matrix between two coordinate bases in-plane bend infrared Jacobian element, = av:""/df, Jacobian matrix transformation matrix from normal to internal or to local symmetry coordinates _transformation matrix from normal to mass-weighted Cartesian coordinates mass of ith atom matrix (diagonal) of atomic masses number of atoms in a molecule; or number of repeat units in crystallographic repeat of a helix Avogadro's number out-of-plane bend momentum conjugate to r, number of atoms in chemical repeat unit of a helix potential energy distribution column vector of q1 to qsN mass-weighted Cartesian coordinate of ith atom, = m,%, normal coordinate (x rock internal displacement coordinate j bond stretch from equilibrium column vector with internal coordinate components interatomic separation between nonbonded atoms stretch local group symmetry coordinate i symmetric bend shoulder symmetric stretch torsion kinetic energy transition dipole coupling twist atomic mass unit
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
183
potential energy wag weighting factor for ith frequency Cartesian coordinate of ith atom column vector with Cartesian coordinate components rotation angle about helix axis transforming one unit into an adjacent one phase angle between displacements in adjacent peptide groups of a periodic polypeptide chain structure dielectric constant in-plane angle bend from equilibrium 4lr9CW
dipole moment dipole moment derivative with respect to normal coordinate frequency in cm-1 unperturbed frequency observed amide A frequency observed amide B frequency dispersion in force constantf, dihedral angle change from equilibrium phase shift between displacements in adjacent units of a helix N, - CP dihedral angle sum of weighted squared errors CP - C, dihedral angle out-of-plane angle bend from equilibrium IR dichroism parallel to sample orientation axis IR dichroism perpendicular to sample orientation axis
I. INTRODUCTION
The vibrational spectrum of a molecule is determined by its threedimensional structure and its vibrational force field. An analysis of this [usually infrared (IR) and Raman] spectrum can therefore provide information on the structure and on intramolecular and intermolecular interactions. The more probing the analysis, the more detailed is the information that can be obtained. While the structure and force field uniquely determine the vibrational frequencies of the molecule, the structure cannot in general be obtained directly from the spectrum. However, to a useful approximation, the atomic displacements in many of the vibrational modes of a large molecule are concentrated in the motions of atoms in small chemical groups, and these localized modes are to a good approximation transferable between molecules. Therefore, in the early studies of peptides and proteins (Sutherland, 1952), efforts were directed mainly to the identification of such characteristic frequencies and the determination of their relation to the structure of the molecule. This kind of analysis depended on empirical correlations of the spectra of chemically similar molecules,
184
SAMUEL KRIMM AND JAGDEESH BANDEKAR
and occasionally yielded significant insights into the dependence of the spectrum on the conformation of the polypeptide chain (Bamford et al., 1956). Detailed analyses of the vibrational spectra of macromolecules, however, have provided a deeper understanding of structure and interactions in these systems (Krimm, 1960). An important advance in this direction for proteins came with the determination of the normal modes of vibration of the peptide group in N-methylacetamide (Miyazawa et al., 1958), and the characterization of several specific amide vibrations in polypeptide systems (Miyazawa, 1962, 1967). Extensive use has been made of spectra-structure correlations based on some of these amide modes, including attempts to determine secondary structure composition in proteins (see, for example, Pezolet et al., 1976; Lippert et al., 1976; Williams and Dunker, 1981; Williams, 1983). Polypeptide molecules of course exhibit many more vibrational frequencies than the half-dozen o r so amide modes. For example, a molecule as simple as the extended form of polyglycine has about 50 bands in its IR and Raman spectra. It is clear that the information contained in the entire spectrum must therefore be a more sensitive indicator of three-dimensional structure. The only way to utilize this information fully is through a normal-mode analysis, that is, by comparing observed frequencies with those calculated for specific secondary structures. This can provide a powerful method for testing structural hypotheses in great detail. Over the years, some normal-mode calculations have provided greater insight into the spectra of particular molecules. However, these have often been based on approximate structures (i.e., a group of atoms, such as CH2 or CHs, replaced by a point mass) or have employed limited force fields. T h e work in our laboratory has developed on the basis of a more systematic approach: We have endeavored to refine a vibrational force field for the polypeptide chain that is essentially transferable from one molecule to another. By starting with N ,lnethylacetamide to obtain a force field for the peptide group (Jakes and Krimm, 1971a,b), and building up through known polypeptide structures such as polyglycine I (Abe and Krimm, 1972a; Moore and Krimm, 1976a; Dwivedi and Krimm, 1982a), P-poly(L-alanine) (Moore and Krimm, 1976b; Dwivedi and Krimm, 1982b, 1983), and a-poly(L-alanine) (Rabolt et al., 1977; Dwivedi and Krimm, 1984a), it has been possible to develop vibrational force fields that can account for the observed spectra of these molecules with an average band-frequency error of -5 cm-’ (Krimm, 1983). These force fields can now serve as a basis for detailed analyses of spectral and structural questions in other polypeptide molecules.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
185
The aim of this review is to present these recent developments in the vibrational spectroscopy of peptides, polypeptides, and proteins. We will first discuss the necessary basic aspects of normal-mode calculations. We will then give results for those polypeptide secondary structures that have been studied to date, with an evaluation of the insights obtained from these analyses. Finally, we will comment on the preliminary studies being done on proteins and the prospects for the future.
CONSIDERATIONS 11. THEORETICAL A. Introduction
Bands are observed in the IR and Raman spectra of a molecule that correspond to normal modes of vibration of that particular structure. These normal modes can be calculated from knowledge of the threedimensional structure of the molecule and of its vibrational force field. If a reliable force field is available, it is therefore possible to predict the vibrational frequencies of a known or hypothesized structure, and to draw significant structural conclusions on the basis of comparisons with observed IR and Raman bands. In this section we review the nature of the normal-mode calculation, discuss the various factors that enter into the specification and refinement of the force field, and consider aspects of band assignments in IR and Raman spectra that are important in making comparisons between observed and calculated frequencies. These considerations underlie the results discussed in subsequent sections.
B . Normal Modes of Vibration 1 . Isolated Small Molecule The interactions of electromagnetic radiation with the vibrations of a molecule, either by absorption in the infrared region or by the inelastic scattering of visible light (Raman effect), occur with the classical normal vibrations of the system (Pauling and Wilson, 1935). The goal of our spectroscopic analysis is to show how the frequencies of these normal modes depend upon the three-dimensional structure of the molecule. We will therefore review briefly in this section the nature of the normalmode calculation; more detailed treatments can be found in a number of references (Herzberg, 1945; Wilson et al., 1955; Woodward, 1972; Califano, 1976). We will then discuss the component parts that go into such calculations.
186
SAMUEL KRIMM AND JAGDEESH BANDEKAR
The normal-mode frequencies are obtained by solving an equation, the secular equation, that is the condition that must be satisfied if the molecule is to have harmonic modes of vibration. Since the constant terms in this equation are determined in part by the molecular structure, the frequency-structure correlation appears explicitly in the solution of the secular equation. These terms also depend on the potential-energy changes during the vibrations, and therefore the force field associated with such displacements from equilibrium must be known. The secular equation arises from the solution of the equations of motion for the atoms in the molecule. The kinetic energy, T, of a molecule of N atoms is given by 2T =
3N
3N
i= I
1=1
2 mix.? =
q‘
where mi are the masses, xi are Cartesian displacement coordinates (i.e., changes in Cartesian coordinates from their equilibrium values), xi = d x i / dt and qi m,!’*xiare convenient mass-weighted Cartesian displacement coordinates. Equation (1) can be expressed in matrix notation as
2T
=
qq
(q = M1’2x)
where q is a column vector of q1 to 4 3 N , with the tilde indicating the transpose, and M is a diagonal matrix of the atomic masses. The kinetic energy matrix is, of course, diagonal in Cartesian coordinates. The potential energy for displacement from equilibrium, in the harmonic approximation, is given by
where
are the force constants for infinitesimal displacements. In matrix notation this becomes 2V = qF,q where Fq is a 3N X 3N symmetric matrix of thej-j. Application of Lagrange’s equation,
(5)
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
187
to Eqs. (2) and (5),together with the assumption that the qi vary harmonically with the same frequency, produces a set of linear homogeneous equations that has a nontrivial solution only if the determinant of its coefficients is set equal to zero. This determinantal equation, with the frequencies as the variables, is known as the secular equation. The above formulation, however, is not always convenient. We generally tend to think, and to a major extent justifiably, in terms of energy changes that result from changes in bond lengths, bond angles, and dihedral angles in a molecule, that is, from displacements in internal coordinates. (Although not convenient, changes in relevant nonbonded interactions can be included in this category.) It is therefore desirable at times to recast the normal vibration problem into other than a Cartesian basis, and the internal coordinate basis is a generally useful one. A nonlinear molecule of N atoms has 3N - 6 internal vibrational degrees of freedom, and therefore 3N - 6 normal modes of vibration (the three translational and three rotational degrees of freedom are not of vibrational spech-oscopicrelevance). Thus, there are 3N - 6 independent internal coordinates, each of which can be expressed in terms of Cartesian coordinates. To first order, we can write any internal displacement coordinate rj in the form 3N
r, =
2 Bj,xi i=
( j = 1, 2, 3 , ..., 3N - 6 )
1
(7)
where the coefficients B,i are determined by the three-dimensional geometry of the molecule. If r and x denote the column vectors whose components are the internal and Cartesian coordinates, respectively, then Eq. (7) can be written as r = Bx
(8)
where B is, in general, a (3N - 6) X 3N matrix. Since we want to invert Eq. (€9,which cannot formally be done if B is not square, we can include the three translations and three rotations in the r vector, enabling us to transform from internal to Cartesian coordinates through
x = B-lr where B-' is the inverse matrix of B. By substituting Eq. (9) into Eqs. (2) and ( 5 ) we obtain
(9)
188
SAMUEL KRIMM AND JAGDEESH BANDEKAR
and
is the force constant matrix in Cartesian coordinates, and
F = B-~F,B-~
(13)
are the force constant and inverse kinetic energy matrices, respectively, in internal coordinates. The G-' matrix has the property that the kinetic energy can be written as
(Wilson et al., 1955), where p; is the momentum conjugate to r;. We proceed to the secular equation by recognizing that the normalmode frequencies correspond to normal coordinates, Q, in which the kinetic and potential energies are both diagonal; that is,
2T
=
QQ
2v
=
QAQ
(15)
and where A is a diagonal matrix whose elements are the normal-mode frequency parameters hi = 47r2c2v?, vi being in cm-'. In such coordinates each mode is a simple harmonic oscillator whose energy levels are determined by the Schrodinger equation and whose interaction with radiation is thus well defined. Normal coordinates are usually defined in terms of Cartesian coordinates by the relation (Wilson et al., 1955)
q
=
ZQ
so that the internal coordinates are given by
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
r = Bx = BM-l/2q = BM-1129 Q = LQ where
L = BM-l/23 From Eqs. (10) and ( 1 1) we see that
2T = QxG-'LQ and
2 v = QLFLQ which, by comparison with Eqs. (15) and (16), shows that
LG-lL = E and
LFL = A where E is the unit matrix. Solving Eq. (22) for L = L-IG and substituting into equation (23) give
L-lGFL = A
(24)
GFL = LA
(25)
and
Equation (25) is the most general form of the secular equation, since if we denote the a t h column of L by L, , we have
GFL, = L,A,
=
A,L,
(26)
and thus the a t h column of L is an eigenvector of GF with eigenvalue A,. The eigenvectors describe the forms of the normal modes, with the internal coordinate contributions being obtained for any Qa from Eq. (18), that is, ri = Li,Qa
(27)
The relative amplitudes of the ri in the normal-mode Qa are therefore given by the relative Li,. It is usually easier to visualize an eigenvector by expressing it in terms of Cartesian displacements of the atoms. This can be done, since, by combining Eqs. (9) and (18), we have x = B-lr = B-lLQ
(28)
Alternatively, the secular equation in Cartesian coordinates can be
190
SAMUEL KRIMM AND JACDEESH BANDEKAR
solved, in which case Cartesian eigenvectors are obtained. By methods similar to those used above, the secular equation in Cartesian coordinates, analogous to Eq. (25),is M-'F,SI! = 5% where F, can be obtained from F using Eq. (13),viz.,
(29)
F, = BFB The Cartesian eigenvectors are then given by
(30)
x = M-1123Q
(31)
using Eq. (17). Another convenient method of characterizing a normal mode is by the potential energy distribution (PED), which describes the relative contributions of various displacement coordinates to the total change in potential energy during the vibration. From Eq. (16)we see that when only one normal mode Qu is excited, the potential energy V, of the molecule is given by
2V,
= A,@
(32) so that A, measures the potential energy change for unit displacement of 4p. From Eq. (23)we have ha =
q
FqLiaLja
(33)
and therefore
Thus, terms such as 2FqLiaLj,lA,
(35)
and F~~ L:I A,
give the fractional contributions to the potentia. energy change during the normal vibration Qu of the internal coordinate displacements rirj and r?, respectively. It should be noted that the diagonal elements of the PED, which according to Eq. (36)are never negative, can add up to more than unity since the off-diagonal elements [Eq. (35)]can be negative. Although the PEDs can be given entirely in terms of the internal
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
191
coordinates, the fact that certain combinations of displacements in a normal mode can be consistently associated with particular chemical groups makes it convenient to identify such local group coordinates. Thus, the stretching vibrations of a CH:, group in general consist of a symmetric and an antisymmetric stretching of the two CH bonds; the bending of the HCH angle in a CHz group is a characteristic localized mode of vibration involving other adjacent angles; etc. We can therefore define specific linear combinations of internal coordinates that correspond to such local group (usually symmetry)coordinates, S,and formulate the secular equation in terms of these coordinates. The PED in a normal mode will then be given in terms of the Si rather than the r;. The normal-mode problem therefore consists of solving Eq. (25) to determine the eigenvalues A and the eigenvectors L. This requires knowledge of the force constant matrix F, i.e., the set of force constants F I ~in. internal coordinates, and of the kinetic energy matrix G. The latter is obtained by inversion of Eq. (14), that is, G = BWlB
(37)
and is seen to be determined by B, namely the geometry-dependent matrix that transforms from Cartesian to internal coordinates. The construction of the elements of the B matrix is a straightforward, if tedious, process that has been systematized (Wilson et al., 1955; Califano, 1976) and can be accomplished by computer programs. The solution of Eq. (25), namely the diagonalization of GF, can also be readily done by computer, usually after symmetrizing this product, since, although G and F are symmetric by their definition, their product in general is not (Miyazawa, 1958). [Of course, an alternative approach is to solve Eq. (29) in Cartesian coordinates. Then we only need to compute the B matrix and transform F into F, using Eq. (30).] Obtaining the normal modes does not automatically guarantee that these vibrations will be observable in IR absorption or Raman scattering. Other conditions must also be satisfied (Wilson et al., 1955): For IR absorption, there must be a nonzero dipole moment change during the vibration; for Raman scattering, there must be a nonzero poiarizability change during the vibration. Some quantitative aspects of IR absorption will be considered below (Sections II,C,2,c and II,E,2). The computational procedure used in our laboratory is illustrated schematically in Fig. 1. The procedure starts with the input of the number of atoms in the molecule, their Cartesian coordinates and masses, and the total number of internal coordinates. The latter are given specifically in terms of the number of stretch, angle bend, linear angle bend, out-of-plane bend, and torsional coordinates. From the index that char-
192
SAMUEL KRIMM AND JAGDEESH BANDEKAR
-
1 Read Index which determines Type of Intrrnol Coords = J
Read Atom Numbers Involved I n this deformation
i Form Unit Vectors Bstween Bonded Atoms Concerned
i Compile 6 - M a t r i x Row i
- NO
1
solve Secular Equation
f
~
I
~-
Eigenvaluer and Eigenfunctions
FIG. 1. Schematic diagram of computer program to calculate normal-modefrequencies and eigenvectors.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
193
Y
t,
FIG.2. N-Methylacetamide structure.
acterizes the type of internal coordinate and the atoms involved, the program computes the B matrix and G matrix elements. The (symmetrized) secular equation is then solved for the eigenvalues and eigenvectors, using programs for matrix diagonalization. In the case of a repeating helical structure or a molecular crystal (see below), the procedure is essentially the same: Although the G (or B) and F matrices are for the entire polymer or crystal, symmetry allows them to be “folded” into smaller matrices with dimensions appropriate to the repeat or asymmetric unit. In the case of a force-field refinement (see below), the scheme in Fig. 1 changes somewhat. The G matrix elements remain unchanged, but the cycle involving the F matrix changes until the refinement converges to an acceptable agreement between observed and calculated frequencies.
2. N-Methylacetamide We illustrate the results of a normal-mode calculation on a small molecule by discussing the normal vibrations of N-methylacetamide (NMA), the simplest molecule containing a trans peptide group analogous to that in a polypeptide chain. A study of this molecule provides insights into the general nature of the so-called amide modes of the peptide group. Although a number of normal-mode analyses have been done on NMA (Miyazawa et al., 1958; Miyazawa, 1962; Jake3 and Schneider, 1968; Jake3 and Krimm, 1971a; Rey-Lafon et al., 1973), we discuss here a calculation (T. C. Cheam, personal communication, 1985) on a molecule of “standard geometry” (see below) using a polyglycine I [(Gly),I] force field (Dwivedi and Krimm, 1982a) [withf(NH ob, CN t) = -0.051. The model used is shown in Fig. 2, where we have replaced the CH3 group by an equivalent point mass, an assumption that should not seriously affect the main conclusions concerning the nature of the amide modes. The local symmetry coordinates of the peptide group in terms of internal-displacement coordinates (AT = bond stretch, A6 = in-plane angle bend, Au = out-of-plane angle bend, AT = dihedral angle change)
194
SAMUEL KRIMM AND JAGDEESH BANDEKAR TABLE I Local Symmetry Coordinates of the Peptide Group
SI
=
r(CC)
SZ = r[(Cz)Nl = r"(C4)I = r(C0) S5 = r(NH) Ss = [ZO(CCN) - O(CC0) - O(NCO)]/fi S7 = [O(CCO) - @ ( N C O ) ] / f i Ss = [2O(CNC) - B[(Cz)NH] - O[(C,)NH]]/fi S g = [O(C2NH) - O[(C,)NH]]/fi
SS s4
S l o = o ( C 0 ) sin(CCN) S L I= o(NH) sin(CNC) S12 = [T(CCNC) + T(CCNH) + r(0CNC) + 7(OCNH)]/4 a
CC stretch (CC s) CN stretch (CN s) NC stretch (NC s) CO stretch (CO s) NH stretch (NH s) CCN deformation (CCN d) CO in-plane bend (CO ib) CNC deformation (CNC d) NH in-plane bend (NH ib) CO out-of-plane bendb(CO ob) NH out-of-plane b e d (NH ob) CN torsion (CN t)
Atoms numbered as in Fig. 2. Positive: C moves in + Z . Positive: N moves in -2.
are given in Table I. The calculated frequencies and PEDs are given in Table 11, together with observed frequencies, and the Cartesian eigenvectors are presented in Fig. 3 (T. C. Cheam, personal communication, 1985). As will be seen from Table 11, the agreement between observed and calculated frequencies is not too good. The main reason for this, other than the CHS point mass approximation, is that we used a force field refined for another system which, though similar, is not completely analogous. In particular, the (Gly),I force field was refined for hydrogenbonded groups, which are in fact reflected in the observed NMA frequencies, whereas in the present example we have calculated the modes of an isolated NMA molecule. Nevertheless, the qualitative features of the normal modes, as given in Table I1 and Fig. 3, should be preserved. a. NH Stretch. The NH stretch mode (NH s), designated amide A, is completely localized in the NH group. It is usually found as part of a Fermi resonance doublet (see Section II,D,5,d),the other component of which is designated amide B and is observed in the 3100- to 3050-cm-I region. Although in NMA the resonance is with the overtone of amide 11, in certain conformations of the polypeptide chain NH s is in resonance with a combination of amide I1 modes. b. Amide Z. The amide Z mode is primarily a stretching of the CO bond, together with an out-of-phase CN s component and a small contribution from CCN deformation (CCN d). [Note that, even though the eigenvec-
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
195
TABLE I1 Obserued and Calculated Frequencies of N-Methylacetamide Mode
u,bQ (cm-I)
v,~, (cm-I)
Potential energy distributionb
NH stretch Amide I Arnide I1
3236s' 1653s 1567s
3254 1646 1515
Amide 111
1299M
1269
NC stretch CN stretch, CC stretch
1096W 881 W
1070 908
Amide V Amide IV Amide VI CCN deformation
725s 627W 600M 436W
721 637 655 498
CNC deformation Amide VII
289 206
274 226
NH s (100) CO s (83), CN s (15). CCN d ( 1 1 ) NH ib (49), CN s (33), CO ib (12). CC s (lo), NC s (9) NH ib (52). CC s (18), CN s (14), CO ib ( 1 1 ) NC s (77), CC s (17) CN s (31), CC s (17), CO s (16), CNC d (14), CCN d (10) CN t (75), NH ob (38) CO ib (44), CC s (34), CNC d (1 1) CO ob (85), CN t (13) CCN d (63), CO ib ( 1 l ) , CN s (8), NC s (8) CNC d (71), CO ib (19), CCN d (13) NH ob (64), CN t (15), CO ob (12)
a Absorbances are described as: S, Strong; M, medium; W, weak. Observed by Miyazawa et al. (1958). bSm, Stretch; d, deformation; t, torsion; ib, in-plane bend; ob, out-of-plane bend. All contributions 2 5 included. Unperturbed by Fermi resonance.
tor reveals a small NH in-plane bend (NH ib) contribution, this does not show up in the PED because its value of 2 is below the arbitrary cutoff of 5 used in Table 111. As a result, the dipole moment derivative ( d p / dQ) has a direction that is found experimentally to be + 15" to +25" from the CO bond direction in the plane of the peptide group (Bradbury and Elliott, 1963). [The positive direction is such as to rotate the moment from the CO bond direction to one closer to parallelism with the CN bond direction.] Experimental measurements on N , N '-diacetylhexamethylenediamine (Sandeman, 1955) and on silk fibroin (Suzuki, 1967) have given + 17" and + 19", respectively, for this angle. The direction of the dipole moment derivative can be calculated by ab initzo quantum mechanical methods (see Section VII,B,3), and it depends quite sensitively on the details of the eigenvector and therefore of the force field. In a comparison of various force fields (Cheam and Krimm, 1985), it was shown that the best prediction was given by that of Dwivedi and Krimm (1982a), which gives a value of + 17"; that of Miyazawa et al. (1958) gives +go, and that of Rey-Lafon et al. (1973) gives
196
SAMUEL KRIMM AND JAGDEESH BANDEKAR
0
b
C
e
f
FIG.3. N-Methylacetamide normal vibrations. (a) NH stretch; (b) arnide I; (c) amide 11; (d) amide 111; (e)arnide IV; (f) amide V; (g) amide VI; (h) amide VII; (i) 1070; (j)908; (k) 498; (I) 274.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
197
- 19" (the latter resulting from the negligible contribution of CN s and the presence of a relatively large CO ib contribution). It should be noted that the small NH ib contribution results in a small downshift in the amide I frequency on N-deuteration (Miyazawa et al., 1958; Rey-Lafon et al., 1973). c. Amide II. The amide II mode is an out-of-phase combination of largely NH ib and CN s with smaller PED contributions from CO ib, CC s, and NC s. T h e transition moment direction determined experimentally is +73" or - 37" from CO (Bradbury and Elliott, 1963), the two values resulting from a sign indeterminacy. T h e result of about +73" from studies on N,N '-diacetylhexamethylenediamine (Sandeman, 1955) led to a favoring of the former value. T h e calculated transition moment direction is +69" (Cheam and Krimm, 1985), in excellent agreement with experiment and better than values of +58" and +52" predicted by other force fields. Because of the large NH ib contribution, N-deuteration has a major effect on this mode, converting it to a largely CN s mode and shifting it to -1480 cm-' (Miyazawa et al., 1958; Rey-Lafon et al., 1973). d. Amide III. T h e amide III mode is the in-phase combination of NH ib and CN s, with contributions from CC s and CO ib. Because of sign indeterminacy in the experimental transition moment direction (Bradbury and Elliott, 1963), values of +96" or +120" to the CO bond are possible. These are consistent with values of + 137", + 118", and +97" predicted by different force fields (Cheam and Krimm, 1985). All force fields predict the approximate cancellation of the NH ib and CN s contributions; they differ in assigning the main residual contribution to the dipole derivative, and thus its direction should be a very sensitive function of the force field. Since there is a large contribution of NH ib to this mode, it is affected significantly by N-deuteration: The ND ib coordinate separates out (as it does in the case of amide 11) into a relatively pure mode near 960 cm-', and the other coordinates become redistributed into other modes. e. Skeletal Stretch. Stretching of the three skeletal bonds contributes to two fairly well-defined modes: NC s mixed with a small amount of out-of-phase CC s, observed at 1096 cm-', and mixed CN s and CC s with C O s, CNC d, and CCN d, observed at 881 cm-'. In the polypeptide chain, we may expect to see these stretch contributions mixed with sidechain coordinates, perhaps giving complex modes. f . Amide V . The amide V mode, observed at 725 cm-', is largely an NH out-of-plane bend (NH ob) motion with some CN torsion (CN t). The large CN t contribution to the PED is a result of the large value of the CN t force constant in the (Gly),I force field. Calculations of the
198
SAMUEL KRIMM AND JAGDEESH BANDEKAR
various components (Cheam and Krimm, 1985) show that NH ob contributes about 5.5 times as much as CN t to the dipole moment derivative. Of course, since NH ob is such a large component of this mode, the frequency is very sensitive to N-deuteration: The band disappears and is replaced by a (mainly) ND ob mode at 510 cm-’. g. Other Amide Modes and Skeletal Deformation. The amide ZV mode is mainly CO ib and CC s, with a small contribution from CNC d. The phase relationships of these components are such that they subtract, leading to a low intensity for this mode. The amide VZ mode is mainly CO ob in terms of PED, but in fact most of its intensity derives from the NH ob component of the motion. Modes calculated at 498 and 274 cm-I involve deformation of the backbone skeleton, the former being primarily CCN d and the latter primarily CNC d. The amide VZZ mode is another strong mixture of NH ob and CN t, but like amide V its intensity derives primarily from NH ob. We see from this example that a normal-mode analysis provides a detailed understanding of the vibrational spectrum of a molecule of known structure and force field. We will discuss later how this technique can serve to determine structure.
3. Isolated Helical Molecule Many polypeptide chain structures are helical, and it is therefore desirable to have specific methods for determining the normal modes of such molecules. While actual structures are finite in length, and could be treated as “small” molecules, the theory is most simply formulated for infinite helices (Higgs, 1953a,b). It is generally assumed that, except in obvious cases, end effects for real structures are small, and that selection rules for IR and Raman activity are dominated by those for the infinite helix, viz., that only those vibrational modes can exhibit activity in which equivalent motions in each crystallographic repeat unit of the helix are in phase (Higgs, 1953a). Many authors have developed treatments of this problem (Tadokoro, 1960; Miyazawa, 1961b; Miyazawa et al., 1963; Piseri and Zerbi, 1968; Small et al., 1970; Fanconi, 1972a). Consider an ideally infinite helical chain whose crystallographic repeat contains N chemical repeat units, each with P atoms. A screw symmetry operation transforms one chemical unit into the next, with a being the rotation about the helix axis and d the translation along the axis. Let rp denote the ith internal displacement coordinate associated with the nth chemical repeat unit. The potential energy, by analogy with Eq. (3), is given by
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
199
Since the chain is periodic, it follows that the force constants FT' (= Fgn) depend only on the difference (n - n'1 = m, i.e., (39)
F?$"= F?$
Substituting Eq. (39) into Eq. (38) gives 2V = n;i,k
FirfrE +
z
(Fzrfr;'"
n;m#O;i.k
+ Fz-rfrf-")
(40)
The kinetic energy can similarly be written as 2T =
2 G i p f p E + c,
n;ik
(Czplp$'"
+ C$pfpE-")
(41)
n;m#O;i,k
Applying Lagrange's equation to Eqs. (40) and (41) leads to an infinite number of second-order differential equations in rf'", with an infinite number of normal-mode frequencies. However, the assumption of normal modes for the helix is equivalent to assuming that all r?+" vary with the same frequency and with a phase factor that depends only on m,viz., that r;+" = Ai exp[i(A1'2t+
m9)]
(42)
where Ai is the amplitude of the motion, which is independent of n, and is the phase shift between two equivalent, adjacent, internal displacement coordinates. Such a form is required by the helical symmetry (Higgs, 1953a). Substitution of Eq. (42) into the above system of differential equations reduces these to a set of 3P simultaneous homogeneous linear equations in the unknowns Ai. This set has a nontrivial solution only if C#J
IW)F(4) - h(4)El = 0
(43)
where
and
Equation (43) has 3P characteristic roots h (= 4,rr2c2v2) for each value of the phase 4. We have thus effectively reduced the calculation of the infinite helix to
200
SAMUEL KRIMM AND JAGDEESH BANDEKAR
that of one chemical repeat. T h e 3P functions vi(#J) are known as the dispersion relation, and the shapes of the branches, i.e., the variations of the frequencies with the phase, depend on the coupling between neighboring chemical units through the G or F matrices. Since (Kittel, 1969) v(#J)= ~ ( - 4 )and v(#J 27r) = v(#J),only the range 0 5 #J 5 7r, which corresponds to half of the first Brillouin zone (Brillouin, 1953), is needed. The true Brillouin zone of a helix is 1/N times smaller than the above. The dispersion curves in this zone are obtained by folding at the zone boundary points, i.e., #I = m r / N (0 5 m 5 N - 1). This results in four acoustic branches, i.e., in which v + 0 as #J --* 0, corresponding to three translations plus a rotation about the helix axis. Infrared-active normal modes occur for #J = 0 and a,and Raman-active normal modes for #J = 0, a,and 2a (Higgs, 1953a). The density of vibrational states in the infinite set can be obtained by summing d#Jldvas a function of v over the dispersion curves, i.e., determining how densely the phase angles are distributed in v space. T h e computations for a helical structure start by calculating the B matrix, which, through Eq. (37), is used to calculate the Gmmatrices of Eq. (44)(Small et al., 1970). T h e Gomatrix in this equation is that for a single chemical repeat unit, except that some atoms outside this unit must be used to define the internal coordinates within the unit. In hydrogen-bonded helices, internal coordinates in one unit involve atoms in nonadjacent units, and it is therefore necessary to have coordinate transformations between different units of the helix. To calculate the B matrix we proceed as follows. The internal coordinates belonging to the nth chemical repeat unit, r7',are related to the Cartesian coordinates of unit n + m, xn+"', by rn = C Bn,n+mxn+m (46)
+
m
Taking the helix axis as the z axis, the xn+mare related to the ( x ' ) ~ in + ~the rotated basis as follows: = Hn+m(a)Xn+m
(X')n+m
(47)
where (Goldstein, 1950)
+ m)a] -sin[(n + m)a] sin[(n + m)a] cos[(n + m)a]
cos[(n
Hn+m(a)=
[
0
0
0 0]
(48)
1
is the transformation matrix for rotating the basis vectors. We have neglected pure translations since they do not affect internal coordinates. Substituting from Eq. (47) into Eq. (46) we get
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
201
Since B n ~ n + m ~ + mmust ( a ) be independent of n, we can write Bn,n+m&+m(a) = Bo.mHm(a). Then, to obtain B, we sum over all values of m:
B=
C BosmHm(a) m
It is through this relation that the dimensionality of the infinite helix is reduced to that of a chemical repeat unit. If the calculation is being done in the internal coordinate basis, the B matrices are used, through Eq. (37), to calculate the Gmmatrices of Eq. (44). If the elements of F(+) are known, viz., the force constants in internal coordinates within one chemical unit, FO, and any force constants that span units m apart, P, then Eq. (43) can be solved for any value of 4: 0, a,and 2a for the IR- and Raman-active modes, and for a range of values between 0 and T in order to obtain the dispersion relation. T h e fact that G(+) and F(+) are complex is no problem: Transformations exist that will convert the complex matrices into equivalent real ones (Miyazawa et al., 1963). If Cartesian coordinates are used, then B(4) matrices, corresponding to Eq. (50) multiplied by eim+,are used to calculate the F,(+) [Eq. (30)] and then secular equation (29) is solved. It should be noted that there are advantages in working in Cartesian coordinates, even though the force constants are not as meaningful as in the internal-coordinate basis. First, it is much easier to include nonbonded interactions, which otherwise must be included as separate internal coordinates for each interaction. Second, whereas the G(+)F(c#J) matrix in Eq. (43) is in general not symmetric, the M-'Fx(+) matrix of Eq. (29) is, and it is usually easier to diagonalize a symmetric matrix. 4. Molecular Crystal T h e treatment thus far has been for isolated molecules, either small or one-dimensional infinite helices. I n some instances crystalline intermolecular interactions are important, and it is therefore necessary to be able to compute the normal modes of a molecular crystal. T h e general theory of crystal dynamics has been presented by Born and Huang (1954), and the theory of molecular vibrations in solids has been discussed by a number of authors (Fanconi, 1972a,b; Zak, 1975; Net0 et al., 1976; Decius and Hexter, 1977; Califano, 1977; Schrader, 1978). The formalism developed in the preceding section can be modified to apply to a molecular crystal. T h e starting point is the asymmetric unit
202
SAMUEL KRIMM AND JACDEESH BANDEKAR
in a unit cell. Let xn denote the set of Cartesian coordinates of all atoms in the reference asymmetric unit. Then X" includes all atoms required to define all the internal coordinates in the asymmetric unit. Using Eq. (7), we have rn
=
Bn,n+mxn+m m
where rnrefers to the internal coordinates of the asymmetric unit n. The xn+m are related to the ( ~ ' ) n +in~ the rotated basis as follows: =
(x')n+m
Hn+m(a)Xn+m
=
C;+mxn,
(52)
where
[
cos[(n + m)a]
+ m)a]
Hn+m(a) = -sin[(n
]
sin[(n + m ) a ] 0
cos[(n + m)a] 0
0
0
1
(53)
and C;+m denotes the character of the character table corresponding to the symmetry operation for the a species (see Section II,E,2). From Eq. (52), since the H matrices are orthogonal, Xn+m
=
C;+mAn+m(a)xn
(54)
Substituting in Eq. (51) for x " +gives ~ rn
=
x
C;+mBn,n+m~+mxn+m
(55)
m
[Note that the B matrix elements in Eqs. (55) and (51) refer to the same basis; in Eqs. (46) and (49) they refer to different bases.] This implies that the B matrix elements for a given species, P,are B(rQ)=
2 m
C;+mBn,n+m&+m(a)
(56)
or, since n is a dummy variable,
B(P)
CZBosmHm(a)
= m
(57)
Once the B matrix elements are computed for all values of m, those belonging to a given species are computed using Eq. (57). As mentioned before, when working in internal coordinates, it is necessary to compute the corresponding G(P) matrix elements. From Eq. ( 1 l), the potential energy may be written as
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
203
where the F"" are the force constant matrix elements in internal coordinates. For the I'" species,
rm= C;rn
(59)
so that
which implies that
F(P) =
C;Pm m
Once the B (or G.) and F matrix elements are known for a given symmetry species, the solution of the secular equation leads to the desired eigenvalues and eigenvectors. It should again be noted that the dimensionality of the B or F matrices is reduced to that of an asymmetric unit in the unit cell. The preceding discussions have established the theoretical basis for computing the normal modes of vibration of a polypeptide chain molecule. Throughout the discussion we have assumed knowledge of the structure of the molecule and of its vibrational potential energy function. It is now necessary to examine these two kinds of inputs, and in particular to understand how we can obtain a polypeptide force field that might serve to predict the vibrational frequencies of an arbitrary chain conformation. C . Polypeptide Chain Geometry and Coordinates
1 . Standard Geometry The secondary structure of a polypeptide chain is determined by its set of dihedral angles +i (rotation about the Ni-CP bond) and I,+(rotation about the CP-Ci bond), once it is assumed that the peptide group is planar and has a given geometry (Pauling et al., 1951). Although bond lengths and angles in the peptide groups of various molecules may vary slightly, it is reasonable to assume standard bond lengths and angles for this group, and we have done so in our calculations. This geometry (Corey and Pauling, 1953) is given in Table 111. It is less obvious, however, that the peptide group will always be planar, since relatively little energy is required for a small out-of-plane twist (Winkler and Dunitz, 1971). In fact, X-ray crystal structure analyses of peptides and proteins have shown that departures from planarity of 5- 10" are not uncommon. In our refinement of a force field for the polypeptide chain we have
204
SAMUEL KRIMM AND JAGDEESH BANDEKAR
TABLE 111 Standard Geometry of the Peptide Group Bond lengths
(A):
Bond angles (degrees):
l(C"-C) = 1.53 l(C-N) = 1.32 l(N-C")= 1.47
l(C=O) = 1.24 l(C"-H) = 1.07 l(N-H) = 1.00
C"CN = 114.0 C"C0 = 121.0 CNH = 123.0 CNC" = 123.0 NC"C = NC"H = CC"H = tetrahedral
taken the peptide group as planar, since this is applicable to NMA and to the (Gly),I, P-poly(L-alanine)[P-(Ala),], and a-poly(L-alanine)[a-(Ala),,] polypeptide structures used in the refinement. Despite the slight variability in other structures, we have also generally maintained the assumption of planarity in our calculations. This is not expected to lead to serious problems in the computed frequencies. In terms of the discussion in Section II,B, we therefore seek a set of FV with maximum transferability between different structures, recognizing that some changes will be necessary because of differences in hydrogen-bonding geometry and that small changes due to conformational differences can occur. We expect that the largest differences in the A, will arise from the significant changes in G (or in F,) brought about by the dependence of B on the three-dimensional geometry.
2. Internal and Symmet? Coordinates The eigenvectors of polypeptide chain modes, as in the case of NMA, can be described by PEDs in terms of symmetry coordinates, which in turn are related to internal coordinates. A list of the internal coordinates for (Ala), is given in Table IV, and the local symmetry coordinates are given in Table V (Moore and Krimm, 1976b).These serve as the general local symmetry coordinates for most polypeptide chain structures [for the particular set for (Gly),I, see Dwivedi and Krimm (1982a)l. D. Polypeptide Force Field The development of a force field suitable for the polypeptide chain involves several steps. First, as we will see below, it is necessary to select a physically appropriate form for the potential, V, both for the intramolecular, Vi,,,,, and the intermolecular, Vi,,,,, parts. Second, we need a formalism for producing an acceptable set of force constants from the observed data; this is usually done by a least-squares procedure. Third, the molecules used, and their sequence, in the refinement process
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
205
TABLE IV Internal Coordinates far Poly(L-alanine)
RI = Ar(C"-C) R2 = Ar(C-N) R3 = Ar(N-Ca) R4 = Ar(C=O) R5 = Ar(N-H) R6 = Ar(C"-H") R7 = Ar(C@-H) RE = Ar(C@-H) Rg = Ar(C@-H) R l o = Ar(C"-CP) R l l = Ar(H 0) R12 = Ar(H" *..Hn) Rls = AO(C"-C-N) R14 = AO(C-N-C") R15 = AO(N-C"-C) R16 = AO(Cn-C=O) R1, = AO(N-C=O) Ria = AO(C-N-H) Rig = AO(C"-N-H) R20 = AO(N-C"-Ha)
should be such as to lead to a properly convergent (if not unique) solution. We discuss these matters below, and present our present force field for a polypeptide chain with trans peptide groups. [Preliminary studies relevant to the cis peptide group have also been completed (Cheam and Krimm, 1984a,b).] 1 . Intramolecular Potential Functions No analytical form is known by which to express the potential energy of a molecule. It is therefore convenient to expand the potential in a Taylor series:
and to neglect, in the harmonic approximation, terms higher than quadratic. For convenience we set Vo = 0, and, if the internal coordinates are independent and the molecule is at equilibrium, the second term vanishes since all Fi = (aV/ar& = 0. Thus,
206
SAMUEL KRlMM AND JAGDEESH BANDEKAR
TABLE V Symmetry Coordinates for Poly(L-ahnine)
N-C" stretch Cn-C stretch C-N stretch C=O stretch N-H stretch CeCa stretch 0 - H " stretch CHs symmetric stretch CHs asymmetric stretch 1 CHs asymmetric stretch 2 C"-C-N deformation C=O in-plane bend C-N-C" deformation N-H in-plane bend N-C"-C deformation Cabend 1 Cb bend 2 H" bend 1 H" bend 2 CHs asymmetric bend 1 CHs asymmetric bend 2 CHs rock 1 CHs rock 2 CHs symmetric bend C=O out-of-plane bend N-H out-of-plane bend N-C" torsion C"-C torsion C-N torsion Cn-Ca torsion C=O H in-plane bend N-H ... 0 in-plane bend H 0 stretch H" Ha stretch C=O torsion N-H torsion
where the
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
207
are the harmonic force constants. Rarely can a complete harmonic force field be determined from the data, since there are in general only 3N - 6 vibrational frequencies but (3N - 6) 4[(3N - 6)2- (3N - 6)] = f(3N 6)(3N - 5) values of the Fq. It is therefore necessary to restrict the force field to a manageable number of Fq, and this is usually done by assuming what is considered to be a physically reasonable model. Many such model force fields have been discussed in the literature. [For general discussions, see Herzberg (1945), Wilson et al. (1955), Woodward (1972), and Califano (1976). For discussions of the UreyBradley force field, see the review by Duncan (1975). For discussions of the entirely different consistent force field approach, see Lifson and Warshel(1968), Warshel et al. (1970), and Burkert and Allinger (1982).] We have chosen to use a simplified general valence force field (SGVFF), which has been defined as one “which contains the minimum possible number of interaction constants compatible with a good fit of the spectra” (Califano, 1976). Such a force field has been demonstrated to be very effective for hydrocarbops (Schachtschneider and Snyder, 1963). For this form the potential energy of Eq. (63) is written explicitly as
+
+ 2 2 Fd r,e
Ar Af? + 2
2 Few Af? Am + 2 2 F , e,w
Ao Ar (65)
0.7
In Eq. (65) it is to be understood that the terms in the first line include cross-terms between similar coordinates, i.e., Ari Ar,, Af?; Ad,, etc., although these are assumed to be physically significant only for i close toj. A similar assumption is made for the cross-terms in the second line of Eq. (65). In both cases, experience with such force fields provides a guide as to which off-diagonal terms to include: terms in Ar Ao have been found to have a minor effect on the frequencies, and terms for which the Jacobian elements (see Section II,D,3) for all modes are small are usually negligible. The units for the force constants are F,, mdynl& F+, mdyn; all others, mdyn A. Even with the reduction in number represented by Eq. (65), there still are not enough observable vibrational data for a large molecule to permit a determination of the force constants. Additional assumptions are necessary, and two in particular are very useful. First, we assume the essential transferability of comparable force constants in different molecules containing the same groups. Thus, peptide group force constants for NMA should serve as a satisfactory starting point to describe the force field for this group in the polypeptide chain (although of course differences in hydrogen bonding have to be taken into account). Second,
208
SAMUEL KRIMM AND JAGDEESH BANDEKAR
we assume that when isotopic changes are made, as when NH is replaced by ND, the force field is unchanged. While this is true in the harmonic approximation, molecular vibrations are in fact anharmonic. When such anharmonicities are large, as in the case of the NH s mode, we find that calculated frequencies for isotopic derivatives deviate significantly from observed. However, nonstretch modes are affected much less, and thus both of the above assumptions result in the addition of more independent data without changing the number of FV. In favorable cases, such as the hydrocarbons, there can be u p to 10 times as many frequencies as force constants, thus significantly overdetermining the latter; in the polypeptide case this number is closer to three. The consistent force field (CFF) represents a different approach to increasing the ratio of observables to parameters. In this method, the total potential V is parameterized to a range of properties of a set of molecules, including known equilibrium structures and energies. This approach leads to a potential function that can be used for energy minimization and molecular dynamics calculations (Lifson and Warshel, 1968; Lifson et al., 1979; Brooks et al., 1983; Levitt, 1983). However, such functions have not led to good reproduction of frequencies, perhaps because frequencies have not been given great weight in the parameterization. Although improvements have been made (Lifson and Stern, 1982),these still do not provide satisfactory frequency agreement, and we have therefore used a refined vibrational force field for the determination of structure from vibrational spectra. 2 . Intermolecular Potential Functions I n analyzing the vibrational spectra of polypeptides, it is important to include certain intermolecular contributions to Vi,,, . We discuss here three of these contributions, two of which, hydrogen bonding and transition dipole coupling, have played a very important role in the development of our force field. a. Nonbonded Interactions. Nonbonded atom-atom interactions may be included explicitly (as in the Urey-Bradley force field) or implicitly (as in the SGVFF) in the intramolecular force field. However, in intermolecular potentials such interactions must always be explicitly included. T h e functional form of such a nonbonded potential, Vnb, has been discussed by a number of authors (Kitaigorodsky, 1961, 1973, 1978; Williams, 1981). It is composed of attractive, Vatt, and repulsive, Vrep, parts:
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
209
where Vat, is due to the long-range dispersion energy of interaction between two neutral atoms a distance R apart, and has the form Vat, = -AIR6
(67)
The constant A can be calculated (Slater and Kirkwood, 1931), and is given by
where a1 and a2 are the polarizabilities, N1 and N2 are the effective number of polarizable electrons on atoms 1 and 2, respectively, and e, m, and h are the familiar physical constants. In distinction to Vatt, no single analytical form has been used for the short-range exchange repulsive forces. In the more commonly used Lennard-Jones potential,
V,,, = B1R12
(69)
and only one additional parameter is introduced. The Buckingham potential,
Vrep = B exp(-CR )
(70)
is known to be more correct from quantum-mechanical calculations, but it introduces two additional parameters. A typical procedure for determining the parameters is to assume a form for the potential function (the Lennard-Jones function is most commonly used) and to calculate the constants by requiring that the functions reproduce a set of known crystal structures. Various authors have used this procedure to obtain nonbonded potentials suitable for atom-atom interactions within and between polypeptide chains (Brant and Flory, 1965a,b; Ramachandran and Sasisekharan, 1968; Scheraga, 1968; Lifson et al., 1979; Hagler et al., 1979a,b). These potentials have been quite satisfactory in predicting other structures, but they have not been uniformly successful in the calculation of intermolecular vibrations (Cheam and Krimm, 198413). Nonbonded interactions have generally not been included in our force field, except where an unusually close contact is present, such as Ha Hain (Gly),I (Moore and Krimm, 1976a).The main reason is that they ordinarily have a small influence on the medium-range frequencies, and in particular no effect on amide I or amide I1 splittings (V. Naik and S. Krimm, unpublished results, 1985). When obviously needed, as in (Gly),J, we have introduced a force constant derived from an intermolecular potential for the specific interaction involved. Intermolecular
210
SAMUEL KRIMM AND JAGDEESH BANDEKAR
nonbonded terms probably should be incorporated to give a better description of low-frequency vibrations. b. Hydrogen Bonding. The pervasive role of N-H --.O=C hydrogen bonds in determining the secondary and tertiary structures of proteins needs no emphasis, certainly not since Pauling et al. (1951) noted the energy cost of -3 kcal/mol for not maximizing hydrogen-bond formation. The geometrical properties of such bonds are very well characterized (Schuster et al., 1976; Taylor et al., 1983, 1984a,b), as are their properties in proteins (Baker and Hubbard, 1984). It is clear that these interactions, which can be intramolecular as well as intermolecular, must be explicitly incorporated in a vibrational force field. There are two main ways of handling the hydrogen-bond contribution: through an analytical potential function or with parameterized force constants. We have chosen to use the latter method, but we will comment on the former as well. Analytical functions for the hydrogen-bond potential, Vhb, seek to describe the energy of the X-H *. Y-Z interaction as a function of the geometrical parameters of this group of atoms. One of the first successful functions, designed to account for structural and spectroscopic properties of a linear X-H --.Y bond, was the Lippincott-Schroeder function (Lippincott and Schroeder, 1955). This potential represents the X-H and H -.* Y interactions by functions of a covalent bond type, and adds terms representing the X ... Y attractions and repulsions. Subsequent modifications (Schroeder and Lippincott, 1957; Chidambaram et al., 1970; Balasubramanian et al., 1970) allowed for nonlinear bonds. In order to avoid such complex descriptions, it can be argued that the hydrogen bond is fundamentally no different from other intermolecular interactions, and should therefore be describable by functions similar to those used for the nonbonded potentials (except, of course, that the constants will be different). A number of authors have taken this approach (Lifson et al., 1979; Brooks et al., 1983; Sippl et al., 1984). A variety of other hydrogen-bond potentials has also been given. In our force field the hydrogen bond is treated parametrically in terms of a local valence force field, with the contributing force constants being H 0 s, N-H 0 ib, and C=O --.H ib. This approach cannot provide the degree of transferability that would be possible with an analytic function, but it is simpler to use and should not seriously affect the mid-range frequencies. In addition, this representation behaves in a physically reasonable way, in that we find that the H 0 s force constant increases as the H -..0 distance decreases, i.e., as the hydrogen bond becomes stronger. (This is also accompanied by a decrease in the NH s force constant.) We believe that for purposes of studying the
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
2 11
influence of backbone conformation on the normal-mode frequencies, except perhaps in the low-frequency region, this is a satisfactory approximation. O=C hydroA comment is in order about the possibility of C-H gen bonds in polypeptide structures. Such a bond was suggested in the structure of polyproline I1 between a proline CHP group and the C=O group on an adjacent chain (Sasisekharan, 1959), but the attractive nature of this interaction was subsequently questioned (Arnott and Dover, 1968). In a revision of the structure of collagen (Ramachandran and Sasisekharan, 1965), such a bond was proposed between a backbone (glycine) CH2 group and the C=O on an adjacent chain. A similar interaction was also suggested for (Gly),II (Ramachandran et al., 1966a, 1967). Strong arguments were not possible in these cases because of the poor quality of the X-ray diffraction patterns of these polypeptides. However, spectroscopic evidence supported the presence of C-H 0 hydrogen bonds in (Gly),II (Krimm et al., 1967), as it has in a wide variety of other molecules (Green, 1974). Evidence for C-H ... 0 hydrogen bonds has also been sought from the structures of spa11 molecules. From an early survey of X-ray crystal structures (Sutor, 1963) it was concluded that a number of short C-H .-.0 contacts could be ascribed to hydrogen bonds, but a similar consideration of the available data challenged this interpretation (Donohue, 1968). In an in-depth analysis of 113 neutron-diffraction crystal structures (Taylor and Kennard, 1982), however, it was concluded that the majority of the surveyed short contacts “are attractive interactions, which can reasonably be described as hydrogen bonds.” It was also found that the shortest such contacts were particularly likely when the C-H was adjacent to a neutral or positively charged N atom. Although the N atom of the peptide group has a net negative charge, a quantummechanical population analysis shows that the Ha atom between two peptide groups can have a net charge of’ +0.2 e (Hagler and Lapiccirella, 1976). This would favor an attractive interaction between this atom and the negatively charged 0 atom of the C=O group. As will be seen below, a detailed normal-mode analysis of crystalline (Gly),II (Dwivedi and Krimm, 1982c) provides strong evidence for the 0 hydrogen bonds in this structure. It therefore presence of C-H seems likely that such hydrogen bonds can form under certain favorable conditions. c. Transition Dipole Coupling. We noted in the early part of this section that force-field models are necessary in order to reduce the number of force constant parameters. In the case of polypeptides, it was only natural to try to limit the SGVFF description to the usual internal coordi-
212
SAMUEL KRIMM AND JAGDEESH BANDEKAR
nates of a molecule plus hydrogen-bonding (and possibly nonbonded) interactions between molecules. Careful normal-mode analyses, however, have shown that the above potential energy terms are not sufficient to account for certain spectroscopic details, and that another kind of interaction involving resonance transfer of excitation, and which in general is expected to be present (Hexter, 1960), is of particular relevance for polypeptide systems. The identification of this interaction has had significant consequences in enhancing the power of vibrational spectroscopy as a tool for determination of three-dimensional structure. The main spectroscopic observation that required explanation was the -60-cm-I splitting in the infrared-active amide I (mainly CO s) modes of antiparallel-chain pleated sheet (APPS) polypeptides. Miyazawa (1960a) proposed that such splittings must be a consequence of the interactions between similar oscillators within the repeat unit of the structure, namely, the four peptide groups in the present case. He showed by a perturbation treatment that the frequencies for the four possible coupled modes would depend on the relative phases of the vibrations and the magnitudes of the interactions between peptide groups according to the relation v(6,S’)= Yo
+ DlO cos 6 + Do1 cos 6’
(7 1) where 6 and 6’ are the phase angles (0 or r)between adjacent groups in the same chain and in the neighboring chain connected by a hydrogen bond, respectively, D I Oand Do1 are the corresponding intrachain and hydrogen-bonding interchain interaction constants, and Y O is the amide frequency in the absence of such interactions. Although this treatment provided important insights into the understanding of the spectra of polypeptides and proteins (Miyazawa, 1960a; Miyazawa and Blout, 1961; Krimm, 1962), it suffered from some basic difficulties that were resolved only when it was recognized that another term had to be added to Eq. (71), viz., one whose physical origin was due mainly to interacting dipole derivatives, i.e., transition dipoles (Krimm and Abe, 1972). A resonance interaction can occur between two oscillators when one of them is in an excited state. The energy of this interaction is determined by that part of the total Hamiltonian that represents all pairwise coulombic interactions between electrons and nuclei in the two groups. At sufficiently large distances (probably over 3 A) these interactions can be expanded in a multipole series, of which the first important term for a neutral system is that due to transition dipole coupling (TDC). Higher transition multipoles may be important in some cases (Cheam and Krimm, 1985), but we treat here only the TDC case (Krimm and Abe, 1972; Moore and Krimm, 1975; Cheam and Krimm, 1984~).
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
2 13
FIG. 4. Geometrical parameters in transition dipole coupling interaction.
Our goal is to evaluate the contribution to V;,,,, due to the interaction between similar transition dipoles on peptide groups A and B. The potential for the interaction between two dipoles p.A and p~ a distance RABapart is (Jackson, 1975) AB Vdd
-
(l/E)IPAI
= (I/&)
I P B I [gA
. e^B - 3 ( g A
*
gAB)(e^B
IPA( 1 PB [XAB
'
gAB)]/RiB
(72)
where E is the dielectric constant (which we assume to be l), t A , & , and EAR are unit vectors in the directions of the dipoles and the line joining their center (see Fig. 4), and X A B is referred to as a geometrical factor. If the dipole moment is expanded in normal coordinates, then, assuming electrical harmonicity, P = Po
+
c (W/@J)rL i
(73)
and we have for the force constant for this interaction in the cuth transition
2 14
SAMUEL KRIMM AND JAGDEESH BANDEKAR
(d2v$/aQ:)
= (0.1)(ap/a&)2xzB
(74)
in units of mdyn A-1 u-l, where u is the atomic mass unit. We can also express TDC force constants in terms of local symmetry internal coordinates: Fij = (O.l)J(ap/ar;)JJ(ap/drj)JXq
(75)
if we recognize that the coordinate transformation [Eq. (l8)], namely
requires that we sum over terms lap/aril Idp/drjl. For a harmonic oscillator, the transition dipole moment, Ap, [in debyes (D)], is given by
where vo is the unperturbed frequency (in cm-') and (ap/d&) is in units of D A-1 u-lI2. Thus, the frequency shift in cm-' for the a t h mode due to TDC is
Av, = (AV$i/hc) = 5 0 3 4 ( A ~ , ) ~ X $ ~
(78) (79)
It follows from Eq. (23) that Av, is also given by
Finally, we note that integrated intensities, A, in cm mmol-l are also related to transition moments by (Person and Zerbi, 1982) A = (N,.rr/3~~)I(dp/aQ)1~ = 4225.471(ap/dQ)l2
(81)
where N , is Avogadro's number. Since Av and A are measurable quantities, and the ap/ar can be calculated by quantum-mechanical methods (Cheam and Krimm, 1985) (see Section VII,B,3), it is possible to evaluate from experiment the internal consistency of the TDC hypothesis for explaining band splittings in polypeptide systems. For (Gly)"I, Moore and Krimm (1976a) found that amide I splittings could be accounted for by TDC using lAp1 = 0.348 D oriented at 20" to the CO bond. From Eq. (77) we find that, for vo = 1672 cm-l (Mooreand Krimm, 1976a), this corresponds to
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
215
lap/aQl = 3.466 D u - ” ~ and , from Eq. (81) we obtain an integrated intensity of 50,760 cm mmol-I. Measured intensities (Chirgadze et al., 1973) fall in the range 47,000-61,000 cm mmol-I for ordered p structures and 30,000-5 1,000 cm mmol-I for random conformations. From a6 znitio calculations on a hydrogen-bonded NMA molecule (Cheam and Krimm, 1984c, 1985), values of lap/aQI were found to be 3.144 D A-l u - I ’ ~ (24” to CO) for the A, mode and 3.065 D A-1 u-”* (29” to CO) for the B, mode. Thus, the transition dipole moment obtained from band splittings is in good agreement with experimentally measured band intensities and with values calculated by quantum-mechanical methods. It should be noted that similar agreement was obtained for TDC parameters of amide I1 modes (Cheam and Krimm, 1984~). Two important conclusions emerge from these results. First, although the designation v ( & 6 ’) can still be used to describe normal modes of the APPS structure, the formalism of Eq. (71) can no longer be considered an adequate basis for explaining the splittings, nor can it be extended in general to other structures. Since TDC interactions were summed over a sphere of radius 30 A, and since the assumed value of Ap, which is in substantial agreement with that obtained from ab initio calculations, can account for the observed splittings, the contributions from valence force-field interactions represented by the Dl0 and Do1 terms must be negligible. Second, since the frequency shifts depend on XABin Eq. (72), which in turn is a function of the geometrical arrangement of the transition dipole moments, amide I (and amide 11) splittings provide a sensitive measure of the relative spatial arrangement of peptide groups in a polypeptide chain. We will see in the various examples discussed below how this dependence manifests itself in practice. In this connection, it should be noted that the relative values of the L, in Eq. (76) depend on the relative phase of the vibrations in the interacting groups: If, as in the case of the /?sheet, symmetry determines this angle to be 0 or 7,then only the value of X A B is affected; if symmetry is not operative, then the relative magnitudes of the L, must be obtained from the eigenvector for the overall normal mode. 3, Least-Squures Refinement Once a force-field model has been chosen, we need a mechanism whereby a set of force constants can be selected that gives optimal prediction of a (usually larger) set of observed frequencies. This could be done by a manual trial-and-error adjustment, but a least-squares fitting procedure is more satisfactory (for discussions of these methods see Duncan, 1975; Califano, 1976; Gans, 1977; Zerbi, 1977). In the present discussion we assume that we have “reasonable” starting force constants
216
SAMUEL KRIMM AND JAGDEESH BANDEKAR
and that observed bands have been properly assigned to calculated normal modes. Both of these points are discussed below. Suppose that we have a starting set of arbitrary force constants f and we wish to minimize the sum of the weighted squared errors: xo =
c
Wp(upbs - u p ) * =
6WS
(82)
1
where 13, = upbs- u?" and w, is a weighting factor. If we change the force constants by Af, this leads to a change in vcaicof Av, and the new sum is
x = (S - AC)W(S - Av) = SWS - SW AV - ACWS + Ai, W AV (83) We want to minimize x not in terms of the frequencies but in terms of the force constants, which we are trying to adjust. We therefore assume (Long et al., 1963) that a linear relation holds for small changes, that is,
Av = J Af,
(84)
where J is the Jacobian matrix with elements], = au:ll'/af,, so that Eq. (83) can be written as
x
6WS
=
T h e minimum in
-
6WJ AF
+ AiJWJ Af
(85)
+ 2JWJ Af
(86)
(JWJ)-'JW(vobS- v'"')
(87)
-
AfJWS
x is given by
(axla A f ) = 0 = -JWS
- JWS
from which
Af
=
(JWJ)-lJWS
=
Equation (87) permits us to calculate the Af that will minimize x. In practice, the procedure for getting a new set o f f is iterated until either a specified number of cycles is complete or x becomes smaller than a preassigned value. The dispersions in the refined force constants after the Kth cycle are given by
where No is the number of observed frequencies and Nf is the number of force constants being varied. The J matrix can be obtained once we have defined the F matrix in terms o f f , which is done through a Z matrix that depends on the force field being used. Thus, with
F=
2 Z' fi, I
and using Eq. (23), A = LFL, we have
(89)
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
2 17
and
The scheme that we use for such a force-constant refinement is shown in Fig. 5. We will not go into the details, but it is important to be aware of the fact that there are basic problems in the least-squares refinement of force constants (see previous references). Aside from the limitations imposed by the force-field model and the assumption of its transferability between (assumed) common structures, there are two main categories of problems. The first has to do with the handling and weighting of experimental data. We want as large a number of experimental frequencies as possible so as to reduce the dispersions in the force constants, as shown by Eq. (88). This can be achieved by using a large set of similar molecules as well as isotopic species (although the latter introduce problems associated with different anharmonicities). As seen from Eq. (82), we need to assign weights to the observed frequency differences, and this is often not a straightforward procedure. We usually assign large weights to well-assigned bands, particularly if they are not weak, but this procedure is difficult to quantify. The second kind of problem is mathematical, and there are several of these. The minimization procedure is based on the assumed linearity of the Jacobian elements [Eq. (84)]and this may not be valid if the starting force field is a “poor” one. If the determinant of the matrix (JWJ) is close to zero, then inverting it to get Af [viz., Eq. (87)], can lead to large errors, even singularities; the best way to minimize this possibility is to have at least three to four times as many frequencies as force constants, distributed so that enough observed frequencies occur in a region to which a particular force constant is contributing (Zerbi, 1977). Finally, the least-squares calculation may not have a unique solution. All of these issues need careful evaluation in arriving at a satisfactory force field. 4 . Valence Force Field for the Polypeptide Chain
In developing a transferable SGVFF for the polypeptide chain, we have utilized strategies based on all of the considerations discussed thus far in this article. We review these developments before presenting the details of the force field.
218
SAMUEL KRIMM AND JAGDEESH BANDEKAR
I
START
I
INPUT No. of force constant8 No. of variable force constants ( N f ) , No. 04 observed frequenciel (No). No.of cycles (Nc), value of refinement Index.8
\
INPUT Rood F matrix elements ReadG motrix element8
/
INPUT Rood force conatant names. volues of observed frequencies and weighting element8
Compare observed doto with computed frequencies
4
1
Form the Jocobion matrix
I
Modify the vorioble force constants A f = (?WJ)-’?W8
I
Convergence not poraible within prescribed No. of cycles
Print No. ot cycler. refinement index.8, computed ond observed frequencies
-t
FIG. 5 . Schematic diagram of force-field refinement program.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
2 19
In order to have a reasonable starting set of peptide force constants, a complete analysis was done on NMA and its deuterated derivatives as well as on some nylons and their deuterated derivatives (Jake5 and Krimm, 1971a). Force constants for the hydrocarbon portions of these molecules were transferred from the elegant work of Schachtschneider and Snyder (1963) on n-paraffins. A total of 125 NMA frequencies and 306 nylon frequencies were fit with a force field of 27 transferred and 84 refined force constants, with an average error of JAvJ= 4.3 cm-', and a total of 51 frequencies having an error greater than 10 cm-'. The ratio of observed frequencies to refined force constants, NoINf, was therefore 431I84 = 5.1. This force field was then used as a starting point ,in refining a force field for (Gly),I (Abe and Krimm, 1972a), with only modest success. A total of 109 frequencies of (Gly),I and its isotopic derivatives were fit using 34 transferred or reasonably fixed and 38 refined force constants, giving NoINf = 2.9. However, this refinement resulted in a IAvl = 9.1 cm-l, with 26 frequencies having lAv( > 10 cm-'. When this force field was used as a starting point in a global refinement of (Gly),I, P-(Ala),, and P-poly(L-alanylglycine)[P-(AlaGly),], much better results were obtained. In this case, 131 frequencies were fit using 74 transferred or reasonably fixed and 47 refined force constants, for NoINf = 2.8. In this case we obtained JAvl = 4.9 cm-', with 16 frequencies for which (Avl > 10 cm-'. In the most recent refinement (Dwive'di and Krimm, 1982a,b,c, 1984a),we have added (Gly),II and a-(Ala), frequencies as well as those from an (unpublished) analysis of P-(AlaGly),. Frequencies of deuterated molecules were used only as guides in the initial stages of the refinement, and some force constants were allowed to vary slightly from one structure to another. The set of 198 frequencies of the native molecules is reproduced with IAvI = 5.0 cm-' and 13 frequencies with lA.1 > 10 cm-'. This force field is given in Table VI, and is the basis for our structural analyses. For cases where the backbone conformation is the major interest, it has seemed desirable to have a force field in which the side chain is approximated by a point mass. We have refined such force fields for P-(Ala), and a-(Ma),, starting from the detailed force field of Table VI (Dwivedi and Krimm, 1984b). This approximate force field gives frequency and eigenvector agreement comparable to that obtained with the full calculation (see Sections III,C,l and IV,B,l). The criteria discussed so far for judging the suitability of a force field have involved the extent of agreement between observed and calculated frequencies (we have assumed proper assignments of bands, which we discuss below). Recent ab initio calculations of dipole derivatives
220
SAMUEL KRIMM AND JAGDEESH BANDEKAR
TABLE Vl General Valence Force Constants far Different Polypeptide Chains Valueb for different
4.523 (4.823) 4. I60 6.415 9.882 5.674 e 4.4628
4.323
5.043
4.843
* *
* *
4.409
4.409
10.029 5.830
9.955 5.752
4.323
(*I
* *
5.840 4.564
4.523
4.564 4.980 (5.280) 4.800 0.150
* (5.080)
*
*
*
0.120
0.135
0.125
*
9.62 1 5.720 5.856 4.564 4.430 4.564 4.780
0.160 0.110
0.0027 0.819 0.765
* 1.119 (0.819)
1.119
*
0.050
0.819
1.150
0.715 0.715
0.715 0.7 15 0.785 0.715
*
*
(1.093) 1.446 1.033
1.446 1.033
1.246 1.400
1.166 1.300
1.046 0.687 0.556
1.046 0.687 0.556
1.246
1.166
0.527
0.5259 (0.5759) 0.566
0.826
0.826
0.687
0.537 0.532 0.487
0.556
0.556
0.527
0.684
0.654
1.193 (1.0783) 1.306 0.933 (0.833) 1.306 0.677 0.566
(*)
* 0.684
0.537 0.532
*
0.684 0.684 0.684
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
22 1
TABLE V I (Continued) Valueb for different polypeptidesc,"
CC"C0 H"C"H H"C"CB HUH C@CUC@ C"H" ... 0 ib CO ... H ib CO -.* H" ib NH ... 0 ib NH ... O b ib CO ob Cob ob NH ob NC" t (2°C t CN t NHt
co t
C"C@t NC", C"C NC", C - V C"C, CN C"C, co C"C, C"C@ C"H", C"H C"H", C"Hb C"H", H" ... 0 C a p , Hn ... Ha C"C@,C"C0 CBH, C@H CN, NC" CN, CO NC", CNC" NC", NC"C NC", NC", NC", NC",
NC"H" NC"H CC"Ha CC"H
1.181 (0.8687)
*
*
(0.981) 0.584
0.5 175 (0.5275) 0.524
0.6147 (0.6647)
*
0.560
* 1.181
0.0 10
*
*
*
0.020
*
0.036
0.020
0.020
0.0506
0.621
0.657
0.687
0.587
0.159 0.037 0.037 0.680 0.0005 0.001 0.1 10 0.300 0.101 (0.300) 0.300 0.500 0.101
0.129 0.087 0.060
0.129 0.087 0.060
0.129
0.020 0.057 0.050 0.487 0.537 0.129
*
0.0003
0.00 15
0.0035
0.100
0.090 0.100
*
*
* *
* *
* *
* *
0.010
-0.015 -0.050 0.080
*
* * *
*
*
* *
0.026
*
* *
* * * (0.150)
*
*
0.100
*
* * *
* *
*
* * *
*
*
*
0.517 0.517
0.517 0.517
0.026
0.026
0.427
*
* * *
0.301
-0.0075 0.071 0.300 0.500 0.300 (0.600) 0.300 (0.600) 0.627
* * *
*
*
*
(Continued)
222
SAMUEL KRIMM AND JAGDEESH BANDEKAR
TABLE VI (Continued) Value*for different polypeptidesCsd Force constant?
P-(Ala).
NC', C"NH NC", NC"U
0.294 0.417 (0.717) 0.079 (0.129) 0.200
NC", H°CnCe NC", CC"C8 NC", CaC"C@ C"C, NC"C C"C, C"CN C"C, C"C0 C"C, NC"H" C"C, NC"H C"C. CC"H"
a-(Ala),
(Aib),
(Gly),I
(Gly). I1
*
* *
*
*
*
*
*
0.100
*
* *
*
*
0.026
0.026
0.205
0.205
* *
* *
*
*
* *
* * *
*
(0.217)
*
* * *
0.300 0.300 0.200 (0.300) 0.026
0.100
0.205
0.305
* 0.200
*
*
*
*
*
(0.100)
C"C, CC"H C"C, CC"U
C"C, NC"Q C"Ca, NC"H" Coca, NC"C@ Coca, CC"H" C"C0, CC"C0 Coca, H"C"C@ C"C@,C"CSH Coca(I ) , NC"Ca(2) C"Cq I ) , CC"CO(2) coca, C@C"Ce CN, C"CN CN, CNC" CN, NCO CN, CNH c o , C"C0 CO, NCO CO, C°CN NC"C, C"CN NC"C, NC"H"
0.367 (0.667) 0.079 (0.029) 0.000 0.079 0.6 17 (0.317) 0.079 0.4 17 0.415 0.353
0.300 0.300 (0.450) 0.200 0.294 0.450 0.450 0.050 (0.170) 0.000 -0.03 1
*
*
* 0.000
0.100
0.517 (0.317)
0.517
* * *
* * * * (0.600)
*
* 0.403 0.030 0.030 0.5 17
* *
* *
0.000 (0.150) 0.160
0.000
.0.150
0.160
*
*
*
*
* * *
*
*
-0.150
* *
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
223
TABLE VI (Continued) Valueb for different polypeptides'*" Force constant0 NC"C, NC"H NC"C, C"NH NC"C, NC"Ca NC"C, CCnCa NC"C, CO ob NC"C, NH ob NC"H", CC"H" NC"H", H"C"C@ NC"H", NC"H NC"H", HC"H" NC"H", NH ob NC"H, CC"H NC"H, HC"H" NC"H, NH ob NC"C@,CC"C@ NC"C@,H"C"0 trans-(NC"Q, CnCaH) gauche-(NC"C@,C"CaH) NC"Ca, CaC"C@ NC"C@,NH ob NCO, CNH C"CN, CNH C"C0, CC"Q C"C0, CC"H" C"C0, CC"H C"NH, CNH C"NH, NC"H" C"NH, NC"H C"NH, N C W CC"H", HnCuC@ CC"H", CC"H CC"H", HC"H" CC"H", CO ob CC"H, HCnHn CC"H, CO ob CC"C@,H"C"Ca CC"Ca, CO ob CC"C@,CC"C@ CC'CS, C@C"C@
P-(Ala). -0.100 0.200 (0.150) -0.041 (-0.141) -0.1725 0.1 10 0.019 0.043
0.1022
-0.041 -0.031 -0.049 0.040 0.120 0.251 0.200 0.150 0.038 0.100 0.000 0.000 (0.096)
a-(Ala),
(Aib).
* *
* 0.100
*
*
-0.073 0.160
-0.073 0.160
* *
* * *
*
*
* 0.000 -0.031 0.060
*
* 0.0065
* *
*
0.150
-0.100
-0.03 1 0.162
-0.050
(Gly). I1
-0.031
-0.031
-0.0725 0.1092
-0.0725 0.1092
0.0463 0.0615
0.000 0.06 15
0.019 0.0615 0.0456
0.0 19 0.0615 0.0456
*
*
*
0.000
* *
(Gly),,1
* *
0.200
0.0065 0.050
* * * 0.100 0.0065
*
0.031
-0.032 0.0398 0.100 0.0398 0.100
*
*
*
*
* * * 0.100 0.0065 0.05 1 0.06 1
-0.032 0.0398 0.100 0.0398 0.100
* 0.100 -0.031 (Continued)
224
SAMUEL KRIMM AND JACDEESH BANDEKAR
TABLE VI (Continued) Valueb for different
C"C@H,C"C@H ~Y~TLS-(C"C@H, H"C"C@) gauche-(C"C@H,H"C"C@) trans-(C@C"C@, C"C@H) gauche-(C@C"C@, C"C@H) CNCa, C"NH CNC", NC"H" CNC", NC"H CNC", NC"C@ CO ob, NH ob NH 0 ib, NH ob CO ob, d N t NH ob? CN t
-0.045
0.122 0.100
-0.020
*
*
0.000
-0.040 0.100
* *
0.000 0.000 0.007 0.01 11 -0.1477
0.000 -0.050
* * *
0.100 0.010
*
*
*
0.000 0.000
0.000 0.270
*
0.010 0.000
0.010 -0.005
*
-0.1677
-0.1677
0.100 -0.050
*
*
*
a AB, AB bond stretch; ABC, ABC angle bend; X, Y, XY interaction; ib, in-plane bend; ob, out-of-plane bend; t, torsion. Units: mdynlA for stretch and stretch, stretch constants; mdyn for stretch, bend constants; and mdyn A for all others. Values in parenthesis are for force field with CHs replaced by point mass (all other constants are the same). Asterisk indicates that constant is the same as for P-(Ala),$. Blank space indicates inapplicable or unused constant. f Subscript b denotes constant applicable to bifurcated hydrogen bond.
[(ap/aQ)] of the peptide group (Cheam and Krimm, 1985) indicate that intensities and orientations of ap/aQ provide a very sensitive test of a force field. For example, it was found that for NMA our (Gly),,I force field gave good reproduction of intensities of amide modes and excellent agreement with measured directions of d p / d Q for amide I and amide I1 modes (cf. discussion in Section II,D,2,c). Comparable agreement was not found for other force fields (Cheam and Krimm, 1985). A similar calculation gave excellent agreement with observed intensities of (Gly),,I (see Section III,B,l). We expect that in the future such intensity information will also be utilized in the early stages of force-field refinements. E . Band Assignments T h e discussion thus far has made the assumption that observed bands in the IR and Raman spectra have been properly correlated with calculated normal modes, i.e., that we know that the atomic displacements associated with an observed frequency are essentially similar to those in
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
225
the eigenvector of the corresponding calculated frequency. Otherwise, for example, there is no basis for obtaining a (vpbs- v"") term in Eq. (82). Proper band assignments are therefore a vital element in a vibrational analysis, and must be determined independently of a normalmode calculation. We review here briefly the approaches that are used to make such band assignments.
1 . Group Frequencies Over the many years of studying IR and Raman spectra of molecules, many correlations have been established, both from theoretical as well as experimental studies, between characteristic vibrational modes and the frequency regions in which they are found. A number of books and articles have discussed these correlations (Bellamy, 1975; Parker, 1983; Frushour and Koenig, 1975c; Spiro and Gaber, 1977; Tsuboi, 1977), and it is natural that we make general use of these in an initial approach to assigning bands. For the peptide group, the analysis of NMA has provided an important benchmark, and thus the results shown in Table I1 represent guidelines for identifying general regions in which certain types of vibrations are expected. In addition, we know where to expect other kinds of modes (CH stretch in the region 3000-2800 cm-', HCH bend in the region 1500-1400 cm-', etc.), how relevant frequencies will shift with the strength of hydrogen bonds, and some general effects of environmental factors. Specific qbestions can often be answered by studying small model compounds or simple variants of the molecule in question. What the group-frequency approach usually cannot do is to provide a specific assignment for an arbitrary band in a complex spectrum. This becomes obvious when we recognize that side-chain modes overlap regions of main-chain modes, and that multiplicity of peptide groups in a repeat unit (cf. the four in the APPS structure) in general multiply the number of observable bands. In the latter case, it then becomes important to know with which vibrational phase angle a particular band is associated. We therefore need other methods to provide the required detailed band assignments. 2 . Symmetry: Activity and Dichroism Molecular symmetry imposes constraints on the nature of the normal modes of vibration, and these are reflected in observable characteristics in the spectrum. The theoretical basis for symmetry has been widely presented (see, for example, Wilson et al., 1955; Zak et al., 1969; Woodward, 1972), and will not be discussed here. We will only illustrate the results of its application in one case, that of (Gly)nI.
226
SAMUEL KRIMM AND JAGDEESH BANDEKAR
TABLE VII Symmetry Species and Selection R u b for Crystalline Polyglycine I Species Pleated sheet (Dz) A BI B2
Bs Rippled sheet (Czh) A, A, B" B,
Symmetry
46,6')
4 0 , 0) 40,9 ) v ( n , 0) 477, a)
v(0, 0) 40,a) v ( a , 0) v ( a , a)
C!Ca) 1
C2b)
Cdc)
-1 1 -1
1 1 -1 -1
1 -1 -1 1
Ca(b)
i
1 1 -1
1 -1 -1 1
-1
Number of modes
Activity"
Lattice vibrationsb
21 20 20 20
R R, WII) R, IR(I) R, IR(I)
R, Tb R T, Ta
21 20 19 21
R IR(II) IR(I) R
R, Tb R
u& 1 -1 1 -1
Ta, Tc
R, Raman; IR, infrared. R, Rotatory; T , translatory modes.
Two structures have been proposed for (Gly), I: an antiparallel-chain pleated sheet (APPS) and a similar rippled sheet (APRS) (see Section III,B, 1). These structures have different symmetries: the APPS, with DZ symmetry, has twofold screw axes parallel to the a axis [CS,(a)]and the b axis [C",b)], and a twofold rotation axis parallel to the c axis [C~(C)]; the APRS, with C2h symmetry, has a twofold screw axis parallel to the b axis [C",b)], an inversion center, i, and a glide plane parallel to the ac plane, ufC. Once these symmetry elements are known, together with the number of atoms in the repeat, it is possible to determine a number of characteristics of the normal modes: the symmetry classes, or species, to which they belong, depending on their behavior (character) with respect to the symmetry operations; the numbers of normal modes in each symmetry species, both internal and lattice vibrations; their IR and Raman activity; and their dichroism in the IR. These are given in Table VII for both structures. For the APPS structure, the modes divide into four symmetry species. The A species modes, of which there are a total of 21, are totally symmetric with respect to the symmetry operations (i.e., the characters are all l), are only Raman-active, and include a rotatory and a translatory lattice mode. The B1 species modes, 20 in number, are antisymmetric (i.e., with character -1) with respect to C;(a) and C$(b), can exhibit activity in both Raman and IR (of parallel dichroism; see below), and include a rotatory lattice mode. The B2 and Bs species modes, both 20 in
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
227
number, also exhibit activity in Raman and in IR (of perpendicular dichroism), and each set includes a translatory lattice mode. The predictions for the APRS structure are different. The four symmetry species divide up into two (Ag,Bg)whose modes are only Raman-active and two (A,,, B,,) whose modes are only IR-active (this mutual exclusion is a result of the i symmetry). Also, the relative number of V(T, 0) and V(T, T) modes is different, and this difference is entirely in the (low-frequency) translatory lattice modes. These predictions provide important guidelines in assigning bands. Of course, an assignment must be consistent with the prediction of activity for the mode in question, but an even more useful property is the expected IR dichroism. We noted in Eq. (81) that the integrated IR absorbance in a band is proportional to ( ~ 3 p / d Q ) ~If. polarized radiation is incident on an oriented sample, another condition must be satisfied, viz., that there be a nonvanishing component of (dp/dQ) along the direction of the electric field vector, 'is; in fact, the absorbance is proportional to ['is (dp/aQ)]*. The direction of maximum absorbance of a band with respect to the known axis of orientation of the polypeptide chain (i.e., the dichroism) cap be measured experimentally. Thus, the predictions from symmetry analysis permit us to restrict, or in the case of parallel bands to identify, the symmetry species to be associated with an observed band. This can be an important tool in making band assignments. 3. Isotopic Substitution Bands can often be assigned by studying their frequency behavior when an isotopic substitution is made in the molecule. An obvious case is the effect of N-deuteration, which, for example, results in the replacement of an NH stretch mode near 3300 cm-' by an ND stretch mode in the 2500-2400-cm-' region; or in the disappearance of amide 11, 111, and V modes (the first two of which involve major contributions from NH ib and the third of which involves NH ob) and the appearance of ND modes at lower frequencies. For modes with minor contributions from NH deformation, normal-mode calculations are a very important guide in assigning bands: Calculations for the N-deuterated molecule indicate explicitly the behavior of the residual mode when the NH contribution is removed, as well as how the ND contribution may mix with other modes in its spectral region, both aspects of which may be specific to the particular structure. Substitution of ND for NH is a simple procedure (oftenjust involving treatment with DzO), and has been used frequently. Substitutions of 15N for I4N or 19C for 12Chave been used much less frequently, but may be a more powerful tool for studying structure. As is discussed with respect
228
SAMUEL KRIMM AND JAGDEESH BANDEKAR
to the y turn (Section V,C, l), such isotopic substitution at strategic sites, although resulting in relatively smaller frequency shifts, produces shifts that are dependent on the conformation of the local region of the molecule. These subtle changes must be interpreted in terms of the results of normal-mode calculations, but they provide a deeper insight into structure as well as band assignments. 4 . Overtone and Combination Bands: Fermi Resonance
All of the preceding discussions have dealt with bands that are associated with the excitation of individual normal modes, i.e., the fundamental frequencies. Although only such transitions are permitted for a harmonic oscillator, the vibrations of real molecules are anharmonic, and in such cases double excitations of a normal mode (resulting in overtone bands) and single excitations of two different normal modes (resulting in combination bands) are allowed. Analysis of such bands often leads to information on the assignments of the fundamentals, and is therefore of importance. In this connection, we need to know not only the rules for the appearance of such bands, but we must understand that they often are perturbed by an interaction known as Fermi resonance. Overtone and combination bands belong to symmetry species determined by the species of their fundamentals. We can determine this symmetry by multiplying the characters for the fundamentals. Thus, overtones of all species have the character of the totally symmetric species, A for D2 symmetry and A, for C2h symmetry (see Table VII). For combinations, this rule implies that, for D2 symmetry, a B1 mode combining with a Bs mode produces a combination of B2 symmetry, etc., while for CPhr B, combining with B, results in a band of A, symmetry, etc. (see Table VII). Overtone and combination bands are usually weak in comparison with the fundamentals. However, when the frequency of such a combination falls close to that of another fundamental of the same symmetry species, a Fermi resonance interaction occurs, which results in a sharing of intensity between the two modes as well as frequency shifts in both. This occurs, for example, in the interaction between the NH stretch mode, Y:, and overtones or combinations of amide I1 modes, U: (Miyazawa, 1960b). From measurements on the frequencies and intensities of the observed bands, U A and trB , it is possible to obtain the frequencies of the unperturbed fundamental, v i , and combination, u! . The relation is given by (Miyazawa, 1960b) V B = ~ [ ( v A+ and $[(u: + u!) + S] where s = Y A - v B . It can also be shown that VA
=
uB)
- S]
(92)
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS = (s
- 6)/(s
229
+ 6)
(93) where 6 = - ug . Thus, measurements of the frequencies and intensities of the observed amide A and amide B bands permit a determination and ug, and from the latter it is usually possible to infer the of fundamental frequencies involved (Krimm and Dwivedi, 1982a). (ZB/Z*)
UX
VX
111. EXTENDED POLYPEPTIDE CHAINSTRUCTURES A . Introduction
The extended form of the polypeptide chain, with lateral hydrogen bonds stabilizing a sheet-like arrangement, was recognized as a likely structure from early X-ray diffraction studies of silk (Meyer and Mark, 1928) and stretched mammalian (p) keratin (Astbury and Street, 1931; Astbury and Woods, 1933). Neighboring chains in a sheet can be directed, in a chemical sense, in an antiparallel or parallel manner, and detailed coordinates for such stereochemically acceptable /?-sheet structures were first provided by Pauling and Corey (1951a, 1953b). Experimental evidence for the antiparallel-chain P-sheet structure has been obtained from X-ray diffraction studies of a synthetic polypeptide, (Ala), (Arnott et al., 1967), a fibrous protein, /?-keratin (Fraser et al., 1969), and from oligopeptides (Rao and Parthasarathy, 1973; Fawcett et al., 1975; Tanaka and Ashida, 1980; Yamada et al., 1980; Cruse et al., 1982; Admiraal and Vos, 1984; Yamane et al., 1985; Ashida et al., 1986) and proteins (Richardson, (198 1). The parallel-chain structure has been found in some oligopeptides (Marsh and Glusker, 1961; Chatterjee and Parthasarathy, 1984; Lalitha et al., 1987) and in proteins (Richardson, 198 1). The protein studies, as well as theoretical considerations (Chothia 1973; Zimmerman and Scheraga, 1977; Raghavendra and Sasisekharan, 1979; Chou et al., 1982), have shown that finite p sheets have a twist instead of being essentially planar, as they are in the extended structures of synthetic polypeptides. This twist is nearly always “right-handed” when viewed along the polypeptide chain axes (Chothia, 1973), and is a result of energy minimization (Salemme, 1983). Even more complex, cylindrical “/?-barrel” arrangements occur (Richardson, 198 l), and the range of topographies is quite extensive (Salemme, 1983). The p sheet thus plays an important role in the structure of proteins. In some it is the main secondary structural component (e.g., concanavalin A and Bence-Jones proteins); in others it is found in conjunction with a-helical segments; and in many proteins it occurs as a mixed sheet of parallel and antiparallel strands. To date, normal-mode analy-
230
SAMUEL KRIMM AND JAGDEESH BANDEKAR
ses have only been applied to the infinite antiparallel-chain sheet structures, and this work is described in this section. It is clear that such analyses need to be extended to other types of /3-sheet structures. B . Antiparallel-Chain Rippled Sheet Polyglycine I a. Structure and Symmet?. Early X-ray diffraction studies suggested that (Gly), I has an essentially extended chain conformation (Astbury et al., 1948; Astbury, 1949; Bamford et al., 1953). A specific model of this structure, namely the APPS developed from model-building studies (Pauling and Corey, 1951c),was believed to apply to (Gly),I on the basis of “sufficiently good’ agreement with the observed powder X-ray diffraction pattern (Pauling and Corey, 1953a). Because an oriented sample could not be obtained, no definitive structure determination existed for a long time, and the APPS structure was assumed as the basis for early analyses of IR spectra (Elliott and Malcolm, 1956; Miyazawa, 1960a, 1967; Abe and Krimm, 1972a) and in conformational energy calculations (Venkatachalam, 1968a; Hopfinger, 1971). The preparation of “single crystals” and of thin oriented films permitted Lotz (1974) to undertake for the first time an electron-diffraction analysis on an oriented structure. From considerations of unit cell symmetry and the results of conformational energy calculations (ColonnaCesari et al., 1974), Lotz proposed that an APRS structure was a more reasonable model for (Gly)nI. In this structure, first suggested on the basis of model-building studies (Pauling and Corey, 1953b), alternate chains in the sheet consist of all L and all D residues. [Of course, this condition can be satisfied by (Gly),I since its C” atom is achiral.] Although a distinction between APPS and APRS structures is difficult on the basis of calculated diffraction intensities (Lotz, 1974), it was possible to show that the APRS structure was consistent with the diffraction data and was able to account for the observed monoclinic geometry of the unit cell (the APPS structure generally giving an orthorhombic cell). The experimental evidence for APRS (Gly), I was strengthened significantly by the results of the first vibrational analysis on this structure (Moore and Krimm, 1976a).The results showed that the ratios of differences in the three observed amide I mode frequencies could be accounted for by a TDC analysis of the proposed APRS structure, but that these observed ratios were in significant disagreement with calculated values based on the APPS structure. Other spectral features were also in better agreement with the APRS structure.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
23 1
FIG. 6. Antiparallel-chainrippled sheet polyglycine I structure. (A, B) Schematic diagrams (Moore and Krimm, 1976a); (C) ORTEP drawing.
We therefore assume that (Gly),I has an APRS structure. This structure is shown in Fig. 6, and its parameters are given in Table VIII. The angle a is that between the a axis and the projection on the ac plane of a line connecting successive C" atoms of one chain. The quantity Ab is the shift in the b axis direction of the second chain in Fig. 6A with respect to the first; a positive value of Ab results in a decrease in the distance between nearest carbonyl oxygen atoms on adjacent chains compared to this distance for Ab = 0. We have chosen Ab = 0, even though energy
232
SAMUEL KRIMM AND JAGDEESH BANDEKAR
TABLE VIII Structural Parameters of Clystalline Antiparallel-Chain Rippled Sheet Polyglycine I" Dihedral angles:
#IJ
= -149.9,
JI
= 146.5"
Sheet parameters: a / 2 = 4.77 8, b/2 = 3.552 A (fiber axis) Ab=O8, a = 76" Hydrogen-bond parameters:
l(H 0) = 2.12 8, 1(N ... 0) = 2.91 A 8(NH, NO) = 31.4" y(NH0) = 134.4"
Intersheet parameters: c = 3.67 8, @ = 113" -~ 5
See Table 111 for peptide-group geometry.
calculations suggest a value of - 0.7 A (Colonna-Cesari et al., 1974), since this is in best agreement with the results of vibrational analysis (Moore and Krimm, 1976a) and is more consistent with the electron-diffraction studies (Lotz, 1974). It is worth noting that the hydrogen bond in the APRS structure is longer [l(N 0) = 2.91 A] than that in the APPS structure [I(N 0) = 2.73 A] (Moore and Krimm, 1976b; Dwivedi and Krimm, 1982b), and that it is less linear: 8(NH, NO) being 31.4" (APRS) versus 9.8" (APPS). T h e H a-..H" distance is 2.61 A. T h e distribution of the normal modes of the APRS structure of (Gly),I among the symmetry species, and their optical activity, are given in Table VII. b. Vibrational Analysis. There have been a number of experimental studies of the IR (Elliott and Malcolm, 1956; Miyazawa, 1961a; Bradbury and Elliott, 1963; Suzuki et al., 1966; Krimm et al., 1967; Krimm and Kuroiwa, 1968; Fanconi, 1973) and Raman spectra (Smith et al., 1969; Small et al., 1970; Fanconi, 1973) of (Gly)nI.These have provided many kinds of information, including the effects of isotopic substitution (Suzuki et al., 1966), but only very limited data on IR dichroism because of the difficulty in obtaining oriented specimens (Bradbury and Elliot, 1963). There have also been some inelastic neutron-scattering measurements on (Gly),I (Gupta et d.,1968). Infrared and Raman spectra of (Gly),I are shown in Figs. 7 and 8, respectively. Early normal-mode calculations were based on the approximation of taking the CHz group as a point mass, in some cases with computations
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
I
233
3
3 1
5
FIG.7. Infrared spectrum of polyglycine I. Top, Experimental spectrum (KBr pellet); bottom, plot of calculated (ap/aQ)*(numbers correspond to numbered modes in Cheam and Krimm, 1985).
of the modes of only a single chain (Fukushima et al., 1963; Gupta et al., 1968), while in other cases taking the structure to be a two-dimensional sheet (Miyazawa, 1967; Fanconi, 1972a,b). T h e first calculation with all atoms included was done on a hydrogen-bonded sheet (Abe and Krimm, 1972a); it used the (now known to be) incorrect APPS structure but incorporated TDC in order to account for splittings in amide I and amide I1 modes (Krimm and Abe, 1972). The APPS structure was also used in a calculation that determined intensities as well as frequencies
FIG. 8. Raman spectrum of polyglycine I (Small et al., 1970).
234
SAMUEL KRIMM AND JAGDEESH BANDEKAR
(Stepanyan and Gribov, 1979). The APRS structure was the basis of a calculation (Moore and Krimm, 1976a) in which the three-dimensional structure was used to incorporate intersheet TDC (Moore and Krimm, 1975). In a subsequent analysis, the force field was further refined for maximum transferability between different polypeptide molecules (Dwivedi and Krimm, 1982a). The following discussion is based on these two papers (Moore and Krimm, 1976a; Dwivedi and Krimm, 1982a). The observed and calculated frequencies of (Gly),I are compared in Table IX. For a detailed discussion of the assignments, and the calculations of isotopically substituted molecules, the original publications should be consulted. We consider here only some salient features of the results. The relatively high value of the unperturbed amide A frequency, v i , compared to that of the APPS structure (Moore and Krimm, 1976a; Krimm and Dwivedi, 1982a), is generally consistent with the longer hydrogen bond in (Gly), I. The unperturbed amide B frequency, u! ,is now naturally accounted for by a combination between the observed 1517cm-' A, mode and an unobserved B, mode calculated from TDC near 1600 cm-', substantiating an inference of such a mode derived independently from a Fermi resonance analysis (Tsuboi, 1964; Moore and Krimm, 1976a). The large observed splittings in the amide I modes are very well accounted for by the TDC interactions. Since without TDC the normalmode calculation gives a maximum splitting of 10 cm-' (A,, 1684; A,, 1677; B,, 1676; B,, 1674 cm-'), whereas the observed splitting is about 50 cm-l, it is evident that this interaction is of major importance in accounting for the observations (see Section II,D,2,c). The same is even more true of the amide I1 modes: without TDC we find A,, 1534; A,, 1535; B,, 1559; B,, 1559 cm-'. The relatively low observed amide I1 frequencies (15 17 and 1515 cm-') seem to be characteristic of the APRS structure of (Gly),I. The amide I11 region has been used as an indicator of chain conformation, although it has been pointed out (Hsu et al., 1976) that caution is advisable in this regard since these modes (NH ib plus CN s) are sensitive to side-chain composition. I n (Gly),I, NH ib is predicted to contribute from 1415 to 1152 cm-', with only calculated bands at 1304 and 1286 (observed 1295W IR) cm-' having this coordinate as the major contributor. The results on N-deuterated (Gly), I support these assignments (Dwivedi and Krimm, 1982a), and show in particular that the 1236Mcm-' IR band and its strong Raman counterpart as 1234 cm-', although in a region normally associated with amide 111, should be assigned to CH:! tw rather than to an amide mode, as had been assumed (Smith et al.,
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
235
TABLE IX Observed and Calculated Frequencies of Polyglycine I Observeda (cm-I) Raman
IR
Calculated (cm-I) A,
A,
B,
B,
3272 3272 3271 32728' 29328 2932s
3271 2934 2934 2929
2929W 2929W
2928 2865
2869M 2869M
2865 2869VW 2869VW
286 1 2861 1695 1689
1685M 1674s
1677 1643
1636s '
1602 1572
1517s
1515
1515W
1514
1460s
1454 1454 1441 1439
14323 1410M
1415 1408W
1415
1341W
1341 1338
1338W 1304
1286
1295W 1255M
1253 1253
12348
1243 1242
1236M 1220w
1213
Potential energy distributionb NH s (98) NH s (98) NH s (98) NH s (98) CH2 as (98) CH2 as (98) CH2 as (99) CH2 as (99) CHg ss (98) CH2 ss (98) CHZ ss (99) CH2 ss (99) CO s (77), CN s (15), CICN d (11) CO s (75), CN s (20), CaCN d (11) CO s (74), CN s (21), CaCN d (11) CO s (69), CN s (22), CaCN d (11) NH ib (56), CN s (19), (2°C s (12) NH ib (51), CaC s (16), CN s (14) NH ib (35), CN s (28), C°C s (17), CO ib (14) NH ib (35), CN s (27), C°C s (17), CO ib (14) CHP b (66), CH2 w (16) CH2 b (65), CHp w (17) CH2 b (96) CH2 b (96) CH2 w (41), CH2 b (31), NH ib (14) CH2 w (40), CH2 b (33), NH ib (13) CH2 w (84) CH2 w (79) NH ib (30), CO ib (19), CN s (18), C'C s (16) NH ib (39). CmCs (17), CO ib (16), CN s (12) CH2 W t (76), CH2 w (17) CHg t W (76), CH2 w (17) CHZ tw (93) CH2 t W (92) NCn s (29), NH ib (23), CH2 w (18), CH2 tw (16), CN s (13) (Continued)
236
SAMUEL KRIMM AND JAGDEESH BANDEKAR
TABLE IX (Continued) Observed" (cm-I)
Calculated (cm-I)
~
Raman
IR
A,
1214W 1162M
A.
B,
B,
1212 1153 1152
1021vs
1015 1016M
1014 1002 1000
987W
980 979 946
936M
940
888W
890
884M
890 768 767
[
-3.
718 718 702 630
628W 614M
629 62 1 613 589
599w 589W
587 580 589M
327W
579 323
321W
NC" s (29), NH ib (23), CHz w (18), CHz tw (15), CN s (13) NC" s (50), CnC s (13), NH ib (12) NC" s (50), C"C s (14), NH ib (12) NC" s (77), (2°C s (10) NC" s (77), CnC s (10) CHZr (45), CO s (1 l), CnC s (10) CHz r (49), CO s (10) CH2 r (68), CN s (10) CHz r (70), CN s (10) CHZr (29). CN s (12), CnC s ( l l ) , NC"C d (10) CHP r (25),CN s (13), CnC s (12), NC"C d (10) C"C s (29), CN s (21), CHz r (14). CO s (13) (2°C s (31), CN s (24), CO s (12), CH2 r (12) CO ib (16), NC" s (15), CnC s (15), CN t (12), NCuC d (11) C"C s (19), CO ib (17), NC" s (16), NC"C d ( l l ) , CNC"d (11) CN t (63), NH 0 ib (15), NH ob ( l l ) , H 0 s (11) CN t (75), NH 0 ib (19), NH ob (16), H ... 0 s (10) CN t(79), NH ob (26), NH ... 0 ib (23), H ... 0 s (10) CN t(79), NH ob (29), NH 0 ib (25), H ... 0 s (15) CO ib (36), CO ob (24), CnC s (10) CO ib (37), CO ob (23), CnC s (10) CO ob (67), C C N d (15), NH ob (14), NCaC d (10) CO ob (59), C-CN d (20), NH ob (17), NCnC d (11) C"CN d (47), CO ob (17) C"CN d (43), CO ob (24) CO ob (45), CO ib (28), C C s (12) CO ob (45), CO ib (27), C"C s (1 1) NC"C d (21), CO ib (16), NH ob (15) NC"C d (21), CO ib (18), NH ob (15)
...
736
708s
Potential energy distribution*
320
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
237
TABLE IX (Continued) Observed4 (cm-I) Raman
IR
Calculated (cm-I) Ag
Au
BE
Bu
29 1 285W
290 252
260W
250 217W
226
211w
214
170W
178
180 140M, br
140 135
112M
111 108
82s
88
71 37 31
12
Potential energy distributionb C"CN d (51), NC"C d (21), NC" s (10) C"CN d (56), NC"C d (19), NC" s (12) CNC" d (41), CO ib (28) CNC" d (41), CO ib (30), NH ob (15) CNC" d (68), CO ib (lo), H ... 0 s (10) CNC" d (74) NH ob (70), CO ob (20) NH ob (67), CO ob (15), C"CN d (12), CHp w (12) H * * * 0 s (46), CN t (46), NH ob (37), CuC t (11) H ... 0 s (29), CN t (25), NC"C d (15) H ... 0 s (78), CN t (18) NC=C d (46), CnC t (16), NH ob (16), NC" t (14) NH ob (43), CN t(27), NCnC d (26), H" ... H" s (13), NC" t (12), H ... 0 s (1 1) CnC t (35), NC" t (24), CN t (22), NH ob (22), NH **. 0 ib (17) NH * * * 0 ib (38), CO ... H ib (28), NH ob (la), H 0 s (17) NH ... 0 ib (35), CO ... H ib (31), CN t (21), NH ob (16), NC" t (13) NH t (52), CO t (34)
~~
S, Strong; M, medium; W, weak; V, very; br, broad. s, Stretch; as, antisymmetric stretch; ss, symmetric stretch; b, angle bend; ib, in-plane angle bend; ob, out-of-plane angle bend; w, wag; tw, twist; r, rock; t, torsion; d, deformation. Only contributions of 10 or greater are included. Unperturbed frequency. a
238
SAMUEL KRIMM AND JAGDEESH BANDEKAR
1969; Small et al., 1970). Although TDC makes a smaller contribution to amide I11 than to amide 11, it is not necessarily negligible; the unperturbed frequencies of the eight modes in this region that contain NH ib contributions are 1422 (Ag), 1422 (A,,), 1285 (Bg), 1278 (B,,), 1223 (Ag), 1222 (A,,), 1158 (Ag), and 1157 (A,,) cm-'. The amide V mode, which seems to be very sensitive to chain conformation, is found as a strong IR band at 708 cm-' and is fairly well predicted by the calculation. Its assignment to the B,,(I) species is likely on the basis of the structure, but it could also have an A,,(ll)component because of the orientation of the peptide group; this can be decided only from polarized IR spectra on an oriented sample. In Section II,D,4 we mentioned that recent a6 initio calculations of dipole derivatives for the peptide group in NMA have been used to calculate intensities of IR bands in (Gly),I (Cheam and Krimm, 1985). Such calculated intensities are shown in Fig. 7, and it can be seen that they reproduce the observed intensities quite well. This kind of agreement indicates that the force field is a very satisfactory one, since intensities are a sensitive function of the eigenvectors. While (Gly),! is the only polypeptide so far for which intensities have been calculated, it can be expected that this technique will be used in the future to provide additional information on polypeptide chain conformation. C . Antiparallel-Chain Pleated Sheet 1 . P-Poly( L-alanine) a. Structure and Symmetry. Early X-ray diffraction studies of P-(Ala),, clearly demonstrated the extended nature of the chains in this structure (Bamford et al., 1953, 1954). Following a suggestion by Marsh et al. (1955a) that the sheet structure corresponds to the APPS (Pauling and Corey, 1951c, 1953b), Brown and Trotter (1956) tested various packing arrangements of such sheets but were unable to find any that gave calculated structure factors in acceptable agreement with observed intensities in their fiber-diffraction pattern. In an X-ray refinement procedure that additionally allowed A6 to be a parameter, Arnott et al. (1967) were able to find an APPS structure with statistical packing of sheets that gave good agreement with the observed intensities. In this structure, A6, although it could not be well refined, was felt to be probably between 0 and -0.65 A. However, A6 can be obtained relatively accurately from a TDC analysis of the amide I modes (Moore and Krimm, 1976b), and was found to be -0.27 A. This APPS structure of P-(Ala), is shown in Fig. 9, and its parameters are given in Table X. The bond lengths are the same as for the standard
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
239
FIG.9. ORTEP drawing of antiparallel-chain pleated sheet of poly(L-alanine). The CHs group is represented by a point mass.
TABLE X Structural Parameters of Crystalline Antiparallel-Chain Pleated Sheet Poly(L-alanine)" Bond lengths (A): l(C"-C) l(C-N) 1(N-C") 1(C=O)
=
l(C"-H) l(C-H) l(N-H)
1.53
= 1.32 = =
1.47 1.24
= 1.07 = 1.09 = 1.00
C"C0 = 121.0 Bond angles (degrees): C"CN = 115.4 CNH = 123.0 CNC" = 120.9 All angles about C" and Ca tetrahedral Dihedral angles: 4 = -138.4".
JI
=
135.7"
Sheet parameters: a / 2 = 4.73 A b/2 = 3.445 A (fiber axis) Ab = -0.27 A a = 80" Hydrogen-bond parameters:
1(H ... 0) = 1.75 A 1(N 0) = 2.73 A fl(NH, NO) = 9.8" y(NH0) = 164.6' 1..
lntersheet parameters: c12 = 5.27 p = 90" (I
From Arnott et al. (1967).
A
240
SAMUEL KRIMM AND JAGDEESH BANDEKAR
geometry, but the CaCN and CNCa angles differ by 1.4 and 2. lo,respectively; this is not expected to have a significant effect. The hydrogenbond length is slightly shorter than in (Gly),I, as is the H a ... H a distance (2.325 A). The normal modes of APPS P-(Ala), are distributed among the symmetry species, and have optical activity as follows (Moore and Krimm, 1976b): A[v(O, O ) ] , Raman, 30; Bl[v(O, n)],Raman, IR((I),29; B2[v(n, O)], Raman, IR(l.), 29; B3[v(.rr, T)], Raman, I R ( I ) , 29. b. Vibrational Analysis. Since specimens of P-(Ala), can be well oriented, dichroic IR spectra were soon available (Elliott, 1954). Far IR spectra have also been obtained (Itoh et al., 1968, 1969; Itoh and Katabuchi, 1972), as well as spectra of the N-deuterated molecule (Masuda et al., 1969; Dwivedi and Krimm, 198213). Raman spectra are also available (Fanconi, 1.973; Frushour and Koenig, 1974). Infrared and Raman spectra of P-(Ala)* are given in Figs. 10 and 11, respectively. The first normal-mode calculation on P-(Ala), to incorporate all of the atoms in the structure used a force field transferred from (Gly),I and refined for a CH3 side chain (Moore and Krimm, 1976b). This force field was subsequently adjusted slightly (Dwivedi and Krimm, 1982b, 1983),and the results of this calculation, given in Table XI, are the basis of our present discussion. (The original paper should be consulted for the corresponding analysis of the spectra of the N-deuterated molecule.) Using this detailed force field, an “approximate” force field was derived for a structure in which the CH3 group is taken as a point mass (Dwivedi and Krimm, 1984b), and its calculated frequencies are also given in Table XI (as the second of the two entries for each mode). The v i frequency, determined from a Fermi resonance analysis (Moore and Krimm, 1976a; Krimm and Dwivedi, 1982a), is significantly lower than that in (GlyI),: 3242-3250 vs. 3272 crn-’, consistent with the shorter hydrogen bond in P-(Ala),. In the case of v! it seems possible to account for its frequency of 3096-3109 cm-I by two combinations of amide I1 modes, B1 + Bs and A + B2, compared to only one likely combination for (Gly),I (Moore and Krimm, 1976a; Krimm and Dwivedi, 1982a). The maximum observed amide I splitting in P-(Ala),, 1694 - 1632 = 62 cm-l, is significantly larger than that in (Gly),I, 1685 - 1636 = 49 cm-l, and this difference is very well reproduced by the calculation: 65 versus 46 cm-l. (Again, of course, the amide I splittings are due to TDC. Without this interaction we calculate A, 1670; B1, 1673; B2, 1665; B 3 , 1670 cm-l.) Three assignable amide I1 modes are observed in the spectra, compared to two for (Gly)nI,and these agree well with the calculated frequencies. (Without TDC the computed frequencies are A, 1545; B 1 ,
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
I
I
I
3200
3300
3400
24 1
Frequency (cm'i)
b
700
600
400
200
(cm-'1
FIG. 10. Infrared spectrum of @-poly(L-alanine).(a) Mid-infrared region. (-) Electric vector perpendicular to the direction of stretching.(---) Electric vector parallel to the direction of stretching (Elliott, 1954). (b) Far-infrared region (Itoh and Katabuchi, 1972).
1549; Bz,1550; B3, 1554 cm-I.) It is interesting that the maximum predicted amide I1 splitting for /3-(Ala)n(64 cm-l) is much less than that for (Gly),I (88 cm-I), yet the observed splitting of the two lowest frequency modes of /3-(Ala)n(14 cm-l) is much larger than that of (Gly),I (2 cm-l) and this difference is predicted very well. Although amide I11 is generally assigned to the Raman bands at 1243 and 1226 cm-', and the calculation indeed shows NH ib + CN s contributions to these modes, we again see that NH ib is present in calculated frequencies that range from 1402 to 1195 cm-l. Observed bands in the higher frequency region support such an assignment; for example, the 1402MW-~rn-~ IR band disappears on N-deuteration (Dwivedi and
242
SAMUEL KRIMM AND JAGDEESH BANDEKAR 2
L
W
I-
E
zI4.
I 4
a
900 800
700
600 500 400 FREQUENCY(cm-l)
300 200
100
FIG.1 1 . Raman spectrum of p-poly(L-alanine)(Fanconi, 1973).
Krimm, 1982b). As we will see below, these higher frequency modes become important in characterizing &turn structures. The amide V mode, similar to (Gly),I, appears as a strong IR band at 706 cm-'. The essential absence of any significant dichroism in this band in well-oriented specimens of P-(Ala), (Itoh et al., 1968) suggests the near superposition of components having parallel and perpendicular polarization. As can be seen, this is supported very well by the results of the calculation, which additionally indicate that a weak Raman band at 698 cm-l should be assigned to amide V. 2. P-Poly( L-alanylglycine) Since the Ala-Gly sequence is the major one in Bombyx mori silk, the structure of this sequential polypeptide is of importance in understanding the structure of silk. In early studies of P-(AlaGly), it was found that the X-ray powder diffraction pattern and the IR spectrum resembled that from the crystalline component of B . mori silk. The proposed APPS structure of silk (Marsh et al., 1955b) led to a more detailed analysis of the X-ray pattern of P-(AlaGly), (Fraser et al., 1965, 1966). This, together with conformational energy analysis (Colonna-Cesariet al., 1975),
TABLE XI Observed and Calculated Frequencies of P-Poly(L-&nine)
Observed" (cm-I) Raman
IR
Calculatedh(cm-I) A
BI
B2
BS
3243 3242 3243 3242 3242Sd
3243 3242 3243 3242
2984s
2984 2984
Potential energy distributionr NH s (97) NH s (97) NH s (97) NH s (97) NH s (97) NH s (97) NH s (97) NH s (97) CHyas2 (50), CHyas1 (49) CHs as2 (55), CHyas1 (45)
-
2984 -
CHs as2 (53). CH3 as1 (46) 2984 -
2983 -
CHy as2 (52), CHJ as1 (48) CHy as1 (50), CH, as2 (49)
2983
CHy as1 (55), CHJ as2 (45)
-
2980 sh (I)
2983 -
CH, as1 (53). CH, as2 (46) 2983 -
CHS as1 (51), CHs as2 (48)
(Continued)
TABLE XI (Continued) Observed" (cm-I) Raman
IR
Calculated* (cm-I) A
Bi
B2
BJ
Potential energy distribution'
CHJ ss (100)
2929 -
CH:<ss (100)
2929 2929 2933s
2934W (I)
2929
-
2877 2875
2871 sh
2877 2875 2866 2864 2866 2864
2874VW (I)
1698 1698 1695 1695
1694W (11) 1669s
1670 1670 1632VS (I)
1630 1630 1592 1593
1553VW
1555MW (I)
1562
CHJ ss (100) C-H" s (98) C'Ha s (98) C"H" s (98) CuH" s (98) CuH" s (99) CnH" s (99) CuH" s (99) C-H" s (99) CO s (78), CN s (14) CO s (79), CN s (13) CO s (76), CN s (19) CO s (77), CN s (20) CO s (73), CN s (21) CO s (74), CN s (22) CO s (70), CN s (21) CO s (72), CN s (21) NH ib (57), CN s (21), C"C s (10) NH ib (58), CN s (21) NH ib (53), CN s (17). CuC s (14)
1563
NH ib (53).CN s (18), CuC s (13) NH ib (48).CN s (22),CO ib (12),CuC s ( 1 1) N H ib (48),CN s (22),CO ib (1 I), CuC s (10) NH ib (41), CN s (26),CO ib (14), CuC s (13) NH ib (42),CN s (26),CO ib (13),CuC s (12) CH, abl (44).CHs ab2 (39)
1539 1542 1528 1531 1455 1455
CH,3 abl (46), CH:
1453 -
1446s (I)
1452
CH:
-
1451s
CH:3 abl (88),CHS rl (10)
1452
CH:
-
1454s (11)
1452
CHs ab2 (48),CH:
-
1446s (I)
1451
CHJ ab2 (86),CHs r2 (10)
-
I45I -
1399W
1402 1399 1402MW (11)
I399 1395
1386W (I)
1385
CH:
-
1383
CH, sb (74).H" bl (16), C"C0 s (10)
-
1368W
1372 -
CHs sb (64), H" b2 (16)
(Continued)
TABLE XI (Continued) -~ ~
~~
Observed" (cm-I) Raman
IR
Calculated*(cm-I) A
1372MW (11) 1335W
BI
Bz
Bs
1372 -
CHs sb (61). H" b2 (19)
1330W, br (I)
1332 1338 1317 1319 1305 1301
1311W
1236 1242 1231 1235 1 I98 1I07
1I96 1205 1195 1106
I I95 1201 1165W
H' b2 (42), NH ib (25), CnC s (12). CH, sb (11) H" b2 (36), NH ib (31), C"C s (16) H" b2 (64), NH ib (15) H" b2 (48), NH ib (29), C"C s (10) H" b2 (30), CN s (18), CO ib (15) H" b2 (38), CN s (16), CO ib (12), H" bl (10)
1309 sh 1299 1299
12438
Potential energy distribution'
1162
NH ib (23), CO ib (la), C-C s (14). CN s (13) H" b2 (23), CN s (14), CO ib (14), H" bl (12). C"C s (11) H" b2 (34), NH ib (19), NC" s (19), CN s (13) H" b2 (30). NH ib (25), H" bl (20), C"C@s (15) H" b2 (28), NC" s (24). NH ib (I@, CN s (1 1) NH ib (26), H" b2 (26), Ha bl (22), CaC6 s (19) NC" s (36). CHs r l (13), C"C s ( I I ) , NH ib (10) NC" s (54), C"C@s (27). H" bl (1 1) C"C@s (33). H" bl (27), NC" s (22), CHs sb (1 I ) , CHs r l (10) C"C6 s (57), NCn s (19). H" bl (19). CN s (10) NC" s (30), NH ib (13), C"C s (12), H" b2 ( l l ) , CHs r l (11) NC" s (53). C"C@s (28), H" bl (1 1) C"C@s (33), H" bl (26), NC" s (22), CHs r l (lo), CHS sb (10) C"C@s(54),NC"s(18), H " b l (18),CNs(10) H" bl (56), CHs sb (19). C"C@s (13)
C"CS s (43). H" bl (19). CN s (18) H" b l (54), CHs sb (18). C"C6 s (13), CN s (10) C"C@s (39). CN s (19). H" bl (17) CHs r2 (27), H" bl (26), CHs sb (1 1)
1 I68
1161 1 I66
1 120vw
1092s
i 1
1 I25
1125 -
H" bl (26), CHs r2 (26), CHs sb (11)
CHs r2 (55), C"C0 s (21)
1093 1092
CH, r2 (54), CpCS s (22)
-
1085
CHs r l (27). CHs r2 (25). C"CS s (14), CN s (10)
1084W (I)
1086 1065 1073
1052M (I)
1064 1073 1065M
1055 1050
966s
1054 1051 970 -
(11)
967M
C"CS s (23), H" bl (21), CHs r l H" bl (67), C"C@s (15) C"C@s (22), H" bl (21), CHs r2 H" bl (67), C"CS s (15) C"CS s (50). CHs r l (14), H" bl H" bl (44), NC" s (28) C"C@s (50), CHs r l (13). H" bl H" bl (44), NC" s (27), CHs r l (50), NC" s (25)
(18), CHs r2 (17) (19), CHs r l (17)
(1 1)
( l l ) , CHs r2 (10)
CHs r l (50), NC" s (25)
969 925M (I)
CH3 r l (26), CHs r2 (26), C"CS s (13), CN s (10)
918 940
NC" s (27). CHs r l (23), CN s (15) NC" s (79). CN s (16) (Continued)
TABLE XI (Continued)
Observeda (cm-I) Raman ~
Calculatedb(cm-')
IR
A
BI
B2
Bs
Potential energy distributionr
~ ~
909VS
913 906 913 938 904 895 846 847 844 851 775 79 1 775 788 708 712
706 708 705 708 698VW
C"C s (15). CN s (14), CH3 1-2(14). CNC" d (13), COs(l1) (2°C s (20), CNCa d (la), CO s (14), CN s (14), C"CN d (13). NC"C d (13) NC" s (30). CH3 rl (23). CN s (14) NC" s (82), CN s (14) c"C s (15), CN s (14), CH3 r2 (14), CNC"d (13), co s (12) C"C s (20), CNC" d (19). NC"C d (15), C"CN d (14), CN s (14), CO s (14) C"C s (34), NC" s (13), CN s (12), C"CS s (12) CN s (20), CO ib (20), CuC s (19) (2°C s (31), NC" s (19), CuC@s (14) CO ib (20), CN s (17). CuC s (15), CmCS s (10) CO ib (19), CuC s (15), NCas (13). CS b2 (13), NC"C d (12), CNC" d (1 1) C"C s (54), NCuC d (23), CS b2 (12), CNC" d (10) CO ib (IS), CuC s ( I @ , CS b2 (14), NCas (12), NC"C d (1I ) , CNC" d (1 1) C"C s (55), NC-C d (23), CS b2 (12), CNC" d (10) CN t (74), NH ob (28), NH ... 0 ib (23) CN t (75), NH ob (26). NH ... 0 ib (23) CN t (44), NH ob (41), NH ... 0 ib (19). CO ob (18), H" bl (10) CN t (50), NH ob (43), NH ... 0 ib (20), CO ob (15) CN t (74), NH ob (28), NH ... 0 ib (22) CN t (76), NH ob (26). NH ... 0 ib (22) CN t (48), NH ob (41), NH ... 0 ib (19), CO ob (13),
H" bl (10) CN t (55), NH ob (42), NH ... 0 ib (21), CO ob (1 1) CO ob (50), CN t (35), CS bl (10) CO ob (62). CN t (27). CS bl (15) CO ob (47), CN t (41), C@b l (10) CO ob (59), CN t (33), C@bl (14) CO ib (46), CaC s (16), CO ob (13) CO ib (51), CaC s (19) CO ib (49), CaC s (16), CO ob (12) CO ib (54), CaC s (19), CN s (10) C"CN d (38), CO ob (18). NC% d (16), NH ob (16), CS b2 (15) C"CN d (35). CO ob (23), C@b2 (21), NH ob (IS), NC"C d (12)
705 668 658 662 652 624 628 62 1 624
I
626 624
615W (I)
622 620 592 588 59 1 586 447 444
448M (I)
447 444 437w
440 439 432M
(11)
440 439 328 333
326W (11) 332VW
327
C"CN d (31), CO ob (30). NH ob (22), C@b2 (16). NC"C d (14) CO ob (36), C"CN d (29), NH ob (24), C@b2 (21), NCaC d (10) CO ob (52). CaCN d (19) CO ob (53), CaCN d (19), Ha b2 (12) CO ob (62), C"CN d ( l l ) , NH ob ( I I ) , Ha b2 (10) CO ob (64). Ha b2 (15), CQCNd (IZ),NH ob (12) CS bl (55), NH ob (15) CS bl (53), NH ob (14), NCaC d (10) CS bl (54), NH ob (13) CS bl (52). NH ob (12), NC'C d (10) . C@b2 (75), NC"C d (13) C@b2 (79), NC"C d ( I I ) , NH ob (10) CS b2 (74), NC"C d (12) C@b2 (78). NC"C d (1 I), NH ob (10) NC"C d (25), Co b2 (11). CO ob (11). CS bl (10) NC"C d (26), Cp b2 (19) NC"C d (24), C@b2 ( I I ) , CO ob ( I l ) , CS bl (10) (Continued)
TABLE XI (Continued) Observed"(cm- I ) Raman
IR
Calculatedb(cm-I) A
BI
B2
Bs
332 300M
286 289
300M
286 289 279 263
N ul
0
266VW, sh
27 1 259 252 254 252 254
235M. sh
240
-
NC"C d (25), C@b2 (21) CO ib (33). NC"C d (19), CNC" d (15). C@b2 CB bl (13), CO ob (11) CO ib (38). NC"C d (23), CNC" d (20), C@bl CBb2(11) CO ib (33), NCmCd (19), C@b2 (15), CNC" d C@bl (13). CO ob (10) CO ib (38), NC"C d (23), CNC" d (19), CS bl C@b2 (12) C"CN d (39). C"C@t (26), NCuC d ( 1 1) C C N d (46), NC"C d (22), C@bl (13) C"CN d (38), C"C@t (25), NC-C d (17) C C N d (38), NC"C d (29), C@bl (15) C@b2 (50) C@b2 (68) C@b2 (50) CB b2 (68) C"C@t (67). NC"C d (1 1)
(15), (12). (14). (12).
CuC@t (49). NCuC d (16)
240 -
C"C@t (91)
238 238 21 1 220
Potential energy distributionc
C"C8 t (90) CNC" d (35), C"C8 t (23). H ... 0 s (23) CNC" d (43), H ... 0 s (27), NC"C d (1l), CO ib (11)
185VW
177 181 156 159 151 155
135s
147 151 122W, br
118 122 104 106 103 104
91M, sh
91 93 87 89 74 74 41 43 32 32 31 31
CNC" d (62).C"CN d (15) CNC" d (70), C"CN d (15) H ... 0 s (58), NC"C d (10) H ... 0 s (55), NC"C d ( l l ) , NH ob (10) NH ob (39), CO ob (13), NC" s (13), H" b2 (10) NH ob (38), CO ob (13), NC" s (10) NH ob (37), CO ob (16), NC' s (12), Hu b2 (10) NH ob (36), CO ob (16), CmCNd (10) CYJC" d (21), NC"C d (18), C"CN d (13), NH ob (12) CNC" d (21), NC"C d (19). C"CN d (13), NH ob (13) NH ... 0 ib (27), CN t (21), CS bl (15) NH ... 0 ib (26). CN t (22), CS bl (15) NH ... 0 ib (33). CN t (21), NH ob (20), CuC t (15). CO-Ht(l1) NH *.- 0 ib (33), CN t (22).NH ob (20), C"C t (15), CO-Ht(l1) Cebl (26),NHob(19),H*-*Os(14),CaCt(11) CO bl (26), NH ob (20), H ... 0 s (13), C"C t (10) H ... 0 s (33), CNC" d (17), CN t (1 1). NC"C d (10) H ... 0 s (36), CNC" d (17) C"C t (24), NC" t (22), H ... 0 s (22), CN t (15) C"C t (25). NC" t (22), H ... 0 s (22), CN t (15) CO ... H ib (21), NH ob (20), CS bl (16). NH ...0 ib (15), NH ... 0 t (12) CO ... H ib (21), NH ob (21), CB bl (16), NH ... 0 ib (16), NH ... 0 t (12) NH ... 0 ib (37). CO ... H ib (22). CN t (15) NH ... 0 ib (37), CO .-.H ib (22), CN t (16) H ... 0, s (15). CO ... H t (15), CO ... H ib (14). NH -0 ib (14), NH -0 t (12), NH ob (If), CN t (10) H ... 0 s (16), NH ... 0 ib (15). CO ... H ib (14). (Continued)
TABLE XI (Continued) Observed" (cm-') Raman
IR
Calculatedb(cm-I) A
19 19
BI
Bz
Bs
Potential energy distribution' C O - H t(13). NHob(12),NH--.Ot(12),CN t ( l 0 ) CO ... H t (35), NH ... 0 t (27), CO ... H ib (14). H" ... Ha s (12) CO ... H t (37), NH ... 0 t (28). CO ... H ib (13), H"--H"s(ll)
"S,Strong; M, medium; W, weak; V, very; sh, shoulder; br, broad; 1 , parallel dichroism; I,perpendicular dichroisrn. The first frequency in each vertical pair is from the detailed calculation, the second is from the approximate calculation. s, Stretch; as, antisymmetric stretch; ss, symmetric stretch; b, angle bend; ib, in-plane angle bend; ob, out-of-plane angle bend;
ab, antisymmetric angle bend; sb, symmetric angle bend; r, rock; d, deformation; t, torsion. In the approximate calculation CS represents the point mass (CSH,). Only contributions 10 or greater are included. Unperturbed frequency.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
253
established the chain dimensions and sheet packing. These parameters were the basis for a normal-mode calculation (Moore and Krimm, 1976b), except that Ab = -0.25 as determined from TDC analysis of the amide I modes was used rather than the larger value given by conformational energy analysis (Colonna-Cesari et al., 1975). T h e APPS structure of P-(AlaGly), is analogous to that of P-(Ala), (see Fig. 7), except for the following differences in structural parameters: C"CN = 1 14.0",CNC" = 123.0",4 = - 143",I,!I= 139",d 2 = 4.71 A, b/2 (fiber axis) = 3.475 A, Ab = -0.25 A, a = 71",Z(H ... 0)= 1.74 A, Z(N -.0) = 2.73 A, 8(NH, NO) = 8", y (N H 0 ) = 168", c = 8.87 A (cay = 3.79 A, C A = ~ ~5.08 A), P = 90". T h e lower chain symmetry results in a lower unit cell symmetry, giving rise to 50 normal modes of symmetry species A' [v(O,0), v(a,T ) ]and 49 modes of symmetry A" [v(O,a),v(a, O ) ] . All of the modes are IR- and Raman-active, but we expect v(0, 0) to be weak in the IR, and v(0,a)to exhibit parallel while v(a,0) and v(a,a) exhibit perpendicular IR dichroism with respect to the chain axis. Raman (Frushour and Koenig, 1975a; Moore and Krimm, 197613) and polarized IR (Fraser et aZ., 1965) spectra have been obtained for P(AlaGly), . The observed V A mode is at 3284 cm-', only 4 cm-' higher than that of P-(Ala),, but a Fermi resonance analysis (Moore and Krimm, 1976b) shows that the unperturbed frequency, 3258 cm-', is 8-16 cm-' higher than v i in P-(Ala), (Krimm and Dwivedi, 1982a). This indicates that, despite the fact that similar APPS structures are assumed for these two polypeptides, the hydrogen bond in P-(AlaGly), is slightly weaker than that in P-(Ala),. This may be related to a larger NHO angle in (AlaGly), (168") than in P-(Ala), (163").Interestingly, however, the v! frequencies are essentially the same, about 3109 cm-I. This can be satisfactorily explained by v(0,n) + v(a,a)= v(a,0) combinations in both cases. Even though the v(0,T ) IR(J1)band is observed at 1524 cm-' in P-(Ala), and at 1535 cm-' in P-(AlaGly), (Fraser et al., 1965), the computed combination frequencies are about the same for the two structures, viz., 1524 + 1592 (calc) = 31 16 and 1535 + 1580 (calc) = 31 15 [the 1580 (calc) value being from an unpublished calculation by A. M. Dwivedi and S. Krimm]. Thus, spectral features associated with small structural differences are well accounted for by the normal-mode calculation. I n the case of P-(AlaGly), it seems that all of the predicted amide I modes are observed (possibly because of the lower symmetry), and that assignments are unambiguous because of dichroic properties. Thus, we calculate (A. M. Dwivedi and S. Krimm, unpublished results) v(0, T ) = 1704 cm-': observed at 1702W cm-', IR(I1); v(a,a) = 1699 cm-': observed at 1693W cm-', Raman; v(0, 0) = 1661 cm-': observed at 1665VS cm-', Raman, -16658 cm-', IR, W(L?); v(a,0) = 1630 cm-':
254
SAMUEL KRIMM AND JAGDEESH BANDEKAR
observed at 1636VS cm-', IR(I). It is interesting that, in view of the good frequency and species agreement, the highest frequency is predicted and observed to be the v(0,P)mode, whereas in both (Gly),I and P-(Ala), the highest predicted (though unobserved) frequency is the V(P, P)mode. This particular frequency distribution is a consequence of the specific TDC interactions for P-(AlaGly), and the good agreement must be considered as providing strong support for the presence of this interaction mechanism. The strong IR v(0, P) amide I1 mode is observed at 1535 cm-' (and recently calculated at 1525 cm-'), which is a higher frequency than the corresponding bands in (Gly),I (1517cm-') and P-(Ala), (1524cm-l). The calculations indicate that Raman bands at 1265W and 1230s cm-' both contain NH ib contributions (significantly larger in the former case). Thus, an interesting pattern of NH ib PED contributions is evident in the observed Raman bands (in cm-') usually assigned to amide I11 (PEDs in parentheses): P-(AlaGly),, 1265W (33),1230s (16);&(Ala),, 1243s (17),1226M (15);(Gly),I, 1234s (0),1220W (23).This emphasizes the care needed in characterizing this mode even when dealing with essentially similar chain conformations (Hsu et al., 1976). The amide V mode is again found in the IR near 705 cm-', and would seem to be a relatively constant band for p structures. 3. P-Poly(L-glutamate)
The structure and spectra of P-poly(L-glutamicacid) [ P-(GluH),] and its salts have been studied in some detail. X-Ray and electron-diffraction studies (Keith et al., 1969a,b),particularly on the Ca salt [P-(GluCa),], have indicated that this polypeptide forms an APPS structure. A model has been presented for the structure, and coordinates have been given for the atoms in the unit cell (Keith et al., 1969a),but a detailed test was not possible because of the paucity of diffraction data. This structure, which was the basis of a normal-mode calculation (Sengupta et al., 1984), has the following parameters: The unit cell is monoclinic, with a = 9.40 A, b(fiber axis) = 6.83 A, c = 12.82 A, P = 100.3"(which makes the sheet separation 12.61 A). With bond lengths and angles kept the same as in P-(Ala), (Arnott et al., 1967),it is found that 4 = -134.8", JI = 132.0", and a = 72.7". A TDC calculation (Sengupta et al., 1984)shows that A b = -0.27 8, [as in P-(Ala),] gives the best agreement with the observed amide I mode splittings. Under these conditions the h drogen-bond parameters are Z(H -.-0) = 1.70 A, 1(N .** 0)= 2.69 B(NH, NO)= 4.2",y(NH0) = 173.3".The symmetry species are the same as for P-(Ala),.
1,
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
255
Although there have been several IR and Raman studies on p(GluNa), (Lenormant et al., 1958) and P-(GluH), (Lenormant et al., 1958;Zimmerman et al., 1975;Itoh et al., 1976;Fasman et al., 1978),the spectra of P-(GluCa), have been studied in detail only recently (Sengupta et al., 1984). These spectra and those on the N-deuterated derivative, together with normal-mode calculations on both structures, permit a very satisfactory analysis of the vibrational spectrum of p(GluCa), , even though the main chain force constants were transferred without modification from the P-(Ala), force field (Sengupta et al., 1 984). A Fermi resonance analysis of V A = 3275 cm-' and V B = 3088 cm-' leads to unperturbed frequencies of V: = 3230 and V; = 3133 cm-'. The former is significantly lower than the comparable NH s mode in P-(Ala), (viz., 3250-3242 cm-'), indicating the presence of a stronger hydrogen bond in P-(GluCa),. This is consistent with the structural data: 1(N 0) = 2.73 A in P-(Ala), and 2.69 A in P-(GluCa),. The higher value of V: compared to P-(Ala), can be accounted for by an A + B2 combination of amide I1 modes, although this explanation will have to await a more certain assignment of the IR-active amide I1 mode (Sengupta et al., 1984). The amide I modes are observed at V(T, r )= 1693sh cm-' (IR), v(0,O) = 1666s cm-' (Raman), and V(V,0) = 1624VS cm-' (IR), and are well accounted for by the calculations. The lower V(T, 0) frequency, which is at 1632 cm-' in P-(Ala),, is probably due to the slightly stronger hydrogen bond in p-(GluCa), . The assignment of the amide I1 mode is somewhat uncertain, because of the presence in this region of the IR spectrum of the strong COO- antisymmetric stretch mode at 1561 cm-'. The amide I11 modes, assigned by their predicted NH ib contribution and observed disappearances on N-deuteration, are found in the Raman spectrum at (PEDs in parentheses) 1260M (1 1) and 1223M (11) cm-' and in the IR spectrum at 1260MW (10)and 1225sh (12)cm-'. Thus, a band of significant intensity is found in the high-frequency range usually associated with the a helix [see below; such bands are found in a-(Ala), at 1278W, 1271W, and 1261W cm-' in the Raman and 1270M and 1265M,shcm-' in the IR). Again we see the nature of amide I11 and its dependence on side-chain composition (Hsu et al., 1976). Despite our observations that amide V seems to be associated with a characteristically strong IR band near 705 cm-', no such band is found for P-(GluCa),, although a weak Raman band at 705 cm-' [cf. the comparable 698-cm-' band in P-(Ala),] does qualify as such a mode by its disappearance on N-deuteration. However, the mainly CO ob mode, which is observed at 614M cm-' in (Gly),I and is not observed in p-
256
SAMUEL KRIMM AND JAGDEESH BANDEKAR
(Ala),, appears as a strong band at -653 cm-’ in the IR spectrum of /3(GluCa),. We may therefore expect some variability for amide V in the IR, even though the chain conformation is that of a p sheet. IV. HELICAL POLYPEPTIDE CHAINSTRUCTURES A. Introduction
From X-ray studies on unstretched mammalian keratin, it was clear that the polypeptide chain can adopt some kind of “folded” conformation. Limited progress was made in understanding this structure until a systematic study of structural principles governing polypeptide chain conformation (Huggins, 1943) led to the concept of a helix as the most general repeating structure (Huggins, 1943; Crane, 1950). Possible helices having an integral number of residues per turn were characterized by Bragg et al. (1950), but only when this constraint was relaxed, and well-defined stereochemical criteria invoked, was a conformation, the a helix, discovered (Pauling et al., 1951; Pauling and Corey, 1951a) that could be related to actual structures in synthetic polypeptides (Pauling and Corey, 1951b), fibrous proteins (Perutz, 1951), and globular proteins (Kendrew et al., 1960). The a helix is the dominant helical conformation found in globular proteins (Richardson, 1981). However, other helical structures have been proposed (Donohue, 1953) and found to exist. Although normalmode calculations have so far been done only on several of these (discussed below), we review briefly first the structural features of some of the proposed helices (which are tabulated in Table XII). Single-stranded intramolecularly hydrogen-bonded helices are often designated by the symbol n, (Bragg et al., 1950),where n represents the number of residues per turn (negative for a left-handed helix) and m the number of atoms contained in the “ring”joined by the hydrogen bond. The standard a helix is a 3.613 helix. The hydrogen-bonding pattern is defined in terms of the direction from the residue containing the NHdonor group to that containing the acceptor O(=C) atom. Thus, the a helix has 5 --* 1 hydrogen bonds, meaning that the NH group on residue 5 along the chain, adopting the IUPAC convention for chain direction (Kendrew et al., 1970), bonds to the CO group back on residue 1. A planar peptide group is possible in the a helix, but other helices can form only if some nonplanarity is allowed (see Table XII). The a helix can exist in two conformations, a1 and a11,that have the same number of residues per turn, n, and unit axial translation, h. These are discussed in Section IV,B,l, the a11 helix possibly being relevant in
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
257
TABLE XI1 Standard Parameters of Various Helical Structures Structure Intramolecular a1Helix all Helix 310 Helix o Helix n Helix n Helix 2.2, Intermolecular 3, Helix Polyproline I Polyproline I1
no
mb
hc
+d
Jld
od
3.60 3.60 2.99 -4.00 4.40 -4.25 2.17
13 13 10 13 16 16 7
1.495 1.50 2.01 1.325 1.15 1.17 2.75
-57 -70 -45 64 -57 51 -78
-47 -36 -30 55 -70 74 59
180 180 176 175 180 177 180
3.10 1.90 3.10
-77 -83 -78
145 158 149
180 0 180
3.00 3.33 -3.00
H Bonde 5+ 54+ 5+ 6+ 6+ 3+
Referencef
1 1
1 1 1 1 1 8 9 10
Number of residues per turn, positive for right-handed and negative for left-handed helices. * Number of atoms in hydrogen-bonded "ring." c Unit axial translation (A). Dihedral angle (degrees). ' Hydrogen bonding pattern, N-H + O=C (Kendrew et al., 1970). ( 1 , Arnott and Dover (1967); 2, Dwivedi and Krimm (1984a); 3, Prasad and Sasisekharan (1979); 4, Bradbury et al. (1962); 5, Low and Baybutt (1952); 6, Sasaki et al. (1981); 7 , Donohue (1953); 8, Crick and Rich (1955); 9, Traub and Shmueli (1963); 10, Arnott and Dover (1968).
the structure of bacteriorhodopsin (Krimm and Dwivedi, 1982b). The 310 helix, discussed in Section IV,C,l is similar to the a helix, except that it has a 4 + 1 hydrogen-bonding pattern. It has been observed in proteins (Bernstein et d., 1977) and in oligomers of a-aminoisobutyric acid (Aib) (Shamala et al., 1977, 1978; Smith et al., 1977, 1981; Nagaraj et al., 1979; Prasad et al., 1980) and its polymer, (Aib), (Malcolm, 1977, 1983; Dwivedi et al., 1984). The o helix is a left-handed helix with n = 4 (Bragg et al., 1950). It was proposed for the structure of poly(P-benzyl-L-aspartate) (Bradbury et al., 1962), although its exact conformation is still under discussion (Bradbury et al., 1968; Baldwin et al., 1973; Takeda, 1975; Nambudripad et al., 1981). The 7~ helix, which can form if the NC"C angle is distorted by 4", was originally proposed as a right-handed helix with n = 4.4 (Low and Baybutt, 1952; Low and Grenville-Wells, 1953). It has only been observed in some regions of globular proteins, for example, horse oxyhemoglobin (Perutz et al., 1968) and h-immunoglobulin FAB' (Soman and Ramakrishnan, 1983), although it has been suggested as a possible conformation for amphiphilic polypeptides (Kaiser and Ktzdy, 1984). A left-handed form has been claimed on the basis of
258
SAMUEL KRIMM AND JAGDEESH BANDEKAR
experimental studies of poly(p-phenethyl-L-aspartate) (Sasaki et al., 1981). A helix has been proposed (Donohue, 1953) but not yet observed, although this conformation has been identified in small peptides (Smith and Pease, 1980). An interesting new class of helices results when L and D residues alternate along a chain. These structures, known as p" helices (Urry, 1971; Lotz et al., 1976), are relevant to the structure of the transmembrane ion-channel protein gramicidin A. Two single-stranded intramolecularly hydrogen-bonded structures (discussed below) have been given most attention, p4.4and ps.3,and these have the interesting property that hydrogen bonds form only between L residues along the chain and similarly between D residues. The parameters of these structures are discussed in Section IV,E, 1. In some cases, polypeptide chains adopt helical conformations involving intermolecular hydrogen bonds. A basic motif for this kind of structure is the threefold 31 helix of (Gly),II (discussed below). It is found in left-handed poly(L-pro1ine)II (Cowan and McGavin, 1955; Arnott and Dover, 1968), and in a slightly modified form in right-handed poly(Lproline)I, with cis peptide bonds (Traub and Shmueli, 1963). In an additionally coiled form, it is the basis for the structure of collagen (Ramachandran and Kartha, 1955). And evidence from circular dichroism (Tiffany and Krimm, 1968; Krimm and Tiffany, 1974) and conformational energy calculations (Krimm and Mark, 1968) suggests that a slightly twisted form, a so-called extended helix (Tiffany and Krimm, 1968), is the local conformation present in solubilized polypeptides with charged side chains [a similar conformation was subsequently identified in a-chymotrysin by Srinivasan et al., (1976)l. In alternating L,D-polypeptides, helices can form with intermolecular hydrogen bonds between two polypeptide chains, and these structures are discussed in Section IV,E,2. It is clear that vibrational analyses of helical polypeptide chain conformations are of major importance to the study of protein structure, and the initial efforts made in this direction are discussed in the remainder of this section. B . aHelix
1 . a-Poly(L-alanine) a. Structure and Symmetry. The a helix was first proposed as a possible stable conformation of the polypeptide chain by Pauling and his collaborators (Pauling et al., 1951; Pauling and Corey, 1951a). Subsequent Xray diffraction studies designed to characterize this structure in detail concentrated on (Ala), (Brown and Trotter, 1956; Elliott and Malcolm, 1959), and the application of an X-ray refinement procedure has pro-
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
259
vided the best information on the atomic coordinates (Arnott and Wonacott, 1966; Arnott and Dover, 1967). This structure was the basis of a recent normal-mode analysis (Dwivedi and Krimm, 1984a). The Arnott and Dover (1967)structure, hereafter referred to as the aI helix, has n = 3.62, h = 1.50 A, and a rotation per residue about the helix axis of t = 99.6'. Its dihedral angles are 4 = -57.4" and 91, = -47.5'. For the vibrational analysis, the bond lengths and angles were taken to be the same as in j3-(Ala)n(cf. Table X), except that the NCaC angle had to be increased from 109.5 to 109.9". The torsion angle, xl, about the Ceca bond was assumed to be 60', which puts a CHs-group hydrogen atom in a staggered position with respect to the backbone N atom. A portion of this structure is shown in Fig. 12a. Its hydrogen bonds have the following parameters: Z(N +*. 0) = 2.86 A (as compared to 2.87 A for the refined structure), Z(H 0)= 1.88 A, NHO = 164.2") HNO = 10.3'. While the above structure of a-(Ala), has a low conformational energy (Ramachandran et al., 1966b),other helices of this type have comparable energies. In fact, a trough of low energy extends from about 4 = - 47", $ = -60" to 4 = -88') $I = -25' (Ramachandran and Sasisekharan, 1968). Two other tfpes of a-helical structures in this region may be of a
8
b
B
6 FIG. 12. ORTEP drawing of (a) aI helix and (b) all helix. Methyl groups are represented by point masses.
260
SAMUEL KRIMM AND JAGDEESH BANDEKAR
interest. In one, designated the a11helix (Nkmethy et al., 1967; Dwivedi and Krimm, 1984a), and JI change but n and h remain constant. This corresponds to the second of the two possible solutions for the dihedral angles of a helix at constant n and h (Miyazawa, 1961c) (see Fig. 12b). Corresponding to the above a1 helix, the a11 helix has the following parameters: 4 = -70.5", JI = -35.8", Z(N 0) = 3.00 A, Z(H 0) = 2.12 A, NHO = 145.7", HNO = 23.5" (Dwivedi and Krimm, 1984a). Its weaker hydrogen bonds derive from having the plane of the peptide group tilted with respect to the helix axis (it is nearly parallel in the 01 helix), with the NH group pointing toward the axis. In the second type, designated a goniomeric helix (Colonna-Cesariet al., 1977), the values of r#~ and J, interchange exactly (thus to 4 = -47.5" and JI = -57.4" in the above case), with small changes resulting in the helical parameters. Although a normal-mode analysis has been done for the a11 helix (Dwivedi and Krimm, 1984a), none has yet been done for the goniomeric structure. n
-
2800
2900
3000
3100 3200
3300
3400
Frequency (cm-')
700
500
300
FIG. 13. Infrared spectrum of a-poly(L-alanine).(a) Mid-infraredregion (Elliott, 1954). (b) Far-infraredregion. (-) Electric vector perpendicular to the direction of orientation; (---) electric vector parallel to the direction of orientation (Itoh and Shimanouchi, 1970).
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
26 1
The optically active modes of a-(Ala), are classified into A (6 = 0’), El (6 = 99.6”), and E2 (6 = 199.1’) species, where 6 is the phase difference between motions in adjacent units of the helix. The A and El species are both Raman- and IR-active, with the former exhibiting parallel and the latter perpendicular dichroism in the IR. The E2 species is active only in the Raman. There are 28 A species modes, 29 doubly degenerate El species modes, and 30 doubly degenerate E2 species modes (Fanconi et al., 1969). b. Vibrational Analysis. There have been extensive studies of the IR (Elliott, 1954; Bamford et al., 1956; Itoh et al., 1968, 1969; Itoh and Shimanouchi, 1970); Tipping et al., 1984) and Raman (Fanconi et al., 1969; Koenig and Sutton, 1969; Simons et al., 1972; Fanconi, 1973; Frushour and Koenig, 19’74;Chen and Lord, 1974; Tipping et al., 1984) spectra of a-(Ala), and its N-deuterated derivative, including IR dichroic studies, which are particularly important in assigning A and El species modes. Infrared and Raman spectra of a-(Ala), are shown in Figs. 13 and 14, respectively. Although there have been several normal-mode calculations on a(Ala), (Itoh and Shimanouchi, 1970; Fanconi et d., 1971; Rabolt et d., 1977), these either replaced the CHS group by a point mass, used a nontransferable force field, or were based on a structure other than the X-ray refined structure. The calculation of Dwivedi and Krimm (1984a) was the first that used the X-ray structure and was done with a detailed force field derived from that of P-(Ala),. The results of this calculation are given in Table XIII. An “approximate” force field, based on a structure in which the CHs group is replaced by a point mass, has been derived from the detailed force field (Dwivedi and Krimm, 1984b); its calculated frequencies are also given in Table XI11 (as the second of the two entries for each mode), A detailed analysis of the spectra of aI-(AlaND), has also been done, and is given in the original paper (Dwivedi and Krimm, 1984a). The v i frequency has been determined from a Fermi resonance anal-
I 1600
I
1 1400
I
I 1200
I
I lo00
FIG. 14. Raman spectra of a-poly(L-alanine). (-) (Frushour et al., 1976).
I
I 800
I
I 600
I
I 400 lcm-’ 1
Undeuterated; (--) N-deuterated
TABLE XI11 Observed and Calculated Frequencies of a-Poly(L-alanine) Calculatedb
Observed"
Rarnan
IR
A
Ei
Ez
3279 3279 2984 -
3279 3279
3279 3279
2984 2988s
m r.9
-
CHs as2 (99) 2984
-
CHJ ss (1 00)
2930 2930
CHlp ss (1 00)
2884 2883
COHO s (99) C"H" s (99) COHO s (99) COHO s (99) CoH" s (99) COHO s (99)
-
2880M
CHs a ~ 2 (98) CHJ ss (100)
2930 -
2942VS
CHs as1 (98) CHJ as2 (96)
2984 2984 -
2930M, sh
N H s (98) NH s (98) CHJ as1 (96) CH3 as1 (99)
2984 -
r.9
Potential energy distributionr
2884 2883 2884 2883
1658VS (11)
1657 1656 1655 1653
1655s
1645 1643 1540 1549 1543VW
1538 1546
1545vs (I) 1516M.sh
1519 1523 1452
CO s (82).CN s (lo), C"CN d (10) CO s (81) CO s (82),CN s ( I I ) , C"CN d (10) CO s (82) CO s (83),CN s (12).C"CN d (10) CO s (83) NH ib (46), CN s (31),CO ib (12), C"C s (1 1) NH ib (41).CN s (35),CO ib (12),C"C s (10) NH ib (46),CN s (33),CO ib (1 I), C"C s (10) NH ib (41),CN s (36),CO ib (12) NH ib (45),CN s (34),CO ib (1 1) NH ib (42),CN s (37),CO ib (12) CHJ abl (47).CHs ab2 (41).CHs rl (10)
CH, ab2 (54),CHy abl (31)
1452 -
I0
1452
Q,
w
CHy ab2 (59),CHJ abl (26)
-
1458s
1451
CHS ab2 (45).CH3 abl (41).CHy r2 (10)
-
1451
CHs abl (57),CH:%ab2 (31)
-
1451
CHs abl (62),CH3 ab2 (26)
-
1377W
1381s (I)
1379
1379
1379
-
-
-
1349 1354 1345 1343
1338M,sh 1326s
1328M,sh (11)
1334
CHy sb (100) H" bl (34),NH ib (17),C"C s (17),C"C@s (10) NH ib (27),H" bl (27),C"C s (17),C"C@s (12) H" bl (25),H" b2 (20).C"C s (17).NH ib (12) H" bl (26).NH ib (21),C"C s (17).C"C@s (11) H" b2 (58),C"C s (16) (Continwd)
TABLE XI11 (Continued) Observeda Rarnan
Calculated" IR
A
Ei
Ez
1317 1314 1300 1308 1299
I287 1275 Iu 01 4
1271W
1270M (I)
1261W
1265M. sh
1 167M
1170s (11, 1)
1278 1267 1262 1254 1178 1167 1160 1162 1158 1115 1127
1 105s
1108s (I)
1103 1094 1094 1075 1043 -
Potential energy distribution' H" b2 (52), CaC s (17), NH ib (lo), CN s (10) H" b2 (81) Ha b2 (83), Ha bl (10) Ha b2 (56), H" bl (23) H" b2 (73). H" bl (18) Ha bl (62) Ha bl (67), H" b2 (16), C"C@s (14) NH ib (23), NC" s (18), Ha bl (18) H" bl (42), NCa s (21), NH ib (20). Ha b2 (12) N H ib (28), H" b2 (15), NC"s (14). Ha bl (11) Ha bl (38), NH ib (25), NCu s (19), Ha b2 (14) NH ib (40), H a b2 (28) NH ib (38), Ha b2 (26), H" bl (19), NCa s (12) NCu s (32), CHs r l (20), CaCa s (16) NC" s (53), CaC s ( 1 7), C"CP s (1 1) C"C@s (32), NC's (21), CH3 rl (15). Ha bl (10) CaCY s (35), NC" s (31), Ha bl (16) CaCS s (41), NCa s (16), H" bl (12). CHs r l (12) CaCa s (41), NC" s (25). H" bl (19) CaC@s (64), CH3 1-2(15) CaCS s (62), CaC s (1 1) CuCS s (39), CHJ r2 (18) C"C@s (34), NC" s (19) CaCa s (26), CH3 r2 (21) C u e s (25). NCa s (22), CN s (13) CHs r l (47). H" bl (20)
1050W 1017w 970W
1037
CH:+r l (39), H" bl (22), CHJ r2 (15)
-
1026 962 -
CH3 1-2 (28), H" bl (25), CH3 r l (24) CHJ rl (34). NC" s (29). CHJ r2 (17). C°C s (15) 955 -
940VW 908VS
CH3 r2 (32), NC" s (20), CHJ r l (15) 95 1 -
910 904
90 1 900 882W rn
m u1
896 893
773vw
780 773 767 765
756W 693M
754 750 700 696 675 684
662W
660 663 637 640 608
CH3 r2 (41), NC" s (16) CN s (23). CNC" d (16). CO ib (12), CHS r2 ( I I ) , CO s (10) CN s (39), C°C s (14), CNC" d (13) CN s (IS), C"C s (12), CO ib (12), CuQ s (10) CN s (33), CuC s (21) CoQ s (19). C"C s (14), CN s (14), CO ib (12). NC" s (11) CN s (30), C"C s (26) CO ob (42), Ca bl (lo), CN t (10) CO ob (44). CN t ( 1 1) CO ob (52), CS bl (10) CO ob (53) CO ob (38). CN t (30) CO ob (42),CN t(35) NC°C d (36), C°CN d (26) NC°C d (37), C"CN d (30), NH ob (10) CN t (32), CO ib (IS), C"C s (14) CN t (25), CO ib (20), C°C s ( l l ) , NC°C d (10) CN t (37), NH ob (21), NCuC d (12) CN t (36), NH ob (21), NC"C d (14), CO ib (10) CN t (59), NH ob (43), NH ..-0 ib (13) CN t (66), NH ob (46), NH ... 0 ib (14) CN t (47). NH ob (23), CO ob (15). CO ib (12)
TABLE XI11 (Continued)
Observed"
IR
Rarnan
530VS
Calculatedh
526s (11, 1)
A
Ei
Ez
607
CN t (47), NH ob (24), CO ob (19). CO ib (12) CN t (68),NH ob (36), CO ob (26), NH ...0 ib ( 1 1) CN t (66).NH ob (36), CO ob (33), NH ... 0 ib ( 1 1) CO ib (29). C"CN d (21). C"C s (17). CS b2 (17), NH ob (11) CO ib (32). C"CN d (21), CY b2 (19), C"C s (I@, NC" s (10)
522 517
NC"Cd(31),C"Cs(14),CaCNd(11),COib(ll) NC"C d (32), C"C s (14), C"CN d (13), CO ib ( I t ) NC"C d (37), CO ib (17), (2°C s (16) NC"C d (40), CO ib (16), C"C s (15)
589 58 1 537 525
492 490 375s
375s (I)
374
ro
m
Q,
369 -366M. sh
367 367 366 360
328W
3243 (I)
326 317
310s
310 300
294M
2 9 0 (~ 11)
260M
259W, sh 240W
Potential energy distribution<
307 305 264 249 245 -
CO ob (16), NH ob (16), CY b2 (15). C"CN d (15), CO ib (15), CNC" d (10) CY b2 (20), CO ib (17). C"CN d (15), NH ob (14), CO ob (13), CNC"d (12) CO ob (211, CS bl (17), NC"C d (14), NH ob (13), CS b2 (11) CO ob (19), C Y bl (18), NC"C d (14), CY b2 (13), NH ob (12) CS b2 (49), C"CN d (22). CS bl (16) Co b2 (51), C"CN d (26) CS b2 (42), CY bl (19). CO ib (16) CS b2 (41). CY bl (20), CO ib (15). CNC" d (12) CNC" d (30), CO ib (20), CS bl (20), CO ob (17) CNC" d (37). CY bl (25), CO ib (I@, CO ob (12) CS b2 (34), CO ib (29), CNC" d (15) CY b2 (39), CO ib (29). CNC" d (15) C-CY t (35), CS b2 (14) CY b2 (17). C"CN d (13). NH ob (11) C"CY t (91)
244 223VW
230 -
C"CY t (63)
209VW
205 20 1
189M
188M (I)
165M
163M (I)
197 200 155 155
159s
151 154 120s (11)
136 138
113M, sh (I) t 4
cn
-3
87W
84W
C"CY t (95)
96 96 94 95 87 85 49 48 40 39 38 38
C"CN d (27), CY b2 (25), CY bl (12), CO ob (10) C"CN d (31). CY b2 (26), CY bl (14), CO ob (10) C"CN d (15), CY b2 (12), CO ob (12) CaCN d (15), Ca b2 (14), NH ob (12), CO ob (11) CNC" d (33). C"CN d (19), CS bl (14), NH ob (14), NC"C d (12) CNC"O(30), C"CN d (25), CS bl (14). NH ob (13). NC-C d (11) NH ob (43), CNC" d (20), NC-C d (11) NH ob (48), CNC" d (17), NC"C d (10). C"CN d (10) CNC" d (34). C"CN d (21), NC"C d (16), CB bl (12) CNC" d (33), C"CN d (25), NC"C d (14), CY bl (10) NH ob (24), C"C t (20), NC" t (18). CN t (18),H ..-O s (10) NH ob (22). C"C t (21). NC" t (19), CN t (18). H -..O s (10) CN t (27), NC" t (23). C"C t (22). H ... 0 s (15), NH ob (10) CN t (28), NC" t (23), C"C t (21). H ... 0 s (15) NH ob (33), NH ... 0 ib (20). CY b l (12), NC"C d (lo), CNC" d (10) NH ob (30), NH ... 0 ib (22). CS bl (15). NCuC d (10) NH ob (38). NC" t (24), H ... 0 s (23), CN t (15), Ca bl (13) NH ob (40), H ... 0 s (25), NC" t (24), CS bl (18), CN t (15) NH ob (58), CS b l (20), H ... 0 s (11) NH ob (63), CY bl (28), NC"C d (1 l), H ... 0 s (11) C"C t (53), H ... 0 s (18) C"C t (54). H ... 0 s (15), NH ob (12)
S, Strong; M, medium; W, weak; V, very; sh, shoulder; 1, parallel dichroism; I,perpendicular dichroism. T h e first frequency in each vertical pair is from the detailed calculation, the second is from the approximate calculation. c s, Stretch; as, antisymmetric stretch; ss, symmetric stretch; b, angle bend; ib, in-plane angle bend; ob, out-of-plane angle bend; ab, antisymmetric angle bend; sb, symmetric angle bend; r, rock; d, deformation; t, torsion. In the approximate calculation CS represents the point mass (CeH,). Only contributions 10 or greater are included. Unperturbed frequency. a
b
268
SAMUEL KRIMM AND JAGDEESH BANDEKAR
ysis (Krimm and Dwivedi, 1982a), and its value of 3279 cm-l compared to 3242 cm-l for P-(Ala), reflects the weaker hydrogen bond in the a helix [1(N ... 0)= 2.86 A] than in the p sheet [1(N --.0)= 2.73 A]. This is also manifested by the need for a larger value of the NH stretch force constant [F(NH)] in a-(Ala), (viz., 5.830), than in P-(Ala), (viz., 5.674). T h e Fermi resonance shift is well accounted for by an interaction of the fundamental with the overtone of the El species amide I1 mode. The unperturbed amide I modes are calculated at 1663 (A), 1662 (El), and 1662 (E2) cm-l. T h e TDC interactions give rise to a total splitting of 12 cm-l, which is smaller than for the /3 sheet, where TDC results in a predicted splitting of 68 cm-'. T h e strong 1658-cm-l IR band is observed to shift to 1650 cm-l on N-deuteration, and the calculations predict a downshift of 10 cm-', twice that for P-(Ala),. The unperturbed amide I1 modes are calculated at 1529 (A), 1532 (El), and 1536 (E2)cm-'; the 2 l-cm-' splitting predicted from TDC interactions is again much smaller than the 64-cm-l splitting calculated for the &sheet. The NH ib coordinate makes a significant contribution throughout the 1349- to 1262-cm-I region, although amide I11 is typically assigned to observed bands in the region 1280-1260 cm-'. T h e significant increase in frequency of the latter bands over the comparable modes in the 1240 to 1220-cm-' region of P-(Ala),, must be related to the structural differences between these systems, since F(CNH) and F(CaN H) as well as F(NH -*.0 ib) are in fact smaller for the a helix than for the p sheet (Dwivedi and Krimm, 1984a). T h e much lower amide V frequencies in a-(Ala), (viz., the IR bands at 658 and 618 cm-') as compared to /3-(Ala)n (viz., 706 cm-I) probably reflect the smaller effective value of F(NH ob) resulting from the weaker hydrogen bond in the a helix. It is worth noting from Table XI11 that the approximate force field gives a good reproduction of the frequencies and eigenvectors of the non-CH3 modes of the side-chain point-mass model of the a helix. This approximation should therefore be satisfactory for reproducing the amide modes of an a-helical polypeptide chain in, for example, a globular protein. The predicted normal modes of the allhelix indicate that even small conformational differences can lead to significant frequency differences (Dwivedi and Krimm, 1984a). The main differences result from the weaker hydrogen bond and the altered TDC interactions in the all helix: the v i and amide I frequencies increase by 10 cm-l, the amide I splitting increases from 2 to 7 cm-l, and the amide I1 splitting increases from 19 to 25 cm-' (although the strong IR E l mode is not predicted to change much in frequency). Other differences are more obviously seen in the lower frequency regions (Dwivedi and Krimm, 1984a). The observation of such effects in the amide A, I, and I1 regions of the IR spectra of
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
269
bacteriorhodopsin led to the assignment of the aII-helixstructure to this protein, and to the speculation that this conformation might be specifically associated with proton translocation through the protein (Krimm and Dwivedi, 1982b). 2. a-Poly(L-glutamic Acid) It might be expected that amide and backbone bands in the spectra of a-(Ala), would be representative of frequencies to be found in all ahelical polypeptides. Evidence has suggested, however, that the amideI11 frequencies are sensitive to side-chain composition (Hsu et al., 1976). It is, therefore, important that a detailed analysis of another a-helical polypeptide be available so that the influence of this factor can be assessed. A vibrational analysis of a-(GluH), (Sengupta and Krimm, 1985) clearly shows that the side-chain structure does affect the main-chain vibrational modes. T h e normal modes of a-(GluH), were calculated using the same backbone structure and force field as for a-(Ala),, and a comparison of the results for the amide modes is shown in Table XIV. Some significant differences are observed, and are well predicted, between these two structures. TABLE XIV Comparison of Amide and Skeletal Frequencies of a-Poly(L-alanine) and a-Poly(L-glutamic Acid)
a-Poly(L-alanine) Mode
Raman"
IR"?"
a-Poly(L-glutamic acid)
Calculated'
Raman"
IF."
Calculated'
~~
Amide I Amide I1 Amide 111
1658VS (11) 16558 1543VW 1545VS (I) 1516M 1278W 1270M (I) 1271W 1261W 1265M
Skeletal
1167M
Skeletal
908VS
Amide V
662W
Skeletal
530VS
Skeletal
375s
a
909M (11) 8933 (I) 658s (I) 618s (I)
1657 (A) 1655 ( E l ) 1538 ( E l ) 1519 (A) 1287 (E2) 1278 (El) 1262 (A)
910 901 660 608
(A) (El) (El) (El)
1653VS (11) 16528 1550s (I) 1510W (11) 1296M 1283W (I)
928W 924VS 670MW 562W
3758 (I)
374 (El)
618MW 567W 409M
S, Strong; M, medium; W, weak; V, very. Parallel; I,perpendicular polarization with respect to helix axis, A, El : Symmetry species.
' 11,
1657 (A) 1655 ( E l ) 1537 (E,) 1517 (A) 1299 (E2) 1287 (EN) 1263 (A)
929 (El) 922 (A) 678 (El) 626 (El) 549 (A) 515 ( E l ) 402 ( E ~ )
270
SAMUEL KRIMM AND JAGDEESH BANDEKAR
As might be expected, there are few differences for amide I and amide I1 modes. [The small downward shift in amide I and upward shift in amide 11, taken together with a small downshift in v i , may indicate a slightly stronger hydrogen bond in a-(GluH), as compared to a-(Ala), (Sengupta and Krimm, 1985).] However, in the traditional amide I11 region the differences are significant: In a-(Ala), these bands are found at 1278W (Raman), 1271W (Raman), and 1270M (IR), and 1261W (Raman) and 1265M (IR) cm-'; whereas in a-(GluH),, the comparable modes are found at 1296M (Raman) and 1283W (IR) cm-'. Similar prominent differences are found for skeletal frequencies: Predominantly CaCP stretch modes are observed as strong Raman and IR bands near 1170 and 1105 cm-' in a-(Ala),, but as medium-intensity bands at 1118 (IR) and 1080 (Raman) cm-' in a-(GluH),, and the intense Raman, mainly CN s modes, are found at 908 and 924 cm-', respectively. T h e amide V modes of a-(Ala), appear as strong IR bands at 658 and 618 cm-', but in a-(GluH), they appear as medium-weak IR bands at 670 and 618 cm-'. The supposedly characteristic strong Raman and IR band of the a helix near 530 cm-' in a-(Ala), has a counterpart of only weak intensity near 565 cm-' in a-(GluH),. And the strong Raman and IR mode at 375 cm-' in a-(Ala), finds its counterpart in a-(GluH), as a medium-intensity IR band at 409 cm-'. It is therefore necessary to be careful in extrapolating from the commonality of backbone conformations to a commonality of vibrational modes. The exact nature of the similarities and differences can, however, be elucidated with the help of normal-mode analysis. C . 3 1Helix ~ Poly (a-aminoisobutyric acid) a. Structure and Symmetry. The 310 helix differs from the a helix in that the hydrogen bond is of the 4 + 1 rather than the 5 + 1 type. A form of this helix was described in early considerations of possible polypeptide chain conformations by Huggins (1943). A definitive structure was given subsequently by Donohue (1953), and characterized later by conformational energy calculations (Prasad and Sasisekharan, 1979; Paterson et al., 1981). The 3'0 helix conformation is found only occasionally in proteins, but has been established by X-ray structure analysis in peptides containing the a-aminoisobutyric acid (Aib) residue (Shamala et al., 1977; Smith et al., 1981; Bosch et al., 1983) and has been proposed for a polymer of Aib from electron-diffraction studies (Malcolm, 1977, 1983; Malcolm and Walkinshaw, 1986). However, since alamethicin, with 8 or 9 Aib residues out of 20, has an a-helical structure (Fox and Richards,
VIBRATIONAL SPECTROSCOPY OF PEF'TIDES AND PROTEINS
27 1
1982),it is of interest to be able to characterize the 3,o-helix structure of (Aib), , particularly in comparison to the a-helix conformation, from which it differs little in energy (Paterson et al., 1981).Such a comparative normal-mode analysis has been done (Dwivedi et al., 1984), and the results are instructive not only for what they reveal about the possibility of defining the structure of (Aib), but also for what they show concerning the effects of side-chain structure on the modes of the a helix. Since the structure of the polymer has not been fully determined, normal-mode analyses were done for both the 3 1 helix ~ and a-helix conformations of (Aib),. [The so-called a ' helix (Prasad and Sasisekharan, 1979) was also analyzed (Dwivedi et al., 1984), but we will not consider it here since the vibrational analysis showed it to be an unlikely possibility for (Aib),.] For the a-helix structure, the dihedral angles were the same as those for a-(Ala),, viz., 4 = -57.4", $ = -47.5". For the 310helix structure, dihedral angles calculated by Prasad and Sasisekharan (1979) were chosen (viz., 4 = -45" and JI = -30") that gave helix parameters (n = 2.99 and h = 2.01 A) closest to those of the standard structure (Donohue, 1953). This structure is shown in Fig. 15. The hydrogen bonds of this structure have the following characteristics: Z(N * * * 0)= 2.83 A, l(H * * - 0) = 1.83 A, NHO = 175.1", HNO = 3.2". 6. Vibrational Analysis. Infrared and Raman spectra of (Aib), are shown in Figs. 16 and 17, respectively. The force field for (Aib), was transferred from that for a-(Ala), , except that additional force constants were refined for the (CH& group (Dwivedi et al., 1984). A Fermi resonance analysis of U A showed that u i is intermediate between that of a(Ala), and P-(Ala),, consistent with the trend in Z(N *.- 0)distances. This provided a basis for modifying F ( C O ) , F(NH), and F(H ... 0).The results of the full normal-mode calculations are given in Dwivedi et al. (1984). In Table XV we present the observed bands and calculated frequencies of the amide and some skeletal modes; the comparable calculated frequencies for observed bands of a-(Ala), are included in order to assess how the change in side-chain structure in this case has affected the backbone frequencies. The purpose of the vibrational analysis of (Aib), was to determine the structure of this polymer in the thin films used for the electron-diffraction studies (Malcolm, 1977, 1983). It is therefore of interest to note from the results that a 310-helixstructure is clearly favored. This is not so evident from the amide I modes, even though a larger splitting is predicted and observed between A and El modes for the 310 than for the a helix. Similarly, a smaller splitting (as observed) is predicted for the amide I1 modes of the 310helix. Somewhat better agreement with the 310 helix is also evident for the amide I11 mode at 1280 cm-', but signifi-
272
SAMUEL KRIMM AND JAGDEESH BANDEKAR
acid). The CHs group is FIG. 15. ORTEP drawing of 3~~-poly(a-aminoisobutyric represented by a point mass.
cantly better predictions are evident for the amide V and lower frequency skeletal modes. The observed N-deuteration-sensitive bands at 694 and 680 cm-' are well predicted by the 310helix but are very poorly matched by the a helix. And the pattern of the 367(1)- and 362()1)-cm-' pair of IR bands is matched by the calculation for the 310 helix but is inverted by that for the a helix. This superior frequency agreement for the 310 helix, together with a more satisfactory explanation of several special features of the spectra (Dwivedi et al., 1984), provides strong evidence for the presence of the 31o-helix structure in the thin-film samples. It also demonstrates the capabilities of vibrational analysis in distinguishing between similar conformations. From the comparisons in Table XV between calculated frequencies of a-(Ala), and a-(Aib),, we can also gain further insight into the effect of (in this case unusual) side-chain variation. The amide I and I1 modes are hardly affected, but significant changes are seen for amide I11 modes. For a-(Ala), , medium-intensity IR bands are observed corresponding to calculated 1278- (El) and 1262-cm-I (A) modes; for a-(Aib),, there is no
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
0 " " " " " 1800 1600 1400 1200 1000 800
"
273
"
600 400
FREQUENCY, CM-l
FIG. 16. Infrared spectra of poly(a-aminoisobutyric acid). (A) Undeuterated; (B) N-deuterated. (-) Electric vector perpendicular to orientation direction; (---) electric vector parallel to orientation direction (Dwivedi et al., 1984).
FREQUENCY, CM-'
FIG. 17. Raman spectrum of poly(a-aminoisobutyric acid) (Dwivedi et al., 1984). (A) Undeuterated; (B) N-deuterated.
274
SAMUEL KRLMM AND JAGDEESH BANDEKAR
TABLE XV Observed and Calculated Amide and Skeletal Frequencies of Poly(a-aminoisobutyyric Acid) Observed (cm-I) Mode
Raman"
I Ra.b
Amide I 16473 Amide I1 Amide 111
Skeletal Amide V
Skeletal
Skeletal
(I
1531W 1339W 1313W
1280M goavs
594M 568s 506W 367M
Calculatedr (cm-I) 310Helix
a Helix
a-(Ala)i
1665 (A) 1661 (E) 1547 (E) 1533 (A) 1346 (E) 1312 (A)
1657 (A) 1655 (El) 1535 (El) 1514 (A) 1334 (El) 1311 (A)
1657 (A) 1655 (El) 1538 (El) 1519 (A) 1345 (El)
1287 (A) 905 (A) 701 (E)
1295 (A) 902 (A) 659 (El) 654 (Ed
676 (A) 594 (A) 557 (A) 506 (E) 361 (E) 351 (A)
628 (A) 593 (A) 561 (A) 499 (El) 353 (El) 364 (A)
1287 (E2) 1278 (El) 1262 (A) 910 (A) 660 (El) 637 (Ed 608 0%) 589 (A) 537 (A) 522 ( E ~ ) 374 (El)' 367 (A)'
S, Strong; M, medium: W, weak; V, very.
1 , Parallel; I, perpendicular polarization with respect to helix axis. A, E, El, E2 : Symmetry species. Calculated frequencies corresponding to observed bands, except for 637 (E2)and 589 (A), for which there are no observed counterparts. Mode has different character.
comparable E1 mode, and the A mode is predicted to be -30 cm-I higher. The characteristic a-helix skeletal mode at 910 (A) cm-' is only slightly downshifted by the change in side chain. The 660-cm-I (El) amide V mode is relatively unchanged, but the pattern of remaining bands is quite different, particularly in the absence of any mode of a(Aib), near the observed 608-cm-l (El) mode of a-(Ala),3and the large frequency discrepancy between the A modes. The supposedly characteristic a-helix bands, seen near 530-525 cm-' in a-(Ala), and predicted in this vicinity at 537 (A) and 522 (El) crn-l, have shifted significantly in a-(Aib), to 561 (A) and 499 (El) cm-' (as if a splitting had increased), and a new mode with some of the same character is predicted at 593 (A) cm-'. In summary, these results reinforce our previous discussion of the case of a-(GluH),, namely, that caution is necessary in assessing the transferability of some backbone frequencies if the side chain is altered.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
275
D. 31Helix Polyglycine II a. Structure and Sjmmetq. Because of the difficulty in producing welloriented specimens of (Gly)nII,an analysis of structure from fiber X-ray diffraction patterns has not been possible for this system. In fact, even mild mechanical treatment converts (Gly), I1 to (Gly), I (Elliott and Malcolm, 1956; Krimm, 1966). Nevertheless, early powder X-ray diffraction studies of this form of (Gly), (Bamford et al., 1955) suggested a hexagonal packing of chains whose structure was definitely not a helical or jl sheet in type. A molecular model-building study (Crick and Rich, 1955)led to the proposal that the chains are 31 helices completely hydrogen bonded to each other via N-H .-.O=C hydrogen bonds. Although the chemical sequences in all chains was taken to be the same (parallel), it was noted that the structure could accommodate oppositely directed (antiparallel) chains. A small modification of this structure was subsequently proposed (Ramachandran et al., 1966a) in order to allow for the additional formation of Ca-Hu .-.O=C hydrogen bonds, which were thought to be present in collagen (Ramachandran and Sasisekharan, 1965). [See Section II,D,2,b for a discussion of the crystallographic evidence for the existence of such bonds. We discuss below the spectroscopic evidence in (Gly),II.] The relative ease of mechanical conversion of (Gly), I1 to (Gly), I, which clearly contains antiparallel chains, together with electron-diffraction evidence for folded-chain single crystals of relatively high molecular weight (Gly),II (Padden and Keith, 1965), led to the observation (Krimm, 1966)that antiparallel chains must be present in single crystals, and by extension probably in normal preparations. This motivated a detailed analysis of the structural possibilities (Ramachandran et al., 1967), resulting in the proposal of specific chain-packing arrangements for antiparallel-chain (Gly), 11. Because of the uncertainty in the crystal structure of (Gly),II, and of the importance of possible spectroscopic evidence for the presence of ...O=C hydrogen bonds, normal-mode calculations were done C'-Ha on both parallel-chain and antiparallel-chain crystal structures of (Gly),II (Dwivedi and Krimm, 1982~).These structures are shown in Figs. 18 and 19, respectively. The single-chain geometry is based on the same bond lengths and angles as in (Gly),I (see Table VIII). The backbone is a 31 helix with an axial repeat of c = 9.30 A, corresponding to dihedral angles of 4 = -76.9' and J, = 145.3'. The parallel-chain structure (Fig. 18) is based on a hexagonal cell with a = 4.80 A. The unit cell has Cs symmetry, and the normal modes are classified into 20 A species and 20 E species, all of which are Raman- and IR-active. The antiparal-
276
SAMUEL KRIMM AND JAGDEESH BANDEKAR
FIG. 18. ORTEP drawing of parallel-chain polyglycine I1 structure (Dwivedi and Krimm, 1982~).
FIG. 19. ORTEP drawing of antiparallel-chain polyglycine I1 structure (Dwivedi and Krimm, 1982~).
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
277
TABLE XVI Hydrogen-Bond Parameters in Crystalline Polyglycine 11 Parameter
P"
1.73 1(H ... 0) (A) 1(N ...0)(A) 2.68 O(NH, NO) (degrees) 14.0 y(NH0) (degrees) 158.0 1(H" * * * 0)(A) I(C" .* 0)(A) O(C"H", CaO) (degrees) y(C"H"0) (degrees)
A" 1.75 2.69 15.2 156.2 2.36 3.33 21.1 149.5
P, Parallel-chain structure; A, antiparallel-chain structure.
lel-chain structure may be variable (Ramachandran et al., 1967), but its main features are represented by an orthorhombic cell (Fig. 19) with a = 8.487 A and b = 4.80 A. The two oppositely directed chains are related by a twofold screw axis at a/4 and perpendicular to the ac plane, giving the unit cell C2 symmetry. The normal modes are classified into 62 A species and 61 B species, all of which are Raman- and IR-active. It is important to note the significant differences in hydrogen-bond arrangements in the two structures. In both structures all of the N-H *-. O=C bonds are formed, although there are small differences in their parameters (see Table XVI). However, although in the parallelchain structure all CH:! groups can participate in Ca-Ha *.. O=C hydrogen bonds, in the antiparallel-chain structure only every third CHP group along a chain can form such a bond (only to a like-directed chain). The result is that each chain no longer has strict threefold symmetry, and therefore the degeneracy of the E species modes is broken. As we will see, this has important spectral consequences. 6. Vibrational Analysis. Following early IR studies of (Gly),II (Elliott and Malcolm, 1956), a more detailed experimental investigation was undertaken of this polypeptide and its isotopic derivatives (Suzuki et al., 1966).Subsequent studies concentrated on variations in the CHP-stretching region (Krimm et al., 1967), effects of low temperature (Krimm and Kuroiwa, 1968), and far IR spectra (Fanconi, 1973). The Raman spectrum has also been studied (Smith et al., 1969; Small et al., 1970). Infrared and Raman spectra of (Gly),II are shown in Figs. 20 and 21, respectively. These experimental data form the basis of comparison with the results of the normal-mode calculations. In early normal-mode calculations (Miyazawa, 1967; Small et al., 1970), the CH2 group was approximated by a point mass. A subsequent
m
NOlSSl WSNVtll
C
n
278
NOlSSlWSNVtll
v
0
. l
(0
c.
h
r 6
v
0-
h
Y
h
6100
P
6150
6200
2809
952<
875
c
90'
5250
5300
5350
5400
-
5450
5500
5550
5600
5650
5850
5900
1031
t
5250
5300
5450
5500
5550
5600
5950
11
I
6000
26
\L
6LZ
280
SAMUEL KRIMM AND JAGDEESH BANDEKAR
calculation (Singh and Gupta, 1971) included these H atoms but ignored data on isotopic molecules. A calculation that included all atoms and data from five isotopic species (Abe and Krimm, 1972b) was restricted to an isolated chain without explicit hydrogen bonds. Only recently (Dwivedi and Krimm, 1982c) has a calculation been done that incorporates the complete crystal structure and is based on a force field refined from the correct (APRS) structure of (Gly)nI(Moore and Krimm, 1976a; Dwivedi and Krimm, 1982a). T h e development of a force field for (Gly),II requires not only reliable force constants for the polypeptide chain but also satisfactory constants for the possible Ca-Ha .-.O=C hydrogen bonds. The force constants were appropriately derived from the latest refinement for (Gly), I (Dwivedi and Krimm, 1982a). T h e development of force constants for a Ca-Ha O=C interaction depends on a correct assignment of particular features in the observed spectra to the presence of such hydrogen bonds. Despite some skepticism (Small et al., 1970), the body of experimental evidence favors an assignment to such an interaction. First, two bands are observed in the NH s region of the Raman spectrum, at 3305 and 3278 cm-', which disappear on N-deuteration (Small et al., 1970) and can therefore be assigned to NH s. Calculations (Small et al., 1970; Dwivedi and Krimm, 1982c) show that essentially no splitting is expected between A and E species modes, and therefore these bands are presumably associated with NH groups in different environments. Second, although only two CH2-stretching frequencies would normally be expected, since again no frequency difference is predicted between A and E species modes (Abe and Krimm, 1972b; Dwivedi and Krimm, 1982c), C-deuteration studies (Suzuki et al., 1966) show that the four observed bands in this region are associated with CH2 vibrations. This suggests that CH2 groups also occur in different environments. Third, different samples show pairwise intensity variations in these four CH2 s modes (Krimm et al., 1967), suggesting that there are two environments for the CH2 groups. This is also supported by the presence of two CH2 b frequencies with a separation of about 12 cm-', whereas only a 3- to 4-cm-' difference is predicted between A and E species modes of a chain having one type of CH2 group. Finally, IR spectra at liquid-N2 temperature (Krimm and Kuroiwa, 1968) show that, compared to almost no frequency shifts in (Gly),tI, amide and CH:! modes of (Gly),,II undergo O=C hydrogen bonds large frequency shifts that indicate that N-H become stronger whereas differences between CH:, groups diminish. Thus, temperature can have relatively different effects on the structural environments of these two kinds of groups, highlighting the nonuniformity that was present at room temperature. Such experimental observations are consistent with the presence of
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
28 1
C"-Ha -.-O=C hydrogen bonds, and led to the association of modified NH and CH force constants with such bonds (Dwivedi and Krimm, 1982~).In fact, the presence of two types of CH2 groups suggests the prevalence of the antiparallel-chain crystal structure (see above), and this was tested in detail by a calculation of the normal modes of both unit cells. T h e results are given in Table XVII, and compared with observed Raman and 1R data on (GIy),,II. T h e comparisons in Table XVII show clearly that the antiparallelchain (Gly),,I1 structure is preferred over the parallel-chain structure. This is not because of the calculated multiplicities of NH s, CH2 s, and CH2 b modes, which are in effect built into the calculation by choosing different force constants for those groups involved in C"-Ha -..O=C hydrogen bonds. Rather, it is a consequence of a number of unique predictions throughout the remainder of the spectrum. Thus, aside from better predictability in the amide I and amide I1 regions, a previously unassignable band near 1333 cm-' can now be assigned, being totally absent for the parallel-chain structure, and medium-intensity Raman bands at 673 and 566 cm-' are easily accounted for, whereas no bands are predicted in these regions for the parallel-chain structure. Comparable patterns exist in other regions, if not as dramatic as these, and in particular provide an explanation for the larger number of observed bands than can be accounted for by the parallel-chain structure. It should be noted that this larger number of bands is not a consequence of the fact that there are two chains per unit cell for the antiparallelchain structure; as Table XVII shows, crystal splittings are in fact very small. T h e larger number of modes, and in particular the prediction of modes in frequency regions where none is calculated for the parallelchain structure, is a consequence of the loss of strict 31-helical symmetry, and therefore of degeneracies. This loss results from the fact that having O=C hydrogen bond destroys every third residue form a C"-H" the exact equivalence of all Gly units. These results thus give strong support to the presence of C"-Ha *.O=C hydrogen bonds in (GIy),,II. The calculations also provide a very good prediction of the observed bands, the average discrepancy being 4-5 cm-l. T h e effects of N-deuteration are well accounted for (Dwivedi and Krimm, 1982c), for example, in showing that the mainly CH2 w mode at -1380 cm-l has an NH ib component, thus explaining its disappearance on N-deuteration. [The downward shift and enhanced splitting of CH2 b may be better explained, if with somewhat poorer frequency agreement, if we assign observed bands in (GlyND),,I1 at 1448W, 1419S, and 1412W cm-' in the Raman spectrum (Small et al., 1970) to calculated modes at 1427, 1409, and 1404 cm-I, respectively (Dwivedi and Krimm, 1982c).]This analysis thus shows how sensitive the
TABLE XVII Obserued and Calculakd Freqwncies of Crystalline Polyglycine II ~~
Calculated (cm-1) Observed"(cm-') Raman
IR
3279W'
3279s'
3257M' 2979s
2977W
2940VS
2935MW 2850MW
N 00 N
2805W 1654VS
1560W
-1655W
B
A
E
328 1 3254 (3253 2980 2936 I2936 2853 12853 2803 1656
3281 3254 3253 2980 2936 2936 2853 2853 2803
328 1
3281
298 1
2980
{
-
1432W
Parallel
A
1640VS 156OW
1550s
1421s
Antiparallel
{
2803 1658
2804
1649
1548
1548
1433
1433 1423
1551
1420M 1421
CHz as (99) CHz ss (99)
1651 1649 1565 1555
NH s (98) NH s (97) NH s (97) CHz as (78), CHz ss (18)
J
1654 1653
1645 1565 1552
Potential energy distributionh
1533 1435
1431
CHz ss (82), CHz as (22) CO s (71), CN s (20), CaCN d (10) CO s (74), CN s (20), C"CN d (10) CO s (72), CN s (20). CaCN d (10) CO s (73), CN s (19), CaCN d (10) CO s (73). CN s (20), C"CN d (10) CO s (74), CN s (19), CaCN d (10) NH ib (59), CN s (18) NH ib (56),CN s (20), CaC s (10) NH ib (51), CN s (21), CaC s (13) NH ib (54). CN s (21). CaC s (12) NH ib (55), CN s (18). CuC s (12) CHz b (90) CHz b (80) CHz b (82) CHz b (79)
1383MS
1377M
1383
1380
CHZw (58), C"C s (15), NH ib (13) CHz w (58), NH ib (18), C"C s (13) CHz w (68),NH ib (12), C"C s (1 1) CHz w (46), CHz b (16), NH ib (16), C"C s (14) CH2 w (50), NH ib (16), CHz b (14), C"C s (13) CHZtw (31), NH ib (14), CH2 w (13), CN s (12) CH2 tw (36), NH ib (22) CHZtw (29), CHZw (23), NH ib (13). CN s (12) CH:, tW (89) CH2 tw (86) CH, tw (47). CH, w (17), CN s (13) CHz tW (53). CH2 w (13), CN s (11) CHp w (2t3).CN s (25), NC" s (22), NH ib (18) CHp tW (67) CH, w (29), CH, tw (23), CN s (17), NCus (18), NH ib (10) NC" s (58), C"C s (14)
1374 1374
1334vw
1350
1350
1344
1345
1303
1304
1283111
1290
1290
1249M
1266 1249
1267 1249
-1332vw
1297 1283M
1274 1261s
1247
hl
m w
1243 1244MS 1134M
1132W
1031vs
1028M
968VW
971vw
952W 901M
1242 I1237
1242 1237
1131 1039 I037 974
I131 1038 1038
I
965 958 902
1 I31
1038
975 970 965 96 1
969
90 1
I
NC" s (70), C"C s (1 1) CHp r (63), CaC s (10) CH2 r (69) CH2 r (59), C"C s (12) CH2 r (61). C"C s (10) CH, r (66) C"C s (19), CH, r (17), CN s (17), co s (12) (Continued )
TABLE XVII (Continued)
Calculated (cm-') Observedo (cm-') Raman
IR
Antiparallel A
897VW
B
Parallel A
E
898 895
884VS
889 885
864W
862VW
872
87 1 762
752VW
751W
742VW
760
759
740 740M
738 729 716
707W
714 713 6983
Potential energy distribution* C"C s (19), CHt r (la), CN s (18), CO s (12) CHp r (26), C"C s (16), CN s (16), CO s (10) CHp r (21), CN s (la), C°C s (17), CO s (1 1) CHp r (23). C°C s (21). NC°C d ( I Z ) , CN s (12) CH2 r (33). C"C s (19), NC°C d (12). CN s (11) NH ob (17), NC"C d (15), CO ib (14), CNCsd (lo), CN s (10) CN t (20), NH ob (19), NH ... 0 ib (14), NC"C d (13), CO ib (12) CN t (53), NH ... 0 ib (29, NH ob (18) CN t (49), NH ... 0 ib (20). NH ob (16), CO ib (11) CN t (68), NH ob (31), NH ... 0 ib (30) CN t (69), NH ... 0 ib (23). NH ob (17). CO ib ( 1 1 ) CO ib (28), NC"C d (18) CO ib (25), NC"C d (17). CN t ( 1 I ) , CO ob (10) CO ib (26), NC"C d (18) CO ib (27). NCaC d (18)
[
673M
678 674 664
66 1 589 588
578W 573s 566111
I
584
58 1 569
566 563 562 50 1
496W 363s
498 496 355
498 494
353 353vw
352
340M
348
349 328
313vw
325
272W
270
324 276
267M
270
CN t (77), NH ob (27), NH t ( 1 I ) , CO ob (10) CN t (83), NH ob (28), NH t ( 1 I ) , NH ... 0 ib (10) CN t (88).NH ob (30), NH t (14) CO ob (64) CO ob (60). CO ib (14), C"C s (10) CO ob (59), CO ib ( 1 I ) , NH ob (10) CO ob (59), CO ib (1 1) CO ob (61), N H ob (13) CO ob (59), N H ob (13) CO ob (60). NH ob (13) CO ob (59). N H ob (15). CN t (10) C"CN d (38), CO ib (12), NC" s (12) C"CN d (39), CO ib (13). NC" s (12) C°CN d (37), CO ib (13), NC" s ( 1 1) NC"C d (26), C"CN d ( 1 7), CO ib ( 1 7), CO ob (12) NC"C d (27), C"CN d (IS), CO ib (18), COob(11) NC"C d (27). C"CN d (18). CO ib ( I @ , CO ob (13) NC"C d (27), C"CN d (18), CO ib (17), CO ob (13) CO ib (35), CNC" d (24), NCOC d (17), C"CN d (17) CO ib (34), CNC" d (25). NC°C d (17), C-CN d (16) NH ob (21), CO ob (21) NH ob (22),CO ob (22) N H ob (23). CO ob (21), CNC" d (lo), NC" s (10)
TABLE XVII (Continued)
Calculated (cm-') I
Observed" (cm-I) Raman
Anti parallel
IR
A
B
Parallel A
E
237 237 227
I
217W 203W
221
228
211
193 133
194 131 130
113M
115s
I
120 116 116
97 94
Potential energy distribution* CNC" d (60), CO ib (13). H ... 0 s (12) CNC" d (35), H ... 0 s (24), C"CN d (20) CNC" d (66), CO ib (12) CNC" d (72), CO ib (12) CNC" d (55), C'CN d (21) C"CN d (45). CNC" d (22), NC" s (12) H ... 0 s (34), CN t (23) CN t (31), H" ... 0 s (30), NH ob (14). NH ... 0 ib ( 1 1) CN t(29), CNC" d (17). NH-.Oib(15), H U - - O s ( 1 5 ) CN t (30), NH 0 ib (16), H ... 0 s (10) CN t (38). NH ob (19), H ... 0 s (13). NH ... 0 ib ( 1 1) CN t (38), NH -.-0 ib (12), NH ob (lo), H a - 0 s (10) H ... 0 s (28), NH ob (17), CN t (17), CO t (14), NH t ( ] I ) , CaC t (11) NH ob (35), CO t (18), NC" t (14), NC% d (13), CN t (10) .I-
83W
85 78
78 74 73 72 67 63 -50 VW
50 43 29
NH o b (24), NC°C d (16). HO-0 s (12), NCo t (11) NH o b (43), NH -..0 ib (20), H - 0 s ( l l ) , NH t (ll), CO ... H ib (10) NH ob (35), NH t (23), COHO ... 0 ib (19) NH ob (41), N C C d (16), C°C t (14). NH ..-0 ib (12), H ... 0 s (1 1) NH ob (49). NC°C d (20), C°C t (15) NH ob (45), C°C t (18), NC=Cd (17), NH ... 0 ib (10) NH ob (38), NH --.0 ib (31), CN t (13), NH t (10) C°C t (24), NH ob (18) NH ob (31), H ... 0 s (12), NC°C d (1 l), COHO ... 0 ib (lo), CO t (10) NH ob (34), H ... 0 s (20), NH ... 0 ib (16), CO ... H ib (11) NH t (41). CO t (16), NH ob (12), CO- H ib (11)
S, Strong; M, medium; W, weak; V, very. Stretch; as, antisymmetric stretch; ss, symmetric stretch; b, angle bend; ib, in-plane angle bend; ob, out-of-plane angle bend; W,wag; r, rock; t, torsion; d, deformation; tw, twist. Only contributions of 10 or greater are included. In some cases where contributions to the PEDs of related A and B modes differ by less than 3 the average value has been given for both frequencies. c Unperturbed frequency. a
s,
288
SAMUEL KRIMM AND JAGDEESH BANDEKAR
vibrational spectrum can be to chain conformation, and in particular how details of spectral changes can be understood from normal-mode calculations.
E , L,Dp Helices T h e conformations considered above have referred to polypeptide chains all of whose residues are achiral, such as (GIY)~ or (Aib),, or have the same chirality, L or D, throughout. There is an important class of polypeptides, of which the transmembrane ion-channel gramicidin A (GA) is an example, in which the chirality of adjacent residues alternates along the chain. Although a-helix structures are possible (Hesselink and Scheraga, 1972), L,D sequences favor new kinds of conformations. T h e normal-mode analyses of these structures (Naik and Krimm, 1986a) permit a detailed characterization of their vibrational spectra. Stereochemical and experimental studies of GA led to proposals that this L,D-pentadecapeptide is capable of adopting single-stranded (Urry, 197 1) and double-stranded (Veatch et al., 1974) helical conformations known as /3 helices. Conformational energy analyses of these structures (Ramachandran and Chandrasekharan, 1972; Prasad and Chandrasekharan, 1977; Colonna-Cesari et al., 1977) have shown that they are stable, and have defined their parameters. Both kinds of helices have been found in poly(y-benzyl-m-glutamate)by X-ray and electron-diffraction studies (Heitz et al., 1975; Lotz et al., 1976), and a single crystal analysis of t-Boc-(~-Val-~-Val)4-OCH3(t-Boc=N-tert-butoxycarbonyl) (Benedetti et al., 1979) has demonstrated the existence of an antiparallelchain double-stranded @helix structure for this compound. Normalmode analyses of these structures (Naik and Krimm, 1986a) show that their vibrational spectra have special characteristics.
I . Single-Stranded p Helices a. Structures. Single-stranded p helices are designated by the symbol p”, where n is the total number of L + D residues per turn. The ones that have been studied most, because of their relevance to the GA structure, have n = 4.4 and 6.3. Parameters of these helices are given in Table XVIII, and the structures are shown in Figs. 22 and 23. The normal modes of a goniomeric ps.3helix (symbolized /3E3) (Colonna-Cesari et al., 1977), have also been computed in order to study the effects of small conformational changes. [The pE3 helix is derivable from the helix by interchangeing CO and NH groups and reversing the order of the dihedral angles.] T h e hydrogen bonds in these structures are of two types: type A, corresponding to NH(D3) * OC(D1) for j34.4and NH(D4) * OC(D1) for p6.3;and type B, corresponding to NH(L1) + OC(L3)for
TABLE XVIII Parameters of P Helices Hydrogen bonds
no
hb
42
P 4.4
2.20
2.33
86.5
3.13
Pt'g
$LC
4DC
-84.0
100.8
136.1
-95.0
1.53
-104.0
118.0
144.0
-132.0
3.14
1.59
-128.7
134.7
132.6
-109.4
885.6
2.80 3.60 2.80
4.02 2.95 4.02
-116.0 -127.2 133.4
141.1 145.8 -93.0
159.2 154.0 -121.6
?tP7.2
3.60
2.97
135.1
Heli
mP
5.6
~1 P 7.2
-106.0
-144.4
Number of L,D units per turn of the helix. Projection (A) per L,D unit on the helix axis. Dihedral angle (degrees). Hydrogen bond distance (A). Angle between NH and N 0 (degrees). f A, D' + D hydrogen bond; B, L + L' hydrogen bond. g Goniomeric structure (Colonna-Cesari et al., 1977).
a
1..
$DC
-102.9 -126.4 166.1 166.4
I(N ... o l d
Z(H... o
Af: 2.76 Bf: 2.85 Af:2.88 Bf: 2.78 Af:3.00 Bf:2.95 2.76 2.90 Af:3.09 Bf: 2.56 Af: 3.31 Bf:3.08
1.81 1.93 2.06 1.87 2.07 2.00 1.83 1.92 2.20 1.88 2.50 2.34
) d
ee
15.2 17.7 28.7 19.8 16.8 14.0 16.8 10.5 22.3 38.5 30.3 35.9
290
SAMUEL KRIMM AND JACDEESH BANDEKAR
FIG. 22. ORTEP drawing of /34,4helix with side chain represented by a point mass (Naik and Krimm, 1986a).
p4.4and NH(LI)+ OC(L4) for pS.%.For purposes of the normal-mode calculations, the side chains are represented by point masses equivalent to a CH3 group. b. Vibrational Analysis. The force field used in the normal-mode calculations was one refined for an APPS with a side-chain CH3 group approximated by a point mass (Dwivedi and Krimm, 1984b). Hydrogenbond force constants were obtained by interpolation or extrapolation based on values refined for known structures. Transition dipole coupling was included for amide I and I1 modes, using the eigenvector components from the normal-mode analysis. This is important, since the L,D dipeptide is the asymmetric unit, and approaches that imply an equivalence of L and D peptide groups (Sychev et al., 1980) could give erroneous results.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
FIG.23. ORTEP drawing of (Naik and Krimm, 1986a).
@.3
29 1
helix, with side chain represented by a point mass
Since multiple modes occur for a helix, it can be difficult to determine which frequencies should be seen in the spectra. This problem can be resolved for IR-active modes for which the transition moment direction is known, such as amide I: It is only necessary to sum the transition moments in a translational repeat of the helix, taking account of the appropriate phase relationships between motions in adjacent L,D units. The intensity is proportional to the square of this summed transition moment. Table XIX presents the calculated amide I frequencies and the relative intensities for the A and El species modes (expected to be the observed ones) of the p4.4and ps.shelices. Details for the other amide modes are given in the original paper (Naik and Krimm, 1986a). The nature of the modes differ somewhat between p4.4 and /36.3, being more consistently mixed in the former. The intensity distributions indicate
292
SAMUEL KRIMM AND JAGDEESH BANDEKAR TABLE XIX Calculated A and El Species of Amide I Frequencies of P-Helti Structures Relative Intensity"
Structure
Species
EI
ti P 7.2
A
EI
tt P 5 . 6
A
El
v
I(
1
1648 1631 1653 1644 1652 1643 1654 1645 1648 1647 1650 1649 1672 1669 1666 1636 1675 1669 1666 1656 1686 1674 1667 1632 1677 1674 1662 1651 1705 1669 1668 1656 1697 1685 1683 1664
0.29 1.00 0.01 0.00 0.13 1.00 0.00 0.00 1.00 0.32 0.00 0.00 0.00 0.24 0.36 1.00 0.00 0.00 0.00 0.00 0.02 0.03 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.40 0.37 1.00 0.00 0.00 0.00 0.01
0.11 0.10 0.02 0.05 0.17 0.02 0.04 0.00 0.12 0.04 0.05 0.00 0.04 0.01 0.00 0.75 0.22 0.04 0.04 0.00 0.01 0.06 0.07 0.17 0.06 0.04 0.02 0.05 0.17 0.08 0.73 0.13 0.02 0.06 0.23 0.05
Av
I
17
I
9
I
1
I
39
I
43-54
12 (+ 15,
+ 22)
Groupsr L (68), D (7) D (6% L (7) L (671, D (8) D (68), L (8) D (63). L (10) L @3), D (10) D (69) L (69) D (W,L (8) L (68),D ( 8 ) D (74) L (73) D (53), D' (22) L' (67), D' (7) L (72) D' (43), D (20), L' (9). L (5) D' (38), D ( 3 3 , L' (71) L (74) D (43), D' (31) L (28), D (20), D' (16), L' (13) L' (38), D (31) D (42), L (28) D' (27), L' (22), L (19), D (10) D' (25), L' (22), L (18), D (12) L' (38), D' (28) D (431, L (2% D' (5) L (33), D' (17), D (16), L' (1 1) D' (43), D (33) L (72) D (43), D' (33) L' (72) L (57), L' (18) D' (51), D (25) D (51), D' (25) L' (54), L (18)
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
293
TABLE XIX (Continued) Relative Intensity” Structure
u
I1
1
1724 1686 1676 1670 1712 1698 1682 1669
0.00 0.02 1.00 0.00 0.00 0.00 0.00 0.00
0.02 0.01 0.13 0.30 0.00 0.01 0.10 0.04
Species
Avh
~
tt1372
A
El
Groups‘ ~
1’
-,“
~~
D (28). D’ (25), L (13), L‘ ( I 1) L (35), L’ (19), D (15), D’ (9) L’ (32), L (18), D’ (18), D (11) D’ (25), D (23), L’ (16), L (12) L (39), L’ (18), D (14), D’ (7) D (35), D’ (21). L (14), L’ (8) D’ (31), D (20), L’ (17), L (9) L’ (35), D’ (18), L (17), D (9)
Normalized to highest A species intensity. Splitting between highest and next-highest intensity unpolarized bands. Number in parenthesis is contribution to the potential energy distribution from CO stretch. L and D refer to the same chain, L’ and D’ to the second chain in double-stranded helices.
that, even in an unpolarized IR spectrum, a splitting should be observed that depends on the structure. Thus, for the p4.4 helix the most intense band is expected at 1631 cm-’, and the splitting between this and a weaker (-36% of the intensity) band at 1648 cm-’ is 17 cm-’; for the @s.3 helix, the strong band is predicted at 1643 cm-’ and the splitting is 9 cm-’; and for the /323 helix, these values are 1648 and 1 cm-’, respectively. It is more difficult to predict the position of the strong Raman mode, but experimental studies on crystalline GA (Naik and Krimm, 1984b,c, 1986b)suggest that it occurs near the midpoint of the range of weak IR modes; in this case all strong Raman bands would be predicted near 1650 cm-’. Thus, the IR-band characteristics provide the stronger base for a distinction between the structures. We will discuss the application of these results to the analysis of the structures of GA after considering the double-stranded p helices.
2. Double-Stranded p Helices a. Structures. The double-stranded helices can have antiparallel ( T i ) or parallel ( ff ) chains, and structures that are most relevant to GA have n = 5.6 and 7.2. The parameters of such helices are given in Table structures are shown in Figs. 24 and 25, XVIII, and the fJ/35.6 and respectively (the parallel-chain helices are similar in appearance). For the antiparallel-chain helices, all hydrogen bonds are essentially equivalent because of a local dyad perpendicular to the helix axis; for the
294
SAMUEL KRIMM AND JACDEESH BANDEKAR
FIG. 24. ORTEP drawing of a #f15.6 helix, with side chain represented by a point mass (Naik and Krimm, 1986a).
parallel-chain structures, the hydrogen bonds are of the A and B types. Side chains are again represented by point masses. b. Vibrational Analyses. The force fields and TDC parameters for the double-stranded were the same as for the single-stranded helices. Detailed descriptions of all amide modes are given in the original paper (Naik and Krimm, 1986a); the results for the amide I modes are presented in Table XIX. For the #p5.6and #p7.*helices: (1) Mixing of CO s is quite different, and certainly hardly approaches an equivalence of all four groups in the asymmetric unit; (2) the strong IR bands and splittings are slightly different: 1636 and 32-39 cm-' versus 1632 and 4354 cm-', respectively (the range in the latter case depending on whether the highest frequency band is observable); (3) the strong Raman modes
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
FIG. 25. ORTEP drawing of a (Naik and Krimm, 1986a).
fJ87.2
295
helix, with side chain represented by a point mass
are expected near 1666 and 1669 cm-', respectively. For the tt/35.6and flpp7.*helices: (1) The mixing of CO modes differs significantly, as well as being different from the mixing for the antiparallel-chain helices; (2) the strong IR bands and splittings are 1656 and 12 (+15, +22 for weaker bands) cm-' versus 1676 and +6, -6 cm-', respectively; (3) the strong Raman bands are expected near 1694 and 1695 cm-', respectively. It is clear that these predictions permit not only a distinction between the antiparallel- and parallel-chain double-stranded structures, but also between these and the single-stranded helices. 3. Gramacidan A As noted earlier, various studies have indicated that GA can adopt phelical structures, the exact form depending upon the state and physical environment of the molecule. Normal-mode analyses of the single- and
296
SAMUEL KRIMM AND JAGDEESH BANDEKAR
TABLE XX Obsemed and Calculated Amide I Frequencies for Single- and Double-Stranded Gramicidin A Structures Calculateda (cm-') Structure
P4* p6.3
ti P 5.6 tiP '.*
v(IR)*
Avc
1631 1643 1636 1632
17 9 32-39 43-54
v(R)~ 1663 1650 1666 1669
Observed (cm-I) v(IR)
Av
v(R)
State
-164OC 16389 16309
?
429 559
1654f 16669 16689
GA-Lipid GA (crystal) GA-CsSCN (crystal)
Calculated values from Naik and Krimm (1986a). Strong parallel-polarized IR mode. Difference between strong parallel and strongest (though weaker) perpendicular IR modes. Midpoint of IR-weak frequencies, expected to be strong in Raman. Estimated from observations on vesicles (Naik and Krimm, 1986b). f Naik and Krimm (1986b). g Naik and Krimm (1984b).
double-stranded p helices have made it possible to use vibrational spectra to determine the specific conformations present under the various circumstances (Naik and Krimm, 1984b,c, 1986b). We illustrate these results by considering just the amide I region of the spectrum. Infrared and Raman frequencies of GA in various states are given in Table XX; the assignments to the several structures are based on the correspondences between the observed and calculated results. Thus, native and CsSCN-complexed GA exhibit IR splittings of over 40 cm-'. These are consistent with the predictions for double-stranded p helices, but inconsistent with the maximum splitting of 17 cm-' calculated for a single-stranded helix. Furthermore, the observed spectral changes between native and CsSCN-GA (viz., the shift in the strong IR band from 1638 to 1630 cm-' and the increase in splitting from 42 to 55 cm-') are helix in good agreement with the calculated characteristics of a fJp5.6 converting to a T&P7.*helix (viz., predicted frequency changes from 1636 to 1632 cm-' and an increase in splitting from 32-39 to 43-54 cm-I). Th e values and the change in the Raman frequency are also consistent with this assignment. On the other hand, the spectra of GA in lipid vesicles are distinctly different: Although the IR band has increased only slightly over the frequency for the native crystalline GA (to -1640 cm-'), there is no observed splitting and the Raman band has shifted down from 1666 to 1654 cm-'. These observations are consistent with a P6.3helix, for which
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
297
the predicted values are an IR frequency of 1643 cm-', a splitting of only 9 cm-', and a Raman frequency that has decreased to 1650 cm-'. Such a level of predictability shows that vibrational analysis can be a major technique in determining polypeptide chain conformation. V. REVERSE TURNS A . Introduction Reverse turns are structural features of polypeptides and proteins that involve four (for the p turn) o r three (for the y turn) successive amino acid residues, and are characterized by an intramolecular hydrogen bond that produces a reversal of the direction of the polypeptide chain by -180". In the terminology of Bragg et al. (1950), they may also be called CIOor C7 structures, respectively. Three types of p turns were predicted by Venkatachalam (198613) on the basis of stereochemical criteria. Data from crystal-structure studies of more than 100 globular proteins (Bernstein et al., 1977) have since revealed that a substantial portion of the amino acid residues in globular proteins occur in p turns. For example, a survey of 38 nonhomologous proteins of known crystal structure (Bandekar and Krimm, 1979a) has shown that 29% of the residues occur in /3 turns, 39% are in helices, and 2 1% are in p sheets. In recent years, significant efforts have been directed toward a better understanding of /3 turns, since they occur frequently and have the property of making an otherwise extended polypeptide chain compact (see, for example, reviews by Lewis et al., 1973; Chou and Fasman, 1977, 1979a,b; Smith and Pease, 1980; Geisow, 1978). This compactness is implicated in the specific functions that a given protein carries out. A substantial portion of surface residues, or those exposed to the solvent where most of the enzymatic reactions take place, are in reverse turns (Kuntz, 1972; Rossman and Argos, 1975; Richardson et al., 1976). Beely (1977) and Aubert et al. (1976) found that sugar chains seem to be attached to glycoproteins preferentially at p turns. Carbohydrate moieties are present at /3 turns in the known structures of the human immunoglobulin heavy chain and RNase B (Geisow, 1978). Small et al. (1977) reported that about 80% of amino acids phosphorylated by protein kinases are predicted to be in p turns. In contrast to the p turn, although the C7 structure has been known in the literature since 1943 (Huggins, 1943), the y turn was proposed for the first time only in 1972 (NCmethy and Printz, 1972). Compared to the frequent occurrence of p turns in proteins and peptides, only 10 y turns have been identified in all of the globular proteins known to date (Baker
298
SAMUEL KRIMM AND JACDEESH BANDEKAR
and Hubbard, 1984). Recent NMR and X-ray studies have shown that some of the biologically active cyclic peptides contain y turns. The cyclic tetrapeptide dihydrochlamydocin is known from X-ray studies to contain a y turn (Flippen and Karle, 1976).A crystal-structure study of the iodo derivative of cyclosporin A has shown it to contain a y turn (Petcher et al., 1976). Host-specific toxin from Helminthosporium carbium is claimed from NMR studies to have a y turn (Kawai et al., 1983). NMR work has shown that the repeat unit of the elastin polypentapeptide helix contains a y turn (Urry et al., 1975). And finally, the C p and C? structures that have been discussed in the literature (Bystrov et al., 1969; Lipkind et al., 1971; Neel, 1972; Pullman and Pullman, 1974) are nothing but the mirror-related and normal y turn, respectively. The above considerations show that the reverse turns form an important structural and functional part of globular proteins. It is therefore important that spectroscopic methods be developed to identify and characterize such structures in peptides and proteins. While other spectroscopic methods have been developed to study the reverse turns (Smith and Pease, 1980), vibrational analysis can provide definitive characterizations for such structures (Krimm and Bandekar, 1980; Lagant et al., 1984b).
B . PTurns I . Standard Turns a. Structure. The p turn is characterized by the presence of a 4 + 1 hydrogen bond (see Fig. 26). Using stereochemicalcriteria, Venkatachalam (1968b) described three main types of /3 turns: Whereas the type I11 turn corresponds to one turn of a 310helix, types I and I1 are nonhelical conformations. The standard form of the type I1 turn requires a Gly residue at the C! position, there being no such restrictions on the other turns. Structures that have mirror-image arrangements of the backbone atoms were also found to be possible, and are designated types I’ 11’,and 111’. The Venkatachalam (1968b) study was restricted to L-amino acid residues; Chandrasekharan et al. ( 1973) extended such stereochemical studies to include D-amino acid residues. A subsequent analysis of eight protein structures (Lewis et al., 1973) revealed the existence of five additional types of /3 turns, designated IV, V, V’, VI and VII. A classification of the 11 turn types, together with their frequency of occurrence in 26 proteins, has been reported (Chou and Fasman, 1977). Vibrational analyses have been done only for p turns of types I, 11,111, 1’, 11’,and 111’ (Krimm and Bandekar, 1980).The model system chosen to represent a type I and a type I11 p turn was CHsCO-(Ala)4-NHCHs,
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
299
0
FIG. 26. Schematic illustration of a /3 turn of CHs-CO-(Ala)4-NH-CH3 external hydrogen bonds included (Krirnrn and Bandekar, 1980).
with
shown schematically in Fig. 26. For a type I1 /3 turn, the model was CH&O-(Ala)2-Gly-Ala-NHCHa. [Our results (Krimm and Bandekar, 1980) suggest that the latter structure, with Gly in the third position, is probably not a good general model for types I and I11 /3 turns, as had been assumed previously (Lagant et al., 1984b).This is because the CH:, group contributes in a special way to the normal modes, particularly in the amide I11 region.] Standard bond lengths and bond angles were used in generating these structures (see Table 111). The dihedral angles at residues 1 and 4 were, in the standard @turn types 1-111, kept at values consistent with the continuation of the chains in an APPS structure, viz., (4, $)<= (4, = - 139", 135". For p-turn types 1'-111', these
300
SAMUEL KRIMM AND JAGDEESH BANDEKAR
+
values of and $ are sterically disallowed, and therefore (+, $)I = (4, $), = -60", 120" were used. For canonical p turns of types 1-111 (and their mirror-image conformations), standard turn dihedral angles (Venkatachalam, 196813) were used: Type I: (4, $)2 = -60", -30", (4, $)3 = -go", 0"; type 11: (4, \Ir)* = -60", 120", (4, $)3 = 80", 0"; type 111: (4, $)z = -60", -30", (4, $)3 = -60", -30". T h e negatives of these values were used for turns 1'-111'. Since a variation from these standard angles is found for /3 turns in proteins, calculations were also done on structures having different (4, $)2 and (4, $)3. Analysis of the structural data on 38 nonhomologous proteins (Bernstein et d.,1977) indicated that the majority of p turns of types I and I1 had average variations in $2 of up to - 30", in $2 and 43 of u p to &30", and in $3 of up to k35". For type I11 p turns the comparable values were: + 1 , up to - 10"; $2, up to -25"; $13, up to + 15" and -25"; and $3, up to + 15" and -20". The normal vibration frequencies of such p turns were computed with each angle varied from the standard value by the above amounts. Since TDC also involves peptide groups 1 and 5 (see Fig. 26), the effects of variations in (4, $)I and (4, $)4 on the characteristic @turn frequencies were also studied. Examination of the protein data led to the selection of the following additional values of these angles: +] = -60", 120"; t / ~ =~ -60", -120"; = -60", 120"; and $4 = -60", - 120" for types I and I1 p turns; and $1 = -60"; $1 = 40", -40"; = -80", -45"; and $4 = 60", -40" for type 111 p turns. Type I1 p turns are occasionally found with a residue other than Gly in the third position. A calculation was therefore also done for a type I1 turn with Ala in the third position. The dihedral angles were (4, $)2 = -72", 119" and (4, $)3 = 77", 16", based on average values from protein structures, and (+, $)] = (+, 1/44 = -139", 135". b. Vibrational Analysis. T h e force field for these calculations was taken from an early refinement on polypeptides (Moore and Krimm, 1976a,b; Rabolt et al., 1977). In an initial calculation (Bandekar and Krimm, 1979a), external CO and NH groups were not hydrogen bonded, but in a subsequent calculation (Krimm and Bandekar, 1980) these groups had external H and 0 atoms, respectively, bonded to them (see Fig. 26) with constant values of F(H .-.0).On the other hand, values of F(H 0)for the internal hydrogen bonds were altered as Z(H 0)changed with variations in dihedral angles, using a simple linear dependence (Krimm and Bandekar, 1980). Transition dipole coupling was incorporated for amide I and amide I1 modes, with all interactions included. Since no independent information was available on the magnitude of Apefffor p turns, calculations were made for a range of values. The complete normal-mode frequencies for standard p turns of types
+,
---
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
30 1
1-111 have been presented elsewhere (see Tables 1-111 of Krimm and Bandekar, 1980). I n Table XXI we present a summary of computed amide 1-111 frequencies for different values of Apeff for P turns of types 1-111; Table XXII gives similar data for /3 turns of types 1'-111'. In Table XXIII we present the results for the type I1 turns with Ala in the third position. T h e effects of variations in dihedral angles on the amide frequencies are complex, and rather than considering them in detail the original paper should be consulted for specifics (Krimm and Bandekar, 1980). In discussing the characteristic amide modes of P turns, we will draw attention to the general features that emerged from these calculations; the specific predicted frequencies are very dependent on the particular dihedral angles of the turn. For purposes of the present discussion we also limit ourselves to turns of types 1-111, and to Apeffof 0.45 D for amide I and 0.40 D for amide I1 [which seemed to be most effective in accounting for these frequencies in a tetrapeptide /3 turn (Bandekar and Krimm, 1979b); see below]. In Table XXIV we give a comparison of the calculated amide I, 11, 111, and V modes of these /3 turns with similar modes calculated for the a helix and APPS structures [based on the earlier force field (Moore and Krimm, 1976a,b), since this was used for the /3 turns]. The salient feature about the predicted amide I modes of P turns is that some of their frequencies overlap with those of the a-helix and P-sheet structures. In particular, PI and PI11 have a predicted frequency near 1690 cm-', which is characteristic of an IR mode of the APPS structure; and for pIIthere are predicted modes near 1666 and 1656 cm-l, the former occurring near a strong Raman band of the APPS structure and the latter near a strong IR band of the a-helix structure. This observation suggests that caution is necessary in interpreting protein spectra in terms of a simple sum of component secondary conformations. This is particularly true if we recognize that large frequency shifts are predicted even for a given type of /3 turn as the dihedral angles are varied. For example, if 42 for a type I turn is changed from -60" to -75", the highest frequency is predicted to drop from 1690 to 1675 cm-1 (Krimm and Bandekar, 1980). Thus, while a-helix and P-sheet components of a protein may be expected to have relatively constant amide I frequencies, the same is not likely to be true of a ''@-turncomponent." The amide I1 modes of P turns do not appear to be distinctive, since they show overlaps with modes of the a helix and /3 sheet. However, they tend to be calculated at somewhat higher frequencies than those of the a helix and P sheet. On the other hand, it does seem that the amide I11
Calculated Am&
TABLEXXI I, II, and I l l Freqwmies with Transition Dipole Coupling Included for Types I , 11, and I l l p Turns Type Ill
Type 11
Amide I Frequency (cm-')
Frequency (cm-I)
Frequency (cm-I) APerr (D)
____
~~
Groupa
0.0
0.30
0.35
0.45
Group"
0.0
0.30
0.35
0.45
Group"
5 2+ 1 1+2 3+4 4+3
1676 1673 1671 1666 1665
1678 1659 1685 1675 1654
1679 1650 1694 1681 1647
1680 1642 1702 1690 1640
5 3
1676 1675 1674 1671 1665
1680 1666 1671 1681 1661
1683 1661 1669 1689 1659
1686 1656 1666 1693 1657
5 2+ 1 1 +2 3+4 4+3
2+ 1 1 +2 4
Amide I1 A P ~ N(D) ~
5 4 1 2 3 1 4 3 2 5
0.0
0.20
0.27
0.40
1579 1569 1567 1554 1540
1575 1561 1563 1562 1543
1568 1555 1557 1567 1550
1559 1536 1547 1575 1558
1331 1324 1305 1299 1293
5 4+1 1+4 3 2 1 4 2+3 3+2 5
0.20
0.27
0.40
1578 1568 1567 1563 1553
1573 1564 1561 1561 1550
1561 1557 1555 1560 1545
1555 1547 1548 1558 1540
Amide III 1330 1329 1303 1297 1293
0.30
0.35
0.45
1676 1674 1671 1667 1665
1679 1661 1680 1675 1657
1682 1649 1687 1680 1652
1684 1643 1694 1686 1648
0.0
0.20
0.27
0.40
1578 1563 1550 1543 1536
1569 1562 1554 1545 1536
1561 1560 1559 1549 1537
1553 1558 1562 1551 1539
-
~
0.0
0.0
5 1 2 4 3 1 4 3 2+3 5
~
1321 1317 1303 1291 1286
Group numbers refer to the peptide groups of Fig. 26.The designation 2 + 1 indicates that both groups contribute to the mode, that of 2 being larger.
TABLE XXII Calculated Am& I , II, and III Frequencies with Transition Dipole Coupling Included for Tvpes I ' , II', and III' p Turn Type 11'
Type I' Frequency (cm-')
Type 111'
Amide I Frequency (cm-')
Frequency (cm-I)
b e r r (D)
APerr (D)
Group"
0.0
0.30
0.35
0.45
Group"
0.0
0.30
0.35
0.45
Group"
0.0
0.30
0.35
0.45
5 2 1
1676 1675 1672 1669 1685
1676 1677 1671 1676 1657
1676 1678 1670 I680 1651
1676 1680 1669 1684 1646
2 5
+5 +2
1676 1676 1673 1672 1666
1682 1674 1673 1670 1665
1685 1673 1673 1668 1665
1687 1673 1673 1666 1664
5 2
1676 1675 1671 1669 1666
1677 1671 1678 1674 1658
1677 1669 1685 1676 1654
1679 1667 1691 1678 1649
3 4
+4
+3
1
3 4
W
1
3 4
Amide I1 ACLerr (D)
0 W
5 1 4 3 2 1 4 3+2 2+3 5
0.0
0.20
0.27
0.40
1568 1555 1548 1541 1536
1563 1549 1531 1545 1538
1553 1543 1529 1550 1542
1546 1536 1521 1553 1544
1318 1311 1290 1273 1268
See footnote to Table XXI.
5 1+2 4 1+2 3 1
4 2+4 3+2 5
0.0
0.20
0.27
0.40
1569 1553 1548 1536 1525
1565 1548 1539 1530 1524
1559 1546 1531 1523 1523
1551 1541 1526 1517 1523
Amide I11 131 1 1309 1300 1273 1267
(D)
&rr
5 1 4 2 3 1 4 2+4 2+3 5
0.0
0.20
0.27
0.40
1566 1555 1546 1542 1537
1551 1552 1549 1545 1535
1543 1550 1552 1548 1534
1531 1549 1553 1548 1532
1311 1303 1288 1274 1271
304
SAMUEL KRIMM AND JAGDEESH BANDEKAR TABLE XXIII Calculated Frequencies with Transition Dipole Coupling Included far a Type II p Turn of CHj-CO-(Ala)+-NH-CH3
Frequency (cm-I) Aperr (D)
Group'
0.0
5+3 3+5 2 + 1 1+2 4
1676 1676 1673 1671 1665
0.30
0.35
0.45
1677 1658 1672 1694 1659
1679 1653 1671 1702 1659
Amide I 1676 1665 1672 1685 1662
Amide I1 &err
5 4+3 1 4 2
(D)
0.0
0.20
0.27
0.40
1566 1553 1551 1546 1534
1563 1548 1548 1541 1530
1556 1541 1540 1536 1527
1551 1536 1533 1531 1523
Amide 111 1 4 2+3 3 + 2 5
1311 1302 1284 1274 1271
., See footnote to Table XXI.
modes have features characteristic of fi turns, viz., they are predicted at frequencies generally higher than those of the a helix and /3 sheet. For the latter two structures, the main NH ib + CN s contributions occur in the regions 1280-1260 and 1245-1220 cm-', respectively, where medium to strong Raman and N-deuteration-sensitive IR bands are found. For the fi turns, such modes associated with peptide groups in the turn are consistently predicted above 1290 cm-', thus providing the possibility of special identification of such structures. However, a dependence of these frequencies on the dihedral angles must be noted (Krimm and Bandekar, 1980).
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
Calculated Am&
305
TABLE XXIV Mode Frequencies for Various Structures Structure
Mode
' a
Amide I [1663] 1659 [1650]
Amide I1 [1546] 1543 1520
P(mb [1701]c 1695 1669
1690
1630 [ 15871 1555
-1640 1575 1558
1534 1523
Amide I11 1287 1270
Amide V
1247 1228 1225 W91 713 702 [6991
PI1
PI
PI11
1686 1666 .1656
-
1646
*3
1536
1558 1547 1540
1562 1551 1539
1324 1305 1299
1329 1303 1297
1317 1303 1291
595 575 574 570
644 607 594 588 572 57 1
67 1 589 577 573 573
From Rabolt et al. (1977). From Moore and Krimm (1976b). Values in brackets represent frequencies not observed.
The amide V modes also appear to be distinctive for p turns. Such modes are found, and calculated, in the Q helix near 660 and 618 cm-', and in the APPS near 705 cm-'. Except for the highest frequency, which seems to be characteristic of the turn type, these modes generally occur at lower frequencies: 575-570 for PI, 607-571 for Prr,and 589-573 for PIII.T h e presence of N-deuteration-sensitive bands in these lower frequency regions may therefore be diagnostic for a /3 turn. The significant variation of amide frequencies with dihedral angles in the turn (Krimm and Bandekar, 1980), however, raises the general question of whether p-turn frequencies can be considered to be as diagnostic as those of the Q helix and /3 sheet, particularly in proteins where relatively large departures from standard values of 4 and $ can occur. Incidentally, as can be seen from Table XXI, the unperturbed amide I
306
SAMUEL KRIMM AND JAGDEESH BANDEKAR
and I1 frequencies of all turns are essentially the same; TDC, which is particularly sensitive to geometry, accounts for the frequency differences and their dependence on $. It is likely that correlations such as those shown in Table XXIV will be useful in cases where it can be assumed that and $ are close to the standard values. More generally, it will be desirable to deal with the specific conformations under consideration so that more relevant predictions can be made. In this connection, it is useful to see how well normal-mode calculations are able to predict the frequencies of actual /3 turns found in peptides and proteins.
+,
+
2.
p Turns in Peptades a. Type Z p Turn; Z-Gly-Pro-Leu-Gly-OH.
The tetrapeptide pbromocarbobenzoxylglycylprolylleucylglycine (BrZ-Gly-Pro-Leu-GlyOH) is known from crystal-structure studies to have a type I p-turn conformation (Ueki et al., 1969). Its dihedral angles are (4, $)2 = -58", -33" and (4, $)s = - 104", 8", which are very close to the standard values of (4, $ ) p = -60", -30" and (4, $)3 = -go", 0". In a normal-mode calculation (Bandekar and Krimm, 1979b), this molecule was modeled by CHsOCO-Gly-Ala-Ala-Gly-OCH3 with external hydrogen bonds. The force field was the same as that used for the standard /3 turns (Krimm and Bandekar, 1980), with the addition of CO s force constants to reproduce the observed ester (1743 cm-') and urethane (1686 cm-l) frequencies. Infrared and Raman spectra of Z-Gly-Pro-Leu-Gly-OH are shown in Figs. 27 and 28, respectively. The observed amide I, 11, 111, and V bands are listed in Table XXV, together with the calculated values of these frequencies. As can be seen from Table XXV, the amide I bands are well predicted by the calculation. The highest peptide group frequency, calculated at 1681 cm-', is lower than the predicted frequency for the standard turn (viz., 1690 cm-l); this is probably due to the combined effects of the changes in 43 and $4 (Krimm and Bandekar, 1980), which also change the mixing from 3 + 4 to 3 + 2. The observed bands near 1674 cm-l are in reasonable agreement with the predicted frequency. The calculated frequencies at 1659 and 1647 cm-l are different from those for the comparable modes of a standard turn at 1640 and 1642 cm-', respectively (see Table XXI), but again the mixing is significantly different and the observed bands are well reproduced. The amide I1 modes, identified by their disappearance on N-deuteration (see Figs. 27a,b), are quite well predicted by the calculation. Two points are worthy of note: (1) Bands at 1562 and 1544 cm-I are calculated at higher frequencies than their counterparts at 1558 and 1536
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
I
1760
I
I
I
I
I
I
I
I
1600 1500 1400 1300 1200 1100 1000 900 FREQUENCY (cm-' 1
I
I
307
I
800 700 600
FIG. 27. Infrared spectra of Z-Gly-Pro-Leu-Gly-OH. (a) Undeuterated; (b) N-deuterated (J. Bandekar and S. Krimm, unpublished work).
308
SAMUEL KRIMM AND JAGDEESH BANDEKAR
>
t v) z w I-
5
5
;Ii
J
I0
FREQUENCY (cm-1)
FIG. 28. Raman spectrum of Z-Gly-Pro-Leu-Gly-OH unpublished work).
(J. Bandekar and S. Krimrn,
cm-', respectively, in the standard turn, (see Table XXI), and are observed nearer these values. (2) The highest observed frequency, at 1568 cm-', is significantly higher than that seen for the a helix (1545 cm-') or p sheet (1555 cm-l) of (Ala),, a feature noted for the standard turn. The amide I11 modes calculated at 1313, 1291, and 1281 cm-' have a large contribution from NH ib CN s. The weak IR band at 1314 cm-' and the strong IR band at 1294 cm-' disappear on N-deuteration and are well accounted for by the first two of the above calculated frequencies. In addition, NH ib contributes to modes at 1391, 1331, 1326, and 1300 cm-', for the second of which there is an observed N-deuterationsensitive band observed at 1333 cm-'. The NH ob coordinate makes contributions to modes above -500 cm-' at 609,583,565, and 498 cm-'. A medium-intensity band in the IR at 599 cm-' is observed to disappear on N-deuteration, and is very well accounted for by the first of the above calculated modes. It is interesting that the general prediction that amide V frequencies of /3 turns are found below those of the a helix and p sheet is supported by the results on this molecule. The normal-mode calculations on Z-Gly-Pro-Leu-Gly-OH, together with IR and Raman spectra of this molecule, thus provide a strong basis for supporting the general conclusions drawn from a vibrational analysis
+
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
309
TABLE XXV Observed and Calculated Amide Mode Frequencies of CH~-O-Gly-Ala-Ala-Gly-O-CHI and Observed Bands of Type I D-Turn Z-Gly-Pro-Leu-Gly-OH Calculated (cm-I) Mode Amide I
Amide I1
Amide 111
Amide V
Observed (cm-')
v
Groupb
Raman
Infrared
1743 1688 1681 1659 1647 1579 1562 1544 1534 1391 1331 1326 1313 1300 1291 1281 609 583 565 498
5
1741MS 1689s 1674W 16568 1644MW -
1741s 1686s 1673W 1655VS 1639VS 1568MS 1548MS 1525M 1333W 1314W 12948 599M
1
3+2 4 2+3 2+ 1 3 4 1+2 2 4 3 1 3 4 2 1
-
1333M 1325W 1316W 1291M -
3 1+2 4
S, Strong; M, medium; W, weak; V, very. See footnote a , Table XXI, for general explanation of symbols. a
of standard /3 turns (Krimm and Bandekar, 1980). They also enhance support for the conclusion (Kawai and Fasman, 1978) that Z-Gly-Ser(OBut)-Ser-Gly-O-stearyl ester, on the basis of the similarity of its IR spectrum to that of Z-Gly-Pro-Leu-Gly-OH, probably has a type I j3turn structure. A complete vibrational analysis has not been done for a type I' j3-turn molecule, but the results of a Raman study of (Leu5)-enkephalin(Han et al., 1980),which is known from X-ray diffraction analysis to form a type I' j3 turn (Smith and Griffin, 1978), are consistent with the predictions made for the standard turn. The dihedral angles in this molecule [($, t,!~)~ = 59", 25" and (4, $)3 = 9 T , -7"J are quite close to the standard values of 60", 30" and go", 0", respectively, thus suggesting that the predictions for the standard type I' /3 turn (Table XXII) are likely to be applicable. Relevant amide I modes are calculated at 1684, 1680, 1676, and 1646
310
SAMUEL KRIMM AND JACDEESH BANDEKAR
cm-'; there are observed Raman bands at 1676VS and 1642W cm-' for the crystalline material, quite consistent with these predictions (no IR spectra were presented on the crystalline compound). No amide I1 data (from IR spectra) were given, but N-deuteration-sensitive amide 111 bands were identified at 1325, 1282, 1271, and 1255 cm-'. These bands are in good agreement with predicted modes at 1311, 1290, 1273, and 1268 cm-'. No IR data were presented on amide V modes, but the above results support the general predictive capabilities of the normal-mode calculation. b. Type II /3 Turn i. Pro-Leu-Gly-NHz . The C-terminal tripeptide of oxytocin, ProLeu-Gly-NH2 , has been shown from crystallographic studies to have a type I1 p-turn structure (Reed and Johnson, 1973). Its dihedral angles are t,bl(Pro)= 152.9", r#Jn(Leu)= -61.2", +2(Leu) = 127.8", and r#Js(Gly) = 71.8", which are close to the standard values of (r#J,$)z = -60", 120" and r#J3 = 80". The normal modes of this molecule as well as its N-deuterated derivative have been calculated, and compared in detail with Raman and IR spectra (Naik et al., 1980; Naik and Krimm, 1984a). No structural approximations were made in this case, and the force field for the peptide group was transferred from more recently refined force fields for (Gly),I (Dwivedi and Krimm, 1982a) and /3-(Ala)n(Dwivedi and Krimm, 198213). Force constants for the prolyl moiety were transferred from (Pro), (Johnston, 1975), for the leucyl side chain from hydrocarbons (Schachtschneider and Snyder, 1963), and for the CONHp group from acetamide (Uno et al., 1969, 1971). For the IR and Raman spectra, and the detailed description of the normal modes, of this molecule the original publication (Naik and Krimm, 1984a) should be consulted. The 51 Raman and 46 IR bands observed below 1700 cm-' could be assigned to 68 calculated normal modes with an average error of 6 cm-'. Comparable assignments could be made for 44 Raman and 50 IR bands observed in this region for the N-deuterated molecule. In Table XXVI we give only the results for the amide modes. The predictions for amide I and I1 modes are generally good, considering that force constants were transferred without further refinement. The large frequency difference between the 1680 (IR) and 1691 cm-' (Raman) bands may reflect the presence of intermolecular interactions in the crystal. In any event, we do not expect frequencies as high as these for a standard type 11 /3 turn. Their observation and prediction are undoubtedly related to the particular structure of this molecule, which emphasizes the caution required in assuming general characteristic /3turn frequencies. Incidentally, the calculation predicts that the 1658-
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
3 11
TABLE XXVI Calculated and Observed Amide Frequencies of Type II p Turn Pro-Leu-Gly-NHZ Calculated (cm-I) Mode
1680 1664 1658 1568 1545 1375 1354 1335 1328 1266 1237 699 657 603 575
Amide I1 Amide I11
Amide V
~
a
Groupb
Y
Amide I
Observed" (cm-I)
2 3
+3 +2 4 2 3 2 3 2 2 3 2 2 2 3 3
~~
~~
Raman
Infrared
1691 MW 1664sh 1652s 1375W 1351MW 1345M 1338M 1271MS 1241s 695W 647W 614W 571M
1680VS 1662M 1650sh 1565sh 1556VS 1370M -
-
1336M 1270sh 1241M 687MS 645s 612MS 570MS
~~
See footnote a, Table XXV, for explanation of symbols. See footnote b, Table XXV, for explanation of symbols.
cm-' mode, with a significant contribution from NH2 r and NH2 b, will be shifted down on N-deuteration by a larger amount (18 cm-l) than the amide I modes of the other peptide groups (about 5 cm-'). This uniquely large shift is observed, thus making unnecessary the interpretation of this shift in terms of conformational change (Hseu and Chang, 1980). In Table XXVI we also list the observed IR and Raman bands in the amide I11 and V regions that weaken or shift on N-deuteration, and the calculated modes containing NH ib or NH ob, respectively, that can be assigned to them. Despite the prediction for the standard turn that amide I11 frequencies should occur above about 1300 cm-' (see Table XXI), we find two clearly N-deuteration-sensitive bands below, at 1271 and 1241 cm-', that are well accounted for by the calculation. This arises from the different forms of the normal modes for the two molecules, particularly the CBCy(Leu) s contribution to the 1237-cm-' mode. A similar consideration may apply to the 687-cm-' (IR) band, which is much higher than the highest predicted mode of the standard turn: The CH2 r contribution in the latter case is absent for the tripeptide.
312
SAMUEL KRIMM AND JAGDEESH BANDEKAR
From the results on Pro-Leu-Gly-NH:, we can conclude that the force fields are highly reliable in their ability to reproduce observed frequencies of this type I1 @turn structure. In addition, we see again that p-turn frequencies depend strongly on the specific dihedral angles and side chains associated with the turn. ii. Z-Gly-Pro-Gly-Gly-OMe. The tetrapeptide Z-Gly-Pro-GlyGly-OMe is found from NMR studies to adopt a type I1 /3 conformation (Perly et al., 1983), and its spectra have been analyzed with the help of normal-mode calculations (Lagant et al., 1984a). A standard type I1 pturn structure was assumed in the calculations, which were done with a Urey-Bradley force field. T h e observed amide frequencies and their assignments to peptide groups were as follows: type I, 1694 cm-' (3), 1656 cm-I (2), 1639 cm-' (4); 11, 1560 cm-l (4), 1540 cm-' (3);111, 1280 cm-l(4), 1255 cm-l (3). These differ from those of the type I1 p turn of Pro-Leu-Gly-NHe , probably mainly because of the differences in sidechain structure. iii. Cyclo(L-Ala-Gly-Aca). T h e potential for structure determination through vibrational analysis is demonstrated by a study of cyclo(~-alanylglycyl-e-aminocaproyl) [cyclo(~-Ala-Gly-Aca)], a tripeptide cyclized by a (CH& chain and therefore constrained to form a /3 turn. T h e structure of this molecule was not previously known, but conformational energy calculations suggested possible low-energy structures (Nemethy et al., 1981). Since four of these had energies within about 1 kcal/mol of the minimum, it was difficult to be certain which structure prevailed. Normal-mode calculations were used to analyze Raman and IR spectra of this molecule (Maxfield et al., 1981), and from this it was possible to conclude which of the two type I1 flturns among the four is the predominant structure in the solid state and in solution. As a general test of the method, calculations were done on the 10 lowest energy conformations calculated for this molecule (Nemethy et al., 1981); these energies and the types of bends are given in Table XXVII. T h e calculated amide I, 11, 111, and V modes were compared with observed Raman and IR bands of the parent and N-deuterated molecules in order to determine which structure gives best agreement between observed and calculated bands. Results for the amide I and V modes are shown in Fig. 29. As shown in Table XXVII, the maximum predicted splittings of the amide I mode vary with the conformation of the turn (a result of differences in the TDC contributions). On the basis of the observed splitting of -50 cm-', conformations 3, 8, 9, and 10 could be considered possible ones, although the frequencies of 3 are in better agreement with observed bands of the solid (Fig. 29a). For amide 11, observed IR bands
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
3 13
TABLE XXVII Parameters Characterizing the Theoretically Computed Minimum-Energy Conformations of cyclo(L-Ala-Gly-Aca) with Trans Peptide Bonds Maximum splitting (cm-I) of amide I modesa Conformation numbe@
Energy A E c (kcal/mol)
1 2 3 4 5 6 7 8 9 10
0.00 0.74 0.93 1.07 1.22 1.25 1.59 2.80 2.96 3.08
Turn type
I1 I I1 111 I11 111
I 111' I' I'
Absent
Present
10 7 10 12 11 10 15 13 18 11
11 10 45 21 21 18 21 53 57 63
a Calculated value of maximum splitting of amide I modes, with and without transition dipole coupling. Ntmethy el al. (1981). AE = E - E o , where Eo is the computed energy of conformation 1.
show a maximum splitting of -35 cm-'; calculated splittings of conformations 1 (30 cm-') and 3 (-50 cm-I) agree best with the observations. The amide V modes provide a strong selection criterion: as seen in Fig. 29b, only structure 3 predicts two modes in the region 650-550 cm-' that are compatible with the observed N-deuteration-sensitive bands. Thus, the combined evidence clearly favors conformation 3 as the likely one in the solid state. This structure has (+,$)2 = - 85", 74" and (4, $)3 = 132", -62", which are significantly different from the standard values of - 60", 120" and 80", 0",respectively. The angles for the type I1 p turn of conformation 1 are (+, J l ) p = - 89", 85" and (4, $)3 = 81", 74", again emphasizing the dependence of the amide frequencies on the specific values of the dihedral angles (Krimm and Bandekar, 1980). A type I1 p turn has also been confirmed iv. C~C~~(L-AZU-D-AZU-ACU). in another cyclic tripeptide, cyclo(L-Ala-D-Ala-Aca) (Bandekar et al., 1982). As in the case of cyclo(L-Ala-Gly-Aca), this conclusion resulted from a vibrational analysis of a number of conformations obtained from a conformational energy calculation. In this instance the spectroscopic study concluded that the solid-state structure corresponded predominantly to the lowest energy conformation (number 3), a structure having (4, $)z = -89", 72" and (4, $)Q = 134, -61", with the possibility of a
3+2
a
2+3+l
1+2
I
10 I+3
2
3
I,
9
I
1
3
I . I , I
I
8
3
2
I
7
2
1
23
I In
6
5
I
4 3+2
3
I ,
2
3
I
I
1 1
2
3
I I
I .I
1
2+3
I
I
3
2
2 23
n,
I
1640
I
i650
I
1660
nl
1670
I , 1 I.
1
I
II
1
I
I
1
1680
I
1690
I
1700
I
1710
d
450
500
550
600
650
700
750
800
FIG. 29. (a) Calculated frequencies in the amide I region for the 10 lowest energy conformations of cyclo(L-Ala-Gly-Aca) (see Table XXVII). The observed infrared (solid bar) and Raman (open bar) bands are shown on the bottom line. Numbers above the computed frequencies represent the groups involved in the vibration (Maxfield et al., 1981).(b) Calculated frequencies in the amide V region for the 10 lowest energy conformations of cyclo(L-Ala-Gly-Aca) (see Table XXVII). The observed infrared and Raman bands occur at the same frequencies and are indicated by the shaded bars on the bottom line. Numbers above the calculated frequencies represent the groups involved in the vibration (Maxfield et ul., 1981).
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
3 15
TABLE XXVIII Obseroed and Calculated Amide Modes of Trpe I1 fi Turn in cyclo(L-Ala-D-Ala-Aca) ~~
Observed" (cm-1) Mode
Raman
Infraredb
1669VS 1656sh 1643W
1668W 1656sh 1641VS 1565sh (1565sh) 1554M (1554MS)
Amide I
Amide I1
1546MS (1545M) Amide 111
1379MW 1310MW 1255M 1240MS
Amide V 729W
1303MW (1290MW) 1252M (1256M) 1236sh (1238sh) (767MW) 736sh (746M) 716MS (719MS) 697MS (700M)
Calculatedc (cm-I) No. 3
No. 1
1686 1676
1683
1651 1560 1555
1535 1372 1337 1300 1255 1242
1662 1646 1546 1545 1542 1370 1339 1299 1262 1237 773
749 739 690
670W
597MW
(668W) (650VW) (615VW) 592MW (592MW)
660 650 620 59 1 573
See footnote a , Table XXV, for explanation of symbols. Frequency values in parentheses were observed at low temperature. c The numbers in the heading refer to those of the computed conformations (Bandekar et al., 1982). b
small amount of another type I1 /3 turn (number 1,0.74 kcal/mol higher in energy) also being present, having (4, $)* = -96", 94" and (4, $)3 = 85", 57". The observed Raman and IR bands of this molecule and the calculated frequencies of these two structures are given in Table XXVIII. Since the (4, $)z and (4, $)3 values of conformations 3 of cyclo(L-AlaD-Ala-Aca) and cyclo(L-Ala-Gly-Aca) are essentially identical, it is of interest to compare the amide modes of these two molecules to see what effect the replacement of a Gly H by an Ala CH3 has on the frequencies. The strong -1668 (Raman) and -1642 (IR) cm-' arnide I frequencies
316
SAMUEL KRIMM AND JAGDEESH BANDEKAR
are identical for both molecules. Of the amide I1 modes, a frequency at 1566 cm-I is common to both, though this is a medium-intensity band and represents a shoulder in the Gly and Ala molecules, respectively. T h e other two amide I1 modes are quite different: 1549MW and 1533M versus 1554M and 1546MS cm-l in the Gly and Ala molecules, respectively. While both molecules have modes with NH ib near 1375 and 1305 cm-l in common, the lower frequency amide I11 modes are significantly different: -1280W and -1230M versus -1254M and 1240MS (Raman) cm-l in the Gly and Ala molecules, respectively. T h e amide V modes show similar differences. These results emphasize a point made above, viz., that p turns with a Gly residue at position 3 are not good general models for p turns. It should also be noted that the frequencies of cyclo(L-Ala-D-Ala-Aca) are not in good agreement with those of a standard type 11 p turn with Ala in position 3 (cf. Tables XXIII and XXVIII). This is probably a consequence of the significant differences in dihedral angles of these p turns. T h e only complete vibrational analysis of a type 11' p-turn peptide is that of gramicidin S (Naik et al., 1984). Although the dihedral angles of (4, $)z = 60°, - 137" and (4, JI)s = -75", - 18" are not too far from the standard values of (4, $)z = 60", -120" and (4, $):3 = -8O", 0", the situation is complicated by the cyclosymmetric nature of the molecule and the presence of a Pro residue in position 3. Thus, although good agreement is obtained between observed and calculated frequencies for the gramicidin S structure, there is, as expected, poorer agreement with calculated modes of a standard type 11' p turn (cf. Table XXII). c. Type ZZZ /ITurn. The only complete vibrational analysis of a system that adopts a type 111 p turn is that of cyclo(L-Ala-L-Ala-Aca) (Bandekar et al., 1982). This molecule was found by conformational energy calculations to be likely to assume either a type I or type I11 p-turn structure. Comparison of the observed Raman and IR bands with calculated normal modes showed that the type I11 p turn, of 0.55 kcal/mol higher energy than the minimum-energy (type I p-turn) structure, predominated in the solid state, with some increase in type I structure occurring at low temperature. T h e type I11 p turn of cyclo(L-Ala-L-AlaAca) has (4, JI)z = - 8 lo,- 53" and (4, Jl)s = - 87", - 48",compared to the standard values of - 60", - 30" and - 60", - 30", respectively. Observed and calculated frequencies of this molecule are compared in Table XXIX; the agreement is reasonably good, except for some amide V modes. T h e strong amide I modes at 1670 (Raman) and 1650 (IR) cm-l are well predicted, despite their frequency differences from modes of the standard structure (cf. Table XXIV). The strong amide I1 modes at 1543 and 1530 cm-I are well accounted for, conformations 4 and 2
-
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
3 17
TABLE XXIX Obserued and Calculated Amide Modes of Type III (No. 4 ) and Type I (No. 2) /3 Turns in cyclo(L-Ala-L-Ala-Aca) Observed" (cm-I) Mode
Raman
Amide I
Infraredb 1682sh
1670VS 1652W 1641W
1650VS 1640sh 1574sh 1565sh 1554sh 1543s 1530s
No. 2
1686 1673 1651
1681 1674 1638
1379MW
(1572W)
(1536s)
1378MW (1377MW)
1242s 784W 757w 738W 726W 681W
1278W 1272M 1241M 781MW
723MS 682MS 595w
1578
(1554sh) ( 1548s)
1362VW 1330s
Amide V
No. 4
1585
Amide I1
Amide 111
Calculate& (cm-I)
(1 278W) (1272M) (1244M) (785MW) (738M) (726MS) (694W) (679MS) (595W)
1549 1537 1371 1365 1332 1281 1245 796
1548 1536 1379 1373
1271 1241 76 1 713
703 676
675
See footnote a, Table XXV, for explanation of symbols. values in parentheses were determined at low temperature. c The numbers in the heading refer to those of the computed conformations (Bandekar et al., 1982). (I
6 Frequency
being the only ones to predict another mode below the 1543-cm-l band. All of the observed amide I11 modes are predicted, which is not the case for the other conformations. And while the frequency agreement for the 713- and 703-cm-l modes is not good (assuming the assignment is correct), other conformations do not even predict any bands between 775 and 696 cm-I. In view of the overall frequency agreement, it is highly probable that cyclo(L-Ala-L-Ala-Aca) adopts a type I11 p-turn conformation. A type I11 /3 turn has been found in the crystal structure of benzoxycarbonyl-a-.aminoisobutyryl-L-prolylmethylamide (Prasad et al., 1979),
318
SAMUEL KRIMM AND JAGDEESH BANDEKAR
but no detailed vibrational analysis has been made of this compound. Its dihedral angles, (4, $)2 = -51", -38" and (4, $)s = -65", -25", are close to the standard values of - 60", - 30" and - 60", - 30", respectively, so it might be thought that its amide modes would be close to those of the standard structure. However, the presence of the two unusual residues, as well as the urethane groups, indicates that a simple correspondence may not occur. In addition, the presence of four molecules in the unit cell is likely to complicate the spectrum. Raman spectra of this molecule in the solid state (Ishizaki et al., 1981) show bands at 1693W and 1677M, and an amide I11 mode at 1286W cm-'. The IR spectrum in CHCls (Rao et al., 1980) has bands at 1715S, 1658VS, and 1645VS cm-l. The 1715and 1693-cm-1 bands are due mainly to the urethane group. The other two expected amide I modes are assignable to the 1677- and -1650cm-l bands (assuming in the latter case that the same type I11 /3 turn is preserved in solution). It is interesting that these frequencies are close to those of cyclo(L-Ala-L-Ala-Aca), and indeed not so far from those of the standard structure (see Table XXI). The amide I11 mode at 1286 cm-' may be related to the 1278-cm-l band found in the cyclic molecule. The incomplete deductions achievable in this case contrast clearly with the relatively powerful conclusions possible on the basis of a normal-mode analysis.
3.
Turns an Proteilzr Efforts in the past to interpret the IR and Raman spectra of proteins have generally been based on the assumption of a three-state model, viz., components consisting of a-helix, P-sheet, and "random coil" structures. These attempts, which are based on the supposedly known characteristic frequencies of the above structures, have often led to controversial assignments (Yu et al., 1972, 1974; Spiro and Gaber, 1977; van Wart and Scheraga, 1978) or to incomplete assignments (Lord and Yu, 1970a,b; Chen and Lord, 1976; Chen et al., 1973; Frushour and Koenig, 1974; Craig and Gaber, 1977) for bands in the amide I and 111 regions of the Raman spectrum. As has been pointed out (Bandekar and Krimm, 1980), part of the reason for this failure is the neglect of the contributions of P turns, which we have seen are expected to be significant. In order to incorporate their presence in protein spectral analysis, it is necessary to know the characteristic frequencies of p turns. However, since a variety of such &turn structures exist and since what one observes in proteins are P turns with dihedral angles that vary from the canonical values (Chou and Fasman, 1977; Venkatachalam, 1968a; Lewis et al., 1973), it is not easy to identify these structures by a study of model compounds, or by the assumption of characteristic frequencies
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
3 19
for a generalized &turn component derived from a set of known protein structures (Williams, 1983). In this context, it is nevertheless important to know if p turns in proteins make specific contributions to the spectra. The only protein for which such a vibrational analysis has been done is insulin (Bandekar and Krimm, 1980). This protein is a particularly suitable one for such a study, since its structure has been solved (Blundell et al., 1972), it is relatively small, with only four j3 turns, and Raman spectra of single crystals have been reported (Yu et al., 1974). The normal-mode calculations (Bandekar and Krimm, 1980) permit a correlation of previously unassignable bands in the Raman spectrum with j3 turns in the structure, as well as showing that some of the computed &turn frequencies lie in spectral regions previously associated exclusively with a-helix modes. These results thus emphasize our previous remarks that caution must be exercised in proposing unique assignments of bands to a-helix and P-sheet structures in proteins. The /3 turns in insulin are of four different types and have the following dihedral angles (Blundell et al., 1972): A7-10: Cys-Thr-Ser-Ile (type IV), (4, $ ) 2 = -89", 20"; (4, $)3 = -141", -134". A12-15: Ser-Leu-Tyr-Glu (type 111), (4, $ ) 2 = -76", -40"; (4, $)3 = -69", -47". B7-10: Cys-Gly-Ser-His (type 11'), (4, $ ) 2 = 84", -107"; (4, $)3 = -92", -24". B20-23: Gly-Glu-Arg-Gly (type I), (4, $)* = - 143", 11"; (4, $)3 = -96", -40". These dihedral angles were used in normal-mode calculations on the model system CHsCO-(Ala)4-NHCH3. The amide I and I11 frequencies of the above four /3 turns are given in Table XXX. We note that the angles for the type I turn are quite different from the standard values, and the frequencies (though not the ranges) are also different; whereas for the type I11 turn the angles are closer to the standard values, and so are the frequencies (though some group assignments differ). The predicted amide I modes for groups 2-4 of the /3 turns center near two frequencies, 1652 & 3 and 1680 & 4 cm-l. Bands are observed near these frequencies in the Raman spectra of single crystals of insulin, viz., at 1658 and 1681 cm-1 (Yu et al., 1974). The 1658-cm-' band has been assigned (Yu et al., 1974; van Wart and Scheraga, 1978), on the basis of previous correlations, to the known 40-50% a-helix component of insulin (Blundell et al., 1972). This is a reasonable interpretation,
320
SAMUEL KRIMM AND JAGDEESH BANDEKAR
TABLE XXX Amide I and Amide I l l Frequencies of P-Turn CH,-CO-(Ala)4-NH-CH;I Insulin Dihedral Angles Amide I
P Turn
a
Amide 111
Group"
Frequency (cm-')
Group"
Frequency (cm-I)
1+2 4+5
1697 1677 1660 1656 1650 1696 1683 1674 1655 1648 1680 1677 1675 1655 1650 1684 1683 1671 1653 1646
1 + 3 4 2 3+1
1311 1302 1296 1290 1281 1310 1305 1299 1289 1283 1319 1311 1307 1289 1281 1315 1296 1290 1287 1281
5 + 4
B7-10 (type 11')
with
3 2+1 1+2 4 + 3 5 2+ 1 3+4 1 5 2 3 4 2+1 3+4 5 4 + 3 1+2
5
1 3+4 2+4 4+3 5 1+3 3+1 4 2 5 1 3+1 4+5 2+3 5+2
See footnote b, Table XXV, for explanation of symbols.
except that our calculations would now suggest that the @-turncomponent of the insulin structure also contributes in this region. The origin of the observed band at 1681 cm-' had previously been perplexing. It had been assigned (Yu et al., 1972; Spiro and Gaber, 1977) to a random-coil component, but this is difficult to support since it disappears in denatured insulin (Yu et al., 1972). van Wart and Scheraga (1978) have commented that "the shoulder at 1681 cm-' might be due to a state not encountered in model studies." The results of the normal-mode calculations make a strong case for assigning this band to the @ turns in the native insulin structure. The disappearance of this band on denaturation (Yu et al., 1972) is certainly consistent with this assignment, as is its continued presence in a deuterated single crystal of insulin (Yu et al., 1974). T h e predicted amide I11 modes of groups 2-4 fall roughly into two groups: at 1289 k 1 cm-' and fairly uniformly distributed in the range 1311-1296 cm-l. If external hydrogen bonding is not included, these frequencies are at 1280 k 2 and 1298-1287 cm-'. T h e observed amide
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
32 1
I11 modes in the Raman spectrum (Yu et al., 1974) are found at 1240, 1269, 1284, and 1303 cm-l. These bands have been assigned at follows: 1240 cm-l to random coil and P sheet (Yu et al., 1972), 1269 and 1284 cm-l to a helix (Yu et al., 1972), and 1303 cm-l to “the a-helical category” (Yu et al., 1974). On the other hand, while essentially agreeing with the assignments of the first three bands, other authors (van Wart and Scheraga, 1978) note that “the band at 1303 cm-l cannot be assigned on the basis of what is presently known from model studies.” The reason behind this is that, although high-frequency amide I11 modes have been correlated with a-helix structures, the highest frequency that had been observed for such a band was 1295 cm-l in solid poly(Llysine) HCl at 50% relative humidity (Chen and Lord, 1974). Thus amide 111 bands near 1300 cm-l and above cannot be correlated with a-helix, P-sheet, or unordered structures. The calculations on insulin /3 turns, however, clearly indicate that the band at 1303 cm-l, and probably part of that at 1284 crn-’, can be assigned to /3 turns. I n addition, normal-mode calculations on canonical P turns (Bandekar and Krimm, 1979a; Krimm and Bandekar, 1980) have shown that they have characteristic amide I11 modes above 1300 cm-’, in a region where such modes are not found for a-helix and P-sheet structures. T h e 1303-cm-’ band assignment is thus strongly supported by these calculated results. Whereas the 1284-cm-’ band could be associated with the a helix, the fact that a band is predicted near this position for the /3 turns of insulin indicates that, at the very least, this band should be considered to be partly due to the presence of the latter structures. Based on the assignment of amide I11 modes with frequencies near 1300 cm-l to p turns, heretofore unassignable bands in other proteins could be interpreted (Bandekar and Krimm, 1980). A band at 1300 cm-’ in lysozyme (Chen et al., 1973) may also be due to P turns. Similar assignments may be appropriate for the 1305-cm-’ band in human carbonic anhydrase (Craig and Gaber, 1977), the 1305- and 1317-cm-’ bands in ribonuclease (Koenig and Frushour, 1972), the 1314-cm-l band in ovalbumin (Koenig and Frushour, 1972), and the 1279-cm-’ band in concanavalin A. In a study of the Raman spectra of BenceJones proteins (Kitagawa et al., 1979), amide 111 bands were observed at 1242, 1262, and 1318 cm-1 in the solid state and at 1245, 1265, and 1322 cm-l in aqueous solution for the type A protein. Since crystal-structure analysis of this protein (Epp et al., 1974) shows that it contains about 50% P-sheet structure and no a helix, the strong band at 1242-1245 cm-’ was assigned to the P-sheet structure by Kitagawa et al. (1979). But there were no reasonable assignments for the weak bands at 1262-1264
-
322
SAMUEL KRIMM AND JAGDEESH BANDEKAR
FIG.30. ORTEP drawing of CHS-CO-(L-Ala)s-NH-CHs model of a y turn. The CHS groups of the L-Ala residues are represented by point masses. External hydrogen bonds are included (Bandekar and Krimm, 1985a).
and 1318-1322 cm-1 on the basis of the three-state model. Since this protein has nine p turns, and since bands are predicted above 1300 cm-' for p-turn types 1-111 (Krimm and Bandekar, 1980), it was proposed (Bandekar and Krimm, 1980) that the bands at 1262-1265 and 13181322 cm-l of the Bence-Jones proteins are assignable to its p-turn component. These assignments are supported by the disappearance of these three bands on thermal denaturation and on N-deuteration (Kitagawa et al., 1979). It must be noted, however, that p-turn frequencies can occur in the region 1262-1265 cm-l if the conformation results in weak hydrogen bonds (Bandekar and Krimm, 1979a; Kawai and Fasman, 1978).
C. y Turns 1 . Standard T u r n a. Structure. They turn is formed by three amino acid residues, i, i + 1, and i + 2, and is characterized by the presence of two hydrogen bonds (see Fig. 30). The 3 + 1 hydrogen bond between the CO of residue i and the NH of residue i + 2 forms a C7 structure (Bragg et al., 1950). Since the early study in 1972 (Nemethy and Printz, 1972), the y-turn structure has been further refined using improved energy parameters (G. Nemethy, personal communication), and three different energetically stable conformations have been proposed: (1) the y turn ( y , a Cy structure); (2) the mirror-related y turn ( y M , a CFq structure); and (3) the inverse y turn ( y , , a C p structure). In y and y M there is a second (1 --.* 3) hydrogen bond between the NH of residue i and the CO of residue i + 2; in yI this bond is between the CO of residue i - 1 and the NH of residue i + 3 (a 5 + 1 bond).
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
323
TABLE XXXI Dihedral Angles for y-Turn Structures of CH~-CO-[L-A~~),-NH-CH~~ Angleb
y Turn
180 - 152 90 -161 58 -74 178 -76 149 -179 a
y~ Turn
180
- 155
-53 177 -81 74 - 179 - 154 158 - 180
y I Turn
180 59 - 177 172 -77 68 - 176 - 163 - 55 179
All angles given in degrees. See Fig. 30 for designation of angles.
The model systems for the normal-mode calculations (Bandekar and Krimm, 1985a) were CHsCO-(Ala),-NHCH3 (n = 3 and 5) with external hydrogen-bonded atoms. The dihedral angles used for the various 7-turn conformations of C H ~ - C O - ( L - A ~ ~ ) ~ - N H are - Cgiven H ~ in Table XXXI. T h e dihedral angles of the additional residues for the n = 5 y-turn model were taken to correspond to those of the APPS [viz., (c#I,$, O)O = (4, 0 ) 4 = - 139", 133", 180"l. For both structures, in order to use the refined force fields, we used the bond lengths and bond angles of P-(Ala), (Dwivedi and Krimm, 1982b), while retaining the above dihedral angles. The side-chain Ala CH3 groups were replaced by point masses at the P-carbon atom. T h e terminal CH3 groups were treated completely. 6. Vibrational Analysis. The force field used for these calculations was based on a recent refinement for P-(Ala),, in the approximation of the side-chain CH3 group taken as an equivalent point mass (Dwivedi and Krimm, 1984b). Transition dipole coupling was incorporated for amide I and II modes, using Apeff = 0.45 D for amide I and 0.279 D for amide 11. Detailed frequencies and potential energy distributions for both structures are given in the original paper (Bandekar and Krimm, 1985a). In Table XXXII we present the results for the amide modes of the n = 3 molecule; we indicate in the discussion the changes when n = 5. The amide I modes of all y turns are predicted in the range 16851650 crn-', although for n = 5 this range is narrowed to 1675-1655 cm-1. It should be noted that the lower part of this range overlaps the characteristic a-helix frequency. T h e possibility of distinguishing be-
+,
324
SAMUEL KRIMM AND JAGDEESH BANDEKAR
TABLE XXXII Calculated Amide Frequenciesa of CH3-CO-(~-Ala)3-NH-CH3 YM
Y Mode Amide I
Amide I1
Amide 111
Amide V
in y-Turn Conformatiom
v (cm-1).
Groupb
1684 1670 1655 1653 1552 1529 1526 1509 1390 1367 1336 1327 1297* 1282 1242* 1225*
0 + 3
1 + 3 3 + 1 2 o+ 1 1 1 3 3+2 1 0 3+2
709 706 676 655 608 602 548 517 493
3 3 1 1+2 2 2 0 0 0
1
2 3+0 0
u
YI
(cm-1).
Groupb
u (cm-1).
Groupb
1675 1668 1656 1654 1551 1540 1527 1518 1387 1352 1331* 1310* 1261* 1254* 1248*
0 + 3 2 1 3+0 2+1+0 0+2 3 1 0 2 1 2+3+1 0 3 0+1
729 719 707 677 643 570 562
2 1 3 2 1 0 0
1675 1667 1660 1649 1546 1540 1512 1503 1375 1371 1351 1346 1323 1308* 1268 1243* 1232* 718 712 706 698 644
0 + 3 2 + 1 1+2 3+0 1+2 1+2 3 0 1 2 3 0 1 2 + 1 3+1 3 + 2 0 1 2 + 3 3 0-t-1 2
a Only frequencies with an asterisk have a CN stretch contribution, in all cases appropriate to the peptide group except the 1268 cm-' band of 7 , . which has a CN(2) stretch contribution. b The numbers refer to the peptide groups of Fig. 30. The designation 0 + 3 indicates that both groups contribute to the mode, that of 0 being larger.
tween the different y turns clearly depends on identifying the groups contributing to different frequencies, since modes associated with groups 1 and 2 at the turn have frequencies near 1668 and 165'7 cm-' for all y turns (for n = 3 and 5). As we have shown (Bandekar and Krimm, 1985a),if isotopic substitution can be used, a distinction is possible: A calculation for a molecule with 14CO(l)indicates that, from the location of the shifted band in the original sequence of amide I frequencies plus the magnitude of the shift, it should be possible to identify the type of turn. Incidentally, this is aided by the fact that the amide II(1)
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
325
modes are also shifted by a I4CO(l) substitution (since amide I1 always contains CN s). The situation for the amide I1 modes is complicated by the different mixing of NH ib for the different turns. Thus, the frequencies for groups 1 and 2 are 1529 and 1509 cm-' in y and 1518 and 1551 in yM, but these modes mix equally in y~ to give frequencies at 1546 and 1540 cm-I. (For n = 5, differential changes occur for groups 1 and 2: 1527 and 1509 cm-I for y , 1547 and 1544 for YM, and mixed modes at 1541 and 1558, 1525 cm-l for 71,respectively.) This again suggests that backbone isotopic substitution can help to elucidate the structural origin of the frequencies, and a calculation for an 15NH(2)structure (Bandekar and Krimm, 1985a) indeed shows this to be the case: Amide II(2) drops by 12 cm-l for y, by 25 cm-I for YM, and the equal mixing for yI is altered to give purer modes at 1542(1) and 1540(2) cm-'. Backbone isotopic substitution thus provides a new dimension of conformational analysis when combined with normal-mode calculations. As shown in Table XXXII, a relatively larger number of modes in the region 1400-1200 cm-I contain a significant NH ib contribution. The frequency distributions of these modes differ between the different yturn structures, and the utility of this region for structure determination will depend on the intensities of the bands in the Raman and IR spectra. A similar comment is applicable to the region of NH ob contribution (below -730 cm-l), although some of the differences are larger than for NH ib (Bandekar and Krimm, 1985a). While calculations on canonical structures can provide useful guidelines for correlating frequencies with conformations, actual y-turn structures will often have dihedral angles that differ from the standard values. It is therefore important to do the normal-mode analysis on the actual structure under consideration, taking appropriate account of the side chains involved.
2 . y Turns in Peptides In order to gain confidence in the predictions for the canonical y-turn structures, it is important to have a satisfactory vibrational analysis on a molecule with a known y-turn structure. Such an analysis has been done for cyclo(D-Phe-L-Pro-Gly-D-Ala-L-Pro) (Bandekar and Krimm, 1985b), a cyclic peptide known from X-ray studies (Karle, 1981) to contain a y~ turn. Its dihedral angles are (+, +), = 135", -69"; (+, J l ) p = -82", 59"; (+, 4)s = 81", -126". The normal-mode calculations were done on a structure with the prolyl rings included, the side chains approximated by point masses, and external hydrogen bonds included (See Fig. 31). In order to maintain
326
SAMUEL KRIMM AND JAGDEESH BANDEKAR
FIG. 3 1. Schematic illustration of cyclo(D-Phe-L-Pro-GIy-D-Ala-L-Pro). Peptide groups are numbered as for canonical y turn (see Fig. 30), and external hydrogen bonds are shown (Bandekar and Krimm, 1985b).
standard bond lengths and angles (so that the force field could be transferred), the dihedral angles had to be modified slightly, in most cases by less than 5" but in the cases of +(Pro-3) and +(Pro-1) by + 12" and + 1lo, respectively. This modified the internal hydrogen-bond lengths only slightly. The force field for the backbone was one refined for a point mass approximation (Dwivedi and Krimm, 1984b), while that for the prolyl ring was transferred from one for (Pro),JI (Johnston, 1975). The observed Raman and IR amide bands, together with the calculated normal-mode frequencies, are given in Table XXXIII. The observed amide I modes are very well reproduced, including the fact that the two lowest frequencies are associated with the Pro groups. Only groups 0, 2, and 4 are expected to have amide I1 modes, and the observed (N-deuteration-sensitive) bands are reasonably well accounted for. A large number of modes in the region 1400-1200 cm-I are predicted to have NH ib contributions and therefore to be sensitive to N-
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
327
TABLE XXXIII Observed and Calculated Amide Frequencies of cyclo(D-Phe-L-Pro-GI-D-Ala-L-Pro) ~~
Observed" (cm-1) Mode Amide I
Raman 1685M
1689s
16643
1666VS
1634M 1618W
747w 724VW
1638MS 1621s 1563sh 1542MW 1522MS 1386sh 1341w 1319MW 1304MW 1280sh 1272MW 1266W 1248MW 1233W 750M 724W
701VW
700M
588MW
653sh 593MW
Amide I1
Amide 111
1389VW 1340W 1327W 1303W
-
1278M 1273M 1252s
-
Amide V
b
Infrared
~~
Calculated (cm-1) V
1693 1667 I1659 1640 1628 1561 1549 1531 1386 1336 1313 1301 1293 1286 1272 1266 1249 1237 756 731 655 593
Groupb 2 4 +0 0+4 1+3 3+1 4+2 2+4 0 4 2 4 0 2 0 0
4 2 4 2(+Phe) 4 0 4
See footnote a, Table XXV, for explanation of symbols. Group numbers refer to the peptide groups in Fig. 3 1.
deuteration. While many of the observed bands are weak and hard to assign definitively, they are well predicted. However, bands at 1319, 1304,1248,and 1233 cm-l clearly weaken on N-deuteration and their frequencies are calculated very well. The amide V modes are also predicted quite well, despite the overlap with Phe modes at 750 and 700 cm-' (using an internal standard band, it could be shown that their intensities decrease on N-deuteration). The good predictions for the cyclic molecule indicate that the predictions for the canonical structures should be reliable. As expected, the frequencies of the cyclic structure are different from those of the standard structure, a result of the cyclic nature of the former system, the differences in dihedral angles, and the presence of prolyl residues.
328
SAMUEL KRlMM AND JAGDEESH BANDEKAR
VI. CHARACTERISTICSOF POLYPEPTIDE CHAIN MODES
A . Introduction The goal of a vibrational spectroscopic study of a polypeptide molecule is to derive structural information from spectral parameters, such as band frequencies, intensities, and polarizations. In the past, the frequencies of the amide modes were the main diagnostic quantities, with structural insights being obtained from correlational studies based on observed spectra of known polypeptide chain structures. It is apparent from the preceding discussions that we now have a rigorous basis for understanding the normal modes of a polypeptide chain. Instead of speculating on the meaning of differences in the spectrum, it is now possible to provide a detailed prediction of the effects of structural changes on the normal modes. It is appropriate at this stage to examine what such calculations tell us about general characteristics of vibrational frequencies of the polypeptide chain. We do this on the basis of the structures studied thus far by normal-mode analyses. The discussion of the peptide group modes in NMA (Section II,B,2) serves as a useful background for the present considerations.
B . Amide and Skeletal Modes of the Polypeptide Chain
1 . NH Stretch Mode Although NH s is a highly localized mode, and therefore not likely to be sensitive to chain conformation, its frequency depends strongly on the strength of the N-H O=C hydrogen bond, and it can be expected that this will be a sensitive reflection of structure and its variations. The observed NH s band, normally seen between 3310 and 3270 cm-' (Krimm and Dwivedi, 1982a), represents a modification of v i by at least two factors: TDC and Fermi resonance. T h e former must be expected since the dipole derivative for this mode is large (Cheam and Krimm, 1985); however, to date no splittings assignable to this effect have been observed, such as are seen for amide I and I1 modes. Fermi resonance (see Section II,E,4) results in an upward shift in frequency because of the interaction with the lower frequency overtone or combination band, v 8 , involving amide I1 modes. In Table XXXIV we give data on observed spectra that have been analyzed, and on the assignment of the combination that interacts with the fundamental. It is interesting that, in general for helical structures, the combination interacting with the NH s
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
329
TABLE XXXIV Observed and Unperturbed NH Stretch Frequacies and Combinationsa and Their Assignments Structure
(GlY).I (G$)"II P- (Ah). a-(Ala), P-(GluCa), 310-(Aib),
VA
VEI
Vl
3300 3278 3280 3307 3275 3272
3080 3086 3072 3058 3088 3060 3030
3272 3257 3242 3279 3230
3261
Vl
Combinationb
3108 3108 3109 3086 3133 3070 3043
1517 + 1602 = 3119 2 X I550 = 3100 1524 + 1592 = 3116 2 X 1545 = 3090 1552 + 1576 = 3128 2 X 1545 = 3090 2 X 1531 = 3062
Ref.c 1 2 1 1 3 4
In cm-I. Italicized frequencies are observed IR bands; others are calculated values. 1 , Krimm and Dwivedi (1982a); 2, Dwivedi and Krimm (1982~);3, Sengupta et al. (1984); 4, Dwivedi et al. (1984). a
fundamental is an overtone of an amide I1 mode, whereas for @sheet structures it is a combination of amide I1 modes of different species. Having data on an experimentally determined NH s fundamental enables us to ask how well this frequency correlates with the geometry of the hydrogen bond. [It should be noted that the varying anharmonicities in the combination bands, i.e., the differences between v i and the combination frequency, suggest that v x cannot be obtained accurately from Y A + V B - 2v11 (Fraser and MacRae, 1973, p. 205).] Incidentally, the frequency depends on both F(NH) and F(H 0),and a correlation of these force constants with hydrogen-bond geometry is therefore a more fundamental relation; however, both force constants have the expected dependences on the strengths of the hydrogen bonds [viz., 5.674 and 0.150 versus 5.830 and 0.120 for /3-(Ala)nand a-(Ala), , respectively]. Previous authors (Nakamoto et aZ., 1955; Pimentel and Sederholm, 1956) have shown that there is a dependence of V A or Av 5 vfree- V A on Z(N ... 0),and this relationship has been used to relate the NH s frequency with Z(N 0)in the a helix (Fraser and MacRae, 1973, p. 205). This correlation suffers from several deficiencies. First, the original relationship was based on V A rather than v i values (Pimentel and Sederholm, 1956). Second, this relationship was derived from data on a broad range of crystalline compounds having N-H O=C hydrogen bonds, without regard to the influences of crystal packing forces and hydrogen-bond geometry other than Z(N -.-0).Third, it is necessary to know the frequency of the free NH group, and it is not clear whether the molecule chosen to provide this (CZH~CONHCZHF,) is necessarily the appropriate representative of a polypeptide chain. It is, therefore, better to correlate Z(N ..-0)with v i , recognizing that this still does not provide e..
e..
330
SAMUEL KRIMM AND JAGDEESH BANDEKAR
U; (CM-'I
FIG. 32. Relationship between N .*.0 distance [I(N O)] in hydrogen bond and unperturbed NH stretch frequency (I&. From lower left, points correspond to p-(GluCa), ,p(Ala), , 3,,-(Aib). , and a-(Ala),. '
a complete dependence on hydrogen-bond geometry (Cheam and Krimm, 1986). In Fig. 32 we show a plot of 1(N 0)versus v i for the non-glycinecontaining polypeptides listed in Table XXXIV. [The data for (Gly),I and (Gly),II depart significantly from the curve of Fig. 32,the reason for which is not clear.] The hydrogen bonds involved all have HNO in the range of about 3-10' and NHO in the range of about 165-175'. Under these conditions, and in the range of 1 = 2.70-2.90A, it appears that the relationship is relatively linear. While this curve coincides at the a-(Ala), point with one given previously (Fraser and MacRae; 1973,p. 204), the latter deviates significantly at the lower end [e.g., predicting v i = 3180 cm-l for Z(N 0)= 2.70 A]. The relationship in Fig. 32 should permit the determination of Z(N ... 0) from v i in the region of about 32003300 cm-l, where the variation corresponds to 0.0035 A1cm-l.
2. Arnide I Mode The amide I mode in polypeptides is still predominantly CO s plus CN s, but it can also contain significant contributions from C"CN d and minor contributions from CaC s, CNC" d, H a b, and NH ib. (The presence of the latter is mainly responsible for the downshifts in the amide I
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
33 1
frequency on N-deuteration.) Therefore, we cannot expect even an "unperturbed" frequency (defined, for example, as an average of the frequencies calculated without TDC) to necessarily reflect the strength of the hydrogen bond, as was true for the NH s frequency. In fact, although the F ( C 0 ) force constants of p-(Ala), and a-(Ala), of 9.822 and 10.029, respectively, are in the proper order of the respective hydrogenbond strengths, the "unperturbed" frequencies of 1670 and 1662 cm-', respectively, are not, demonstrating this point. This is probably related to the fact that amide I in p-(Ala)n is on average a CO s (74), CN s (19) mode, whereas in a-(Ala), it is essentially a CO s (82), CN s ( 1 l), C"CN d (10) mode. This eigenvector difference probably also accounts for the difference in intensities between the amide I modes of p-sheet (Chirgadze et al., 1973) and a-helix (Chirgadze and Brazhnikov, 1974) structures. T h e main perturbing influence on the amide I mode is TDC, although Fermi resonance interactions can occur in special cases (Dellapiane et al., 1980). For the structures analyzed thus far by normal-mode analyses, the calculated frequencies obtained with such coupling and their observed counterpqrts are collected together in.Table XXXV. While there are common special features within structural groupings, small observed differences result from real structural differences, showing that generalized perturbation treatments (Miyazawa and Blout, 1961 ; Krimm, 1962) cannot provide a correct description of the actual situation. In the case of the extended chain structures, the smaller splitting for (Gly),I (1685 - 1636 = 49 cm-l) than for P-(Ala), (62 cm-') and p(GluH), (69 cm-') results from different TDC interactions in the APRS as compared to the APPS structures. This also accounts for the higher frequency of the strong Raman band in (Gly),I at 1674 cm-', which in APPS structures is generally found close to 1669 cm-' (Frushour et al., 1976). T h e slightly lower value in p-(GluH), (1665 cm-') as compared to p-(Ala)" (1669 cm-') is probably related to the slightly stronger hydrogen bond in the former (Sengupta et al., 1984), as is the lower value of the strong IR band (1624 versus 1632 cm-'). (No allowance was made for this difference in hydrogen-bond strength in the calculated frequencies.) For the helical structures, it is interesting that, despite the significantly different conformations, the strong Raman band is found near 1652 cm-' for all structures. Small frequency differences may again reflect real differences in hydrogen-bond strengths: The v i value for a-(GluH), is between those of a-(Ala), and Slo-(Aib), (Sengupta and Krimm, 1985), and the mean values of Raman and IR modes are also in this range.
TABLE XXXV Observedo and Calculated Frequenciesb of Backbone Modes of Polypeptides
Gly.1 Mode Amide I
Obs.
Calc.
1685
1689 1677 1643 1515 1514
1674
Amide I1
1636 1517 (1515)
Amide 111
1410
(1408) (1295) (1220) (1214) Skeletal Amide V
1162 884
708
1415 1415 1286 1213 1212 1153 890 718
P-(Ala),, Obs.
a-(Ala),,
a-(GluH),
310-(Aib),~
Calc.
Obs.
Calc.
Obs.
Calc.
Obs.
Calc.
Obs.
Calc.
(1694) 1695 1669 1670 1632 1630 (1555) 1562 (1538) 1539 1524 1528 (1399) 1402 (1402) 1399 (1333) 1332 1243 1236 1224 1231
(1693) 1665 1624 (1597) (1568) 1560 1260 (1225) 1223
1692 1668 1630 1607 1576 1550 1249 1221 1222
1654
1654 1645
1658
1652
1657 1655
1656
1655
1657 1655
1653
1640
1647
1665 1661
(1560) 1550
1565 1551
1545 (1516)
1538 1519
1550 (1510)
1537 1517
(1531)
1380 (1333) 1283
1382 1344 1290
(1338) (1278)
1345 1287 1278 1262
1326 1299 (1283) 1287
913 706 704
956 (705)
944 713
884
889 738 669
908
706 698
a
(Gly).II
Obs.
909
Calc.
P-(GluCa).
740 673
1270 1265
658 618
910 660 608
134@ 1296
924
(670) (618)
922 678 626
1545
(1339) (1313)
1480
908
694 (680)
Raman bands, italic; IR and Raman bands, bold; IR bands, regular type. Intensities weaker than medium are in parentheses. In cm-I. Obs., observed; calc., calculated. Overlapped mode.
1547 1533 1346 1312 1287
905 701 676
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
333
3. Amide II Mode The amide I1 mode is predominantly NH ib plus CN s, but it always has a significant contribution from C"C s and smaller contributions from CO ib and NC" s (see Table XXXVI). It is always strong in the IR spectrum and weak o r absent in the Raman. On N-deuteration, it disappears from the IR spectrum, with ND ib contributing to and mixing with other modes in the region 1070-900 cm-' (and appearing in the spectra usually in the region 1040-940 cm-'), and CN s moving to the region 1490-1460 cm-', where it mixes with C"C s and CO ib to give modes that are usually observed in the region 1480-1465 cm-I. As we have noted before, amide I1 is perturbed by TDC, and a collection of observed and calculated frequencies is given in Table XXXV. For the extended chain structures, the significant frequency differences seem to result from the influence of the structure on the form of the normal vibration. Thus, for the strong IR bands at 1517, 1524, and 1560 cm-' in (Gly),I, P-(Ala),, and P-(GluCa),, respectively, there is a definite trend of increasing frequency with increasing relative contribution from NH ib. This contribution of course depends on the relative force constants in the cases of (Gly),I and P-(Ala),, and the relative structures, particularly in the influence of the side chains, in the cases of P-(Ala), and P-(GIuCa), . The fact that the observed frequencies are well reproduced suggests that these factors are properly accounted for. For the helical structures, it is interesting that the strong IR band is found consistently at 1550- 1545 cm-', despite the large differences in structures. Again, there seems to be a correlation, with the relative contribution of NH ib in the bands of a-(Ala),, 31~-(Aib),,and (Gly),II at 1545, 1545, and 1550 cm-', respectively. [The higher frequency for a(GluH), compared to a-(Ala),, even though the PEDs are the same, is due to the slightly stronger hydrogen bond in the former a-helix structure (Sengupta and Krimm, 1985).] If these frequencies are plotted together with those of the extended chain structures as a function of the NH ib contribution, there is an essentially linear relation between the two quantities. Interestingly, this relation extrapolates to near 1460 cm-' when the NH contribution is zero, roughly where the CN s mode is found for an N-deuterated system! 4 . Amide III Mode
Although the so-called amide 111 mode has been described as the localized counterpart to amide I1 (Fraser and Price, 1952), and, as we have seen, contains NH ib and CN s in NMA, the situation is in fact much more complex for the polypeptide chain. The main point is that
TABLE XXXVI Potential Energy Distributions" for ObservedbStrong Am& I1 Modes
~~~~~~~~~
NH ib CN s CQCs CO ib
~
35 28 17 14
NC- s a
Contributions 2 5 . In cm-I.
41 26 13 14
58 18
8
55 21 11 8 6
~
46 33 10 11 6
~~
46 33 10 11 6
47 29 11 11 6
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
-
335
NH ib is a significant component of a number of modes in the 1400- to 1200-cm-' region, mixing differently in different parts of this region and as a function of the side-chain structure. As has been noted (Hsu et al., 1976), it is therefore not possible to expect a simple general relationship between such a frequency and the backbone conformation (Lord, 1977). In NMA, the main contributions 310% to the PED of this mode, in addition to NH ib, come from CN s, CaC s, and CO ib (Rey-Lafon et al., 1973). Additional contributions are also made by NC" s and CO s, and the dipole moment derivative, and therefore the intensity of this mode and the orientation of dp/dQ, depends strongly on the details of the force field (Cheam and Krimm, 1985). However, in a polypeptide chain other coordinates can make major contributions, and these are influenced by the side chain. In addition, if the main criterion for an amide I11 mode is a significant contribution from NH ib (since CN s is certainly not a common feature), and therefore a sensitivity to N-deuteration, we must broaden our outlook to include all modes in the 1400- to 1200cm-' region. Such observed bands, and their calculated counterparts, are given in Table XXXV, and PEDs for nonweak observed bands are given in Table XXXVII. It is clear from these results that the nature of the amide I11 mode depends very much on whether or not the side-chain structure involves a Ca-Ha group. For the Gly polypeptides, CH2 modes mix extensively with NH ib. For 310-(Aib),,there is some mixing with side-chain modes. In the cases of side chains with Ca-Ha groups, the predominant contribution is from Ha b2, in which Ha moves essentially perpendicular to the H"CuC6 plane (Ha b l being the in-plane mode). It is clear that this is the most important coupling to NH ib, as can be seen from the observed bands of P-(Ala), and a-(Ala), . It is interesting that Ha b l makes large contributions in the case of the a helix and none in the APPS structure. This is probably due to the fact that in the extended chain structure the HaCaCP plane is more nearly parallel to the NH bond, making the NH ib and Ha b2 coordinates nearly parallel and therefore strongly coupled through the CN bond, whereas in the a helix the angle deviates more from parallelism, and maximum coupling requires contributions from Ha bl as well as H" b2. This would suggest that there may be a sensitivity of NH ib modes to the backbone 4 angle, rather than to $ (Lord, 1977), and in fact the lowest frequency NH ib mode correlates somewhat better with than $ . However, such relations must presently be viewed with caution, since the lowest frequency NH ib mode is not necessarily the one containing the largest contribution from this coordinate. Certainly, associating characteristic frequency ranges
-
+
TABLE XXXVII Potential Energy DistnbutionP for ObservedbNonweak Amide III M&s
Coordinate NH ib CN s CH2 w CHZ b CHp t W NC" s (2°C s HI b2 H" bl CSH, W t CYHp W t
1410
1162
1243
1224
1260
1223
1380
1283
1270
1265
1296
1280
14
12
19 13
18 11
11 15
11
13
28 6
40 9
15
10 6
41 31
5
13 12 23
8 8 24
20 24
58
29 7
50 13
19 34
24 28
17 23
10 7 22
6 14
16 15
15
14 8 15 11
28
23 13 14
ccp w
CHS r CO ib C"CS s C"CN d CSH, r
6
7 9
6 8
Contributions 25.
* In cm-'.
9
9
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
337
TABLE XXXVIII Potential Energy Dtitributions" for Obsenredb Skekhl Stretching Modes
C"C s CN s co s CNC" d C"CN d NC"C d CHs r CHs r CK7 s C7C6 s CO ib
31 24 12 8
15 14 11 13 8 7
17 18 11
6 23 10 16
9 27 9 12
5
12
13 9 22 11 6
21 14
11
9 10
12
cc*ss a
18 13 11 9
8
11
7
Contributions 2 5 . In cm-'.
with conformation can be dangerous when w e observe that bands of comparable intensities are found in comparable regions for P-(GluCa),, (1260 cm-l) and cu-(Ala),2(1265 cm-I).
5. Skeletal Stretch Mode I n addition to the amide modes, all polypeptide chain conformations appear to have a characteristic skeletal stretching mode that is of relatively common origin and that gives rise to a strong Raman band, generally in the region 960-880 cm-'. (The counterpart skeletal stretch mode, found near 1100 cm-' in NMA, does not show up as a characteristic band in polypeptides; rather, its NC" s contribution is distributed broadly in the region 1180-920 cm-', depending on the side-chain composition.) It is important to determine whether this mode has any sensitivity to conformation. T h e observed and calculated frequencies of this mode are given in Table XXXV, and the PEDs for observed bands are given in Table XXXVIII. In (Gly),I and (Gly),II this skeletal stretch frequency is observed at 884 cm-', and the calculated modes at -890 cm-' have quite different PEDs. Thus, despite different structures and different eigenvectors, the frequencies are the same, and apparently insensitive to conformation. The same seems to be true of (AIa),t:for P-(Ala),, the observed band is at 909 cm-' (calculated at 913 cm-l), and for a-(Ala), the observed band is at 908 cm-' (calculated at 910 cm-l). Even 3lo-(Aib), has an observed
-
338
SAMUEL KRIMM A N D JAGDEESH BANDEKAR
band at 908 cm-’ (calculated at 905 cm-I). The only coordinate in common is CN s, and it seems as if the frequency of this mode is determined only by the side-chain composition, being 884 cm-’ for Gly and 908 cm-’ for Ala (or its related Aib), and is independent of main-chain conformation. This is both confirmed and modified by the results on (Glu),. For a(Glu), this mode is found at 924 cm-’ (calculated at 922 cm-’), an in-. crease from a-(Ala), that seems to indicate a dependence on the number of carbon atoms in the side chain. [This trend is continued in a-(Lys),, where the comparable band is observed at 945 cm-’ (Chen and Lord, 1974; Yu et al., 1973).] However, the frequency for @-(Glu),is not the same as for a-(Glu), [as was true of a-(Ala),], being observed at 956 cm-l (calculated at 944 cm-I). [The comparable frequency of p-(Lys), is 1002 cm-l (Yu et al., 1973; Frushour and Koenig, 1975b).]These results would seem to indicate that for side chains longer than CH3 the frequency of this skeletal stretch mode may depend on main-chain conformation. Calculations on (Glu), (Sengupta and Krimm, 1987) bear this out. Sengupta and Krimm (1987) calculated the normal modes of (Glu), in conformations varying from the extended APPS structure (a 21 helix) through 2.41- and 31-helix structures to the a-helix conformation (a 3.61 helix). They used force fields for various of these calculations that were based on those for @-(Ala),, (Gly),II, and a-(Ala),. Although the frequency values depend on force field, it was found that the frequency of this skeletal stretch mode varied essentially linearly with the backbone 4 angle. It would appear that, for side chains longer than CH3, there is mixing of backbone and side-chain stretching motions that depends on the “extension” of the backbone. It remains to be seen whether a linear relation with 4 is valid for all side chains, but it seems clear that the frequency of this mode can be an indicator of backbone conformation. 6 . Amade V Mode The amide V mode in NMA consists of CN t plus NH ob, although CO ob can make a small contribution (Rey-Lafon et al., 1973). In the polypeptide chain, CN t and NH ob are also the main components but other coordinates contribute significantly. Thus, the frequency of this mode depends not only on the strength of the hydrogen bond (Miyazawa, 1962), but also on the side-chain structure. We present in Table XXXIX the PEDs together with their N 0 and H .-.0 bond lengths, for observed amide V modes of structures for which the normal-mode calculations have been done. 1..
TABLE XXXIX Potential Energj Dirtributwns" for Observedb Am& V Modes
(G1Y)"I
P-(AW,,
Coordinate
708
I(N ... O)d l(H ... O)d
2.91 2.12
CNt NH ob NH ... 0 ib CO ob CO ib NCaC d CaCN d C"C s NC" s NH t NCa t Ha bl H ... 0 s C8CyC8 d C'CS s CCpr
75
44-749
16 19 5
41-28K 20
706
698<
18h
705<
(Gly),tII 740
2.69 1.70
2.73 1.75 48 41 19 13 5
42 31 10
673 2.69' 1.74'
49 16 20
5
a-(Ala),4 658
77-88C 29 10h 1O h
11
37 21 7
7 8 12 7 9
5 9
618
a-(GluH), 670'
2.86 1.88
8
618'
2.86' 1.88'
47 23 7 15 12 7 6 8
19 11
10
3dAib). 694
680
2.83 1.83 42 17 6
54 30 11
11
8 9 12 5
37 40 10 27
5
12 5
10
5
7
10
Contributions t 5. In cm-I. Weak or very weak bands. d In A. Average value. Assumed equal to a-(Ala). . g Two possible assigned modes. For one mode. a
B-(GluCa),8
6 5 13
340
SAMUEL KRIMM AND JAGDEESH BANDEKAR
As can be seen by comparing (Gly),I and 0-(Ala),, there is no simple dependence of the amide V frequency on hydrogen-bond strength alone: (Gly),I has the weaker hydrogen bond and therefore should have the lower frequency, but the opposite is true. A similar result holds in comparing p-(Ala), with p-(GluCa), . It seems that for @-sheetstructures an amide V mode is found near 700 cm-', independent of the side chain or the strength of the hydrogen bond. However, for P-(Ala), and a(Ala), ,which have the same side chain, the frequencies are correlated as expected with hydrogen-bond strengths. This also seems to be true of a(GluH),, for which there is evidence (Sengupta and Krimm, 1985) that the hydrogen bond is slightly stronger than that in a-(Ala),. And this correlation again holds true for the comparison between (Gly), I1 and (Gly),I. There is thus no obvious correlation between the frequency and any aspect of the PED. It is likely that a combination of main-chain conformation, hydrogen-bond strength, and side-chain structure are involved in determining the frequency of the amide V mode.
7. Other Amide Modes and Skeletal Deformations For NMA it made sense to define CO ib (amide IV), CO ob (amide VI), CN t (amide VII), and C"CN d and CNC" d as characteristic modes. Such a classification is too simplistic for a polypeptide chain, where many of these modes, together with NC"C d and CP b, mix strongly and very differently depending on the main-chain conformation and the sidechain structure. For example, bands to which CO ib makes the major contribution (with the other next-highest coordinate given in parenthesis) occur at 628 (CO ob) and 614 (C"CN d) cm-l in (Gly),I, at -703 (NC"C d) and 313 (CNC" d) cm-' in (Gly),,II, at 629 (CO ob) and 300 (CS b) cm-' in p(Ala),, and at -528 (C"CN d) cm-' in a-(Ala),,. It might seem that a band near 628 cm-' is characteristic of @-sheet structures, but a 625cm-l band observed in p-(GluCa),, (Sengupta et al., 1984) is mainly C"CN d with no CO ib contribution. And the comparable band in a(GluH), is found at -565 cm-' (Sengupta and Krimm, 1985), with C"CN d predominating and CO ib being only the third largest contributor to the PED. Similarly, bands to which CO ob makes the major contribution (with the next-highest contributing coordinate in parenthesis) are found at 589 (CO ib) cm-' in (Gly),,I; at 578 (CO ib) and -570 (NH ob) cm-l in (Gly),JI; not at all for p-(Ala),,; and at 774 (CP b), 756 (CN t), and 375 (NH ob) cm-l in a-(Ala),,. In contrast, a strong band is found in p-(GluCa), at 653 cm-' (CO ib) (Sengupta et al., 1984), and no welldefined band is found in a-(GluH),, (Sengupta and Krimm, 1985) comparable to any of the three bands of a-(Ala),.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
34 1
These examples make it clear that we cannot expect to find characteristic bands for the CO and skeletal deformation modes in the polypeptide chain. These coordinates contribute to normal modes in ways specific to the structures in question. In fact, this means that this spectral region should be particularly sensitive to conformation, and that normal-mode analysis should provide a useful approach to the determination of structural differences. VII. VIBRATIONAL SPECTROSCOPY OF PROTEINS A. Introduction In the spirit of the preceding discussions, it would be most appropriate to approach the study of the vibrational spectrum of a protein in the context of a normal-mode analysis of the molecule. In fact, despite the progress achieved on small peptides and regular polypeptide chain structures, such analyses on globular proteins are only in the beginning stages. The reasons have been twofold. First, a satisfactory force field for the polypeptide chain had not been available. This is no longer the case: The force field developed for the side-chain point-mass approximation (Dwivedi and Krimm, 1984b) should provide an appropriate basis for correlating the observed spectrum with the polypeptide chain conformation. Second, a normal-mode calculation on a very large nonregular molecule presents formidable computational problems. These are also being overcome, in terms of special computer programs adapted to the calculation of a large polypeptide chain (Tasumi et al., 1982) as well as the availability of high-speed computers and supercomputers. It can be expected that such calculations on globular protein molecules will be routine in the not-too-distant future. We discuss below the start that has been made in this direction. Even with the ability to calculate vibrational frequencies of a given structure of a protein molecule, still other problems will have to be resolved. We saw in Section II,B that a molecule with N atoms has, in general, 3N - 6 normal modes of vibration. For a regular repeating structure, such as a helix, the number of interest to us does not increase indefinitely with the length of the structure, since the IR- and Ramanactive modes of an infinite structure consist of only three sets of phaserelated values for the modes of a single chemical repeat unit (see Section II,B,3). No such restriction exists for an arbitrary protein molecule, and we must therefore expect all 3N - 6 modes to be potentially active. It therefore becomes very important to have some idea as to which modes can be expected to be strong and which weak, i.e., to be able to predict
342
SAMUEL KRIMM AND JAGDEESH BANDEKAR
spectral intensities. An important step has been taken in this direction by the demonstration that reliable IR intensities can be obtained from ab initzo calculations of dp/dSi (Cheam and Krimm, 1985), and this will be considered further below. Another problem that must be faced if we concentrate only on polypeptide chain vibrations is the contribution of side-chain modes to the spectrum. At the very least, we must be able to identify characteristic frequencies associated with side-chain groups and S-S bridges. We first review below what is known about these structures. B . Side-Chain and S-S Modes Amino acid side chains, particularly those with aromatic groups, exhibit characteristic frequencies that often are very useful in probing the local environment of the group in the protein (Spiro and Gaber, 1977). From our present point of view, however, we are interested in characterizing spectral features of backbone chain conformation. It is therefore important to know the locations of such bands so that their contribution to the spectrum is not confused with amide and backbone vibrations. We discuss below some features (in the nonstretching region) of such sidechain modes; these are summarized in Table XL. 1 . Characteristic Side-Chain Residue Group Frequencies
For side chains composed of aliphatic groups, such as alanine, valine, leucine, and isoleucine, the most prominent characteristic frequencies will be those associated with the CH2 and CH3 groups. Thus, the CH2 b frequency is generally found with reasonable intensity at 1465 & 20 cm-I in both Raman and IR spectra, and is a relatively localized group mode (Bellamy, 1975). The same is not true of other CHz modes, which tend to mix with local backbone modes and thus have frequencies that depend on local structure. The CH3 ab mode is found in Raman and IR spectra at 1450 k 20 cm-*. The CH3 sb mode shows up as a characteristic IR band at 1375 k 5 cm-' for a single group, and as a doublet at 1395-1385 and 1365-1360 crn-' when t w o CH3 groups are bonded to the same C atom [cf. (Aib),]. The hydroxyl group of serine and threonine should have a characteristic deformation mode, probably in the region 1350-1250 cm-' in the IR (Bellamy, 1975). In p-(Ser),, it has been assigned to a Raman band at 1399 cm-l (Koenig and Sutton, 1971). Since this mode is likely to be mixed with other backbone vibrations, it is probably a poor group frequency. The COOH groups of aspartic and glutamic acids have characteristic frequencies depending on the state of ionization. In the un-ionized state,
TABLE XL Characteristic Side-Chin Frequencies
Frequencies (cm-I) Residue
Group
Ala Val Leu lle Ser Thr
CHZ CH3
Asp Glu
COOH
Asn Gln LYS
CONHz
His
lmidazole
Phe
Phenyl
Raman
- 1465 - 1450
OH 1720
coo-
-1425
NHs+
- 1650 -1615 1640-1600 1550-1480 -1160 -1 100 1491 1409 (Dz0) 1605 1585
lnfrared
Assignment"
Referenceb
CH? b CHy ab CHx sb
1 1 1
1350- 1250
OH d
1
1720 1560 -1415 -1650 -1615 1640-1610 1550-1485 -1160 -1100
co s
2 3 3 1 4 1 1 5 5 6, 7 6, 7
-
1465 -1450 -1375
1602
- 1450
1207 1006 760 700
Con- as
cos- ss
co s
NH? b NHy+ d NHy+ d NHJ+ r NHJ+ r Ring Ring
Ring Ring Ring Ring Ring Ring Ring
6, 6, 6, 6,
7, 8 7, 8 7, 8 7, 8 8 8 6, 7
622
Tyr
Phenyl
-
- 1600
850,830
6, 6, 6, 6,
1582 1553 1363 1338 1014 879 76 1 577 544
Ring Ring Ring Ring Ring Ring Ring Ring Ring
6, 7 6, 7 6, 7 6, 7 6, 7 6, 7 6, 7 6, 7 6, 7
-1450
Trp
Indole
7
Ring Ring Ring Ring
1600 -1590
7 7 7
(Continued)
344
SAMUEL KRIMM AND JAGDEESH BANDEKAR
TABLE XL (Continued) Frequencies (cm-I) Residue CYS
Group
s-s c-s
Raman 540 525 510 745-700 670-630
Infrared
Assignment" TGT GGT GGG T G
Referenceb 9 9 9 9 9
* s, Stretch; as, antisymmetric stretch; ss, symmetric stretch; b, bend; ab, antisymmetric bend; sb, symmetric bend; d, deformation; r, rock; T, tram; G, gauche. 1, Bellamy (1975); 2, Sengupta and Krimm (1985); 3, Sengupta et al. (1984); 4, Naik and Krimm (1984a); 5, Lagant et al. (1983); 6, Lord and Yu (1970a); 7, Lord and Yu (1970b); 8, Krimm (1960); 9, Sugeta et al. (1972).
the CO s mode is generally expected in the region 1725-1700 cm-' (Bellamy, 1975), and is found in a-(Glu), in the Raman and IR spectra at 1720 cm-'. The ionized form has two stretching frequencies, C02- as in the region 1610-1550 cm-' and found in P-(CluCa), as a strong IR band at 1560 cm-', and CO2- ss in the region 1420-1300 cm-' and found in P-(GluCa), in the Raman spectrum at -1425 cm-' and in the IR at -1415 cm-'. When the carboxyl group is converted to an amine, as in asparagine and glutamine, the CO s frequency is at -1650 cm-' (Bellamy, 1975), and seems to be particularly strong in the Raman spectrum (Naik and Krimm, 1984a). In this system the NH:, b mode is expected in the region 1650-1620 cm-' (Bellamy, 1975), and has been observed at 1615 cm-' (Naik and Krimm, 1984a). T h e NH2 b of lysine is expected in the same region as that of asparagine and glutamine. In the case of NH3+, deformation modes are expected in the regions 1640-1610 and 1550-1485 cm-' (Bellamy, 1975); these modes are found in GlyCly at 1628 (IR, Raman), 1611 (Raman), 1598 (IR), -1500 (IR, Raman), and -1479 (IR, Raman) (Lagant et al., 1983). Rocking modes are found near 1160 and 1100 cm-' (Lagant et al., 1983). For histidine, as for side chains with aromatic residues, ring modes are the most characteristic bands. The normally intense Raman band of histidine at 1491 cm-', associated with the imidazole ring, is not observed in ribonuclease, but in D20 at pD = 1.0 its N-deuterated counterpart at 1409 cm-', due to imidazolium, is strong and sharp (Lord and Yu, 1970b). This band can serve as a probe of the state of ionization of the ring.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
345
In phenylalanine, ring stretching modes give rise to strong Raman bands at 1605 and I585 cm-’, ring “breathing” modes are associated with Raman bands at 1006 and 622 cm-I and an IR band at 1450 crn-l, CH in-plane deformation accounts for a Raman band at 1207 cm-’, and CH out-of-plane deformations result in strong IR bands near 760 and 700 cm-l (Lord and Yu, 1970a,b; Krimm, 1960). These bands are obvious in spectra of peptides containing phenylalanine (Naik et al., 1984; Bandekar and Krimm, 1985b). The ring stretching and breathing modes of tyrosine are expected to be at frequencies similar to those of phenylalanine. For a para-disubstituted phenyl ring, the CH out-of-plane deformation occurs characteristically in the IR in the region 860-800 cm-’ (Bellamy, 1975). In tyrosine, however, a doublet is found at 850, 830 cm-’ in the Raman spectrum and has been assigned to a Fermi resonance between a ring breathing mode and the overtone of an out-of-plane ring-bending vibration (Siamwiza et al., 1975). The intensity ratio of the two components seems to be sensitive to the environment of the ring (Dunker et al., 1979). The indole ring of tryptophan gives rise to many ring modes, of which Raman bands at 1582, 1553, 1363, 1338, 1014, 879,761, 577, and 544 stand out (Lord and Yu, 1970a).The band near 1360 cm-I seems to be very sensitive to the environment of the group (Yu, 1974).
-
2. Disulfide Bridge The disulfide bridge of cystine deserves a separate discussion because, in addition to its having characteristic frequencies, there has been a question of whether the spectrum is sensitive to the side-chain conformation. Hordvik (1966) reported a relationship between the bond length 1(S-S) and the torsion angle T(S-S), but this was subsequently disputed by Shefter (1970). Lord and Yu (1970a) suggested that Raman spectroscopy should provide direct evidence concerning the presence and number of disulfide cross-links in proteins as well as the local conformation of the C-S-S-C group. This suggestion was soon tested (van Wart and Scheraga, 1976a,b; van Wart et al., 1973, 1975a,b, 1976; Bastian and Martin, 1973; Martin, 1974; Sugeta et al., 1972, 1973; Sugeta, 1975; Higashi et al., 1978). Bastian and Martin (1973) reported that the SS s frequency is independent of T(S-S), and they found no correlation between the intensity ratio of the SS s and CS s Raman bands and the CSS angles. On the other hand, van Wart et al. (1973) proposed a linear relationship between SS s and T ( S - S ) . The independence of SS s and T(S-S) was reasserted by Martin (1974). Sugeta (1975) carried out normal-mode calculations on a series of dialkyl disulfides, and found that SS s varies little with T(S-S) but does depend on the dihedral angles
346
SAMUEL KRIMM AND JAGDEESH BANDEKAR
T(C-S). Higashi et al. (1978) studied the reported crystal structures of 21 symmetrical disulfides, and concluded that 1(S-S) does not appear to
correlate with T(S-S) for aromatic disulfides in the equatorial conformation, but does show a dependence on 7(S-S) for aromatic disulfides in the axial conformation. Thus, the present view (Sugeta, 1975; van Wart et al., 1976; Higashi et al., 1978) seems to be that SS s is not in general directly related to T(S-S) in the range 65-85", but some dihedral angle dependence is found in the case of aromatic disulfides in the axial conformation and in the case of highly strained disulfides. The observed variations in the SS s frequency for alkyl disulfides have been interpreted by Sugeta (1975) as being a result of weak coupling between the SS s and SCC d modes, the SS s frequency being sensitive to the conformations about the C-S bonds. The conformations TGT, GGT (or TGG), and GGG (referring to conformations in which, respectively, two carbons, one carbon and one hydrogen, or two hydrogens are at positions trans to the distal sulfur across the C-S bond) have the following SS s frequencies: 540, 525, and 510 cm-I, respectively (Sugeta et al., 1972). The CS s frequencies give rise to bands at 745-700 (trans) and 670-630 (gauche) cm-' (Sugeta et al., 1972).van Wart and Scheraga (1976a,b) have proposed modifications of these correlations to consider effects due to substituents at the /3 carbon. They also found that deviations from ideal trans and gauche conformations led to further deviations. Boyd ( 1974)has performed all-valence-electron calculations by the EH and CND0/2 methods on disulfide (RSSR) compounds. He found that variation in the calculated strength of the S-S bond as a function of T(S-S) correlates well with observed SS s frequencies. Although SS s and CS s modes are certainly useful in studies of -S-Sconformation in proteins, it seems that more work is needed before their potential can be fully realized. C . Normal Modes of Proteins 1 . Glucagon
Glucagon is one of the few small proteins on which a normal-mode analysis has been done (Tasumi et al., 1982). It contains 29 amino acid residues, and its structure has been determined to 3 8, resolution (Bernstein et al., 1977). The force field used in the calculation was that refined for a side-chain point-mass approximation (Dwivedi and Krimm, 198413). The computer program was one specifically designed to handle such large molecules (Tasumi et al., 1982). At the time it could not include hydrogen bonds, but more recent versions are able to do so (Ataka and Tasumi, 1986).The present results must, therefore, be taken as mainly illustrative.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
347
3 10
8 I B 8 5
0 0
$00 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 i500 1600 1700 FREOUENCY (cm-')
FIG. 33. Histogram (at 5-cm-' intervals) of normal-mode frequencies of glucagon (Tasumi et al., 1982).
The normal-mode calculation produced 597 frequencies and eigenvectors. We represent the former in Fig. 33 by giving a histogram of the calculated frequencies below 1700 cm-', plotting the number of normal modes in each 5-cm-' interval. Such a density of vibrational states, which can be probed by inelastic neutron scattering (Jacrot et al., 1982), is not observed directly by IR or Raman. As we noted, and will consider further below, the IR spectrum is determined by the d p / d Q in each normal mode, which remains to be calculated. Since hydrogen bonding and TDC have also not been included in this calculation, the values of the amide frequencies and the distribution of the low-frequency modes cannot be considered finalized as yet. A representation of the eigenvectors of the normal modes is much more difficult, since so many internal coordinates are involved. It is instructive, however, to examine the amide I modes, which can be represented simply in terms of only the CO s coordinate. This is done in Fig. 34, where a bar to the right represents the stretching of a C=O bond in a particular amino acid residue and a bar to the left represents the contraction of such a bond. The most striking aspect of this figure is that most of the modes are highly localized, with the displacements being confined to several adjacent residues. Greater delocalization is shown for modes associated with the more consistently a-helical region between residues 20 and 25. A similar pattern is exhibited by the amide I1 modes, but the amide I11 modes (1270-1250 cm-') are in general delocalized. In the low-frequency region some modes are highly localized while others are delocalized. Much more remains to be learned about the normal modes of glucagon, but this will follow as more detailed calculations are undertaken.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
349
2. Bovine Pancreatic Trypsin Inhibitor Bovine pancreatic trypsin inhibitor (BPTI) is a protein of 58 amino acid residues. Normal-mode calculations have been done on it, designed primarily to study the low-frequency vibrations. In one kind of study (Gd et al., 1983), the bond lengths and angles were held fixed and only main-chain and side-chain dihedral angles were allowed to vary. Hydrogen atoms were included in the atom to which they are bonded, except for those involved in hydrogen bonding. Most of the modes in this model occur below 200 cm-', and in the range up to 120 cm-' the number of normal modes is consistent with predictions based on treating the molecule as a continuous elastic object of Young's modulus 10" dyn/cm2 (Suezaki and Gd, 1975; G6, 1978). The modes in the region 120-200 cm-' tend to be localized, while those below 120 cm-I are almost completely delocalized. In a similar study (Levitt et al., 1983), it was shown that most of the a-carbon displacements are accounted for by the eight lowest modes (all below 10 cm-I). While modes above 50 cm-' behave harmonically, below this frequency the modes show some anharmonicity. Such low-frequency modes, because of their large contributions to the atomic displacements, make a significant contribution to the entropy of the system, and therefore to thermodynamic properties (Peticolas, 1979). While such a treatment can give a general picture of low-frequency collective motions of a protein molecule, it cannot reveal detailed aspects of chain conformation, since for such features the spectra are sensitive to accurate descriptions of the local vibrational force field. This problem also exists for a quasi-harmonic virtual-bond description of the low-frequency modes (Levy et al., 1984).
A normal-mode calculation of BPTI has also been done in the full conformational space of the molecule, i.e., including bond length and bond angle as well as dihedral angle changes (Brooks and Karplus, 1983). Nonpolar hydrogens were absorbed into the atoms to which they are bonded, and a potential energy function used for energy minimization and molecular dynamics was used to obtain the force constants (Brooks et al., 1983). The frequencies are distributed essentially continuously between 3 and 1200 cm-', with a grouping of modes between 1200 and 1800 cm-'. Most of the low-frequency modes were found to be significantly anharmonic in character and to be delocalized over the molecule. The atomic fluctuations of the main chain, which are mostly dominated by frequencies below 30 cm-', were found to compare well with those calculated from a molecular dynamics simulation. This analysis showed that, despite the anharmonic contributions to the potential
350
SAMUEL KRIMM AND JAGDEESH BANDEKAR
(through nonbonded and electrostatic interactions), a harmonic (i.e., normal-mode) treatment can provide satisfactory descriptions of the internal motions in a protein molecule. However, the applicability of this calculation to the analysis of the vibrational spectrum of BPTI remains to be demonstrated. The potential function used only has terms that are diagonal in the internal coordinates and no detailed test of its ability to reproduce vibrational spectra of polypeptides has yet been presented. 3. Infrared Intensities The normal-mode calculations on glucagon and BPTI demonstrate that, in order to analyze the vibrational spectra of proteins, we need to be able to predict more than just a density of vibrational states. In particular, we need to know what IR and Raman intensities are to be expected for a normal vibration. While a calculation of Raman intensities has yet to be done, important progress has been made in calculating IR intensities of amide and backbone modes of the polypeptide chain (Cheam and Krimm, 1985). The IR intensity of a normal mode is, from Eq. (79), proportional to (dp/dQ)*. One approach would be to determine experimentally a set of bond moment (electro-optical) parameters from which the (dp/dQ) could be calculated (Person and Zerbi, 1982). For molecules with as low a symmetry as peptides this presents serious difficulties. Our approach has been to use Eq. (76), dp/dQ1 = (dp/dS,)L,, to calculate the dp/dSi from NMA by ab initio Hartree-Fock methods, and to use the L , obtained from our empirical force field for the particular polypeptide. The dipole derivatives of NMA were calculated in terms of the local symmetry internal coordinate, S,of the peptide group (see Table I). In order to have the results relevant to hydrogen-bonded systems, the structure used was an NMA molecule (Fig. 2) to which two formamide (FA) molecules were hydrogen bonded. The dp/dSi values were calculated by displacing atoms along S and doing a numerical differentiation: Ap/AS, = [p(S, = A*) - p(AS; = O)]/Ar. Bonds were distorted by 0.01 8, and angles by 0.025 rad. The basis set was chosen by calculating static dipole moments and derivatives with respect to CO s for formaldehyde, formamide, and NMA, and comparing the results for different basis sets with experimental results (Cheam and Krimm, 1985). The results for the NMA-FA2 complex are given in Table XLI, where 6 is the angle (counterclockwise being positive) from the x axis in Fig. 2. These dp/dSi are visualized in Fig. 35, where all vectors are arbitrarily centered on the CO carbon, and the numbers refer to the S, values of Table XLI (the vectors for CN s and CO s have been drawn at half their lengths). When these dp/dSi were used together with Li, for (Gly)*I,the
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
35 1
TABLE XLI Dipole Derivatives aplaS, for NMA-FA2 Symmetry Coordinate
S,:CC stretch S p : CN stretch Ss:NC stretch S,: CO stretch S5: NH stretch Ss: CCN deformation S, : CO in-plane bend S8: CNC deformation Sg: NH in-plane bend SI0: CO out-of-plane bend SI1: NH out-of-plane bend S1p:CN torsion
lapiasp
8
0.732 4.61 1 2.869 6.953 1.878 0.748 1.997 0.536 0.980 0.143 1.803 0.624
-49.9 2.3 -61.7 47.0 61.1 -4.0 -35.6 -7.7 11.4
In DIA or Dlrad. From x axis in Fig. 2, counterclockwise being positive.
calculated intensities shown in Fig. 7 were obtained. From the good agreement with the observed spectrum, it is clear that these dp/dSi should serve to calculate IR intensities in other polypeptides and in proteins. a
3 FIG. 35. In-plane internal coordinate dipole derivatives, apidS,, for the N-methylacetamide molecule (cf. Fig. 2) in an N-methylacetamide-(formamide)2 complex. Numbers refer to S,of Table I. Vectors 2 and 4 are drawn at half their lengths (Cheam and Krimm, 1985).
352
SAMUEL KRIMM AND JAGDEESH BANDEKAR
VIII. PROSPECTS FOR THE FUTURE Modern vibrational spectroscopy of polypeptides and proteins, as outlined in the previous pages, has made a significant initial contribution as a tool for the detailed analysis of conformation in such molecules. Yet much more remains to be done, both with respect to further refinements in the inputs to the normal-mode calculations as well as in applications to the many general and specific structures that need to be studied. We consider below only briefly some aspects of such future developments. The most important question associated with the normal-mode calculations involves the force field. Our SGVFF parameters have been generally successful in predicting observed IR and Raman bands, but it is still important to establish their particular values in specific cases. Thus, we have ascertained that certain intramolecular interaction force constants differ somewhat between a-helix and P-sheet structures; it remains to define what is the detailed dependence of force field on conformation. The assumption we have made is that this dependence is small, and that variation in the spectrum arises primarily from variation in three-dimensional structure, which is substantially true. But if we wish to interpret detailed aspects of the IR and Raman spectra, we must expect to have a comparably rigorous understanding of the force field. A similar situation exists in defining that part of the force field that involves the hydrogen bond. Although we have parameterized these force constants, we have at present no specific algorithms that describe how such force constants vary with the geometry of the hydrogen bond. Such improvements in the force field will have to be made if we wish to refine our predictive capabilities. Another approach to improving the input to the normal-mode calculation is to select a model for the force field different from the SGVFF that we have used. The most reasonable alternative is the CFF (mentioned in Section II,D,l), since in principle such a potential function permits the calculation of other molecular properties in addition to frequencies. However, the CFF functions presently in use give relatively poor reproduction of frequencies of polypeptide molecules, with average errors of 30-50 cm-', compared to errors of - 5 cm-' for our SGVFF. It is not clear whether this is due to the fact that frequencies were not used in the parameterization, or to more fundamental problems in formulating the model. The CFF generally contains VFF terms [such as in Eq. (65)], as well as terms for nonbonded interactions [such as represented by Eqs. (66)-(70)] and coulombic interactions between partial charges on atoms. In most cases, off-diagonal terms, such as in the second line of Eq. (65), and nonpolar hydrogen atoms are excluded
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
353
(Brooks et al., 1983; Levitt, 1983). The exact form taken for the CFF potential, and the manner in which its parameters are determined, will clearly affect its ability to reproduce correct normal-mode frequencies. For example, coulombic terms, which are important in determining minimum energy structures, are much less important at equilibrium, where their effect can be incorporated into the SGVFF parameters. However, charge flux effects, which presently are not incorporated in CFF potentials, are of importance in determining IR spectral properties (as evidenced by TDC interactions), and cannot be obtained from the static partial charges, as has been done (Levitt et al., 1983). [For example, the dipole derivative for CO s in formaldehyde is 2.66 D/A from the static charge distribution but 3.85 D/A from ab initio calculations (Cheam and Krimm, 1985).] At the present time, our SGVFF is the most effective approach to interpreting and reproducing vibrational frequencies of polypeptide molecules. Whether modifications, such as the addition of certain nonbonded interactions, will be required in order to deal with very low frequencies of globular proteins, particularly their anharmonicity, remains to be determined. Additions to the present SGVFF that are relevant to polypeptides are a natural extension of current work. We can expect that, on the basis of preliminary studies (Cheam and Krimm, 1984a,b), a force field will be available for the cis peptide group. An analysis of CH3CON(CH3)2 (Johnston, 1975) has provided a force field for the N-methylated peptide group. And a current refinement of the force field of CH3COOCH3 (Dybal and Krimm, 1987) should provide the means for extending the present SGVFF to depsipeptides, which contain the -CO-NHCHR’-CO-O-CHRstructure. We have noted the importance of incorporating calculations of IR intensities in the analysis of spectra. This approach is certain to prove fruitful in a number of areas: determination of the dependence of amide mode intensities on conformation; influence of size and perfection of structure on intensities; correlation of intensities with hydrogen-bond geometry (Cheam and Krimm, 1986). Just as it is possible to develop a conformational (4, $)-frequency map (Hsu et al., 1976), it should be possible to compute a conformational (4, $)-intensity map, which could be useful in analyzing the spectra of unordered polypeptide chain structures. Of course, nothing has yet been done on the calculation of Raman intensities of polypeptides, and this area is ripe for future development. Much remains to be done in characterizing the vibrational spectra of known polypeptide chain structures. Although some preliminary studies were done on the parallel-chain pleated sheet (Krimm and Abe, 1972; Moore and Krimm, 1975), a full analysis of this structure found in
354
SAMUEL KRIMM AND JAGDEESH BANDEKAR
proteins is appropriate (Bandekar and Krimm, 1987). Of course, since mixed parallel- and antiparallel-chain structures occur in proteins, normal-mode analysis should be useful in understanding their vibrational properties. The twisted p sheet needs to be studied to see what effect this variation in structure has on the spectrum. Calculations of “&barrel” arrangements could help to characterize this particular structure. Calculations will also be important in characterizing the various finite-sheet arrangements found in proteins. Among helical structures yet to be analyzed are the w and 7 helices. Preliminary analyses have already been done on the polyproline I and polyproline I1 helices (Johnston, 1975), and these studies can provide the basis for a detailed analysis of the triple-stranded collagen helix. Since helices in proteins are found in short segments and often with distorted axes, it will be important to study the influence of such “defects” on the vibrational spectrum. The application of normal-mode analysis to the study of vibrational spectra of proteins is in its infancy, and we may expect this area to develop significantly. In this connection, certain general studies will be useful: the nature of vibrations in combined a and /3 structures; the extent of localization of modes in structures which contain “hinge” regions; the correlation of the calculated spectrum of a protein with that obtained from a sum of its constituent secondary structural components. In view of the progress to date, it is clear that IR and Raman spectroscopy, together with normal-mode analysis, will have a major contribution to make to the detailed study of conformation in polypeptides and proteins. ACKNOWLEDGMENTS This research program was supported by the National Science Foundation, most recently by grants PCM-8214064 and DMR-8303610. The continued support by NSF has made the progress in this field possible, and it is deeply appreciated. The results could not have been achieved without the dedicated efforts of many, and one of us (S. K.) wishes to express his deep indebtedness to, and pleasure in working with, his colleagues in these studies. Discussions with Dr. Toon C. Cheam in the course of writing this article were very helpful and are much appreciated. As usual, Ms. Shirley Mieras provided devoted support in preparing the manuscript.
REFERENCES Abe, Y., and Krimm, S . (1972a). Biqolymers 11, 1817-1839. Abe, Y., and Krimm, S . (1972b). Biqolymers 11, 1841-1853. Admiraal, G., and Vos, A. (1984). 1nt.J. P e p . Protein Res. 23, 151-157. Arnott, S., and Dover, S . D. (1967).J. MoZ. BioZ. 30, 209-212. Amott, S.,and Dover, S . D. (1968). Acta Clystallogr., Sect. B 24, 599-601.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
355
Arnott, S., and Wonacott, A. J. (1966).J . Mol. Biol. 21, 371-383. Arnott, S., Dover, S. D., and Elliott, A. (1967).J. Mol. Biol. 30, 201-208. Ashida, T., Kojima, T., Tanaka, I., and Yamane, T. (1986). Int. J. Peptide Protein Res. 27, 61-69. Astbury, W. T. (1949). Nature (London) 163, 722. Astbury, W. T., and Street, A. (1931). Philos. Trans. R. SOC.London, Ser. A . 230, 75-101. Astbury, W. T., and Woods, H. J. (1933). Philos. Trans. R . SOC.London, Ser. A . 232, 333394. Astbury, W. T., Dalgliesh, C. E., Darmon, S. E., and Sutherland, G. B. B. M. (1948).Nature (London) 162,596-600. Ataka, S., and Tasumi, M. (1986).J. Mol. Struct. 143, 445-448. Aubert, J.-P., Biserte, G., and Loucheux-Lefebvre, M.-H. (1976). Arch. Biochem. Biophys. 175,410-418. Baker, E. N., and Hubbard, R. E. (1984). Prog. Biophys. Mol. Biol. 49, 97-179. Balasubrarnanian, R., Chidambaram, R., and Ramachandran, G. N. (1970). Eiochim. BiOphyS. Acta 221,196-206. Baldwin, J. P., Bradbury, E. M., McLuckie, I. F., and Stephens, R. M. (1973). Macromolec u b 6, 83-91. Bamford, C. H., Brown, L., Elliott, A., Hanby, W. E., and Trotter, I. F. (1953). Nature (London) 171, 1149-1151. Bamford, C. H., Brown, L., Elliott, A., Hanby, W. E., and Trotter, I. F. (1954). Nature (London) 173327-29. Bamford, C. H., Browh, L., Cant, E. M., Elliott, A., Hanby, W. E., and Malcolm, B. R. (1955).Nature (London) 176, 396-397. Bamford, C. H., Elliott, A., and Hanby, W. E. (1956). “Synthetic Polypeptides.” Academic Press, New York. Bandekar, J., and Krimm, S. (1979a). Proc. Natl. Acad. Sci. U.S.A. 76, 774-777. Bandekar, J., and Krimm, S. (197913).I n “Peptides: Structure and Biological Function” (E. Gross and J. Meienhofer, eds.), pp. 241-244. Pierce Chemical Co., Rockford, Illinois. Bandekar, J., and Krimrn, S. (1980). Bio@olymers 19, 31-36. Bandekar, J., and Krimm, S. (1985a). Znt. J. P e p . Protein Res. 26, 407-415. Bandekar, J., and Krimm, S. (1985b). Znt. J. P g t . Protein Res. 26, 158-165. Bandekar, J., and Krimm, S. (1987). Submitted. Bandekar, J., Evans, D. J., Krimm, S., Leach, S. J., Lee, S., McQuie, J. R., Minasian, E., Ntmethy, G., Pottle, M. S., Scheraga, H. A., Stimson, E. R., and Woody, R. W. (1982). Znt. J. Pept. Protein Res. 19, 187-205. Bastian, E. J., and Martin, R. B. (1973).J. Phys. Chem. 77, 1129-1133. Beely, J. G. (1977). Biochem. Biophys. Res. Commun. 76, 1051-1055. Bellamy. L. J. (1975). “The Infrared Spectra of Complex Molecules,” 3rd Ed., Chapman & Hall, London. Benedetti, E., DiBlasio, B., Pedone, C., Lorenzi, G. P., Tomasic, L., and Gramlich, V. (1979). Nature (London) 282, 630. Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. F., Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T., and Tasumi, M. (1977).J. Mol. Biol. 112, 535-542. Blundell, T., Dodson, G., Hodgkin, D., and Mercola, D. (1972).Adv. Protein Chem. 26,279402. Born, M., and Huang, K. (1954). “Dynamical Theory of Crystal Lattices.” Oxford University Press, London and New York. Bosch, R.,Jung, G., and Winter, W. (1983). Acta Crystallogr. Sec. C 39, 776-778. Boyd, D. B. (1974). Znt. J . Quantum Chem. Quantum Biol. Symp. 1, 13-19.
356
SAMUEL KRIMM AND JAGDEESH BANDEKAR
Bradbury, E. M., and Elliott, A. (1963). Polymer 4,47-49. Bradbury, E. M., Brown, L., Downie, A. R., Elliott, A., Fraser, R. D. B., and Hanby, W. E. (1962).J. Mol. Biol. 5, 230-247. Bradbury, E. M., Carpenter, B. G., and Stephens, R. M. (1968). Biopolymers 6, 905-915. Bragg, L., Kendrew, J. C., and Perutz, M. F. (1950). Proc. R . SOC.London Ser. A 203,321357. Brant, D. A., and Flory, P. J. (1965a).J. Am. Chem. SOC.87, 663-664. Brant, D. A,, and Flory, P. J. (1965b). J . Am. Chem. SOC.87, 2791-2800. Brillouin, L. (1953). “Wave Propagation in Periodic Structures.” Dover, New York. Brooks, B., and Karplus, M. (1983). Proc. Natl. Acad. Sci. U.S.A. 80, 6571-6575. Brooks. B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J., Swaminathan, S., and Karplus, M. (1983). J . Comput. Chem. 4, 187-217. Brown, L., and Trotter, I. F. (1956). Trans. Farday Sot. 52, 537-548. Burkert, W., and Allinger, N. L. (1982). “Molecular Mechanics.” American Chemical Society, Washington, D.C. Bystrov, V. F., Portnova, S. L., Tsetlin, V. I., Ivanov, V. T., and Ovchinnikov, Yu. A. (1969). Tetrahedron 25, 493-515. Califano, S. (1976). “Vibrational States.” Wiley, New York. Califano, S. (1977). In “Vibrational Spectroscopy-Modern Trends” (A. J. Barnes and W. J. Orville-Thomas, eds.), pp. 285-304. Elsevier, Amsterdam. Chandrasekharan, R., Lakshminarayanan, A. V., Pandya, U. V., and Ramachandran, U 14-27. G. N. (1973). Biochim. BiOphy~.A C ~ 303, Chatterjee, A., and Parthasarathy, R. (1984). Int. J . Pept. Protein Res. 24, 447-452. Cheam, T. C., and Krimm, S. (1984a). Spectrochzm. Acta Part A 40,481-501. Cheam, T. C., and Krimm, S. (1984b). Spectrochim. Actu Part A 40, 503-517. Cheam, T. C., and Krimm, S. (1984~).Chem. Phys. Lett. 107, 613-616. Cheam, T. C., and Krimm, S. (1985).J. Chem. Phys. 82, 1631-1641. Cheam, T. C., and Krimm, S. (1986).J. Mol. Struct., in press. Chen, M. C., and Lord, R. C. (1974). J . Am. Chem. SOC.96,4750-4752. Chen, M. C., and Lord, R. C. (1976). Biochemistry 15, 1889-1897. Chen, M. C., Lord, R. C., and Mendelsohn, R. (1973). Biochim.Biophys. Acta 328,252-260. Chidambaram, R., Balasubramanian, R., and Ramachandran, G. N. (1970). Biochzm. BiOphy~.Acts 221, 182-195. Chirgadze, Yu. N., and Brazhnikov, E. V. (1974). Biopolymers 13, 1701-1712. Chirgadze, Yu. N., Shestopalov, B. V., and Venyarninov, S. Y. (1973). Biopolymers 12, 1337- 1351. Chothia, C. (1973).J. Mol. Biol. 75, 295-302. Chou, K.-C., Pottle, M., Nkmethy, G., Ueda, Y., and Scheraga, H. A. (1982).J. Mol. Biol. 162989-112. Chou, P. Y., and Fasman, G. D. (1977).J. Mol. Biol. 115, 135-175. Chou, P. Y., and Fasman, G. D. (1979a). Biophys. J. 26, 367-384. Chou, P. Y., and Fasman, G. D. (1979b). Bi0phys.J. 26, 385-400. Colonna-Cesari, F., Premilat, S., and Lotz, B. (1974).J. Mol. Biol. 87, 181-191. Colonna-Cesari, F., Premilat, S., and Lotz, B. (1975).J. Mol. Biol. 95, 71-82. Colonna-Cesari, F., Premilat, S., Heitz, F., Spach, G., and Lotz, B. (1977). Macromolecules 10, 1284-1288. Corey, R. B., and Pauling, L. (1953). Proc. R. SOC.London Ser. B . 141, 10-20. Cowan, P. M., and McGavin, S. (1955). Nature (London) 176, 501-503. Craig, W. S., and Caber, B. P. (1977).J. Am. Chem. SOC.99, 4130-4134. Crane, H. R. (1950). Sn’. Mon. 70, 376-389.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
357
Crick, F. H. C., and Rich, A. (1955). Nature (London) 176, 780-781. Cruse, W. B. T., Egert, E., Viswamitra, M. A., and Kennard, 0. (1982). Acta Crystallogr. Sect. B 38, 1758-1764. Decius, J. C., and Hexter, R. M. (1977). “Molecular Vibrations in Crystals.” McGraw-Hill, New York. Dellapiane, G., Abbate, S., Bosi, P., and Zerbi, G. (1980).J. Chem. Phys. 73, 1040-1047. Donohue, J. (1953). Proc. Natl. Acad. Sci. U.S.A. 39, 470-478. Donohue, J. (1968). I n “Structural Chemistry and Molecular Biology” (A. Rich and N. Davidson, eds.), pp. 459-463. Freeman, San Francisco. Duncan, J. L. (1975). I n “Molecular Spectroscopy, A Specialist Report,” Vol. 3, pp. 104162. The Chemical Society, London. Dunker, A. K., Williams, R. W., and Peticolas, W. L. (1979).J. Biol. Chem. 254,6444-6448. Dwivedi, A. M., and Krimm, S. (1982a). Macromolecules 15, 177-185. Dwivedi, A. M., and Krirnm, S. (1982b). Macromolecules 15, 186-193. Dwivedi, A. M., and Krirnm, S. (1982~).Biopolymers 21,2377-2397. Dwivedi, A. M., and Krirnm, S. (1983). Macromolecules 16, 340. Dwivedi, A. M., and Krirnm, S. (1984a). Biopolymers 23, 923-943. Dwivedi, A. M., and Krirnm, S. (1984b).J. Phys. Chem. 88, 620-627. Dwivedi, A. M., Krimm, S., and Malcolm, B. R. (1984). Biopolymers 23, 2025-2065. Dybal, J., and Krimm, S. (1987). In preparation. Elliott, A. (1954). Proc. R . SOC.London Ser. A 226, 408-421. Elliott, A., and Malcolm, B. R. (1956). Tram. Faraday SOC. 52, 528-536. Elliott, A., and Malcolm, B. R. (1959). Proc. R . Sot. London Ser. A 249, 30-41. Epp, 0.. Colman, P., Felhammer, H., Bode, M., Schiffer, M., Huber, R., and Palm, W. (1974). Eur. J.Biochem. 45, 513-524. Fanconi, B. (1972a).J. Res. Natl. Bur. Stand.SectA 76, 351-359. Fanconi, B. ( 1972b).J . Chem. Phys. 57, 2 109-2 1 16. Fanconi, B. (1973). Biopolymers 12, 2759-2776. Fanconi, B., Tornlinson, B., Nafie, L. A., Small, W., and Peticolas, W. L. (1969).J. Chem. Phys. 51, 3993-4005. Fanconi, B., Small, W., and Peticolas, W. L. (1971). BiopoZymers 10, 1277-1298. Fasman, G. D., Itoh, K., Liu, C. S., and Lord, R. C. (1978). BiopoZymm 17, 1729-1746. Fawcett, J. K., Camerman, N., and Camerman, A. (1975).Acta Crystallogr. Sect. B 31,658665. Flippen, J. L., and Karle, I. (1976). Biopolymers 15, 1081-1092. Fox, R. O., Jr., and Richards, F. M. (1982). Nature (London) 300, 325-330. Fraser, R. D. B., and MacRae, T. P. (1973). “Conformation in Fibrous Proteins and Related Synthetic Polypeptides.” Academic Press, New York. Fraser, R. D. B., and Price, W. C. (1952). Nature (London) 170, 490-491. Fraser, R. D. B., MacRae, T. P., Stewart, F. H., and Suzuki, E. (1965).J. Mol. Biol. 11,706712. Fraser, R. D. B., MacRae, T. P., and Stewart, F. H. (1966).J. Mol. Biol. 19, 580-582. Fraser, R. D. B., MacRae, T. P., Parry, D. A. D., and Suzuki, E. (1969). Polymer 10, 810826. Frushour, B. G., and Koenig, J. L. (1974). Biopolymers 13, 455-474. Frushour, B. G., and Koenig, J. L. (1975a). Biopolymers 14,2115-2135. Frushour, B. G., and Koenig, J. L. (1975b). Biopolymers 14, 363-377. Frushour, B. G., and Koenig, J. L. (1975~).Adv. Infrared Raman Spectrosc. 1, 35-97. Frushour, B. G., Painter, P. C., and Koenig, J. L. (1976).J. Macromol. Sci. Rev. Macromol. Chem. C15, 29-1 15.
358
SAMUEL KRIMM AND JAGDEESH BANDEKAR
Fukushima, K., Ideguchi, Y., and Miyazawa, T. (1963). Bull. Chem. SOL.Jpn. 36, 13011307. Cans, P. (1977). Adv. Infrared Raman Spectrosc. 3, 87-126. Geisow, M. (1978). Nature (London) 274, 642. Ga,N. (1978). Biopolymers 17, 1373-1379. Ga,N., Noguti, T., and Nishikawa, T. (1983). Proc. Natl. Acad. Sci. U.S.A. 80,3696-3700. Goldstein, H. (1950). “Classical Mechanics.” Addison-Wesley, Reading, Massachusetts. Green, R. D. (1974). “Hydrogen Bonding by C-H Groups.” Wiley, New York. Gupta, V. D., Trevino, S., and Boutin, H. (1968).J . Chem. Phys. 48, 3008-3015. Hagler, A. T., and Lapiccirella, A. (1976). Biopolymers 15, 1167-1200. Hagler, A. T., Lifson, S., and Dauber, P. (1979a).J. Am. Chem. SOC. 101, 5122-5130. Hagler, A. T., Dauber, P., and Lifson, S. (1979b).J . Am. Chem. SOL. 101, 5131-5141. Han, S.-L., Stimson, E. R., Maxfield, F. R., and Scheraga, H. A. (1980). Int. J . P e p . Protein Res. 16, 173-182. Heitz, F., Lotz, B., and Spach, G. (1975).J . Mol. Biol. 92, 1-13. Herzberg, G. (1945). “Molecular Spectra and Molecular Structure. 11. Infrared and Raman Spectra of Polyatomic Molecules.” Van Nostrand-Reinhold, Princeton, New Jersey. Hesselink, F. T., and Scheraga, H. A. (1972). Mamomolecules 5,455-463. Hexter, R. M. (1960).J. Chem. Phys. 33, 1833-1841. Higashi, L. S., Lundeen, M., and Seff, K. (1978).J. Am. Chem. SOL. 100, 8101-8106. Higgs, P. W. (1953a). Proc. R . SOL.London Ser. A 220,472-485. Higgs, P. W. (1953b).J . Chem. Phys. 21, 1131-1134. Hopfinger, A. J. (1971). Bwpolymers 10, 1299-1315. Hordvik, A. (1966). Acta Chem.S c a d . 20, 1885-1891. Hseu, T. H., and Chang, H. (1980). Biochim. Biophys. Acta 624, 340-345. Hsu, S. L., Moore, W. H., and Krimm, S. (1976). Biopolymers 15, 1513-1528. Huggins, M. L. (1943). Chem. Rev. 32, 195-218. Ishizaki, H., Balaram, P., Nagaraj, R., Venkatachalapathi, Y. V., and Tu, A. T. (1981). Biophys. J . 26,509-517. Itoh, K., and Katabuchi, H. (1972). B i o p o l y w s 11, 1593-1605. Itoh, K., and Shimanouchi, T. (1970). BEopolymers 9, 383-399. Itoh, K., Nakahara, T., Shimanouchi, T., Oya, M., Uno, J., and Iwakura, Y. (1968). Biopolymers 6, 1759-1766. Itoh, K., Shimanouchi, T., and Oya, M. (1969). Biopolymers 7, 649-658. Itoh, K., Foxman, B. M., and Fasman, G. D. (1976). Biopolymers 15,419-455. Jackson, J. D. (1975). “Classical Electrodynamics,” 2nd Ed. Wiley, New York. Jacrot, B., Cusack, S.,Dianoux, A. J., and Engelman, D. M. (1982). Nature (London) 300, 84-86. JakeS, J., and Krimm, S. (1971a). Spectrochim. Acta PartA 27, 19-34. JakeS, J., and Krimm, S. (1971b). Spectrochim. Acta Part A 27, 35-63. JakeS, J., and Schneider, B. (1968). Collect. Czech. C h m . Cmmun. 33, 643-655. Johnston, N. H. (1975). Ph. D. Dissertation, University of Michigan, Ann Arbor. Kaiser, E. T., and Kkzdy, F. J. (1984). Science 223, 249-255. Karle, I. (1981). In “Perspectives in Chemistry” (A. Eberle, R. Geiger, and T. Weiland, eds.), pp. 261-271. Karger, Basel. Kawai, M., and Fasman, G. D. (1978).J. Am. Chem. Soc. 100, 3630-3632. Kawai, M., Rich, D. H., and Watson, J. D. (1983).Biochem. Biophys. Res. Commun. 111, 398403. Keith, H. D., Padden, F. J., and Giannoni, G. (1969a).J. Mol. Biol. 43, 423-438.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
359
Keith, H. D., Giannoni, G., and Padden, F. J. (196913). Biopolymers 7, 775-792. Kendrew, J. C., Dickerson, R. E., Strandberg, B. E., Hart, R. G., Davies, D. R., Phillips, D. C., and Shore, V. C. (1960). Nature ( L d o n ) 185,422-427. Kendrew, J. C., Klyne, W.,Lifson, S., Miyazawa, T., NCmethy, G., Phillips, D. C., Ramachandran, G. N., and Scheraga, H. A. (1970).Biochemistry 9, 3471-3479. Kitagawa, T., Azuma, T., and Hamaguchi, K. (1979).Biopolymers 18,451-465. Kitaigorodsky, A. I. (1961). “Chemical Organic Crystallography.” Plenum, New York. Kitaigorodsky, A. I. (1973). “Molecular Crystals and Molecules.” Academic Press, New York. Kitaigorodsky, A. I. (1978). Chem. SOL.Rev. 7, 133-163. Kittel, C. (1969). “Introduction to Solid State Physics.” Wiley, New York. Koenig, J. L., and Frushour, B. G. (1972).Biopolymers 11, 2505-2520. Koenig, J. L., and Sutton, P. L. (1969). Biopolymers 8, 167-171. Koenig, J. L., and Sutton, P. L. (1971).BiOpolymers 10, 89-106. Krimm, S . (1960).Adu. Polymer Sci. 2, 51-172. Krimm, S. (1962).J. Mol. Biol. 4, 528-540. Krimm, S. (1966).Nature (London) 212, 1482-1483. Krimm, S . (1983). Biopolymers 22,217-225. Krimm, S., and Abe, Y. (1972). Proc. Natl. Acad. Sci. U.S.A. 69, 2788-2792. Krimm, S., and Bandekar, J. (1980). Biopolymers 19, 1-29. Krimm, S., and Dwivedi, A. M. (1982a).J. Ramun Spectrosc. 12, 133-137. Krimm, S., and Dwivedi, A. M. (1982b). Science 216,407-408. Krimm, S., and Kuroiwa, K. (1968).B i o p o l y s 6,401-407. Krimm, S., and Mark, J. E. (1968). Proc. Natl. Acad. Sci. U.S.A. 60, 1122-1129. Krimm, S., and Tiffany, M. L. (1974).Isr. J . Chem. 12, 189-200. Krimm, S., Kuroiwa, K., and Rebane, T. (1967).In “Conformation of Biopolymers” (G. N. Ramachandran, ed.), Vol. 2, pp. 439-447. Academic Press, New York. Kuntz, I. D. (1972).J. Am. Chem. SOL. 94,4009-4012. Lagant, P., Vergoten, G., Loucheux-Lefebvre, M.-H., and Fleury, G. (1983). Bzopolymers 22, 1267-1283. Lagant, P., Vergoten, G., Fleury, G., and Loucheux-Lefebvre, M.-H. (1984a). Eur. J . Eiochem. 139, 137-148. Lagant, P., Vergoten, G., Fleury, G., and Loucheux-Lefebvre, M.-H. (1984b). Eur. J . Eio~hewt.139,149-154. Lalitha, V., Murali, R., and Subramanian, E. (1987).Int. J . P e p . Protein Res., in press. Lenormant, H., Baudras, A., and Blout, E. R. (1958).J. Am. Chem. Soc. 80,6191-6195. Levitt, M. (1983).J. Mol. Biol. 168, 595-620. Levitt, M., Sander, C., and Stern, P. S . (1983).Int. J . Quantum Chem. QuantumBiol. Symp. 10, 181- 199. Levy, R. M., Srinivasan, A. R., Olson, W. R., and McCammon,J. A. (1984).Biopolymers 23, 1099- 1112. Lewis, P. M., Momany, F. A., and Scheraga, H. A. (1973).Biochzm. Biophys. Acta 303,211229. Lifson, S., and Stern, P. S. (1982).J . Chem. Phys. 77,4542-4550. Lifson, S., and Warshel, A. (1968).J. Chem. Phys. 49, 5116-5129. Lifson, S., Hagler, A. T., and Dauber, P. (1979).J. Am. Chem. Soc. 101, 5111-5121. Lipkind, G. M., Arkhipova, S. F., and Popov, E. M. (1971). Mol. Bzol. (Moscow) 4, 409414. Lippert, J. L., Tyminski, D., and Desmeules, P. J. (1976).J. Am. Chem. Soc. 98,7075-7080. Lippincott, E. R., and Schroeder, R. (1955).J. Chem. Phys. 23, 1099-1106.
360
SAMUEL KRIMM AND JAGDEESH BANDEKAR
Long, D. A., Gravenor, R. B., and Woodger, M. (1963). Spectrochim. Actu Part A 19,937949. Lord, R. C. (1977). Appl. Spectrosc. 31, 187-194. Lord, R. C., and Yu, N.-T. (1970a).J. Mol. Biol. 51, 203-213. Lord, R. C., and Yu, N.-T. (1970b). J . Mol. Biol. 50, 509-524. Lotz, B. (1974).J. Mol. Biol. 87, 169-180. Lotz, B., Colonna-Cesari, F., Heitz, F., and Spach, G. (1976). J. Mol. Biol. 106,915-942. Low, B., and Baybutt, R. B. (1952).J. Am. Chem. SOC.74,5806-5807. Low, B., and Grenville-Wells, H. J. (1953). Proc. Nutl. Acud. Scz. U.S.A. 39, 785-801. Malcolm, B. R. (1977). Biopolymers 16,2591-2592. Malcolm, B. R. (1983). Biopolymers 22, 319-321. Malcolm, B. K.,and Walkinshaw, M. D. (1986). Bzopolymers 25, 607-625. Marsh, R. E., and Glusker, J. P. (1961). Acta Crystullogr. 14, 1110-1 116. Marsh, R. E., Corey, R. B., and Pauling, L. (1955a). Actu Crystullogr. 8, 710-715. Marsh, R. E., Corey, R. B., and Pauling, L. (1955b). Biochzm. Bzophys. Actu 16, 1-34. Martin, R. B. (1974).J. Phys. Chem. 78, 855-856. Masuda, Y., Fukushima, K., Fujii, T., and Miyazawa, T. (1969). Biopolymers 8, 91-99. Maxfield, F. R., Bandekar, J., Krimm, S., Evans, D. J., Leach, S. J., Nemethy, G., and Scheraga, H. A. (1981). Macromolecules 14, 997-1003. Meyer, K. H., and Mark, H. F. (1928). Ber. Dtsch. Chem. Ges. B61, 1932-1936. Miyazawa, T . (195Q.J. Chem. Phys. 29, 246-248. Miyazawa, T. (1960a).J. Chem. Phys. 32, 1647-1652. Miyazawa, T. (1960b).J. Mol. Spectrosc. 4, 168-172. Miyazawa, T. (1961a). Bull. Chem. Soc. Jpn. 34, 691-696. Miyazawa, T. (1961b).J. Chem. Phys. 35, 693-713. Miyazawa, T. (1961c).J. Polymer Sci. 55, 215-231. Miyazawa, T. (1962). I n “Polyamino Acids, Polypeptides, and Proteins” (M. A. Stahmann, ed.), pp. 201-217. University of Wisconsin Press, Madison. Miyazawa, T. (1967). In “Poly-a-Amino Acids” (G. D. Fasman, ed.), pp. 69-103. Dekker, New York. Miyazawa, T., and Blout, E. R. (1961). J. Am. Chem. SOC.83, 712-719. Miyazawa, T., Shimanouchi, T., and Mizushima, S. (1958).J. Chem. Phys. 29,611-616. Miyazawa, T., Ideguchi, Y., and Fukushima, K. (1963).]. Chem. Phys. 38, 2709-2720. Moore, W. H., and Krimm, S. (1975). Proc. Nutl. Acud. Sci. U.S.A. 72, 4933-4935. Moore, W. H., and Krimm, S. (1976a). Biopolymers 15, 2439-2464. Moore, W. H., and Krimm, S. (1976b). Biopolymers 15, 2465-2483. Nagaraj, R., Shamala, N., and Balaram, P. (1979).J. Am. Chem. Sot. 101, 16-20. Naik, V. M., and Krimm, S. (1984a). Int. J . Pept. Protein Res. 23, 1-24. Naik, V., and Krimm, S. (1984b). Bi0phys.J. 45, 109-112. Naik, V., and Krimm, S. (1984~).Biochem. Biophys. Res. Commun. 125, 919-925. Naik, V., and Krimm, S. (1986a). Bi0phys.J. 49, 1131-1 145. Naik, V., and Krimm, S. (l986b). Biophys. J . 49, 1147-1 154. Naik, V. M., and Bandekar, J., and Krimm, S. (1980). Proc. Int. Conf. Ruman Spectrosc. 7th, pp. 596-597. Naik, V. M., Krimm, S., Denton, J., Nkmethy, G., and Scheraga, H. A. (1984). Int. J. P e p . Protein Res. 24, 613-626. Nakamoto, K., Margoshes, M., and Rundle, R. E. (1955).J. Am. Chem. Soc. 77,6480-6486. Nambudripad, R., Bansal, M., and Sasisekharan, V. (1981). Int. J. Pept. Protein Res. 18, 374-382. Neel, J. (1972). Pure Appl. Chem. 31, 201-225.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
36 1
Nkmethy, G., and Printz, M. P. (1972). Macromolecules 5, 755-758. N h e t h y , G., Phillips, D. C., Leach, S. J., and Scheraga, H. A. (1967).Nature (London)214, 363-365. Nkmethy, G., McQuie, J. R., Pottle, M. S., and Scheraga, H. A. (1981). Macromolecules 14, 975-985. Neto, N., Taddei, G., Califano, S., and Walmsley, S. H. (1976). Mol. Phys. 31, 457-468. Padden, F. J., and Keith, H. D. (1965).J . Appl. Phys. 36, 2987-2995. Parker, F. S. (1983). “Applications of Infrared, Raman, and Resonance Raman Spectroscopy in Biochemistry.” Plenum, New York. Paterson, Y., Rumsey, S. M., Benedetti, E., N h e t h y , G., and Scheraga, H. A. (1981).J . Am. Chem. Soc. 103,2947-2955. Pauling, L., and Corey, R. B. (1951a). Proc. Natl. Acad. Sci. U.S.A. 37, 235-240. Pauling, L., and Corey, R. B. (1951b). Proc. Natl. Acad Sci. U.S.A. 37, 241-250. Proc. Natl. Acad Sci. U.S.A. 37, 729-740. Pauling, L., and Corey, R. B. (1951~). Pauling, L., and Corey, R. B. (1953a). Proc. R. Soc. Ser. B 141, 21-33. Pauling, L., and Corey, R. B. (1953b). Proc. Natl. Acad. Sci. U.S.A. 39, 253-256. Pauling, L., and Wilson, E. B., Jr. (1935). “Introduction to Quantum Mechanics.” McGrawHill, New York. Pauling, L., Corey, R. B., and Branson, H. R. (1951).Proc. Natl. Acad. Sci. U.S.A. 37, 20521 1. Perly, B., Helbecque, N., Forchioni, A., and Loucheux-Lefebvre, M.-H. (1983).Biopolymers 22, 1853-1868. Person, W., and Zerbi, G. (1982). “Vibrational Intensities in Infrared and Raman Spectroscopy.” Elsevier, Amsterdam. Perutz, M. F. (1951). Nature (London) 167, 1053-1054. Perutz, M. F., Muirhead, H., Cox, J. M., and Goaman, L. C. G. (1968).Nature (London)219, 13 1-139. Petcher, T. J., Weber, H.-P., and Riiegger, A. (1976). Helu. Chim. Acta 59, 1480-1489. Peticolas, W. L. (1979). I n “Methods in Enzymology” (S. Colowick and N. Kaplan, eds.), Vol. 61, pp. 425-458. Academic Press, New York. Pezolet, M., Pigeon-Gosselin, M., and Coulombe, L. (1976).Biochim. Biophys. Acta 453,502512. Pimentel, G. C., and Sederholm, C. H. (1956).J. Chem. Phys. 24, 639-641. Piseri, L., and Zerbi, G. (1968).J. Chem. Phys. 48, 3561-3572. Prasad, B. V. V., and Chandrasekharan, R. (1977). Int. J. P e p . Protein Res. 10, 129-138. Prasad, B. V. V., and Sasisekharan, V. (1979).Macromolecules 12, 1107-1 110. Prasad, B. V. V., Shamala, N., Nagaraj, R., Chandrasekharan, R., and Balaram, P. (1979). Biopolymers 18, 1635-1646. Prasad, B. V. V., Shamala, N., Nagaraj, R., and Balaram, P. (1980).Acta Clystallogr. Sect. B 36, 107-110. Pullman, B., and Pullman, A. (1974). Adu. Protein Chem. 28, 347-526. Rabolt, J. F., Moore, W. H., and Krimm, S. (1977). Macromolecules 10, 1065-1074. Raghavendra, K., and Sasisekharan, V. (1979).Int. J . P e p . Protein Res. 14, 326-338. Ramachandran, G. N., and Chandrasekharan, R. (1972). Indian J . Biochem. Biophys. 9, 111. Ramachandran, G. N., and Kartha, G. (1955).Nature (London) 176, 593-595. Ramachandran, G. N., and Sasisekharan, V. (1965). Biochim. Biophys. Acta 109, 314-316. Ramachandran, G. N., and Sasisekharan, V. S. (1968). Adv. Protein C h m . 24, 283-438. Ramachandran, G. N., Sasisekharan, V., and Ramakrishnan, C. (1966a). Biochim. Biophys. Acts 112, 168-170.
362
SAMUEL KRIMM AND JAGDEESH BANDEKAR
Ramachandran, G. N., Venkatachalam, C. M., and Krimm, S. (1966b). Biophys. J. 6, 849872. Ramachandran, G. N., Ramakrishnan, C . , and Venkatachalam, C. M. (1967). In “Conformation of Biopolymers.” (G. N. Ramachandran, ed.), Vol. 2, pp. 429-438. Academic Press, New York. Rao, Ch. P., Nagaraj, R., Rao, C. N. R., and Balaram, P. (1980). Biochemistry 19,425-431. Rao, S. N., and Parthasarathy, R. (1973). Acta Crystallogr. Sect. B 29, 2379-2388. Reed, L. L., and Johnson, P. L. (1973).J. Am. Chem. Soc. 95, 7523-7524. Rey-Lafon, M., Forel, M. T., and Garrigou-Lagrange, C. (1973). Spectrochim. Acta P a r t A 29, 471-486. Richardson, J. S. (1981). Adv. Protein Chem. 34, 167-339. Richardson, J. S., Richardson, D. C., Thomas, K. A., Silverton, E. W., and Davies, D. R. (1976).J. Mol. Biol. 102,221-235. Rossman, M. G., and Argos, P. (1975).J. Biol. Chem. 250, 7525-7532. Salemme, F. R. (1983). Prog. BiOphys. Mol. Biol. 42, 95-133. Sandeman, I. (1955). Proc. R . Soc. London Ser. A 232, 105-113. Sasaki, S., Yasumoto, Y., and Uematsu, E. (1981). Macromolecules 14, 1797-1801. Sasisekharan, V. (1959). Acta Crysystallogr. 12, 897-903. Schachtschneider, J. H., and Snyder, R. G. (1963). Spectrochim. Acta 19, 117-168. Scheraga, H. A. (1968). Adv. Phys. Org. Chem. 6 , 103-184. Schrader, B. (1978). In “Molecular Spectroscopy, A Specialist Report,” Vol. 5, pp. 235270. The Chemical Society, London. Schroeder, R., and Lippiyott, E. R. (1957).J. Phys. Chem. 61, 921-928. Schuster, P., Zundel, G., and Sandorfy, C. (1976). “The Hydrogen Bond.” North-Holland Publ., Amsterdam. Sengupta, P., and Krimm, S. (1985). Biopolymers 24, 1479-1491. Sengupta, P., and Krimm, S. (1987). Biopolymers, in press. Sengupta, P., Krimm, S., and Hsu, S. L. (1984). Biopolymers 23, 1565-1594. Shamala, N., Nagaraj, R., and Balaram, P. (1977). Biochem. Biophys. Res. Commun. 79,292298. Shamala, N.,Nagaraj, R., and Balaram, P. (1978).J. Chem. Soc., Chem. Commun., pp. 996997. Shefter, E. (1970).J. Chem. Soc. B , pp. 903-906. Siamwiza, M. N., Lord, R. C., Chen, M. C., Takamatsu, T., Harada, I., Matsuura, H., and Shimanouchi, T. (1975). Biochemistry 14, 4870-4876. Simons, L., Bergstrbm, G., Blomfelt, G., Forss, S., Stenbach, H., and Wansen, G. (1972). Commentat. Phys. Math. 42, 125-207. Singh, R. D., and Gupta, V. D. (1971). Spectrochim. Acta Part A 27, 385-393. Sippl, M. J., Nemethy, G., and Scheraga, H. A. (1984).J. Phys. Chem. 88, 6231-6233. Slater, J. C., and Kirkwood, J. G. (1931). Phys. Rev. 37, 682-697. Small, D., Chou, P. Y.,and Fasman, G. D. (1977). Biochem. Biophys. Res. Commun. 79, 341346. Small, E. W., Fanconi, B., and Peticolas, W. L. (1970).J. Chem. Phys. 52, 4369-4379. Smith, G. D., and Griffin, J. F. (1978). Science 199, 1214-1216. Smith, G. D., Duax, W. L., Czerwinski, E. W., Kendrick, N. E., Marshall, G. R., and Matthews, F. S. (1977). Pept. Proc. Am. Pept. Symp. 5th, pp. 277-279. Smith, G. D., Pletnev, V. Z., Duax, W. L., Balasubramanian, T. M., Bosshard, H. E., Czerwinski, E. W., Kendrick, N. E., Matthews, F. S., and Marshall, G. R. (1981). J. Am. Chem. SOC. 103, 1493-1501. Smith, J. A., and Pease, L. G. (1980). CRC Cn’t. Rev. Biochem. 8, 315-399.
VIBRATIONAL SPECTROSCOPY OF PEPTIDES AND PROTEINS
363
Smith, M., Walton, A. G., and Koenig, J. L. (1969). Biopolymers 8, 29-43. Soman, K. V., and Ramakrishnan, C. (1983).J. Mol. Biol. 170, 1045-1048. Spiro, T. G., and Caber, B. P. (1977). Annu. Rev. Biochem. 46, 553-572. Srinivasan, R., Balasubramanian, R., and Rajan, S. S. (1976). Science 194, 720-721. Stepanyan, S. A., and Gribov, L. A. (1979). Opt. Spektrosk. 47, 291-296. [Opt. Spectrosc. (Engl. Traml.) 47, 165-167.1 Suezaki, Y., and Gd,N. (1975). Int. J. Pept. Protein Res. 7, 333-334. Sugeta, H. (1975). Spectrochim. Acta Part A 31, 1729-1737. Sugeta, H., Go, A., and Miyazawa, T. (1972). Chem. Lett., pp. 83-86. Sugeta, H., Go, A., and Miyazawa, T. (1973). Bull. Chem. Soc.Jpn. 46, 3407-3411. Sutherland, G. B. B. M. (1952). Adv. Protein Chem. 7, 291-318. Sutor, D. J. (1963).J. Chem. SOC.,pp. 1105-1110. Suzuki, E. (1967). Spectrochim. Acta Part A 23, 2303-2308. Suzuki, S., Iwashita, Y., Shimanouchi, T., and Tsuboi, M. (1966). Biopolymers 4, 337-350. Sychev, S. V., Nevskaya, N. A., Jordanov, S. J., Shepel, E. N., Miroshnikov, A. I., and Ivanov, V. T. (1980). Bioorg. Chem. 9, 121-151. Tadokoro, H. (1960).J. Chem. Phys. 33, 1558-1567. Takeda, Y. (1975). Biopolymers 14,891-893. Tanaka, I., and Ashida, T. (1980). Acta Crystallogr. Sect. B 36, 2164-2167. Tasumi, M., Takeuchi, H., Ataka, S., Dwivedi, A. M., and Krimm, S. (1982).Biopolymers 21, 7 11-7 14. Taylor, R., and Kennard, 0. (1982).J. Am. Chem. SOL. 104,5063-5070. Taylor, R., Kennard, O., and Versichel, W. (1983).J. Am. Chem. SOL. 105, 5761-5766. Taylor, R., Kennard, O., and Versichel, W. (1984a).J . Am. Chem. SOL. 106, 244-248. Taylor, R., Kennard, O., and Versichel, W. (1984b). A d a CTstallogr. Sect. B 40, 280-288. Tiffany, M. L., and Krimm, S. (1968). Biopolymers 6, 1379-1382. Tipping, M., Viras, K., and King, T. A. (1984). Biopolymers 23, 2891-2899. Traub, W., and Shmueli, U. (1963). I n “Aspects of Protein Structure” (G. N. Ramachandran, ed.), pp. 81-92. Academic Press, New York. Tsuboi, M. (1964). Biopolym. Symp., No. 1 , 527-547. Tsuboi, M. (1977).I n “Vibrational Spectroscopy-Modern Trends” (A.J. Barnes and W. J. Orville-Thomas, eds.), pp. 405-41 2. Elsevier, Amsterdam. Ueki, T., Ashida, T., Kakudo, M., Sasada, Y., and Katsube, Y. (1969).Acta Crystallogr. Sect. B 25, 1840-1849. Uno, T., Machida, K., and Saito, Y. (1969).Bull. Chem. SOC.Jpn. 42, 897-904. Uno, T., Machida, K., and Saito, Y. (1971). Spectrochim. Acta Part A 27, 833-844. Urry, D. W. (1971). Proc. Natl. Acad. Sci. U.S.A. 68, 672-676. Urry, D. W., Mitchell, L. W., Ohnishi, T., and Long, M. M. (1975).J. Mol. Biol. 96, 101117. van Wart, H. E,, and Scheraga, H. A. (1976a).J.Phys. Chem. 80, 1812-1823. van Wart, H. E., and Scheraga, H. A. (1976b).J.Phys. Chem. 80, 1823-1832. van Wart, H. E., and Scheraga, H. A. (1978). In “Methods in Enzymology” (S. Colowick and N. Kaplan, eds.), Vol. 44, pp. 124-149. Academic Press, New York. van Wart, H. E., Lewis, A., Scheraga, H. A., and Saeva, F. D. (1973).Proc. Natl. Acad. Sci. U.S.A. 70, 2619-2623. van Wart, H . E., Shipman, L. L., and Scheraga, H. A. (1975a).J. Phys. C h . 79, 14281435. van Wart, H. E., Shipman, L. L., and Scheraga, H. A. (1975b).J. Phys. Chem. 79, 14361447. van Wart, H. E., Scheraga, H. A., and Martin, R. B. (1976).J. Phys. Chem. 80, 1832.
364
SAMUEL KRIMM AND JAGDEESH BANDEKAR
Veatch, W. R., Fossel, E. T., and Blout, E. R. (1974).Biochemrshy 13, 5249-5256. Venkatachalam, C. M. (1968a).Biocham. BLophys. Acta 168,411-416. Venkatachalam, C. M. (1968b).Biopvlymers 6, 1425-1436. Warshel, A., Levitt, M., and Lifson, S. (1970). J . Mol. Spectrosc. 33, 84-99. Williams, D. E. (1981).Top. Cum. P h y . %,3-40. Wikliams, R. W.(1983).J. Mol. Biol. 166, 581-603. Williams, R. W.,and Dunker, A. K. (1981).J.Mol. Biol. 152, 783-813. Wilson, E.B., Decius, J. C., and Cross, P. C. (1955).“Molecular Vibrations.” McGraw-Hill, New York. Winkler, F. K., and Dunitz, J. D. (1971).J.Mol. Biol. 59, 169-182. Woodward, L. A. (1972).“Introduction to the Theory of Molecular Vibrations and Vibrational Spectroscopy.” Oxford University Press, London and New York. Yamada, Y., Tanaka, I., and Ashida, T. (1980).Acta CTstalEogr. Sect. B 36, 331-335. Yarnane, T., Shiraishi, Y., and Ashida, T. (1985).Acta CTySt4llOgT. Sect. B 41, 946-950. Yu. N.-T. (1974).J. Am. Chem. Sac. 96, 4664-4668. Yu, N.-T., Liu, C. S., and O’Shea, D. C. (1972).J.Mol. Biol. 70, 117-132. Yu, N.-T., Jo, B. H.. Chang, R. C. C., and Huber, J. D. (1974).Arch. Biochm. Biofihys. 160,
614-622. Yu, T. J., Lippert, J. L., and Peticolas, W. L. (1973).B w p o l p r s 12, 2161-2176. Zak, J. (1975).In “Lattice Dynamics and Intermolecular Forces” (S. Califano, ed.). Academic Press, New York. Zak, J., Asher, A., Gliick, M., and Cur, Y. (1969).“The Irreducible Representations of Space Groups.” Benjamin, New York. Zerbi, G. (1977).In “Vibrational Spectroscopy-Modern Trends” (A. J. Barnes and W. J. Orville-Thomas, eds.), pp. 261-284. Elsevier, Amsterdam. Zimmerman, S . S . , and Scheraga, H. A. (1977).Biopolymers 16,811-843. Zimrnerman, S. S., Clark, J. C., and Mandelkern, L. (1975).Biqolymers 14, 585-596.
AUTHOR INDEX Numbers in italics refer to the pages on which the complete references are listed.
A Abbate, S., 331,357 Abdel-Meguid, S. S., 80, 103, 107 Abe, Y.,184, 212, 219, 230, 233, 280, 353,354,359 Adair, G. S., 31, 42, 66 Adams, G. A., 173,174 Admiraal, G., 229,354 Akiyama, Y.,139, 174 Alberts, A. W., 130, 180 Alberts, B. M., 74, 75, 76, 77, 82, 85, 86, 90, 93, 94, 95, 96, 105 Allinger, N. L., 207,356 Amar-Costesec, A., 136, 137, 174, 177 Ambros, V., 92, 103 Amos, L. A., 43, 51, 66 Amphlett, G. W., 13, 24, 28, 41, 49, 60, 62 Anba, J., 149, 178 Andersson, T., 16, 60 Andrews, D. W., 133,174 Apirion, D., 141, 177 Applebaum, S. W., 118,177 Arentzen, R., 78, 83, 93, 107 Argos, P., 297,362 Arkhipova, S. F., 298,359 Arnott, S., 211, 229, 238, 239, 254, 257, 258, 259,354,355 Asai, K., 74, 93, 104 Asakura, S., 44, 64 Asano, A,, 6, 65 Ashe, B. M., 130, 180 Asher, A., 225,364 Ashida, T., 229, 306, 355, 363, 364 Astbury, W. T., 229, 230, 355 Ataka, S.,341, 346, 347, 348,363 Aubert, J.-P., 297,355
Austen, B. M., 120, 126, 137, 154, 155, 166, 167, 171,174, 179 Aviv, H., 111, 179 Azuma, T., 321, 322,359
B Bailey, K., 7, 31, 33, 42, 60, 66 Bailey, S. C., 141, 177 Bailin, G., 15, 60 Baker, E. N., 210, 298,355 Bakker, E. P., 129, 151, 174 Balaram, P., 257, 270, 317, 318, 358, 360, 361,362 Balasubramanian, R., 210, 258,355, 356, 363 Balasubramanian, T. M., 257, 270, 362 Baldi, M. I., 94, 103 Baldwin, J. P., 257,355 Baldwin, R. L., 70, 71, 106 Baltimore, D., 92, 103 Bamford, C. H., 184, 230,238,261,275, 355 Bandekar, J., 298, 299, 300, 301, 304, 305, 306, 308, 309, 310, 312, 313, 314, 315, 316, 317, 318, 319, 321, 322, 323, 324, 325, 326, 345, 354, 355,359,360 Bankaitis, V. A., 122, 137, 140, 150, 153, 169,174, 179 Bansal, M., 257,360 Barr, P. J., 118, 153, 179 Barry, C. D., 41, 49, 63 Bartlett, S. G., 148, 176 Bassford, P. J., 115, 122, 127, 137, 140, 149, 153, 169,174, 179 Bassiiner, R., 118, 174 365
366
AUTHOR INDEX
Bastian, E. J., 345,355 Bates, M.,147,180 Baty, D., 116, 153,174 Baudras, A., 255,359 Bauer, W. R., 70,94,103, 104 Baybutt, R. B., 257,360 Beaudette, N. V., 154, 155,179 Bebrin, W.R.,107 Becherer, K.,102,107 Bechtel, P. J., 5, 60 Beckwith, J., 115, 123, 127, 137, 138, 139,
Born, M., 201,355 Bosch, R., 270,355 Bosi, P., 331,357 Bosshard, H. E.,257, 270,362 Botstein, D.,112, 146,175, 177 Bott, K. F., 90,106 Bougis, P., 161,175 Boutin, H.,232,233,358 Boyd, D. B., 355 Bradbury, E. M.,195, 197,232,257,355,
140, 141,174,175,176,177,178 Bedoulle, H.,119, 121,127, 156,174 Bedwell, D. M.,139,175 Beely, J. G.,297,355 Been, M. D.,84,88,89,92,103 Bellamy, L. J., 225, 342,344,345,355 Bendzko, P.,137, 158,166,174 Benedetti, E.,270,271,288,355,361 Benedetti, P., 94,103 Benjamin, H.W.,105 Benson, S. A., 115, 123, 125, 131, 140, 141, 149, 169, 171,175,179 Beppu, N., 132,176 Bergstrom, G., 261,362 Bergstrom, R., 17,49,66 Berman, M.L., 139, 140,179 Bernadac, A., 149,178 Bernstein, F. C., 257,297,300,346,355 Berzofsky, J. A., 153,175 Better, M.,79,103 Bielka, H.,133,177 Biserte, G.,297,355 Blobel, G.,110, 111, 112, 116, 118, 123, 129,131, 133, 134, 135, 136,141, 142, 143, 147, 148, 149,150, 168, 173,175,176,177,178, 179,180 Blomberg, C., 144,180 Blomfelt, G.,261,362 Blout, E. R., 212,255,288,331,359,360, 364 Blundell, T., 319,355 Bochkareva, E. S., 133,177 Bode, M.,321,357 Bogdanov, M. V., 145, 147,175 Btihni, P., 119, 148, 152,176 Boime, I., 123, 125, 150,176 Bole, D. G.,129, 151, 175 Bonven, B. J., 88,103, 104 Boquet, P.-L., 141,I75
Braell, W. A., 116,175 Bragg. L., 256,257,297,322,356 Branson, H.R., 203,210,256,258,361 Brant, D. A., 209,356 Brazhnikov, E. V.,331,356 Brekke, C.,27,62 Bremel, R. D., 35,60 Brennan, M.D.,118, 130,175 Brice, M.D., 257,297,300,346,355 Brickman, E. R., 138, 140,175 Briggs, M.S.,154, 155,156, 157, 158,
356
159,160, 161, 163, 164, 165,175 Brillouin, L., 200,356 Brockman, H. L., 159, 161,178 Bronson, D. D., 31, 60 Brooks, B., 208,210,349,353,356 Brosius, J., 117,180 Brougham, M.J., 94,106 Brown, L., 230,238, 257,258,355,356 Brown, P. A., 126,175 Brown, P. O.,75,81,84,85,90,93,94,
95,99, 100, 101,103, 106, 107 Brownlee, G.G.,1 1 1, 178 Bruccoleri, R. E., 208,349,353,356 Brutlag, D.,93,96,97,104, 106 Buchanan, J. M.,104 Bullard, B., 10, 13, 17,24,29,39,62,64 Burgess, R. R., 88,103 Burkert, W., 207,356 Burstein, Y.,167,177 Busch, H.,98,104, 105 Bystrov, V. F., 298,356
C Califano, S., 185, 191,201, 207, 215,356,
361 Camerman, A., 229,357
367
AUTHOR INDEX Camerman, N., 229,357 Cannon, L. E., 112,178 Cant, E., 275,355 Carew, E. B.,20,60 Carlson, M., 112,175 Carpenter, B. G.,257,356 Carpenter, M. R., 24, 27,49,65 Carvell, M.,59,63 Cashmore, A. R., 115, 173,180 Caspar, D.L. D., 33,60,61 Caulfield, M. P., 142,I75 Cerretti, D. P.,139,175, 177 Chaidez, J., 115, 153, 177 Chalovich, J. M., 51,60 Champion, J., 118,177 Champoux, J. J., 76,84,88,89,91,92, 97,99,103,105 Chan, S.J., 171,I79 Chandrasekharan, R.,288,298, 317,356, 361 Chang, C. N., 118, 133,178 Chang, H., 31 1,358 Chang, R. C. C., 318,319,320,321,364 Chang, S.,117, 144,176 Chang, S.-Y., 117, 144,I76 Charles, A. D., 115,177 Charlton, S. C., 16,62 Chatterjee, A.,229,356 Cheam, T.C., 195, 197, 198,205,209, 212,215, 224,233,238,328,330, 335,342,350,351, 353,356 Chen, G.L.,83,85,91,93,103, 105, 107 Chen, L.,130, 141, 149, 151, I75 Chen, M.C., 261,318,321,338,345, 356,362 Cheung, W. Y., 41,60 Chidambaram, R., 210,355,356 Chirgadze, Yu. N., 215, 331,356 Chong, P.C. S., 14,28,29,30,35, 60,61 Chothia, C., 229,356 Chou, K.-C., 229,356 Chou, P. Y.,126, 127,175, 297,298,318, 356,362 Chowdry, V., 78,83,93,107 Chua, N.-H., 148,I76 Chung, S., 79,105 Clare, D. M.,19,65 Clark, J. C., 255,364 Clement, J.-M., 125, 127,176 Coffman, D. M. D., 20,63
Cohen, C., 27,31,33,44,47,60,61,63, 67 Colacicco, G., 159,175 Cole, H. A., 10,24,27,37,64,65 Coleman, J., 112,175 Collins, J. H., 17,49,61 Colman, A., 118,177 Colman, P.,321,357 Colonna-Cesari, F., 230, 232,242, 253, 258,260,288,289,356,360 Cwke, C. A., 83,104 Copeland, B. R., 129,178 Corey, R. B., 203,229,230, 238, 242, 256,25a,356,360,361 Cornell, D. G., 158, 163, 164, 165,175 Couch, J., 64 Coulombe, L.,184,361 Coussens, L., 135, 136,177 Coutelle, Ch., 117, 171,176 Cowan, P. M.,258,356 Cox, J. M.,257,361 Cozzarelli, N. R., 70,75, 78, 79,80,81, 82,83,84,85,87,89,90,91,93,94, 95,96,97,99, 100, 101, 103, 104, 105,106,107 Craig, N. L., 79,84,89,I03 Craig, R.,3, 61,118,177 Craig, S. W.,5,65 Craig, W.S., 318,321,356 Crane, H. R.,256,356 Crick, F. H. C., 70,103, 257,275,357 Cross, P. C., 185, 188, 191,207,225,364 Cruse, W.B. T., 229,357 Cummins, P., 31,61 Cusack, S.,347,358 Czerwinski, E. R.,257,362
D Dalbey, R., 132,175 Dalgliesh, C.E., 230,355 Daniels, C. J., 129, 151,175 Darby, M. K., 86,103 Darkin, S., 86,I05 Darmon, S. E., 230,355 Dassa, E., 141,I75 Date, T.,151,175 Dauber, P.,208,209, 210,358,359 Davies, D. R.,256,297,359,362
368
AUTHOR INDEX
Davis, B. D., 129, 130, 141, 142, 149, 150,
Duncan, M. C., 150,179 Dungan, J. M.,105 151, 152, 157, 167, 168,175,179 Dunitz, J. D.. 203,364 Davis. J. L.,87,88,92,93,104 Dunker, A. K.,184,345,357,364 Davis, N. G.,173,175 Dawson, R. M.C., 161,179 Durban, E.,98,104, 105 Dean, D.,139,175 Dwivedi, A. M., t84, 293, 195,204,211, Dean, F. B.,78,83, 87,99,100,103 219. 229,232, 234,240, 241, 253, 257,259,260,261,268,271, 272, Decius, J. C., 185, 188, 191,201,207, 273,275,276,280,281,290,310, 225,357,364 Dedmm, J. R., 13, 41,61 323,326,329,341,346,347,348, DeFoor, P.,148,178 357,359,363 DcG~ado. W.F., 157, 159,161,175 Dybal, J., 353,357 DcHus, G . EL, 158,177 Dyson, P.,79,80,105 Dellapkme, G., 33 1.357 Derrocl., R.A., 161,177,180 Dennis, J. E., 4,61 Denton, J., 316, 345,360 E &pew, R E.,70, 78,83,86,96,103 Der, r. K.,132,178 Earnshaw, W. C., 83,I04 Desiderioi J. V.,92,103 Eaton, B. L.,34,37,61 Desmeuks, P.J., 184,359 Ebashi, F., 4,7, 15,36,41,61 Dial&, V.,34,66 Ebasht, S.,4,7,8,9,10, 11, 12, 15,21, Dianoux, A. J., 347,358 23, 24,30,34,36,37,39,40,41,42, DiBlasio, B., 288,355 44,61, 63,64,65, 66 DiCarlo, E. D.,133,176 Echols, H., 79,103, 105 Dickerson, R. E.,256,359 Edge, M.D., 115,177 DiDonato, J. A., 94,107 Edwards, K. A., 87,88,92,93,94,104 Diersbein, R., 153,175 Egert, E., 229,357 Ding, J., 118,175 Eguchi, G., 4,52,64 DiRienzo, J. M.,130, 157,176 Eisenberg, E., 34,37,51,60,61,67 Dluhy, R. A, 158, 163, 164, 165,175 Elliott, A., 184, 195, 197,229, 230,232, Dobberstein, B.,111, 129, 134, 135, 143, 238,239,240, 241,254,257,258, 148,175,178 259,261,275,277,355,356,357 Dobrovolskii, A? B.,44,62 Emr, S. D., 123, 125, 127, 128, 131, 139, Dodson, G.,319,355 140, 141, 149, 150. 156, 169, 171, Doetschman; T.C.,3,62 176,179 Donohue, J,, 211,256,257,258,270,271, Endo, T., 7, 10,41,42,61 357 Enequist, H. G.,151,176 dos Remedios, C. G., 52, 61 Engelman, D. M.,116, 122, 129, 144, 150, Dover, S. D,, 211, 229,238,239,254, 176, 347,358 257,258,259,354,355 England, P. J., 15,61,65 Downie, A. R., 257,356 Englund, P. T., 75, 77,96,105 Drabikowski, W., 13, 18, 19,21,22,23, Enquist, L. W., 79,106 30,34,42,61,62,63,64 Epp, O.,321,357 Drakenberg, T., 16,60 Eppenberger, H. M.,3,62,66 Drlica, K.,70,104 Erickson, L. C.,86,107 Duax, W.L.,257,362 Evans, D. J., 312,313, 314,315,316,317, Duguet, M.,74,105 355,360 Dulbecco, R., 76,99,103 Evans, J. S . , 22,61 Duncan, J. L., 207,215,357 Evans, R. R., 6,61
369
AUTHOR INDEX
F Fanconi, B., 198, 200, 201, 232, 233, 238, 240, 242, 261, 277, 279, 280, 281, 357,362 Fasman, G. D., 126, 127, 154, 155, 156, 175, 176, 179, 255, 297, 298, 309, 318, 322,356,357,358,362 Fawcett, J. K., 229,357 Felhammer, H., 321,357 Felsenfeld, G., 80, 81, 105 Fendler, J. H., 159, 176 Feng, I. M., 4, 64 Fennick, B., 167, 177 Ferenci, T., 153, 176 Fernandes, R., 16,63 Ferro, A. M., 98,104 Ferro-Novick, S., 137, 139, 176 Field, C., 137, 139, 176, 178 Fiil, N., 139, 140, 179 Fikes, J. D., 150, 179 Finch, J. T., 43, 66 Fischman, D. A., 4, 61 Fisher, L. M., 75, 77, 81, 85, 89, 90, 94, 101,104,106 Fleischer, S., 148, 178 Fleury, G., 298, 299, 312, 344,359 Flicker, P. F., 47, 61 Flippen, J. L., 298,357 Flory, P. J., 209,356 Foglesong, P. D., 94, 104 Forchioni, A., 312,361 Forel, M. T., 193, 197, 335, 336,362 Forsen, S.,16, 60 Forss, S., 261, 362 Forterre, P., 74, 105 Fossel, E. T., 288, 364 Fournier, M. J., 141, 177 Fowler, A. V., 115, 127, 174, 178 Fox, R. O., Jr., 270,357 Foxman, B. M., 255,358 Franchesini, T., 120,176 Franke, W. W., 6, 61 Franzini-Armstrong, C., 45, 61 Fraser, R. D. B., 229, 242, 253, 257, 329, 330, 333,356,357 Fried, V. A., 159,179 Friedlander, M., 112, 173, 176 Frushour, B. G., 225, 240, 253,261,318, 321,331,338,357,359
Fujii, K., 54, 61 Fujii, T., 10, 11, 37, 61, 240, 360 Fujimoto, Y., 130, 181, 176 Fukushima, K., 198, 201, 283,240,358, 360 Fukuzawa, T., 7, 64 Fuller, F. B., 70, 74, 104 Funatsu, T., 4, 61 Furie, B., 153, 177 Furie, B. C., 141, I77 Furlong, D., 142, 175
G Gabay, J., 120, 176 Caber, B. P., 225, 318, 320, 321, 342, 356,363 Gagnon, J., 116, 178 Cans, P., 215,358 Garcia, P. D.,'135, 136, 148, 152, 176, 177 Garrigou-Lagrange, C., 193, 197, 335, 336,362 Garwin, J. L., 138, 140, 175 Gasser, S., 119, 148, 152, 176 Gautier, A. E., 115, 177 Geiger, B., 5 , 61 Geisler, N., 6, 61 Geisow, M., 297,358 Gellert, M., 70, 74, 75, 77, 81, 82, 84, 85, 88, 89, 90, 93, 94, 95, 96, 101, 102, 104, 105 Gergely, J., 8, 10, 12, 13, 15, 16, 18, 19, 20, 21, 22, 23, 27, 30, 36, 37, 47, 48, 60, 61, 62, 63, 64, 65 Gerlt, J. A., 118, 153, 179 Gerrard, S. P., 70, 85, 101, 103, 106 Giam, C.-Z., 117, 144, 150, 176, 180 Giannoni, G., 254,358, 359 Gierasch, L. M., 154, 155, 156, 157, 158, 159, 161, 163, 164, 165,175 Gilbert, W., 118, 179, 180 Gillis, J. M., 64 Gilmore, R., 111, 129, 135, 136, 147, 168, 176,180 Gilmour, D., 52, 61 Girshovich, A. S., 133, 177 Cluck, M., 225,364 Glusker, J. P., 229,360 Go, A., 344, 345, 346,363 (36, N., 347,349,358,363
370
AUTHOR INDEX
Goaman, L. C. G., 257,361 Gocke, E., 88, 103, 104 Goldberg, A. R., 98, 107 Goldstein, H., 358 Goldstein, M. A., 62 Gomer, R. H., 5, 62 Goodenough, D. A., 118,178 Goodman, J. M., 149, 151,175,176, J 80 Goto, T., 102, 104 Grabarek, J., 19,62 Grabarek, Z., 13, 18, 19, 21, 22, 23, 30, 61, 62, 63 Gramlich, V., 288,355 Grand, R. J. A., 10, 11, 12, 13, 14, 37, 62, 64, 66, 67 Granger, B. L., 4, 6. 7, 53, 62 Gravenor, R. B., 360 Greaser, M. L., 8, 17, 27, 49, 59, 61, 62, 66, 67 Grebenau, R. C., 136,177 Green, R. D., 211,358 Greenville-Wells, H. J., 257, 360 Gribov, L. A., 234,363 Griffin, J. F., 309,362 Griffith, J. D., 82, 106 Grindley, N. D. F., 71, 79, 80, 84, 89, 92, 103, 104, 106 Groarke, J. M., 127, 176 Gross, C., 125, 150, 179 Grossman, A. R., 148, 176 Grossman, L., 74, 107 Grove, B. K., 3 , 6 2 Gruen, L. C., 55, 62 Guarente, L., 115, 123, 178 Gumport, R. I., 101, 107 Gundelfinger, E. D., 133, 176 Gupta, V. D., 232,233, 280,358,362 Cur, Y., 225,364 Gusev, N. B., 24, 62 Guyer, R., 111, 179
-
H Habener, J. F., 166, 167, 176, 177 Hagler, A. T., 208, 209, 210, 21 1,358, 359 Hahn, V., 117, 171, 176 Halegoua, S., 119, 120, 143, 170, 176 Hall. M. N., 115, 120, 176, 178
Halligan, B. D., 83, 85, 87, 88, 92, 93, 94, 103, 104, 105 Halvorson, H. O., 122, 125, 126, 131, 175, 178 Hamaguchi, K., 321, 322,359 Hamilton, D., 79, 104 Han, S.-L., 309,358 Hanby, W. E., 184, 230, 238, 257, 261, 275,355,356 Handa, S., 4, 52, 64 Hanley-Way, S., 137, 139, 140, 176 Hansen, W., 137, 139, 148, 152,176 Hanson, J., 44, 62 Harada, I., 345,362 Harayama, S., 151, 176 Hardy, S. J. S., 146, 151, 168, 176, 179 Harkins, R. N., 135, 136, 177 Harrison, S. C., 82, 105 Harrison, T. M., 111, 178 Hart, R. G., 256,359 Hartshorne, D. J., 8, 10, 62 Hartwig, J. H., 5, 52, 66 Haselgrove, J. C., 43, 62 Hashimoto, K., 24, 25, 38, 64, 65 Hattori, A., 4, 66 Hay, R., 119, 148, 152, 176 Hayashi, S., 117, 144, 150, 176, 180 Hazelbauer, G. L., 127, 176 Head, J. F., 10, 12, 15, 20, 21, 37, 62, 65 Heck, M. M. S., 83, 104 Hedgpeth, J., 115, 125, 127, 153, 176, 177 Heffron, F., 79, 105 Heitz, F., 258, 260, 288, 289, 356, 358, 360 Helbecque, N., 312, 361 Herasymowych, 0. S., 3, 63 Hermon-Taylor, J., 166, 167, 174 Hermoso, J. M., 92, 104 Herrera-Estrella, L., 115, 173, 180 Herzberg, G., 185, 207, 358 Herzberg, O., 17, 49, 62 Hesselink, F. T., 288, 358 Hexter, R. M., 201, 212,357,358 Hibler, D. W., 118, 153, 179 Higashi, L. S., 345, 346, 358 Higachi-Fujime, S., 33, 34, 62 Higgins, N. P., 81, 82, 84, 91, 93, 95, 97, 98, 104, 106,107 Higgins, S., 118, 177
AUTHOR INDEX Higgs, P. W., 198, 199,200,358 Higuchi, H., 56,57,59,60,62,64 Hill, T. L.,96,104 Hincke, N. T.,17,62 Hirst, T. R., 151,176 Hitchcock, S. E., 10, 12, 13, 14, 15, 21, 23,29,30,31, 37,62 Hodges, R. S., 11, 12, 14, 19,28,29,30, 31, 35,60,61,65,66 Hodgkin, D., 319,355 Hofnung,M., 119, 121, 127, 156,174 Holloman, W.K.,94,106 Holroyde, M. J., 15, 62 Honma, M., 139,176 Hopfinger, A. J., 230,358 Hordvik, A., 345,358 Horiuchi, S., 142,175 Horn, M.J., 17,49,61 Horowitz, D. S.,70,71,104 Hortin, G., 123, 125, 150,176 Honsch, M.,135,176 Horwitz, J., 10, 13,24,29, 39,62 Houghton, M., 118,177 Hseu, T. H., 31 1,358 Hsieh, T., 85,90, 93,96,104, 106, 107 Hsu, S. L., 234, 254, 255,269,329,335,
340,344,353,358,362 Hu, D. H., 56,62 Huang, K.,201,355 Huang, W.M.,82,85,104,105 Huang, Y. P.,52,56,60,63 Hubbard, B. D., 6,63 Hubbard, R. E.,210, 298,355 Huber, J. D.,318,319,320,321,364 Huber, R., 321,357 Hudson, B., 76,104 Huggins, M. L., 256,270,297,358 Hurt, E. C.,115, 173,176 Hussain, M.,131, 132,176 Huth, A., 117, 118, 134,174,180 Huxley, H. E.,43,51, 62, 66
I Ibrahimi, I., 118, 123, 133, 135,178,180 Ichihara, S., 132,176 Ideguchi, Y.,198,201,233,358,360 Iida, A.,127,176 Iio, T.,16,62 Iizuka, Y.,155,158,177
371
Ikeda, H., 90,104,106 Ikeya, H., 59,62 Inouye, M., 112, 118, 119, 120, 126, 131,
143, 153, 157, 170,175, 176, 179, 180 Inouye, S., 120,176,180 Inukai, M., 112,175 Isenberg, G., 5, 62 Ishii, T., 4,64 Ishiwata, S.,4,61 Ishizaki, H.,318,358 Itakura, K.,120,176,I80 Ito, H., 120,180 Ito, K.,137, 139, 141, 147,174, 176, 177, 178,179,180 Itoh, K., 240,241,242,255,260, 261, 357,358 Itoh, T., 84,93,94,96,104 Ittel, M.-E., 98,104 Ivanov, V. T., 290, 298,356,363 Iwai, H.,7,66 Iwakura, Y.,240,242,261,358 Iwashita, Y.,232, 277,278,280,363
J Jackman, N., 17,49,61 Jackson J. D.,213,358 Jackson, P.,24,28,49,62 Jackson, R. C.,131, 150,177,I79 Jackson, R. L.,161,177 Jacrot, B.,347,358 Jaenisch, R.,77,104 Jain, M. K.,158,177 Jake&J., 184, 193,219,358 James, M.N. G., 17,49,62 James, T. C.,118,177 Javaherian, K.,83, 101,104, 107 Jo,B. H., 318,319,320,321,364 Jockusch, B., 5, 62 Johnson, B. L.,140,177 Johnson, J. D.,15, 16,62 Johnson, P.,24,27,49,65 Johnson, P.L.,310,362 Johnston, N. H., 310,326, 353,354,358 Jongeneel, C.V.,94,105 Jongstra-Bilen, J., 98,104 Jordanov, S.J,, 290,363 Josefsson, L.-G., 149, 150,177 Jung, G., 270,355
372
AUTHOR INDEX
K Kiiriiinen, L., 118, 153, 178 Kaderbhai, M. A., 137, 166, 167,174, 179 Kadonaga,J. T., 115, 138, 146, 149,177 Kaiser, E. T., 153, 154, 155, 157, 158, 179, 257,358 Kakiuchi, S., 41, 62 Kakudo, M., 306,363 Kanazawa, H., 150,177 Karle, I., 298, 325, 357, 358 Karplus, M., 210, 349,356 Kartha, G., 258,361 Katabuchi, H., 240,241,358 Katakai, R., 155, 158, 177 Katayama, E., 11, 12, 13, 22, 24, 29, 37, 62, 63 Katsube, Y., 306,363 Katunurna, N., 53, 56, 64 Kaufman, J., 118,180 Kausch, A. P., 115, 173, 180 Kawai, M., 298, 309, 322,358 Kawasaki, I., 90, 104 Kawasaki, Y., 20, 63, 66 Kay, C., 3, 17, 62, 63 Keith, H. D., 254, 275,358, 359,361 Kelly, C. M., 16, 63 Kelly, T. J., 92, 103 Kemper, B., 166,176 Kendrew, J. C., 256, 257, 297,322,356, 359 Kendrick, N. E., 257,362 Kennard, O., 211, 229, 257, 297, 300, 346,355,356,363 Kkzdy, F. J., 257,358 Kikuchi, A., 74, 93, 104 Kikuchi, M., 54, 64 Kikuchi, Y., 79, 104 Kimura, M., 53, 56, 64 Kimura, S., 1, 2, 4, 45, 52, 53, 54, 55, 56, 57, 59, 60, 63, 64 King, N. L., 53, 55, 56, 62, 63 King, T. A., 261,363 Kirchhausen, T., 82,105 Kirkegaard, K., 72, 78, 81, 83, 87, 89, 91, 99, 101,105,107 Kirkwood, J. G., 209,362 Kishi, K., 4, 64
Kitazawa, T., 23, 41, 63. 321, 322,359 Kitaigorodsky,A. I., 208,359 Kittel, C., 200, 359 Kitts, P. A., 79, 80, 105 Klausner, R. D., 126, 179 Klevan. L., 81, 82, 91, 95, 101, 105, 106, 107 Klug, A., 43, 51, 66 Klyne, W., 256, 257,359 Knowles, J. R., 115, 177 Kobayashi, H., 31, 42, 65 Kobayashi, K., 7, 64 Kodama, A., 7, 15, 34, 36, 41, 43, 61 Koenig, J. L., 225, 232, 234, 240, 253, 261, 277, 318, 321, 331, 338, 342, 357,359,363 Koetzle, T. F., 257, 297, 300, 346, 355 Kohama, K., 1, 2, 16, 22, 40, 41, 63, 64 Kohn, K., 86,107 Kojima, T., 229, 355 Kometani, K., 20, 67 Kominz, D. R., 34, 37, 61, 63 Kondo, H., 16,62 Kontis, T., 59, 63 Koren, R., 167, 177 Korn, E. D., 34, 67 Koshland, D., 146,176 Kostriken, R., 79, 105 Kotewicz, M., 79, 101, 105, 107 Kourides, I., 118, 177 Kourilsky, P., 116, 174 Krasnow, M. A., 70, 78, 79, 80, 103, 105 Krause, E., 134, 135, 178 Kreibich, G., 136, 137, 174, 177 Kreil, G., 119, 148, 152, 177 Kretsinger, R. H., 17, 41, 49, 63 Kreuzer, K. N., 82, 84, 85, 86, 90, 94, 101,105, 106,107 Krimm, S., 184, 193, 195, 197, 198, 204, 205, 209, 211, 212, 214, 215, 219, 224, 229, 230, 231, 232, 233, 234, 238, 240, 241, 257, 258, 259, 260, 261, 268, 269, 270, 271, 272, 273, 275, 276, 277, 280, 281, 288, 290, 291, 293, 296, 297, 298, 299, 300, 301, 305, 306, 308, 309, 310, 312, 313, 314, 315, 316, 317, 318, 321, 322, 323, 324, 325, 326, 328, 329, 331, 333, 335, 338, 340, 341, 342,
AUTHOR INDEX
344,345,346,347,348,350,351, 353,354,354, 355,356,357,358, 359,360,361,362,363 Kronenberg, H. M., 166, 167,176,177 Kulaev, I. S., 145,175 Kumamoto, C.,138, 139, 140,175,177 Kumon, A., 24,63 Kung, V. T., 97,105 Kuntz, I. D., 297,359 Kurer, V., 3, 62 Kuroda, M., 4,5, 63,64 Kuroiwa, K., 211, 232, 277,280,359 Kurzchalia, T. V., 133,177 Kuwano, Y.,52,57,64
L Lagant, P., 298,299,312,344,359 Laki, K., 34,63 Lakshminarayanan, A. V.,298,356 Lalitha, V.,229,358 Landick, R., 129,178 Landy, A., 72,107 Lane, C. D.,118,177 Lapiccirella, A., 21 1, 358 Lauffer, L.,135, 136,177 Lauth, M. R., 79,104 Lazarides, E., 4,5, 6,7,53,62,63 Lazdunski, C.,129,149, 151, 152, 153,
157,174,178 Leach, S. J., 260,312,313,314,315, 316,
317,355,360,361 Lear, J. D., 157, 159, 161,175 Leavis, P. C.,13, 18,19,20,21,22,23, 30,47,48,60,61,62,63,65 Leder, P., 1 1 1, 179 Lee, A,, 313,315, 316, 317,355 Lee, C. A., 141,177 Lee, S. Y.,141,177 Lehrer, C.,3, 62 Lehrer, S. S.,19,20,27,28,32,?5,63,
64,65 Lehtovaara, P., 118, 153, 178 Lenormant, H., 255,359 Leonard, K., 5, 62 Leunissen, J., 131,180 Levin, B. A., 11, 12, 13, 22,61,62 Levine, A. I., 77,104 Levine, B. k.,16,20,63
373
Levitt, M., 207,208, 349,353,359,364 Levy, R. M.,349,359 Lewis, A.,345,363 Lewis, P. M.,297,298,318,359 Lewis, R. M., 153,177 Liebscher, D.-H., 117, 171,176 Lifson, S., 207,208,209,210,256, 257,
358,359,364 Lin, J. J. C., 150,177 Lin, S.,4,6, 63, 67 Lingappa, J. R., 116,177 Lingappa, V. R., 115, 116, 117, 144, 153,
177,178 Lipkind, G . M., 298,359 Lippert, J. L., 184,338,359,364 Lippincott, E. R., 210,359,362 Liss, L. R., 138, 140, 147,177, 178 Liu, C.-C.,74,75,76,77,93,94,95,96,
105 Liu, C.S., 318,320,321,364 Liu, L. ,F., 74,75, 76,77,78,80,83,85,
86,87,88,91,92,93,94,95,96,97, 103,104,105,106,107 Lively, M. O., 130, 177 Locker, R. H.,52,59,63 Lockshon, D., 89,105 Lodish, H. F.,116, 148,175, 178, 179 Long, B. H.,86,105 Long, D.A., 360 Long, M. M., 298,363 Longley, W., 33, 60,61 Loranger, J. M.,130,180 Lord, R. C.,255,261,318,321,335,338, 344,345,356,357,360,362 Lorenzi, G. P., 288,355 Lory, S.,118,175 Lother, H., 82,105 Lotz, B.,230,232,242,253,258,260, 288,289,356,358,360 Loucheux-Lefebvre, M.-H., 297,298, 299, 312,344,355,359,361 Low, B., 257,360 Lowey, S., 3, 67 Lowy, J., 44,62 Lu, C.,79,103 Lucas, R. M.,33,61 Lucaveche, C.,59,63 Ludmerer, S., 151,175 Lugtenberg, B., 115, 117, 118,180
374
AUTHOR INDEX
Lundeen, M., 345,346,358 Lurz, R., 82,105 Lusby, M. L., 55,56,63
M McCarnrnon, J. A., 349,359 McCarron, B. G. H., 78,83,93, 107 McClure, J., 52,53, 55,57,67 McConaughy, B. L.,97,105 McCubbin, W.D., 17,62 McGavin, S.,258,356 McGhee, J. D., 80,81,105 Machida, K.,310,363 McKean, D. J., 1 1 1, 179 McKenzie, I. J., 55,62 McLachlan, A. D., 31,32,33, 34,49,63 McLuckie, I. F., 257,355 McQuie, J. R.,312,313,315, 316,317, 355,361 MacRae, T. P., 229,242,253,329,330, 357 Maeda, Y.,44,63 Magid, A., 59,63 Magner, J. A,, 117,177 Maher, P. A., 6,66 Mahowald, A. P., 118, 130,175 Majzoub, J. A., 167,177 Mak, A. S.,31,33,35,63,65 Malcolm, B. R., 230,232,257, 270,271, 272,273,275,277,355,357,360 Mandel, G., 130, 147,177,180 Mandel, P.,98,104 Mandelkern, L.,255,364 Mani, R. S., 3, 63 Manteuffel, R., 118,174 Marcantonio, E. E.,136,177 Margoshes, M., 329,360 Margossian, S. S., 27,63 Marini, J. C.,77,105 Mark, J. E.,229,258,359,360 Marsh, J., 171,179 Marsh, R. E.,229, 238,242,360 Marshall, B.,86,105 Marshall, G . R., 257,362 Martin, R. B., 345,346,355,360,363 Martonosi, A., 34,63 Maruyama, K.,1, 2,4,7, 10, 11, 12, 15, 21,23,27,34,37,40,45,52,53,54,
55,56,57,59,60,61,62,63,64,65, 67 Masaki, T., 3, 5, 8,42,63,64,65 Masuda, Y.,240,360 Mathews, M. B., 1 1 1, 178 Matsubara, I., 44,63 Matsubara, S., 4,52,64 Matsumoto, T., 90,104 Matsurnura, M., 18,27,28,50,64 Matsuura, H., 345,362 Matthews, F. S.,257, 270,362 Matthews, J., 118,177 Mattoccia, E., 94,103 Matzuk, M. M., 78,103, 105 Maunus, R., 167,177 Maxfield, F. R., 309,312,314,358,360 Maxwell, A., 81,82,94,95,105 Mayer, L. D.,159, 161,178 Means, A. R., 13, 41,61 Meek, R. L., 116,178 Melli, M., 133,176 Mendelsohn, R., 318,321,356 Mercereau-Puijalon, O.,116,174 Mercola, D.,10, 13, 16, 17,24,29,39,62, 63,64,319,355 Meyer, D. I., 134, 135, 148, 152,176, 178,179 Meyer, E. F., 257, 297, 300,346,355 Meyer, K. H., 229,360 Michaelis, S., 115, 123,178 Michaels, S., 86,107 Mifflin, B.J., 118,177 Mihashi, K.,8, 10, 15, 23,24,30,31, 39, 42, 61, 65 Mikawa, T., 7,65 Miller, K. G., 75,77,96,105 Mills, J. S., 98,104, 105 Milstein, C., 1 1 1, 178 Mirnura, N.,6,65 Minasian, E.,313,315,316,317,355 Minocha, A., 86,105 Mirambeau, G., 74,105 Miroshnikov, A. I., 290,363 Mitchell, L. W.,298,363 Mitsui, T., 44,64 Miura, A., 137, 139,177 Miyahara, M., 4,64 Miyamoto, S., 18,27,28,50,64 Miyata, T., 28,65
AUTHOR INDEX Miyazawa, T., 184, 191, 193, 195, 197, 198, 201, 212, 228, 230, 232, 233, 240, 256, 257, 260, 277, 331, 338, 344, 345,346,358,359,360,363 Mizushima, S., 131, 132, 176, 184, 193, 195, 197,360 Mizuuchi, K., 72, 74, 75, 77, 81, 84, 89, 93, 94, 95, 96, 101, 102, 104, 105, 106 Mizuuchi, M., 102, 106 Model, P., 150, 173, 175, 179 Mohun, T., 118,177 Moir, A. j. G., 12, 13, 15, 24, 27, 64 Moldave, K., 74, 107 Momany, F. A., 297, 298,318,359 Moore, C. L., 82, 106 Moore, W. H., 184, 204, 209, 212, 214, 230. 231, 232, 234, 238, 240, 253, 254. 255, 261, 269, 280, 300, 301, 305,335,353,358,360,361 Moos, C., 3, 4, 54, 64 Moreno, F., 115,178 Morita, C., 79,105 Moriya, K., 90,104 Morris, D. R., 89,105 Morns, E. P., 27, 28, 35, 64 Morrison, A., 81, 82, 84, 85, 89, 95, 97, 101,106 Morser, J., 118, 177 Moser, C. D., 91, 92, 106 Mostov, K. E., 148, 178 Mueckler, M., 148,178 Mueller, H., 8, 10,62 Muguruma, M., 7, 64 Muirhead, H., 257,361 Muller, M., 118, 133, 141, 142, 149, 168, 178 Muller, M.T., 93, 94, 107 Mumford, R. A., 130,180 Murakami, F., 4, 52, 64 Murali, R., 229,359 Muramatsu, S., 4, 64 Murphy, E., 77, 106 Murray, J. M., 51, 67
N Nafie, L. A., 261,357 Nagano, K., 18, 27, 28, 39, 50, 64, 65
375
Nagaraj, R., 154, 155, 157, 158, 178, 179, 257, 270, 317, 318,358,360, 361, 362 Nagashima, H., 44, 64 Nagy, B., 19, 20,64 Naik, V., 288, 290, 291, 293, 294, 295, 296,310,316,344,345,360 Naito, S., 90, 106 Nakahara, T., 240, 242, 261,358 Nakajima, M.,41, 62 Nakarnoto, K., 329,360 Nakamura, F.,4, 66 Nakamura, K., 120,176 Nakamura, S., 24, 25, 27, 64, 66 Nakase, M., 51, 67 Nakata, A., 149,178 Namba, K., 44,64 Nambudripad, R., 257,360 Nash, N. A., 72, 74, 77, 79, 84, 89, 93, 96, 103,104,105,106 Nashimoto, H., 137, 139, 177 Natio, A., 90,106 Natori, R., 4, 52, 60, 64 Nazos, P., 129, 178 Neel, J., 298,360 Nelsestuen, G. L., 159, 161, 178 Nelson, E. M., 83, 85, 86, 91, 105, 106, 107 Nemethy, G., 210, 229, 256, 257, 260, 270, 271, 297, 312, 313, 314, 315, 316, 317, 322, 345,355,356,359, 360,361,362 Nesmayanova, M. A., 129, 145, 147,175, 178 Neto, N., 201,361 Nevskaya, N. A., 290,363 Newman, B. J., 92,106 Nichols, M., 86, 107 Niedergang, C., 98,104 Nishikawa, T., 349,358 Nishino, N., 130, 180 Nishiyama, K., 51, 67 Nivera, N. L., 87, 88, 104 Nockolds, C. E., 17, 49, 63 Noda, H., 4, 64 Noguchi, H., 83,106 Noguti, T., 349,358 Nokelainen, M., 147, 180 Nolan, J. M.,98, 106
376
A U T H O R INDEX
Nomura, M., 137, 139,175,177 Nonami, Y.,7,66 Nonomura, Y.,4,8,42,52,64,65 Novick, P., 137, 139,176, 178 Novick, R. P., 77,106 Novak, P., 132,178 Nozaki, S., 12,22,63 Niirnberg, P., 137, 166,179
0 Obinata, T., 1, 2,40,42,61,64 OBrien, E. J., 64 ODea, M. H.,74, 75, 77,81,84,85,89,
90,93,94,95,96, 101, 102,104, 106 Offer, G.,3,4,54,61,64,66 Ogawa, Y.,16,64 Ohara, O.,23,30,35,64,66 Ohashi, K.,4,7, 52,53,55,56,57,59,
62,64,65,67 Ohlsson, R. I., 118,177 Ohmori, H.,81,89,104 Ohnishi, S., 12, 15, 21,23,65 Ohnishi, T.,298,363 Ohno-Iwashita, Y.,142, 147,178 Ohtaki, T.,6,65 Ohtsuki, I., 7,8,9,10, 13, 15, 18,23,24,
25,27,28,29,30,94,36,38,39,42, 43,44,45,46,47,50,61,64,65,66, 67 Olafson, B. D., 208, 349,353,356 Oliver, D.B., 137, 138, 140, 141, 147, 175,177,178 Olivera, B. M., 98,104 Ollis, D. L.,107 Olson, W.R., 349,359 Ono, A., 13,23,24,25,28,29,30,66 Onoyama, Y.,24,25,27,38,65,66 Ooi, T.,23, 30, 31, 33,34,35,42,62,64, 65,66 Oosawa, F., 51,67 Ordidge, M.,12, 13, 64 Orr, E., 82,94, 105, 106 Osborn, M.,6,61 OShea, D.C.,318,320,321,364 Osheroff, N., 97,106 Ottensmeyer, F. P., 133, 174 Otter, R., 78,85,101,103, 106 Ovchinnikov, Yu.A., 298,356 Overduin, P., 131,180
Oxender, D. L., 129, 151,175,178 Oya, M., 240,242,261,358 Ozaki, M., 130, 131, 176 Ozawa, Y.,131, 132,176 Ozols, J., 150,177
P Padden, F. J., 254,275,358,359,361 Pages, J.-M., 129, 141, 145, 147, 149, 150,
151, 152, 157,178 Painter, P. C.,261,331,357 Palade, G.,110,178 Palm, W.,321,357 Palmiter, R. D.,116,178 Palter, D.,52,53,54,67 Palva, I., 118, 153,178 Pandya, U. V.,298,356 Pardee, A. B., 83,106 Pardo, J. V.,5,65 Park, S.,127,176 Parker, F. S.,225,361 Parrish, F. C.,Jr., 55,56,63 Parry, D.A. D., 33,61,229,357 Parthasarathy, R.,229,356,362 Pato, M. D.,35,65 Pattus, F., 159, 161,177,180 Paucha, E., 118,I77 Paul, D.L.,118,178 Pauling, L.,185,203,210,229, 230,238,
242,256,258,356,360,361 Pearlstone, J. R., 13, 24,27,28,30,39,
49.65 Pease, L. G.,258, 297, 298,362 Pedone! C.,288,?55 Peebles, C.L., 81,84,85,90,94,95,103,
106,107 Perara, E., 117, 144,178 Perlman, D.,112, 122, 125,126, 131, 175,
178 Perly, B., 312,361 Pernet, A. G.,84,106 Perriard, J. C.,3,62 Perrin, D.,116,174 Perry, S. V., 10, 11, 12, 13, 15, 22,24,27, 31,36,37,41,48,60,61,62,64,65,
66, 67 Person, W., 214,361 Perutz, M. F.,256,257,297,322,356,
361
377
AUTHOR INDEX Pesold-Hurt, B., 115, 173, 176 Petcher, T. J., 298,361 Pethica, B. A., 159, 179 Peticolas, W. L., 198, 200, 232, 233, 238, 261, 277, 279, 280, 281, 338, 345, 347,357,361,362,364 Pezolet, M., 184,361 Pfeil, W., 137, 158, 166, 174 Pflugfelder, G., 83, 87, 99, 105 Phillips, D. C., 256, 257, 260, 359,361 Phillips, G. N., Jr., 47, 61 Phillips, M. C., 159,179 PiCroni, G., 161, 175 Pierzchala, P. A., 130, 180 Pigeon-Gosselin, M., 184,361 Pimentel, G. C., 329, 361 Pincus, M. R., 126, 179 Piovant, M., 129, 152, 157, 178 Piseri, L., 198,361 Pletnev, V. Z., 257, 270, 362 Pluckthun, A., 138, 146, 149, 177 Pollock, T. J., 79, 106 Popov, E. M., 298,359 Portnova, S. L., 298,356 Potter, J. D., 10, 12, 13, 15, 16, 17, 19, 20, 21, 23, 27, 36, 37, 41, 49, 61, 62, 64, 65, 67 Pottle, M. S., 229, 312, 313, 315, 316, 317,355,356,361 Potts, J. T., Jr., 166, 167, 176, 177 Powers, J. C., 130, 180 Powers, S., 138, 179 Prasad, B. V. V., 257, 270, 271, 288, 317, 361 Prehn, S., 137, 158, 166, 174, 179 Prell, B., 84, 106 Premilat, S., 230, 232, 242, 253, 260, 288, 289,356 Price, W. C., 333,357 Priest, J., 17, 64 Printz, M. P., 297, 322,360 Pulleyblank, D. E., 70, 96, 106 Pullman, A., 298,361 Pullman, B., 298,361
Quay, S. C., 129, 151,175 Quinn, P. J., 161, 179 Quinn, P. S., 171, 179
R Rabolt, J. F., 184, 261, 300, 305,361 Raghavendra, K., 229,361 Raja,, S. S., 258,363 Ralph, R. K., 86, 105 Ramachandran, G. N., 209, 210, 211, 256, 257, 258, 259, 275, 277, 288, 298, 355,356,359,361,362 Ramakrishnan, C., 21 1, 257, 275, 277, 361,362,363 Ramirez-Mitchell, R., 52, 53, 54, 67 Randall, L. L., 127, 129, 146, 147, 149, 150, 151, 153, 168, 174, 176, 177, 179 Raney, P., 126, 175 Rao, Ch. P., 318,362 Rao, C. N. R.,318,362 Rao, S. N., 229,362 Rao, S. T., 17, 49, 66 Rapoport, T. A., 117, 118, 129, 134, 137, 144, 148, 158, 166, 171, 174, 176, 179,180 Rasmussen, B. A., 122, 153, 169,174 Ray, K. P., 15, 65 Ray, P. H., 132, 178 Rebane, T., 211, 232, 277,280,359 Reddy, G. L., 154, 155,179 Reddy, Y. S., 15, 65 Redman, C. M., 110,179 Reed, L. L., 310,362 Reed, R. R., 71, 77, 79, 80, 84, 89, 91, 92, 104, 106 Reid, R. E., 19, 65 Reinach, F. C., 4, 61 Reuvers, F. A. M., 161, 180 Rey-Lafon, M., 193, 197, 335, 336,362 Rhoads, D., 129, 130, 141, 142, 149, 150, 151, 152, 157, 167, 168,175 Rich, A., 166,176, 257,275,357 Rich, D. H., 298,358 Richards, F. M., 270,357 Richardson, D. C., 297,362 Richardson, J. S.,229, 256, 297,362 Ridd, D. H., 120, 154, 155, 160, 174 Ridpath, J. F., 55, 56, 63 Robertson, S. P., 15, 62 Robinson, A., 137, 166, 167, 179 Robson, R. M., 6, 55, 56, 61, 63 Rochat, H., 161, 175
378
AUTHOR INDEX
Rodgers, J. R.,257, 297, 300, 346,355 Rogers, J., 158, 177 Roll, D., 98, 104 Rose, J. K., 173, 174 Rosenblatt, M., 154, 155, 166, 167, 176, 177,179 Rosenfeld, S. S., 13, 18, 19, 21, 22, 23, 30, 62, 63 Rosenthal, S., 117, 171, 176 Rossman, M. G., 297,362 Rothblatt, J. A,, 148, 152, 179 Rothfield, L. I., 159, 179 Rothman, J. E., 148, 179 Rowe, T. C., 85,91, 93,94,103,105, 106 Roychowdhury, P., 17, 49, 66 Riiegger, A., 298,361 Rumsey, S. M., 270, 271,361 Rundle, R. E., 329,360 Rusche, J. R., 94, 106 Russell, M., 150, 179 Ryan, J. P., 123, 149, 150, 179
S Sabatini, D. D., 110, 111, 136, 175, 177, 179 Saeva, F. D., 345,363 Saito, M., 7, 66 Saito, Y., 310,363 Sakakibara, Y., 77, 106 Salas, M., 92, 104 Salemme, F. R., 229, 362 Salvo,J. J., 79, 104 Sandeman, I., 195,362 Sander, M., 85,90,98,106, 107 Sandorfy, C., 210,362 Sarvas, M., 118, 153, 178 Sasada, Y., 306,363 Sasaki, S., 257, 258,362 Sasisekharan, V., 209, 21 1, 229, 257, 259, 270, 271, 275,360,361,362 Sass, R. L., 62 Sauer, R. T., 146, 177 Sawada, H., 54, 56, 57, 64 Schachat, F. H., 31, 60 Schachtschneider,J. H., 207, 219, 310, 362 Schatz, G., 115, 173, 176 Schaub, M. C., 8, 10, 65
Schauer, I., 125, 137, 139, 150,176, 179 Schecter, I., 111, 179 Schedl, P., 90, 107 Scheele, G., 111, 179 Schekman, R., 125, 137, 139, 150, 176, 178,179 Scheraga, H. A., 209,210,229,256,257, 260, 270, 271, 288, 297, 298, 309, 312, 313, 314, 316, 317, 318, 319, 320, 321, 345, 346,355, 356, 358, 359,360,361,362,363,364 Schiffer, M., 321, 357 Schill, G., 70, 106 Schlosser, T., 3, 66 Schmid, E., 6, 61 Schmidt, G. W., 148, 176 Schneider, B., 193,358 Schoechlin, M., 17, 37, 67 Schrader, B., 201,362 Schroeder, R., 210,359,362 Schroeter, J. P., 62 Schultz,J., 139, 140, 179 Schuster, P., 210,362 Schwartz, M., 115, 120, 176, 178 Sederholm, C . H., 329,361 Seff, K., 345, 346,358 Seidel, J. C., 19, 20, 65 Seki, N., 53, 55, 65 Sengupta, P., 254, 255, 268, 279, 329, 331, 333, 338, 340, 344,362 Severin, S. E., 24,62 Shamala, N., 257, 270, 317,360, 361, 362 Sheehy, R.J., 77, 106 Shefter, E., 345,362 Shelton, E. R., 97, 106 Shen, L. L., 84, 106 Shepel, E. N., 290,363 Sherratt, D. J. 79, 80, 105 Shestopalov, B. V., 215, 331, 356 Shiba, K., 137, 139, 141, 177, 179 Shimanouchi, T., 184, 193, 195, 197, 232, 240, 242, 257, 260, 261, 277, 278, 280, 297, 300, 345, 346,355, 358, 360,362,363 Shimizu, T., 4, 61 Shinagawa, H., 149, 178 Shinnar, A. E., 153, 154, 155, 157, 158, 179 Shipman, L. L., 345, 363
AUTHOR INDEX Shiraishi, F., 28,65 Shiraishi, Y.,229,364 Shive, M.,70,96,106 Shmueli, U.,257,363 Shore, V. C.,256,359 Shuman, H.,23,63 Siamwiza, M. N., 345,362 Sibakov, M., 118, 153,178 Siegel, V.,134, 135,179 Silhavy, T.J., 115, 123, 125, 127, 128, 131, 137, 139, 140, 141, 149,156,
169, 171,174,175, 176, 178, 179 Siliciano,J,, 5,65 Silver, P., 131, 132, 142, 147,179, 180 Silverton, E. W., 297,362 Simons, L.,261,362 Singer, S. I., 6,66 Singh, R. D.,280,362 Sippl, M.J., 210,362 Sjostrand, F. S., 52,57,65 Slater,J. C.,209,362 Small, D.,261,297,357,362 Small, E. W.,198,200, 232,233, 238,
277,279,280,281,362 Smillie, L. B., 13, 24,27,28,30,31,35,
39,49,63,65,66 Smith, G. D., 257,270,309,362 Smith, J. A., 258,297,298,362 Small,J. V.,6,66 Smith, K.,77,106 Smith, M.,232,234,277,363 Smith, W.P., 147, 149,179 Snyder, R. G.,207,219,310,362 Soberon, X.,120,176 Sobieszek, A., 6,66 Sodek, J., 31, 66 Solaro, R. J., 15,62,64 Soman, K.V.,257,363 Somlyo, A. P.,23,63 Soreq, H.,167,177 Spach, G.,258, 260,288,289,356,358,
360 Sparks, C. E., 159,179 Spengler, S.J., 78,103 Spiro, T.G., 225,318,320,342,363 Spudich,J. A., 34,43,66 Srinivasan, R., 258, 349,359,363 Stahl, S., 118,179
379
Stanley, H. E., 20,60 Starr, R., 3,4, 66 States, D.J., 208,349,353,356 Staudenbauer, W.L.,94,I06 Stein, E. A., 17,37,67 Steiner, D.F., 171,179 Steitz, T.A,, 80, 103, 107, 116, 122, 129,
144, 150,176 Stenbach, H.,261,362 Stepanyan, S. A., 234,363 Stephens, R. M.,257,355,356 Stern, J. B., 131, 179 Stern, P.S.,208,349,353,359 Stewart, F. H.,242, 253,357 Stewart, G.R., 31,33,63 Stewart, M.,31, 32,33, 34,49,63,66 Stimson, E. R., 309,313,315,316,317,
355,358 Stone, D.,31, 66 Stossel, T.P.,5,52,66 Strandberg, B. E., 256,359 Strasburg, G. M.,17,49,66 Straus, D.R., 115,177 Strauss, A. W.,130,180 Streb, M.,158,177 Street, A., 229,355 Stromer, M. H.,6,61 Subramanian, E., 229,359 Suenaga, N., 28,65 Suezaki, Y.,347,363 Sugeta, H.,344, 345,346,363 Sugino, A., 81,84,90,91,93,94,95,96, 106 Sugita, H., 1, 2,40,64 Sundaralingam, M.,17,49,66 Sutherland, G.B. B. M.,183,230,355,
363 Sutor, D. J,, 21 1,363 Sutton, P. L.,261,342,359 Suzina, N. E., 145,175 Suzuki, A.,7,66 Suzuki, E., 195,229, 242,253,357,363 Suzuki, K.,53,56,64 Suzuki, K.,77, 106 Suzuki, S.,232,277, 278, 280,363 Swaminathan, S., 208,349,353,356 Swan, D.,111, 179 Sychev, S. V.,290,363 Symington, L. S., 79, 80,105
380
AUTHOR INDEX
Syska, H., 10, 11, 12, 13, 37,66 Szent-Gyorgyi, A. G.,31,61
T Taddei, G., 201,361 Tadokoro, H.,198,363 Tager, H. S., 171,179 Tai, P. C., 118, 129, 130, 141, 142,149,
150, 151, 152, 157, 167, 168,175, 179 Takahara, M., 118, 153,179 Takahashi, K.,4,49, 66 Takahashi, S.,23,30. 64 Takaiti, O.,3,64 Takamatsu, T., 345,362 Takeda, Y.,79,105,257,363 Takeuchi, H., 341,346,347,348,363 Talbot, J. A., 11, 12,66 Talmadge, K.,117, 118,179,180 Tanaka, I., 229,355,363,364 Tanaka, T., 5,63 Tanford, C., 121,180 Tang, D., 70,96,106 Tanokura, M.,13, 23,24,25,27,28,29, 30,39,65,66 Tasumi, M., 257,297, 300,341,346,347, 348,355,363 Tawada, K.,35,66 Tawada, Y., 13,23,24,25,27,28,29,30, 35,66 Taylor, R., 210, 211,363 Templeton, N. S., 80, 103 Ter-Minassian-Saraga, L.,161,180 Terry, W.,111, 179 Tewey, K. M.,85,86,91,93,103, 105, 106,107 Thom, J., 127,176 Thomas, K. A., 297,362 Thompson, R. C., 149,179 Thornton, J. M.,16,20,63 Thulin, E., 16,60 Tiffany, M. L.,258,359,363 Timko, M. P., 115, 173,180 Ting-Beall, H. P., 59,63 Tipping, M., 261,363 Tocchini-Valentini, G. P., 94,103 Todd, J. A., 136, 137,174 Tokunaga, H.,150,180 Tokunaga, M., 130, 150,180
Tokuyasu, K. T., 6,66 Tomasic. L.,288,355 Tomizawa, J., 77,84,93,94,96,104,
106 Tomlinson, B., 261,357 Tommassen, J., 115, 117, 118, 131,180 Toyoshima, C., 44,51,66 Trask, D. K.,93,94,106 Traub, W., 257,363 Trayer, I. P., 11, 12, 13, 64,66 Trevino, S., 232, 233,358 Trinick, J., 52,53,54,56,59,66 Trotter, I. F.,230, 238, 258,355,356 Tsarnaloukas, A.,137, 166,179 Tsao, T.-C., 31,42,66 Tse, Y.-C., 76,83,84,87,91,95,99,105,
106 Tse-Dinh, Y.-C., 78,83,93,98,106 Tsetlin, V. I., 298,356 Tsfasman, I. M.,145, 147,175 Tsuboi, M., 225, 232, 234,277,278, 280,
363 Tsukui, R., 40,66 Tu, A.,52,53,55,57,67, 318,358 Turner, D. C., 3,66 Tyrninski, D., 184,359
U Uchida, M., 130, 131,176 Udvardy, A.,90,107 Ueda, Y.,229,356 Ueki, T., 306,363 Uematsu, E., 257,258,362 Ullrich, A., 135, 136,177 Ulrich, B.L.,136,177 Umazume, Y.,56,57,60,62, 64 Ungerleider, R. S.,86,107 Uno, J., 240,242,261,358 Uno, T., 310,363 Urry, D. W.,258,298,363
V Vanaman, T. C., 13,41,60 van Damme-Jongsten, M., 131, 180 van Deenen, L. L. M., 161,180 van den Broek, G., 115, 173,180 van Eerd, J.-P., 20,49,63,66 van Montagu, M.,115, 173,180
AUTHOR INDEX
van Tol, H., 115, 117, 118,180 van Wart, H.E., 318,319,320,321,345,
381
Watt, S., 34,66 Watts, C.,131, 132, 142, 147, 149,176,
179,180
346,363 van Zoelen, E. J. F., 161,180 Varenne, S.,129, 152, 157,178 Veatch, W. R.,288,364 veer Reddy, G. P., 83,106 Venkatachalam, C.M.,211,230,259,
275,277,297,298,318,362,364 Venkatachalapathi, Y. V., 318,358 Venyarninov, S. Y.,215,331,356 Verger, R.,159, 161,175,180 Vergoten, G.,298,299,312,344,359 Versichel, W., 210,363 Vibert, P.J., 44,67 Villar-Palasi, C.,24,63 Vinograd, J., 70,76,96,104, 106 Viras, K.,261,363 Viswamitra, M.A., 229,357 Vlasuk, G. P., 126,180 von Heijne,G., 119, 120, 121, 122,125,
131, 144, 150,180 Vos, A., 229,354 Vosberg, H.-P., 70,74, 84,86,96,98,
103,104,106,107
W Wakabayashi, K., 1, 2,44,56,64 Wakabayashi, T.,43,44,51,65,66 Wallimann, T., 3, 66 Walrnsley, S . H., 201,361 Walsh, K. A., 116,130,177,178 Walter, P., 111, 118, 133, 134, 135, 136,
141, 148, 152, 168,174, 176, 177, 178,179,180 Walton, A. G., 232,234,277,363 Wang, B. C.,17,49,66 Wang, J. C.,70,71,72,76,78,80,81, 82, 83,87,89,91,94,96,97,99,100, 101,102,103,104,105,106,107 Wang, K., 52,53,54,55,57,59,66,67 Wang, M.S . , 59,67 Wansen, G., 261,362 Warren, T. G.,118, 130,175 Warshel, A., 207,208,359,364 Watanabe, T.,53, 55,65 Watanabe, Y.,130,131,176 Watson, J. D., 298,358 Watson, M. E. E., 112, 115, 117, 173,180
Weber, A., 35,51,60,67 Weber, H.-P., 298,361 Weber, K.,6,61 Weber, P. C.,107 Weeks, R. A., 13, 22,67 Weisberg, R. A.,72,77,79, 106, 107 f Wells, R. D., 79,104 Westergaard, O.,88,103, 104 White, J. H., 70,103, 107 White, W. R., 131,177 Whiting, A., 52,53,54,56,59,66 Wickner, W., 129, 130, 131, 132, 142,
147, 149, 151, 153, 169,175, 176, 177, 178,179,180 Wiedmann, M., 117, 118, 133, 134,177, 180 Wilkins, J. A., 6,67 Wilkinson, J. M.,10, 11, 12, 13, 14,37, 49,66,67 Williams, D. E., 208,364 Williams, G.J. B., 257, 297,300,346,355 Williams, R. C.,79,103 Williams, R. W., 184,319, 345,357,364 Williamson, C. L., 59,67 Wilshire, G., 17,49, 61 Wilson, E. B., 185, 188, 191,207, 225, 361,364 Wilson, F. J., 10,37,65 Winkler, F. K.,203,364 Winkler, J., 117, 171,176 Winter, W., 270,355 Wittekind, M., 137, 139,177 Wityk, R.J., 79,104 Wnuk, W., 17,37,67 Wolfe, P., 130, 131, 147,178, 180 Wonacott, A.J., 259,355 Wong, T.W., 98,107 Woodger, M.,360 Woodhead, J. L.,3, 67 Woods, E. F., 31,67 Woods, H. J., 229,355 Woodward, L.A., 185,207,225,364 Woody, R.W., 313,315,316,317,355 Wray, J. S.,44,67 Wu, H. C.,117, 144, 150,176,177,180 Wu, R.,74,107 Wyborny, L.E., 15,65
382
A U T H O R INDEX
Y Yagi, N., 44,63 Yamada, K.,20,67 Yamada, Y., 229,364 Yamaguchi, M., 27,62 Yamamoto, K.,1, 2,4, 13, 15,24,25,27, 38,40,56,64,65,67 Yamane, T., 229,355,364 Yamanoue, M.,4,66 Yamazaki, R.,41,62 Yanargida, T., 51.67 Yang, L.,85,91,93,103, 105 Yang, Y.-Z., 34,67 Yasumoto, Y., 257,258,362 Yates, L. D., 2,67 Yoshidomi, H., 54,55,56,57,63,64,67 Yoshioka, T., 57,59, 64 Yost, C.S., 115, 153,177 Young, L.S.,97,I05 Yu, N.-T., 318,319,320,321,338,344, 345,360,364
Yuan, R., 79,104 Yura, T., 137, 139, 141,177, 179
z Zabicky, J. H., 127,176 Zabin, I., 115, 127,174, 178 Zak, J., 201,225,364 Zerbi, G . , 214, 215, 217,331,357,361, 364 Zimmerman, C. J., 29,62 Zimmerman, M.,130,180 Zimmerman, S. S.,229,255,364 Zlotnick, A.,157, 159, 161,175 Zopf, D.,133,176 Zot, H.G.,23,67 Zundel, G.,210,362 Zwaal, R. F. A., 161,180 Zwelling, L. A.,86,107 Zwizinski, C.,130, 131, 132, 151,175, 18G
A Actin interaction with connectin, 57 regulatory proteins, 4-5 Actomyosin, effects on tropomyosin, 35 Amide modes amide I, 194-197 amide 11, 197 amide 111, 197 amide IV, VI, VII, 198 amide V, 197-198 skeletal stretch, 197 Amino acid side-chains,see Side chains Amphiphilic tunnel hypothesis for protein secretion, 144- 145 Antiparallel-chain pleated sheet structures @-poly(L-alanine),238-242 &poly(L-alanylglycine), 242-254 b-poly(L-glutamate),254-256 Antiparallel-chain rippled sheet structures, polyglycine I, 230-238 ATP hydrolysis, topoisomerases, 93-97
B Beta turns in peptides, 306-322 type I (Z-Gly-Pro-Leu-Gly-OH), 306-310
type I1 cyclo(~-Ala-~-Ala-Aca), 313-3 18 cyclo(L-Ala-Gly-Aca), 312-3 13 Pro-Leu-Gly-NH2, 310-312 Z-Gly-Pro-Gly-Gly-OMe, 3 12 type 111, 316-318 in proteins, 318-322 standard turns calculated amide mode frequencies (various structures), 305 structure, 298-300 type 11, calculated frequencies, 304
types I, 11, 111: calculated amide I, 11, and I1 frequencies, 302 types 1', 11', 111': calculated amide I, 11, and I1 frequencies, 303 vibrational analysis, 300-306 Bovine pancreatic trypsin inhibitor, normal-mode calculations, 349-350
C Calcium binding sites on troponin C, 15-17 binding structure in troponin C, 17-20 regulation in thin filament, 51-52 regulatory mechanisms, see Calciumregulatory mechanisms structural change induced in troponin
c, 20-21
Calcium-regulatory mechanisms hybrid troponins, 40-42 troponin components, comparative aspects, 40-42 troponin 1 inhibitory action, 36-37 troponin T regulatory role, 38-40 Calcium regulatory proteins, 7-10; see also Tropomyosin; Troponin C; Troponin I; Troponin T CHS-CO-(L-Ala)s-NH-CHS calculated amide frequencies in y-turn conformations, 324 dihedral angles for y-turn structures, 323
y turn model, 322 CHs-O-Gly-Ala-Ala-Gly-O-CHs, observed and calculated amide mode frequencies, 309 Chymotryptic subfragments, troponin T, 24-27
Conformation isolated signal sequences in membranes, 162- 166
signal sequences, 126-128, 153-157 383
384
SUBJECT INDEX
Connectin, 52 function, 59-60 hydrolysis, 55-56 interaction with actin, 57 localization, 57-59 molecular size and shape, 53-54 myofibrillar content, 53 physicochemical properties, 54-55 preparation, 53 Covalent modification, topoisomerases, 98-99 Crystal(s) molecular, normal vibrational modes, 201-203 tropomyosin, 33-34 Cyclo(L-Ala-D-Ala-Aca), type 11 p turns, 313-318 Cyclo(L-Ala-L- Ala- Aca), observed and calculated amide modes of types 111 and 1 p turns in, 3 17 Cyclo(L-Ala-Gly -Aca) calculated frequencies in amide regions I and V, 314 with trans peptide bonds, minimumenergy conformations, 313 type I1 p turns, 312-3P3 Cyclo(D-Phe-L-Pro-Giy-D-Aia-L-Pro), observed and calculated amide frequencies, 327 Cysteine, disulfide bridge, 345-346 Cytoskeletal proteins, 5-7 intermediate, 6 membrane-attachment, 5-6 Z-line structure, 6-7
D Disulfide bridge, cysteine, 345-346 DNA binding properties of topoisomerases, 78-83 cleavage by a topoisomerase cleavage reaction, 83-86 DNA-protein bond, 91-92 site specificity, 86-91 reunion with topoisomerases, 92-93 DNA supercoiling, twist, writhe, and linking number, 70-72 DNA topoisomerases, see Topoisomerases Domain model for protein secretion, 146 Domain transfer in protein secretion, 148-150
E Eschm'chia coli mutations in signal sequences resulting in export-defective proteins, 123 secretion apparatus, 140-142 wild-type and mutant lambda-receptor protein signal sequences, 128
F Fermi resonance, 228-229 Filaments, thin calcium regulation in, 51-52 tropomyosin arrangement, 42-45 troponin arrangement, 42-45
G Gamma turns in peptides, 325-327 standard turns CH3-CO-(L-Ala)s-NH-CHs, calculated amide frequencies, 324 structure, 322-325 Genes prlA (sec Y), 139-140 prm, prlC, prD, 140 secA, 138 secB and secC, 138- 139 Glucagon, amide I modes, frequencies and CO stretch coordinates in eigenvectors, 348 Gramicidin A, observed and calculated amide I frequencies, 296 Group frequencies, in vibrational spectroscopic band assignments, 225
H Helical hairpin hypothesis of protein secretion, 144 Hybrid troponins, 40-42 Hydrolysis ATP, topoisomerases, 93-97 connectin, 55-56
I Infrared intensity, vibrational spectra of proteins and, 350-351 Infrared spectra poly(a-aminoisobutyric acid), 273
385
SUBJECT INDEX polyglycine 11, 278 Z-Gly-Pro-Leu-Gly-OH, 307 Insulin /3 turns in, 319-321 signal sequence, 173 Intermediate filaments, 6 Isotopic substitution, in vibrational spectroscopic band assignments, 227-228
L Linking number, in DNA supercoiling, 70-72 Lipids fluidity, effects on protein secretion, 129-130 interactions with signal sequences, 157162 membrane, active role in protein secretion, 145-146 Loop model of protein secretion, 143-144
M Membrane-attachment proteins, cytoskeletal, 5-6 Membranes conformations of isolated signal sequences in, 162-166 lipids, active role in protein secretion, 145- 146 fluidity, protein secretion and, 129-130 signal sequence initial interaction with, model, 170-171 Membrane trigger hypothesis of protein secretion, 143 N-Methylacetamide, normal vibrations amide I mode, 194-197 amide I1 mode, 197 amide 111 mode, 197 amide IV, VI, and VII modes, 198 amide V mode, 197-198 NH stretch, 194 skeletal stretch, 197 Molecules, vibrational spectroscopy helical, isolated, 198-201 N-methylacetamide, 193-198 molecular crystal, 20 1-203 small, isolated, 185- 193 Myofibrils, connectin content, 53
Myosin interaction with connectin, 56-57 regulatory proteins, 3-4 Myosin-actin interaction, troponin I inhibitory action, 10-1 1
P Paracrystals, tropomyosin, 33-34 Peptides, /3 turns in type I, Z-Gly-Pro-Leu-gly-OH, 306310 type I1 cyclo(L-Ala-D-Ala-Aca), 3 13-3 16 cyclo(L-Ala-Gly-Aca), 3 12-3 13 Pro-Leu-Gly-NH2, 3 10-312 Z-Gly-Pro-Gly-Gly-OMe, 3 12 type 111, 316-318 Poly(a-aminoisobutyricacid) infrared spectra, 273 observed and calculated amide and skeletal frequencies, 274 Raman spectra, 273 structure and symmetry, 270-271 vibrational analysis, 27 1-274 Poly(L-alanine) internal coordinates, 205 symmetric coordinates, 206 a-Poly (L-alanine) observed and calculated frequencies, 262-267 Raman spectra, 261 structure and symmetry, 258-261 vibrational analysis, 261-269 B-Poly(L-alanine) observed and calculated frequencies, 243-252 Raman spectrum, 242 structure and symmetry, 238-242 /3-Poly(L-alanylglycine),structure, 242, 253-254 /3-Poly(L-glutamate),structure and spectra, 254-256 a-Poly(L-glutamicacid), vibrational analysis, 269-270 Polyglycine I crystalline, symmetry species and selection rules, 226 observed and calculated frequencies, 235-237 structure and symmetry, 230-232 vibrational analysis, 232-234, 238
386
SUBJECT INDEX
Polyglycine I1 crystalline hydrogen-bond parameters, 277 observed and calculated frequencies, 282-287 infrared spectra, 278 Raman spectra, 279 structure and symmetry, 275-277 vibrational analysis, 277-288 Polypeptide chains amide I mode, 330-332 amide I1 mode, 333 amide 111 mode, 333-337 amide V mode, 338-340 extended structures, 229-256 antiparallel-chain pleated sheet, 238256 antiparallel-chain rippled sheet, 230238 force field, 204-224 general valence constants, 220-224 intermolecular function, 208-2 15 hydrogen bonding, 210-21 1 nonbonded interactions, 208-2 10 transition dipole coupling, 21 1-2 15 intramolecular potential functions, 205-208 least-squares refinement, 2 15-2 17 valence, 217-219, 224 helical structures, 256-297 31helix, 275-288 310helix, 270-274 (I helix, 258-270 L,D B helices, 288-297 calculated A and El species of amide I frequencies, 292 double-stranded, 293-295 gramicidin A, 295-297 single-stranded, 288-290 vibrational analysis, 290-293 internal symmetry and coordinates, 204 NH stretch mode, 328-330 potential energy distributions amide V mode, 339 nonweak amide 111 modes, 336 skeletal stretching modes, 337 strong amide I1 modes, 334 skeletal stretch mode, 337-338 standard geometry, 203-204
Polypeptides backbone modes, observed and calculated frequencies, 332 general valence force constants, 220224 reverse turns B turns, see Beta turns y turns, see Gamma turns Potential energy distribution amide V modes, 339 nonweak amide 111 modes, 336 normal-mode frequencies, 190-191 strong amide I1 modes, 337 Processivity, in topoisomerase reactions, 97-98 Pro-Leu-Gly-NH2, type I1 p turn, 310312 Proteins /3 turns in, 318-322 calcium-regulatory, see Calcium-regulatory proteins membrane-attachment, cytoskeletal, 5-6 precursor, and isolated signal sequences, 152-153 Protein secretion energy required for, 150- 152 eukaryotic, 132- 137 ribophorins, 136-137 signal-recognition particle, 133- 135 SRP receptor, 135-136 models active role for membrane lipids, 145146 amphiphilic tunnel hypothesis, 144145 domain, 146 helical hairpin hypothesis, 144 loop, 143-144 membrane trigger hypothesis, 143 signal hypothesis, 142 pre-signal hypothesis, 110-1 11 prokaryotic, 137-142 prlA (sec Y), 139-140 prB, prlC, p.D, 140 s e d , 138 secB and secC, 138-139 secretory apparatus components, 128142 E. coli, 140-142 membrane, 129-130
387
SUBJECT INDEX proteins in, 137 signal peptide, 130- 132 signal-peptide peptidase, 132 signal hypothesis proposal, 111-1 13 signal sequences, see Signal sequences translocation site, 146-148 vectorial and domain transfer, 148- 150
R Raman spectra a-poly(L-alanine),26 1 P-poly(L-alanine),242 poly(a-aminoisobutyric acid), 273 polyglycine 11, 279 Z-Gly-Pro-Leu-GI y-OH, 308 Regulatory proteins, 3-5 actin-associated, 4-5 myosin-associated, 3-4 Reverse turns, 297-298 P , see Beta turns y, see Gamma turns Ribophorins, protein secretion and, 136137
S Secular equation, for normal-mode frequencies, 186- 190 Side chains disulfide bridge, 345-346 residue group frequencies, 342-345 Signal hypothesis, 142 proposal, 111- 113 Signal peptidase, 132 cleavage site on signal sequence, 125 Signal-peptide peptidase, 130- 132 Signal-recognitionparticle, 133-135 Signal sequences charged region, 119- 120 cleavage site for signal peptidase, 125 conformation, predictions of, 126- 128 conformational studies, 153-157 discovery, 111 E. coli wild-type and mutant A receptor, 128 hydrophobic region, 120-125 initial interactions with membranes, model, 170- 171 on insulin, 173
interactions with lipids, 157-162 with proteins, 166-168 interchangeability, 117-1 19 internal, 116-1 17 isolated in membranes, conformations, 162166 precursor proteins and, 152-153 length, 119 as membrane-interacting sequences, 171-174 necessity for secretion, 113-1 15 representative, table, 114-1 15 roles of, summary, 168-169 Skeletal muscle (vertebrate) calcium-regulatory mechanisms, 36-42 connectin, 52-60 cytoskeletal proteins, 5-7 regulatory proteins, 3-5 structural aspects of troponin and tropomyosin, 42-52 tropomyosin, 31-36 troponin C, 15-23 troponin I, 10-15 troponin T, 24-31 SRP receptor, 135-136 S-S modes of proteins, 342-346 Symmetry, in vibrational spectroscopic band assignments, 225-227
T Titan, see Connectin Transfer, vectorial vs. domain in protein secretion, 148-150 Translocation site, nature of, 146-148 Topoisomerases ATP hydrolysis, 93-97 covalent modification, 98-99 DNA binding, 78-83 DNA bond reunion, 92-93 DNA cleavage cleavage reaction, 83-86 cleavage-site specificity, 86-9 1 DNA-protein bond, 91-92 mechanistic models, 99- 102 processivity in reactions of, 97-98 properties, 73 reactions, 72-78
388
SUBJECT INDEX
Tropomyosin actomyosin effects, 35 arrangement in thin filament, 42-45 calcium regulation in thin filament, 5 152 crystals and paracrystals, 33-34 interactions, 34-35 with troponin I, 11-12 with troponin T, 27-29 primary structure, 31-33 structure prediction, 48-50 subunits, 31-33 Troponin arrangement in thin filament, 42-45 calcium regulation in thin filament, 5152 components, arrangement, 45-48 structure prediction, 48-50 Troponin C Caz+binding, 15-17 Caz+-bindingstructure, 17-20 Caz+-inducedstructural change, 20-2 1 interaction with troponin I, 21-23 troponin C, 12-13 troponin T, 23, 30-31 Troponin I cardiac, 14- 15 inhibitory action of tropomyosin, 36-37 on myosin-actin interaction, 10-1 1 interaction with tropomyosin, 11-12 troponin C, 12-13, 21-23 troponin T, 13-14,29-30 Troponins, hybrid, 40-42 Troponin T chymotryptic subfragments, 24-27 interaction with tropomyosin, 27-29 troponin C, 23, 30-31 troponin I, 13-14, 29-30 regulatory role, 38-40 Twist, in DNA supercoiling, 70-72
V Vectorial transfer, in protein secretion, 148- 150 Vibrational spectroscopy, see also Infrared spectra; Raman spectra
band assignments, 224-229 Fermi resonance, 228-229 group frequencies, 225 isotopic substitution, 227-228 overtone and combination bands, 228-229 symmetry, 225-227 extended polypeptide chain structures, 229-256 antiparallel-chain pleated sheet, 238256 antiparallel-chain rippled sheet, 230238 helical molecule, isolated, 198-201 helical polypeptide chain structures, 256-297 S1 helix, 275-288 310helix, 270-274 a helix, 258-270 L,D /3 helices, 288-297 standard parameters, 257 N-methylacetamide amide I mode, 194-197 amide I1 mode, 197 amide 111 mode, 197 amide IV, VI, and VII modes, 198 amide V mode, 197-198 NH stretch, 194 observed and calculated frequencies, 195 peptide group local symmetry coordinates, 194 skeletal stretch, 197 molecular crystal, 20 1-203 polypeptide chain internal and symmetry coordinates, 204 standard geometry, 203-204 polypeptide chain modes, 328-341 amide I, 330-332 amide 11, 333 amide 111, 333-337 amide V, 338-340 NH stretch, 328-330 skeletal stretch, 337-338 polypeptide force field, 204-224 general valence force constants, 220224 hydrogen bonding, 210-2 11 intramolecular potential functions, 205-208
389
SUBJECT INDEX least-squares refinement, 2 15-2 17 nonbonded interactions,. 208-210 transition dipole coupling, 2 11-215 valence, 217-219, 224 of proteins infrared intensities and, 350-35 1 normal modes, 346-351 side-chain and S-S modes, 342-346 reverse turns, 297-341 /3 turns, see Beta turns y turns, see Gamma turns small molecule, isolated, 185-193
W Writhe, in DNA supercoiling, 70-72
Z Z-Gly-Pro-Gly-Gly-OMe, type I1 p turn, 312 Z-Gly-Pro-Leu-GI y -OH infrared spectra, 307 observed bands of type I p turn, 309 Raman spectra, 308 Z-line structure, cytoskeletal proteins, 6-7
This Page Intentionally Left Blank