ADVANCES IN QUANT I TAT IV E STRUCTURE-PROPERTY RELATlONS HIPS Volume2
1999
This Page Intentionally Left Blank
ADVANCES IN Q UA NT ITAT IVE ST RUCT U RE-P R OPE RTY RELATlONS HIPS Editors: MARVIN CHARTON Department of Chemistry Pratt Institute Brooklyn, New York
BARBARA 1. CHARTON St. John’s University Science Library New York, New York
VOLUME2
1999
n
Al PRESS INC. Stamford, Connecticut
Copyright 0 1999 by JAl PRESS INC. 100 Prospect Street Stamford, Connecticut 06904-08 I 1 All rights reserved. No part of this publication may be reproduced, stored on a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, filming, recording, or otherwise without prior permission in writing from the publisher. ISBN: 1-7623-0067-1
Manufactured in the United States of America
CONTENTS
vii
LIST OF CONTRIBUTORS PREFACE Marvin Charton and Barbara Charton
ix
EXPLORING THE ENERGETICS OF BINDING I N CHROMATOGRAPHY A N D RELATED EVENTS Philip S. Magee
1
STRUCTURAL EFFECTS ON GAS-PHASE REACTIVITIES Gabriel Chuchani, Masaaki Mishima, Rafael Notario, and JoskLuis M. Abboud
35
THE PREDICTION OF MELTING POINT John C. Dearden
127
THE APPLICATION OF THE INTERMOLECULAR FORCE MODEL TO PEPTIDE A N D PROTEIN QSAR Marvin Charton
177
INDEX
253
V
This Page Intentionally Left Blank
LIST OF CONTRIBUTORS
Jose-Luis M. Abboud
Institute de Quimica Fisica Rocasolano C. S. I. C. Madrid, Spain
Marvin Charton
Department of Chemistry Pratt Institute Brooklyn, New York
Gabriel Chuchani
Centro de Quimica Institute Venezolano de Investigaciones Cientificas Caracas, Venezuela
John C. Dearden
School of Pharmacy and Chemistry John Moore University Liverpool, England
Philip S. Magee
BIOSAR Research Project Vallejo, California
Masaaki Mishima
Institute for Fundamental Research of Organic Chemistry Kyushu University Fukuoka, Japan
Rafael Notario
Institute de Quimica FisIca Rocasolano C S. I. C. Madrid, Spain
VII
This Page Intentionally Left Blank
PREFACE Quantitative structure property relationships (QSPR) have become a major method of chemical research. In the course of this development the field has suffered from fragmentation. Applications of QSPR are found in all major chemical disciplines including physical organic, physical, medicinal, agricultural, biological, environmental, and polymer chemistry. Frequently workers in one area are unaware of parameterizations and models used in other areas which they might well find useful. There is a common thread which runs through these widely diverse areas. The basic principles, parameterizations and methodology are the same or similar throughout. The object of this series is to provide interesting and timely reviews covering all aspects of the field. It is our hope that this will encourage the transfer of new methods, techniques, and parameterizations from the area in which they were developed to other areas that can make good use of them. In view of the widespread use of QSPR we believe that this is an important objective. We hope that this series will provide the cross-fertilization which we believe to be so sorely needed. Marvin and Barbara I. Charton Editors
This Page Intentionally Left Blank
EXPLORING THE ENERGETICS OF BINDING IN CHROMATOGRAPHY AND RELATED EVENTS
Philip S. Magee
I. Introduction to Adsorption Binding II. The Modeling of Intermolecular Binding Forces III. Binding of Organic Compounds on Inorganic Polymers A. Heats of Adsorption on Clay, Silica, and Alumina B. Adsorption Chromatography on Silica and Alumina IV. Binding of Organic Compounds to Organic Polymers A. Heats of Adsorption on Cellulose and Activated Carbon B. Adsorption Chromatography on Cellulose and Paper V. Binding of Organic Compounds on Bioorganic Polymers VI. Conclusions Note References
Advances in Quantitative Structure Property Relationships Volume 2, pages 1-33. Copyright © 1999 by JAI Press Inc. All rights of reproduction in any form reserved. ISBN: 0-7623-0067-1 1
2 2 5 5 9 17 17 21 28 31 31 31
2
PHILIPS. MAGEE
I. INTRODUCTION TO ADSORPTION BINDING Binding between organic compounds and both simple and complex inorganic and organic polymers are common events in chemistry, biochemistry, medicine, and agriculture. Experimentally, each event is accompanied by a measurable heat of adsorption, equilibrium constant, or physical retardation, as in chromatography. Under a given set of conditions, these measures will depend intimately on some function of the intermolecular forces expressed by the interaction of the known molecular structure with the generally unknown surface structure(s). As molecular structure of the adsorbant varies, so also will the binding measure, and when these differences are expressed in energetic terms, the opportunity for mechanistic insight presents itself. Correlating the differences in binding energetics with parameters derived from molecular structure can be accomplished in many ways, perhaps too many ways. Only the careful selection of descriptors that clearly reveal the nature of the intermolecular forces (imf) have any chance of revealing the underlying mechanisms of the binding process. Successful correlations with mechanistic descriptors should identify and quantify the contribution of each imf and leave a residual that roughly matches the experimental error of the binding measure. Ideally, the information contained in the QSPR equation should reveal a plausible mechanism of binding for the organic structure and some information about the nature of the surface structure(s). Although correlation is never clearly causal, an examination of many different binding QSPR's can lead to a high degree of mechanistic confidence that can be applied to new experimental designs and the crafting of either stronger or weaker adsorbents. This review will explore directly measured heats of adsorption and equilibrium constants as well as retardation (R^) in chromatography where the energetics of adsorption are balanced against the solvation forces of the eluent. In all cases, the data from each source will be reduced to energetic differences and correlated with descriptors that clearly define the possible imf interactions. From these equations^ conclusions will be drawn or inferred about the intimate relations between the adsorbate and the polymeric surface.
II. THE MODELING OF INTERMOLECULAR BINDING FORCES Adsorption binding is complex and may express a full menu of intermolecular forces (imf). These range from simple ion-pairing to several forms of van der Waals forces: (1) dipole-dipole or Keesom interactions, (2) dipole-induced dipole or Debye interactions, and (3) induced dipole-induced dipole or London interactions.^ Beyond these interactions, which decrease progressively in specificity from ion pairing to London forces, most polar groups are capable of forming specific hydrogen bonds with suitably located donor and acceptor atoms.^ To confound the unraveling of the mechanism even more, each of these binding interactions is
Energetics of Binding
3
subject to attenuation by strategically located groups capable of exerting a steric effect.^ Assessing the weighted mix of these diverse forces is the goal of QSPR modeling, with the frequent benefit of illuminating the binding mechanism to the point of a good working hypothesis. Let us clarify once more that structure-property modeling is not unambiguously causal. The relations are probabalistic in nature and, depending on the set under study, may contain accidental colinearities that conceal or oversimplify the picture. However, consistency in correlation over a substantial number of sets can provide very strong insights that lead to virtual mechanistic certainty. There are a vast number of descriptors favored by various researchers and this needs to be simplified for the purpose of this chapter. As the author has analyzed many of the most interesting data sets, it seems appropriate to cast all of the studies into a consistent descriptor format to facilitate cross comparisons. When citing completed work by other authors, an effort will be made to translate their description into compatible terms. In most cases, alternate descriptors for particular effects are colinear and the information derived from the QSPR is equivalent. The descriptors used by the author to model many of the data sets in this review will now be described and justified. One of the most frequent forces expressed in binding events is the mutual polarization of complementary induced dipole moments (London forces). This interaction depends on polarizable volume and is well modeled by molar refraction (MR),'* Bondi's volume (Vw),^ or molecular surface area (Aw),^ all of which are highly colinear^ and can be considered in general as "bulk" descriptors.^ The choice of this author is MR for ease of calculation of both simple and complex structures and concordance with a major portion of the QSPR literature. Polarizable volume is considered to be a fundamental property at the foundation of most intimate binding events. In addition to London forces, it should also model Debye interactions when the dipole moment is induced in the organic structure by preexisting dipoles on the polymer surface. For nonpolar compounds, MR is colinear with the extrathermodynamic measure, log P(octanol/water), and commonly used to model nonspecific binding events, e.g. binding to bovine serum albumin.^ In data sets containing a preponderance of polar compounds, log P will frequently correlate better than MR and give the appearance of greater mechanistic meaning. Log P, however, is a highly composite and constitutive measure based on the difference in free energy of creating cavities in octanol saturated with water and water saturated with octanol. In addition to analyzable factors, such as polarizable volume, inductive effects and all forms of hydrogen bonding, ^^ conformational changes and complex entropic effects accompany the phase transfer. Far from being a passive "shake flask" descriptor as many believe, Guy and Honda have shown that the interfacial transfer of methyl nicotinate from water to octanol is accompanied by an activation energy of 10.3 kcal/mol,^^ which strongly suggests a mechanistic process. Nevertheless, log P can be a useful diagnostic when dealing with nonspecific binding to complex polymers such as
4
PHILIPS. MAGEE
organic soils. It has recently been found that factoring log P into lipophilic (PL) and hydrophilic (PH) descriptors (PL + PH = log P) can lead to better correlations and stronger insight when combined with other intermolecular descriptors.^^ Thus, many "simple" log P correlations reveal, on factoring, both preference for either PL or PH substructures and the need for inclusion of supplementary MR and hydrogen-bond descriptors to complete the picture. All of the dipolar interactions are subject to modification by the electronic properties of substructures and substituents. This applies particularly to Keesom and Debye interactions where dipoles are built into the substructures. Effects on London interactions would necessarily be weaker and possibly undetectable. In addition, both nucleophilic and electrophilic sites in adsorbed compounds, as well as potential sites for hydrogen bonding, are directly impacted by changes in substructure and substituents. A full range of electrical effect substituent constants as described by Charton,^^ Hansch and Leo,^"^ and Magee^ are appropriate for defining substituent effects in binding. Factoring of the composite sigma's into inductive and resonance contributions can provide further insight to the mechanism.^ One of the characteristic features of polar molecules binding to polar surfaces is the formation of one or more specific hydrogen bonds between acceptor and donor sites. Essentially electrostatic in nature, these bonds are flexibly directional and can contribute 2-10 kcal/mol to the total binding energy. Electron pairs on oxygen and nitrogen form the majority of H-bond acceptors, while donor groups are typically 0 - H , N-H, S-H and specially activated C-H bonds. There is more than one way to describe the hydrogen bond in QSAR and QSPR relationships with honest differences of opinion. They can be described as continuous strength parameters based on experimental equilibria measured in nonpolar solvents or as simple chemical potentials based on the count of acceptor pairs on N and O, and on the count of X-H donors. A tremendous effort has been made to support the continuous H-bond model, which has many distinguished adherents.^^'^^ However, both methods are effective in correlation and no definitive study has been made to determine which of the two approaches is more effective. The question of H-bond potential or H-bond strength remains unresolved at this writing. As the author has had unusual success with the simple hydrogen bond potentials,^^ this approach is used exclusively in the new work presented here. All of the intermolecular forces are additionally subject to modification by steric interactions. For the purpose of mechanistic description, there are two equivalent measures of the steric effect, namely Taft's Es^^ and Charton's upsilon (i)),^^ both of which are derived from the kinetics of acid esterification and the hydrolysis of acid derivatives. Charton's observation that symmetrical and spinning-top substituents (H, X, CH3, CX3) show a linear correlation between Es and the van der Waals radii ( r j has led to a convenient reseating of steric effects, with i) directly and positively scaled to van der Waals projection.^^ The projective size of substituents is visually clear and correlations with upsilon(\)) are substantially easier to interpret
Energetics of Binding
5
than with the negatively scaled Es. The use of MR as a steric descriptor for substituents is seriously flawed as MR is a volume function of rj. We can now summarize the descriptors used in this review in a general imf relation: Binding Energy = f(MR, log P, PL, PH, sigma's,!), HBA, HBD, Fs) Sigma's refer to composite and factored electronic effects, while Fs refer to the occasional use of indicator variables to reset the intercept for discontinuous changes in molecular structure. As most of the descriptors are on a free energy scale (RT log X), the experimental binding energy must be expressed on a related scale to maximize correlation and allow direct comparisons between QSPR's.
III. BINDING OF ORGANIC COMPOUNDS ON INORGANIC POLYMERS The binding of organic compounds to inorganic polymers where suitable data exist is limited mainly to clays, alumina, and silica. Moreover, the data derive from two principal sources, namely the direct measurement of heats of adsorption and the indirect measurement of binding energy through retention in adsorption chromatography. These measures are sufficiently different to require separate treatment. A. Heats of Adsorption on Clay^ Silica, and Alumina Clay Despite the broad uses of clays in catalytic and adsorptive processes, there are very few references to comparative adsorption studies, although a vast literature of experimental studies must reside in company files. In general, clays have alternating layers of hydrated alumina and silica with a broad assortment of Group lA and IIA cations to balance the negative charge of the layers. Intercalation and a variety of bonding mechanisms have been studied by FT infrared and by differential scanning calorimetry.^^ Intercalation is proposed for nitrobenzene and trichloroethylene, while phenol, 4-chloroaniline, and triethanolamine show clear evidence of hydrogen bonding. As will be seen from studies of silica and alumina, it is highly probable that these alternating layers in clay participate in additional modes of complex binding. Silica Pure silica has a great number of different crystalline and amorphous forms based on the tetrahedral Si04 unit that is internally constructed of polymeric Si-O-Si bonds. At the surface, however, these bonds are broken by varying degrees of hydration with silicon preserving a tetrahedral structure by forming single SiOH and ^^m-Si(0H)2 structures, the density of which depend on the degree of hydra-
6
PHILIPS. MAGEE
tion. Under vacuum, water is slowly removed in the following approximate stages: physically adsorbed water at temperatures up to 400 K, hydrogen-bonded water up to 500 K, hydrogen-bridged geminal silanols and vicinal silanols up to 900-1000 K, and isolated silanols up to 1000-1200 K.^^ Hydrogen-bonding opportunities for adsorbents to SiOH groups are generally classified as isolated, geminal, and vicinal. Moreover, the surface silanol groups appear to function exclusively as H-bond donors.^^ A series of simple compounds of varying shape, basicity, and dipole moment were selected for adsorption onto hydroxylated and dehydroxylated silica.^^ Heats of adsorption from hexane on silica were measured by microcalorimetry for n-butylamine, n-butanol, n-butyraldehyde, n-butyric acid, n-nitropropane, and DMF. The dominant descriptor on both silicas was bascicity (p^^ of adsorbent conjugate acid), with no measurable dependance on shape or dipole moment. Heats of adsorption were lower for the dehydroxylated silica and the weaker mechanism of Lewis acid binding is thought to replace silanol H-bonding. Binding of alcohols from the gas phase^^'^^ and from carbon tetrachloride^^ on silica has been studied by infrared spectroscopy to measure the weak H-bonding to surface silanol groups. Most of the alcohols studied showed evidence of reactive chemisorption by cleavage of Si-O-Si groups. The strongest H-bonds were formed between single alcohol molecules adsorbed on adjacent silanol groups. This type of double H-bridge has also been observed for methyl mercaptan on porous silica gel.^"^ In other binding studies on partially dehydroxylated silica, the geminal hydroxyl groups are shown to provide the most stable adsorption sites for ammonia and pyridine, both of which form weak hydrogen bonds.^^ The study of aromatic adsorbents provides additional detail as substituent effects provide insight to the details of the binding process. Spectroscopic shifts of adsorbed ester carbonyl groups show linear relations with p^^'^ of the parent acids in binding from carbon tetrachloride onto activated silica.^^ Of particular interest is the finding that silanol binding can be single to the carbonyl oxygen or the aromatic n system and, in the case of benzyl benzoate, both types of H-bonds are formed by adjacent silanol groups. Silica immersed in the heptane solutions of six substituted anisoles show H-bonding interactions between surface silanols with methoxy groups as well as ring n systems.^^ A plot of the H-bonded SiOH frequencies is linear with the Hammett sigma of the anisole substituents. In the same study, surface silanol groups were found to form hydrogen bonds to the nitro groups of 4-nitroanisole, nitrobenzene, and 4-nitrotoluene. In direct comparison with this study, the H-bond interactions of triphenylsilanol in carbon tetrachloride with the same set of substituted anisoles and nitro compounds show nearly identical behavior.^^ Substituent sigma plots of anisoles, nitrobenzenes, and other substituted benzenes with the shift in silanol frequency show very similar slopes for both surface and solution silanols, clearly dispelling the need for a special surface effect. Studies of simple aromatics on silica and chlorinated silica confirm n system
Energetics of Binding
7
adsorption binding and also detect H-bonding of silanol groups to fluorine in fluorobenzenes."^^ Phenols in heptane bind to silica through H-bonds formed by silanol donors to either the hydoxyl group electron pairs or the n system of the ring.^^ There is no evidence for phenol as an H-bond donor. The eleven phenols in this study show a complex plot of sigma against the SiOH frequency, reflecting changes in binding mechanism with the variation in substituents. For phenols with large 2,6-groups, the predominant mode of adsorption is silanol binding to the aromatic n system, with no evidence of steric hindrance for 2,6-di-Nbutylphenol. For pentamethylphenol, however, adsorption is exclusively through silanol binding to the hydroxygroup electron pairs. Alumina
Although the alumina surface is in some ways analogous to that of silica, there are significant differences. While silanol groups function only as H-bond donor groups, the surface oxygen groups on alumina have higher ionic character and can also function as H-bond acceptors.^^ In addition, alumina activated at 500 °C is also populated with incompletely coordinated aluminum ions with strong Lewis acid properties.'^^ The free energies of adsorption of 22 phenylarenes and 29 fused aromatic hydrocarbons in pentane on alumina (1.3% water) has been studied by Snyder.^^'^^ The data show that phenylarenes are all nonplanar in solution but tend to adsorb in the plane of the adsorbent surface. The adsorption energies correlate with the physical size (carbon count, minimum width, maximum length) of the phenylarenes.^^ An analogous relation holds for the fused aromatic hydrocarbons, the list of which extends from benzene to coronene. The author has correlated the free energies of adsorption of the first 15 hydrocarbons (benzene to perylene) with polarizable volume (MR) in order to relate size to energy in a more mechanistic relation. One outlier binding more strongly than predicted (naphthacene) was deleted. This simple correlation is suggestive of London forces due to polarizable volume with binding to Lewis acid sites. Phenylarenes: benzene, naphthalene, azulene, acenaphthalene, phenanthrene, anthracene, pyrene, fluoranthene, triphenylene, 1,2-benzanthracene, chrysene, naphthacene, 1,2- and 1,3-benzpyrene, and perylene. AF(kcal/mol) = -0.0849 MR(-22.42) + 0.238 (Student rvalue) n=l4
5 = 0.230
r^ = 0.977
F = 502.85
(1)
In Eq. 1 and subsequent equations, n indicates the number of data points in the set, s is the standard error from the regression line, r^ is the explained variance and Fis the Fisher statistic that measures the overall strength of the relation. The number
8
PHILIPS. MAGEE
in parentheses after the descriptor is the Student T value. It represents the ratio of the regression coefficient to the statistical error in the coefficient. Minimal values for significance are T = 2.00 and F = 4.00. In another revealing study by Snyder—the free energies of adsorption of 66 nitrogen compounds (pyridines, anilines, and pyrroles) in pentane—are measured on alumina (3.6% water).^^ Excellent linear plots with Hammett's sigma are obtained for non-orf/zo-substituted pyridines, quinolines, anilines, and indoles. All have negative slopes except the indoles, the slope of which is strongly positive. Snyder concludes that all but the indoles are binding by nucleophilic transfer. The indoles/pyrroles bind by proton transfer to the alumina surface (H-bonding). The largest and most diverse group are the 3,4-substituted pyridines and these were regressed by the author (Eq. 2, Table 1). This relation clearly shows that nucleophilic binding dominates with electron donor groups lowering the binding free energy. AF(kcal/mol) = -0.0480 MR (-2.74) + 2.06 a (12.18) -1.14 HB (-10.27)-5.44 n=14
5 = 0.185
r^ = 0.949
F = 62.47
(2)
Table 1. Adsorption Energies of Pyridines from Pentane on Alumina (3.6% Water), 24 °C MR
a
HB
-6.48 -6.06 -6.24
11.30 5.65 10.27
-0.24 -0.17
0 0 0
-6.84
-7.17
5.42
-0.16
1
3-Me
-5.99
-5.86
5.65
-0.07
0
Pyridines
AF, Kcal/mol
3,4-DiMe 4-Me 4-Et
-6.48 -6.05 -6.19
3-NH2
Yest^
-0.15
Pyridine
-5.57
-5.49
1.03
0.00
0
4-CI
-5.40
-5.26
6.03
0.23
0
3-Acetyl
-6.69
-6.34
11.18
0.38
1
3-Formyl
-6.27
-6.19
6.88
0.35
1
3-CI
^.94
-4.97
6.03
0.37
0
3-Br 4-CN
-5.07 -5.53 -5.74
8.88 6.33
0.39 0.66
0 1
3-CN
-5.02 -5.43 -5.74
6.33
0.56
1
3,5-DiCI
-4.29
-4.50
12.06
0.74
0
Note: ^ Equation2.
Energetics of Binding
9
However, other intermolecular forces are also at work with both London forces (MR) and the H-bonding of polar substituent groups (amino, acetyl, formyl, and cyano) making a contribution. The binding of phenols to an aluminum oxide surface was studied by HolmesFarley using a novel procedure.^'* Thermally evaporated aluminum deposited on clean glass slides was exposed to oxygen to generate an aluminum/aluminum oxide surface for adsorption of a broad selection of 2,3- and 4-substituted phenols in competition with acetic acid. Adsorption was measured by evaluating water contact angles. Plots of the binding constants (log l/K) against phenol pK^ were linear for the 3,4-substituents and roughly linear against substituent radius for the 2-substituents. Binding of the 3,4-substituted phenols gives a reasonable linear plot against pA'^ when three more strongly binding acids are included (acetic, benzoic, and 4-trifluoromethylbenzoic). These observations provide strong evidence for a binding event dominated by the formation of H-bonds. As binding increases with pK^, the phenols are acting as H-bond donors to aluminum oxide groups, in contrast to binding on silica. In comparative studies. Glass and Ross have explored the differences in binding of hydrogen sulfide, methanethiol, ethanethiol, and dimethyl sulfide on silica gels^^ and alumina.^^ Heat-treated silica gel (20 h each at 240, 550, and 700 °C) and heat-treated y-alumina (20 h at 700 °C) were used in the adsorption experiments. Limiting heats of adsorption were substantially greater for alumina than for silica, though both display the same order with A//(ads) increasing with methyl substitution (Al/Si): H2S (16.0/5.5 kcal/mol), MeSH (16.5/7.0), EtSH (18.4/10.0), Me2S (20.7/12.2). In each case, the data are consistent with donor H-bonds from AlOH and SiOH surface hydroxyl groups. The difference in binding strength is consistent with the greater acidity of the AlOH groups. B. Adsorption Chromatography on Silica and Alumina Adsorption chromatography is an indirect measure of binding energy, depending on the careful selection of eluent to separate the compounds from both the origin and the solvent front. The observed R^ (compound/solvent travel) can be cast as a relative binding energy by the following transformation:^^ Rj^ = l o g [ l / R f - l ] The resulting scale extends from diminishing positive values through zero at R^ = 0.5 to increasing negative values as R^ increases and approaches 1.0. The scale is therefore positively related to binding energy. Reproducibility is a major problem with adsorpdon chromatography as discussed by Dallas in reference to thin layer techniques.^^ Moreover, the error of observation increases as the spot nears either the origin or solvent front, a fact which can lead to unusual residuals in a QSPR analysis. Despite these problems, the data remain consistently analyzable with excellent results because the order of R^ and Rj^ are never in doubt. Whether or not
10
PHILIPS. MAGEE
two successive TLC plates are exact duplicates is irrelevant if the relative Rj^ values lead to correlations that differ only in the intercept. A large number of TLC studies have been analyzed by Magee in mechanistic QSPR terms.^^ In addition, Magee has found that rank transform regression on ranked Rj^'s and descriptors can be substantially stronger than regression on real values.^ This supports the concept that TLC order as visually observed is absolute, while the measured spot positions increase in error as they recede from the midpoint in either direction. Silica
The general theory for the correlation and prediction of R^ values in TLC has been extensively reviewed by Snyder, one of the pioneers of binding energetics for adsorption of organic compounds on silica and alumina."^^ Snyder was the first to relate R^ to an equilibrium distribution coefficient, K, and to model AT as a function of adsorbent, solvent, and solute properties. The parameter, 5^, is a dimensionless adsorption energy of the solute from pentane solution onto an adsorbent of standard activity. The value is positively scaled to increasing binding energy. It is a function only of the solute with respect to silica or alumina and can be calculated by additivity for a vast number of additional solutes from those experimentally measured. Although Snyder also considered other descriptors such as molecular area of the solute, A^, and the eluent strength of the solvent, it is ^ that largely determines the variation of R^ with solute structure. His descriptors and concepts have been widely used by others in systematizing TLC observations. An excellent example is the work of Vernin and Vernin in applying Snyder's theory to the linear adsorption chromatography of 100 thiazoles on silica and alumina."^^'"^^ They were able to separate polarization and steric effects in addition to demonstrating the additive nature of the interactions. Analysis of a diverse set of aromatic hydrocarbons developed on silica gel G with diisobutylene as eluent is reported by Magee (Eq. 3, Table 2)?^ Many of the aromatics are substituted with CI-CI 8 alkyl chains. The observed R^ values are transformed to the energy scaled Rj^ for correlation and found to cover a broad range from -0.45 to 0.39. The only reasonable descriptor for hydrocarbon adsorption is the polarizable volume, MR, which proved to be uncorrelated with Rj^ (r = 0.09). Factoring MR into contributions from the aromatic rings, MRAr, and the aliphatic side chains, MRAl, led to the dramatic discovery of opposing volume effects. Only the aromatic groups are bound to silica, while the aliphatic side chains are strongly repelled. A further improvement was realized with an indicator variable for aliphatic chains longer than C5, which could be reasonably assumed to lose contact with the silica surface through flexibility. The resulting correlation accommodates the entire set and clearly reveals the nature of the binding process. R^ = 0.0078 MRAr (8.48) - 0.0027 MRAl (-2.31) - 0.190 ICh (-3.36) - 0.205 n = 36
5 = 0.077
r2 = 0.910
F = 107.9
(3)
Energetics of Binding
11
Table 2, TLC of Aromatic Hydrocarbons on Silica Gel G (eluent = diisobutylene) Aromatic
MRAI
Ich
25.36
74.95
1.0
25.36
56.47
1.0
-0.30
25.36
37.99
1.0
-0.27
-0.27
25.36
28.75
1.0
-0.27
-0.11
23.30
30.81
0.0
HC
i^M
Yest^
n-Ci5-phenyl
-0.45
-0.40
n-Ci2-phenyl
-0.37
-0.35
n-Cs-phenyi
-0.35
n-Ce-phenyl 1,3/5-TriEt-phenyl
MRAr
Cycio-Ce-phenyi
-0.19
-0.27
25.36
26.89
1.0
1,2,4-TriMe-phenyl
-0.18
-0.07
23.30
16.95
0.0
1 -Pr-2,4,6-TriMe-phenyl
-0.12
-0.12
22.27
31.84
0.0
PentaMe-phenyl
-0.05
-0.12
21.24
28.25
0.0
Durene
-0.02
-0.09
22.27
22.60
0.0
0.02
-0.07
24.33
18.48
0.0
-0.27
-0.31
24.33
84.19
1.0
Naphthalene
0.00
0.12
41.80
0.00
0.0
Acenaphthene
0.03
0.19
49.70
0.00
0.0
1,4-DiMe-naphthalene
0.03
0.08
39.74
11.30
0.0
1,5-DiMe-naphthalene
0.03
0.08
39.74
11.30
0.0
TetraH-naphthalene 2-Ci8-naphthalene
2,4,6-TriMe-naphthalene
0.12
0.05
38.71
16.95
0.0
4-Me-di phenyl
0.12
0.17
49.73
5.65
0.0
2,3-DiMe-naphthalene
0.18
0.08
39.74
11.30
0.0
Diphenyl
0.18
0.19
50.76
0.00
0.0
Fluorene
0.21
0.17
48.70
4.62
0.0
Diphenylmethane
0.27
0.18
50.72
4.62
0.0
1 -Phenylnaphthalene
0.19
0.31
66.13
0.00
0.0
9-Methylanthracene
0.21
0.22
56.18
5.65
0.0
Anthracene
0.29
0.24
57.21
0.00
0.0
Phenanthrene
0.29
0.24
57.21
0.00
0.0
2-Phenylnaphthalene
0.31
0.31
66.13
0.00
0.0
2,3-Benzofluorene
0.39
0.29
64.07
4.62
0.0
1,4-Diphenylbenzene
0.39
0.39
75.09
0.00
0.0
Pyrene
0.31
0.30
64.08
0.00
0.0
Chrysene
0.33
0.37
72.87
0.00
0.0
2,2'-Dinaphthyl
0.41
0.44
81.54
0.00
0.0
1,2-Dihydronaphthalene
0.07
0.04
34.29
9.24
0.0
9,10-Dihydrophenanthrene
0.27
0.15
48.70
9.24
0.0
Fluoranthene
0.35
0.30
64.07
0.00
0.0
Perylene
0.39
0.42
79.48
0.00
0.0
Notes: ^Equations.
12
PHILIP S. MAGEE
Klemm and coworkers report the TLC study of nitroarenes on both silica and alumina with benzene as the eluent (Eqs. 4 and 5, Table 3).'*^ The author analyzes their set of 15 nitrobenzenes substituted with 1-4 methyl groups and with MeO and a second nitro group. The variation is not great, but 7/15 have ortho substitution with the possibility of steric effects. Descriptors for the study are the summation of substituent MR, a, D, and an indicator variable for the H-bond acceptor qualities of the MeO and NO2 groups, HBA. The data are colinear for both silica and alumina (r^ = 0.914) and both depend on the same descriptors. Strong positive binding through the nitro and methoxy acceptor groups by silanol and aluminol overcomes a small negative bulk effect. No steric or electronic effects are observed. The H-bonding is stronger on alumina, consistent with the more acidic AlOH groups, while the negative bulk effect is somewhat larger. Silica: R^ = -0.0162 E MR M.92) + 0.223 HBA (6.65) + 0.212 « = 14(1 outlier)
s = 0.061
r^ = 0.836
F = 27.98
(4)
Alumina: R^ = -0.0261 E MR (-5.38) + 0.348 HBA (7.23) - 0.116 Ai=15
5 = 0.091
r^ = 0.S5l
F=34.17
(5)
Table 3. TLC of Nitrobenzenes on Silica and Alumina (eluent = benzene), 28.3 °C Nitrobenzene 2,6-DiMe 2,4,6-TriMe 4-N02-2,3,5,6-Me4 2-Me 2,3-DiMe
Vest
^^^'
Vest
-0.017 -0.017 0.000 0.070 0.052
0.046 -0.029 0.239 0.120 0.046
-0.501 -0.477 -0.477 -0.368
-0.384
^^f^^'
"LMR
HB
-0.505 -0.094
10.27 14.89 25.84
-0.348
-0.263 -0.384
5.65 10.27
0 0 2 0 0
3-Me
0.105
0.120
-0.308
-0.263
5.65
0
4-Me
0.140
0.120
-0.231
-0.263
5.65
0
3,4-DiMe
0.140
0.046
-0.213
-0.384
10.27
0
3-N02-2-Me
0.176
0.464
-0.176
0.267
11.98
2
4-NO2 3.N02-4-Me
0.250
0.539
-0.052
0.388
7.36
2
0.308 0.327
0.464
0.000 0.017
0.267
11.98
2
0.388
0.035 0.087
0.375 0.388
7.36 7.87 7.36
2 2 2
0.140
0.375
7.87
2
3-NO2 4-MeO 2-NO2
0.410 0.288
0.539 0.531 0.539
2-MeO
0.550^
0.531
Notes: ^Equation 4. ^Equation 5. ^Outlier. Deleted from Eq. 4.
Energetics of Binding
13
Pyridines provide an interesting departure in adsorption behavior as modeled by Magee.^^'"^ A set of 25 pyridines developed on silica gel with acetone includes a number of 2- and 6-substitutions that permit analysis of the steric effect in addition to electronic and bulk effects. Binding is complex and both MR and 7i(from log P) are supported as bulk effects. The negative electronic and steric effects clearly identify nucleophilic binding by the pyridine nitrogen to either silanol hydroxyl groups or hypervalently to silicon. Hypervalent binding to silicon is suggested as H-bonding to SiOH groups would be less likely to show a significant steric effect. R^ = 0.00687 MR (2.36) - 0.139 n (7.03) - 0.583 a (10.36) - 0.222 v^^^ (4.09) - 0.0892 n = 25 5 = 0.0695
r^ = 0.884 F = 38.17
(6)
In the review article already cited,"*^ Snyder provides experimental ^ values for 29 pyridines (Eq. 7, Table 5). These values are positively scaled to binding energy. Half of the substituents, 15/29, are potential H-bonding groups and are coded HB = 1. This set is analyzed by the author, and confirms the strong electronic effect supporting nucleophilic binding by pyridines on silica. The four deletions from Snyder's set are 2- and 4-hydroxy and aminopyridines, which are not true pyridines. ^ = -2.66 a (-11.75) + 3.69 HB (24.98) + 7.58 n = 25(4 deleted)
5 = 0.360
r^ = 0.969
F = 339.1
(7)
Alumina The similarity of the silanol and aluminol surfaces is revealed in Eqs. 4 and 5 which reveal differences in binding energetics but not in the basic mechanism of binding. It is reasonable to infer that the binding events on one will be mirrored to a significant extent on the other. These analogies are especially evident in the large comparative study on 100 thiazoles by Vernin and Vernin, where they find parallel trends in the energy of binding to alumina and silica."*^ However, the sensitivity of the thiazoles to steric effects of alkyl groups is more important on alumina than silica, in accord with stronger and closer binding to the surface. Snyder has done extensive work on the retention volumes of mono- and polyhalo-substituted benzenes on slightly hydrated alumina (0.7% water) with pentane as an eluent."^^ In addition to substituent MR, steric effects were tested for adjacent halogens and the electronic effect (sum of sigma's) is referenced to the nearest hydrogen adjacent to the smallest halogen. Thus, the electronic effect of 1-fluoro2-chlorobenzene is the sigma sum of /?-fluoro and m-chloro as though the compound were l[//]-2-fluoro-3-chlorobenzene. This treatment was found by the author to be superior to methods reversing the positional effect of the halo groups. Bulk and electronic effects are strongly supported with no evidence of a steric effect.
14
PHILIPS. MAGEE Table 4. Substituted Pyridines Developed by Acetone on Silica Gel, TLC
Pyridine
f^M
Yest^
In
I.MR
2a
^2,6
2-Aceto
-0.35
-0.32
-0.55
14.59
0.50
0.50
3-Amino
0.09
0.21
-1.23
8.51
-0.16
0.0
2-Benzoyl
-0.39 -0.41
-0.46 -0.37
0.95
0.50
0.23
3-Bromo
-0.31
-0.34
0.86
34.30 11.97 11.97
0.43
2-Bromo
0.65 0.00
2-Chloro
-0.37
-0.34
0.71
9.12
-0.27
-0.31
0.71
9.12
0.23 0.37
0.55
3-Chloro 2,4-Dimethyl
0.02
-0.06
1.02
13.36
-0.34
0.52
2,6-Dimethyl
-0.16
1.02
13.36
-0.34
1.04
2-Ethyl
-0.21
-0.16 -0.17
1.02
13.36
-0.15
0.56
2-Fluoro 3-Hydroxy
-0.35
-0.13
0.14
0.27
-0.01
-0.67
4.01 5.94
0.06
-0.05 -0.37
0.12
0.00
-0.35
1.12
-0.07
-0.09
0.51
17.03 8.74
0.35 -0.17
0.00 0.52
3-Methyl 4-Methyl
-0.09
-0.05
0.51
8.74
-0.07
0.00
0.03
0.51
8.74
-0.17
0.00
2-n-Propyl
1.55
18.05
-0.13
0.68
Pyridine
-0.25 -0.07
0.00 -0.27
0.00
0.00
-0.31
4.12 9.97
0.00
2-Formyl(CHO)
-O.03 -0.27
0.42
0.50
3-Formyl 4-Formyl
-0.14 -0.10
-0.14 -0.17
9.97 9.97
0.35 0.42
0.00 0.00
0.09 0.19
0.00 0.10 0.10 -0.03
10.28 10.28 10.28 17.98
0.00 0.00 0.00 -0.51
0.53 0.00 0.00 0.52
3-lodo 2-Methyl
2-Hydroxymethyl 3-Hydroxymethyl 4-Hydroxymethyl 2,4,6-Tri methyl
0.19 -0.02
0.86
-0.65 -0.65 -0.65 -1.03 -1.03 -1.03 1.53
0.39
0.00
Note: ^Equation 6.
The level of correlation and the irregular pattern of the residuals suggest that other factors may be involved, perhaps H-bonding to the smaller halogen substituents. As log R increases with binding, the positive bulk effect is opposed by increasing electron withdrawal from the ring, consistent with binding to Lewis acid sites of this highly activated alumina. Halobenzenes: Mono-F, CI, Br, I and all combinations of 1,2-, 1,3-, and 1,4-disubstitution, 1,2,3- and 1,2,4-triCl, 1,2,4,5-tetraCl, 1,2-diCl-4-Br, 1,2diCl-4-I, 1,3,5-triBr, 1,2,4,5-tetraBr, hexaCl log R = 0.0376 Z MR (9.69) - 0.614 S a (-6.00) + 0.366
Energetics of Binding
15
Table 5, Adsorption Energies of Substituted Pyridines on Silica
^
Pyridine
Yest^
a
HB
Pyridine
7.7
7.6
0.00
2-Methyl
8.1
8.0
-0.17
0
3-Methyl
7.8
-0.07
0
4-Methyl
7.8 8.2
8.0
-0.17
0
2,4-Dimethyl
8.5
8.5
-0.34
0
2,6-Dimethyl
8.1
8.5
-0.34
0
2,4,6-Tri methyl
9.1
8.9
-0.51
0
2-Ethyl
8.0
8.0
-0.15
0
2-n-Propyl
7.5
7.9
-0.13
0
2-Hydroxy
12.4^
12.3
-0.37
3-Hydroxy
11.0
0.12
4-Hydroxy
10.8 15.2^
-0.37
2-Amino
10.9^
-0.66
3-Amino 4-Amino
11.3 12.9^
12.3 13.0 11.7 13.0
-0.66
2-Hydroxymethyl
11.7
11.3
0.00
3-Hydroxymethyl
12.1
11.3
0.00
4-Hydroxymethyl
12.7
11.3
0.00
9.5
10.2
0.42
2-Formyl(CHO)
0
-0.16
3-Formyl
10.2
10.3
0.35
4-Formyl
10.1
10.2
0.42
2-Aceto
9.8
9.9
0.50
10.3 6.4 6.5 6.5 6.9
10.1 11.1 7.0 7.0 6.6
0.43 0.06 0.23 0.23 0.37
0 0 0
7.0 7.0
6.5 6.6
0.39 0.35
0 0
2-Benzoyl 2-Fluoro 2-Chloro 2-Bromo 3-Chloro 3-Bromo 3-lodo Notes: 'Equation 7. ^Not included in Eq. 7.
n = 42
5 = 0.152
r^ = 0.711
F = 48.01
(8)
Another retention study by Snyder concerns substituted phenols adsorbing from 20% /-PrOH in pentane onto hydrated alumina (3.9% water)."^^ The set is small and log R depends only on the substituent sigma values. Note that this dependance is opposite in direction to that of the halobenzenes in Eq. 8 and strongly suggests that phenol acidity as an H-bond donor is responsible for most of the binding energy.
16
PHILIP S. MAGEE
The degree of alumina hydration suggests that the surface holds sufficient AlOH groups to provide H-bond acceptor sites. This work is consistent with HolmesFarley's study of phenols binding to oxidized aluminum surfaces.^"* In support of sigma as a single descriptor, the residuals closely approach a normal distribution. Phenols: phenol, 4-methyl, 3,4- and 3,5-dimethyl, 3- and 4-methoxy, 3- and 4-chloro, 3- and 4-aceto, 4-formyl, 4-nitro log R = 2.08 Z a (6.21)-h 0.464 n=l2
5 = 0.373
r2 = 0.794
F = 38.52
(9)
The TLC development of simple mono-, di-, and triaminoanthraquinones on alumina with 3:1 hexane/acetone is analyzed by the author.'^^ Descriptors tested are summations of 7i (from log P), MR, a, \) for 1,4,5,8-substituents, and an indicator variable for monoaminoanthraquinones. In this rather large set, the n values dominate over the simple bulk factor, MR. As 7i is a composite descriptor, additional factors such as H-bonding are imphed. The second most important factor is electronic with the positive coefficient suggesting enhancement of amino group donor bonds to alumina. The indicator variable for monoaminoanthraquinones was unexpected and may suggest a different binding geometry for this subset. No steric interactions were observed. Substitution pattern: position 1-H, NH2, CH3, position 2-H, NH2, CH3, Br, position 3- and 4-H, NH2, CH3, CI, Br, position 5- and 8-H, NH2, CI, position 6- and 7-H, NH2 R^ = -0.358 271 (-7.02) + 0.454 S-a (6.20) - 0.386 MONO (-3.61) + 0.118 n = 60
5 = 0.227
r^ = 0.191
F = 73.04
(10)
The TLC development of substituted anilines shows both complex bulk effects (MR, 71), steric hindrance of amino-group binding, and, most surprising, no significant electronic effect.'*^ These observations relate to 60 anilines developed on neutral activated alumina with benzene as the eluent. All positions are mono- and poly-substituted. Interpretation is difficult. If the steric effect is blocking nucleophilic or H-binding to aluminol sites as observed with phenols, then a strong electronic effect should modify the nitrogen electron pair or the acidity of the NH groups. The only suggestion of an electronic effect is the need for an indicator variable to accomodate para-substitution by nitro, aceto, and carbomethoxy groups. These and other substituent effects were not handled by a" (7= -0.23) despite the large values for these groups.
Energetics of Binding
17
Substitution pattern: position 2-H, CH3, CI, OH, OCH3, position 3-H, CH3, CI, Br, OH, acetamido, position 4-H, CH3, CI, Br, OCH3, NH2, NO2, phenyl, acetamido, aceto, COOCH3, position 5-H, CH3, CI, OCH3, position 6-H, CI R^ = - 0.480 Z 71 (-11.30) + 0.028 Z MR (4.06) - 0.360 \)2,6 (-3.69) + 0.239 IN02 (4.30) + 0.121 n=:60
5 = 0.201
^ = 0.790
F = 55.56
(11)
IV. BINDING OF ORGANIC COMPOUNDS TO ORGANIC POLYMERS A. Heats of Adsorption on Cellulose and Activated Carbon Cellulose The literature on binding of organic compounds to cellulose is strong in the area of paper and cellulose thin-layer chromatography, but very weak in direct binding studies. Some work has been stimulated by the need to understand the binding of vat dyes to cellulose fiber and some rather specialized descriptors have been developed by Giles and Hassan."*^ Although no regression was applied to the measured binding affinity (kcal/mol) of over 80 anthraquinone dyes to viscose rayon, plots against dye solubilities and the longest conjugate chain length were used to develop several conclusions. For high cellulose binding, dyes must have planar structures and long conjugate systems. Binding is enhanced by hydrogen bonding and this is inhibited in the presence of water. More recent work by Timofei and coworkers quantifies and refines the work of Giles and Hassan by regression analysis and related techniques.^^'^^ Their work on sets of 46 and 49 anthraquinone vat dyes clearly shows the presence of steric, electronic, and hydrophobic effects in the dyeing process. Hydrogen bonding by proton donor groups of the dye molecule is also important. The main structural feature, however, is the descriptor of Giles and Hassan (number of bonds of the conjugated chain along the main axis, r^ = 0.835). As this descriptor is roughly proportional to molecular size, the operation of London forces is strongly inferred. Although no one equation incorporates all of the findings of Timofei and coworkers, the considerable complexity and specificity of the dye adsorption process is revealed through a full range of mechanistic effects. Carbon Carbon is the ultimate degradation product of cellulose and the many woody natural products used in the manufacture of activated carbon via incomplete combustion. Unlike cellulose, the nature of the scientific data is the complete
18
PHILIPS. MAGEE
reverse in that the majority of information is found in direct adsorption studies, rather than indirectly through adsorption chromatography. While cellulose has a relatively uniform surface composed of repeating glucose units, activated carbon presents a much more complex surface of mixed aromatic and aliphatic structures in varying states of partial oxidation, depending on the biomass used and the conditions of incomplete combustion. Adsorptive binding from solution is potentially complex in the global sense as different classes of compounds might be expected to seek different structurally compatible binding sites. That the expected complexity does not emerge from analysis of the data is a mystery that still awaits future insight. Some of the expected surface complexity of activated carbon is revealed by studies on the irreversible adsorption of phenolic compounds by Grant and King.^^ The observation that phenolic compounds react on carbon surfaces (chemisorption) and are difficult to remove was related to oxidative coupling promoted by high pH and oxygen availability. While the role of carbon in the mechanism of oxidative coupling remains speculative, it is known that carbon can catalyze oxidation reactions. The situation is much less complex in reversible adsorption as group contribution methods appear to predict well for simple adsorbates^^ and these are supported by correlations based on a count of carbon, hydrogen, halogen, nitrogen and oxygen atoms.^"^ While these methods and correlations do not directly address mechanism, the implication is that polarizable volume (MR) is the dominant descriptor and this, in turn, implies adsorption by a nonspecific mechanism. Kamlet and coworkers have applied their experimentally evaluated solvatochromic parameters to the binding analysis of 37 simple aliphatic compounds (alcohols, aldehydes, amines, chlorocarbons, esters, ethers, and ketones) from aqueous solution onto activated carbon.^^ It is beyond the scope of this chapter to discuss the solvatochromic approach in detail. However, it encompasses a full mechanistic approach to intermolecular forces by including polarizable volume, dipolarity, and both types of scaled H-bonding descriptors (donor/acceptor). As such, the approach is well suited to describing a range of mechanistic contributions to any set of data based on kinetic or equilibrium measures. It has the disadvantage of being experimentally intensive and is most appropriate when the compound parameters are previously tabulated. In the present study, the partitioning between adsorbed and solution phases, log a, is found to correlate strongly (r^ = 0.949) with polarizable volume, dipolarity and the H-bond acceptor basicities of the adsorbates. This is of exceptional interest in showing the sensitivity of the carbon surface to both the dipolarity and polarizability of adsorbates as well as revealing the presence of H-bond donors that the authors evaluate as somewhat stronger than that of n-octanol. While not directly related to activated carbon. Grate and coworkers use the same approach to show the essential identity of vapor adsorption on graphite and fullerene surfaces for a diverse selection of aliphatic and aromatic compounds.^^ Of interest is the identity of descriptors for binding on graphite, pure fullerene, and on crude activated carbon.
Energetics of Binding
19
The Freundlich equation relates the amount of solute adsorbed (X mg/g of adsorbent) to the equilibrium solute concentration (C mg/1) through two adsorption constants (k and 1/AO as follows: logX=log/:+(l/A01ogC Abe and coworkers have shown a linear relation between (1/AO ^^^ log k (r^ = 0.947) for adsorption on activated carbon.^^ The same authors measured the adsorption of 15 simple alcohols from water onto three activated carbons with gready different pore size distribution (A = wood, B = coal, C = coconut shell) (Eqs. 12-14, Table 6).^^ Good linear relations were obtained between the Freundlich adsorption constant, log k, and the molecular connectivity index, % (r^ = 0.973). As the connectivity index is not directly interpretable, the data have been reanalyzed in terms of polarizable volume (MR) and two indicator variables for branching (IBRCH = 0, 1, 2) and for primary, secondary, and tertiary C-OH (lOH = 1 , 2 , 3). This treatment clearly shows the dominance of London forces (MR) and the negative contributions of both C-C and C - 0 branching. However, only Eq. 12 approaches the level of correlation shown by the single connectivity descriptor. log ^(A) = 0.122 MR (8.40) - 0.201 IBRCH (-3.19) - 0.256 lOH (^.96)-2.44
Table 6. Adsorption of Alcohols from Water onto Activated Carbons Adsorbate
Alcoliol
Log M^ LogkB^
Logkd
Chi
MR
IBRCH
lOH
1 -Butanol
-0.262
0.505
0.910
2A}4
19.51
0
1
2-Butanol
*
0.396
*
2.270
19.51
0
2
2-Me-l -Propanol
-0.600
0.439
0.609
2.270
19.51
1
1
2-Me-2-Propanol
-1.114
0.170
-0.013
2.000
19.51
1
3
1.021
1.408
1 -Pentanol 2-Pentanol
0.328 •
0.995
2.914
24.13
0
1
•
2.770
24.13
0
2
*
0.824
*
2.808
24.13
0
2
2-Me-1 -Butanol
0.025
0.953
1.228
2.808
24.13
1
1
3-Me-1-Butanol
0.188
0.981
1.241
2.770
24.13
1
1
2-Me-2-Butanol
-0.341
0.840
1.045
2.561
24.13
1
3
3-Me-2-Butanol
-0.074
0.678
0.983
2.643
24.13
1
2
2,2-DiMe-1 -Propanol
-0.301
0.564
0.703
2.561
24.13
2
1
Cyclopentanol
-0.356
0.671
0.754
2.394
22.07
0
2
1.770
3.414
28.75
0
1
1.185
2.894
0
2
3-Pentanol
1 -Hexanol
0.772
1.408
Cyclohexanol
0.117
0.899
Notes: ^ A = wood, Eq. 12. 2B = coal, Eq. 13. ^C = coconut shell, Eq. 14.
26.69
20
PHILIPS. MAGEE
n=l2
5 = 0.134
r2 = 0.943
F = 44.53
(12)
log k(B) = 0.0934 MR (7.66) - 0.121 IBRCH (-2.35) - 0.096 lOH (-2.08)-1.19 A2=15
5 = 0.123
r2 = 0.877
F = 26.25
(13)
log k(C) = 0.110 MR (5.08) - 0.201 IBRCH (-2.14) -0.209 lOH (-2.71)-1.12 n=l2
5 = 0.200
r2 = 0.857
F = 16.00
(14)
Studies by Abe and coworkers on complex adsorbents such as local anesthetics and saccharides have led to results of surprising simplicity, as alluded to in the introduction to this section. One is generally accustomed to seeing the complexity of a correlation increase with the complexity of molecular structure. In fitting seven local anesthetics to the Freundlich equation, they find a linear relationship between l/N and molecular weight.^^ The correlation with MR is slightly lower in quality (r^=0.914). As 1/A^is linear in log k for adsorption on carbon, London forces appear to dominate the binding process for these moderately complex drugs. Equally surprising is their study of 13 saccharides and 4 polyhydric alcohols.^^ The Freundlich constant, log k, correlates highly with the carbon and oxygen count and acceptably well with MR. There is no evidence for other significant descriptors that might imply a complex binding mechanism. Local anesthetics: procaine, lidocaine, tetracaine, dibucaine, mepivacaine, chloroprocaine, benzocaine l/N = -l,6S X 10-^MW (11.6) + 0.286 n=lO
r^ = 0.951
F = 135
(15)
Saccharides: D-(+)-xylose, D-(-)-arabinose, D-(-)-2-deoxyribose, D-(+)-glucose, D-(+)-mannose, D-(-)-fructose, D-(+)-galactose, L-(+)-rhamnose, amethyl-D-(+)-glucoside, a-methyl-D-(-)-mannoside, D-(+)-maltose, D-(+)-sucrose, D-(+)-lactose Polyhydric alcohols: glycerol, me^o-erythritol, D-xylitol, D-(-)-mannitol log k = 0.867 N^ (6.46) - 0.610 N^ (^.05) - 2.31 n=l7
5 = 0.232
r2 = 0.949
F = 129.6
(16)
Energetics of Binding
21
log k = 0.0572 MR (9.40) - 2.68 n=l7
5 = 0.378
r^ = 0.855
F = 88.26
(17)
B. Adsorption Chromatography on Cellulose and Paper
Paper chromatography was a highly developed art/technique long before thinlayer plates with powdered cellulose were available to simplify the procedure. Both adsorbents are predominantly cellulose with low amounts of additives to improve physical properties and one would expect similar results in relative performance. However, the longer tank times and migration distance of paper chromatography, with less control over lateral diffusion, suggests separate treatment from TLCcellulose studies. Accordingly, we treat the presumably more precise powdered cellulose studies before the older technique of whole paper chromatography. Powdered Cellulose
A dramatic demonstration of the difference between binding to inorganic and organic polymer surfaces is provided by Sawicki and coworkers.^^ Developing a set of 22 polynuclear ring-carbonyl compounds (fluorenone, coumarin, anthrone, indanone, etc.) on alumina with toluene and on cellulose with DMF-water (35:65 v/v), they observe a radically different sequence. This is, of course, consistent with the expectations of completely different mixes of intermolecular forces for the same compounds binding to very different polymeric surfaces. Powdered cellulose plates are used for much the same separations that gave paper chromatography special advantages, namely for the separation of polar compounds such as amines, acids, heterocyclics, steroids, and complex biochemicals such as nucleic acid derivatives.^^ The technique is especially effective in separating simple aliphatic acids and amino acids. By rescaling the R^ data into the binding energy related log form, R^ (see Section III.B), a QSPR analysis can be performed in mechanistic terms. One very interesting set of aliphatic acids with exceptional variation in structure has been analyzed by the author.^'^^ The set of 49 acids is composed of simple aliphatic structures with hydroxy, amino, halo, and mercapto substituents. Development on cellulose plates with diethylamine-n-butanol-water (1:85:14) resulted in an R^ range of 0.07-0.97. By regression of Rj^ against descriptors of the aliphatic group of RCOOH, an excellent correlation is obtained. The Zf is derived from the partial calculation of log P(octanol/water) and the negative dependance is expected as cellulose is hydrophilic and would repel lipophilic structure. The electronic effect, Saj, is that delivered to the a position of the acid to modify the acidity of RCOOH. The HB descriptor is a simple count of both types of H-bonding by substituents on RCOOH. While H-bonding is expected to the OH groups of cellulose, the effect is weak and of the wrong sign. The indicator variables, lOH and INH2, suggest special behavior for hydroxy- and amino acids beyond that of
22
PHILIPS. MAGEE
the log P and inductive contributions. It seems highly probable that HB, lOH, and INH2 are strongly confounded in an accidental correlation despite the size of the set. In brief, this is a classic example of a statistical disaster that might have gone unnoticed but for the incorrect sign of the HB descriptor. Aliphatic acids: C2-C10 RCOOH with substituents - C2(a): F, CI, Br, I, CH3, OH, SH, NH2, C3(P): CI, CH3, OH, NH2, C4(Y): CH3, OH, NH2
R^ = - 0.367 Sf (-23.24) - 0.104 Z QJ (-2.69) - 0.392 HB (-5.62) + 0.265 lOH (3.74) + 0.811 INH2 (10.86) + 0.333 n = 49
5 = 0.096
r^ = 0.913
F=314
(18)
This set is of sufficient interest to analyze by an alternate method developed by the author.^"^ In this approach, a hypermolecule is developed and each position is analyzed by using positional descriptors to describe lipophilicity, f (from partial log P) and the electronic effect, %, for electronegativity. The atomic electronegativity is known to be directly related to atomic sigma charge and the inductive effect.^"^ Five of the longer alkanoic acids were too branched to accommodate and were deleted from the set. It was further necessary to combine some positions into small regions to have statistically significant loading of the matrix. Positions PI, P3, P4, and P6 are sufficiendy populated to retain their identity, but P22, P55, and P789 have been merged to define small regions. All positions and small regions were tested for lipophilicity (f), charge (%), and H-bonding (HB). As the analysis is positional, no special indicators were tested for hydroxy or amino acids. The result is strikingly different from that achieved in Eq. 18. While Eq. 19 is somewhat weaker statistically, it is far more credible. The lipophilic interactions are complex with P6 showing an unexpected positive slope, while P3 and P5 respond as expected. The electronic effect is quite interesting in being distributed over all positions rather than localized near the COOH group to influence acidity. This suggests that dipolar binding is significant at nearly every position regardless of distance from the COOH group. Finally, the H-bond term is not only strong as expected but positive as demanded by bonding to a hydroxy lie surface. While Eq. 18 is not entirely incorrect, positional analysis provides a more incisive measure of mechanistic detail. Same RCOOH less 5 deletions. Positional diagram: 8 6 1 5.5.4.3.C-COOH 9 7 2 2 RM = -0.177 f3 (-3.02) - 0.156 f5 (-3.21) + 0.394 f6 (2.79)
Energetics of Binding
23
- 0.0619 xl (-2.65) - 0.0933 x2 (-4.54) -0.140 x4 (-4.83) - 0.125 x6 (-2.57) - 0.0663 x789 (-2.45) + 0.934 H B (12.97)+ 0.0190 n = 44
5 = 0.176
A^ = 0.920
F = 43.39
(19)
The migrating species of an amino acid can be strongly affected by the basicity or acidity of the developing solvent and consequently alter the chromatographic pattern and the binding mechanism. A set of 38 amino acids with an exceptional range of structure (keto, carboxylic acid, carboxamide, mercapto, amino, thio, imidazole, hydroxy, sulfonyl, and aromatic substituents) was developed on cellulose plates with a basic eluent (/z-butanol-acetone-diethylamine-water [10:10:2:5]) and with an acidic eluent (isopropanol-formic acid-water [20:1:5]).^^ This set provides a unique opportunity to compare binding of amino acids to cellulose under both protonation and deprotonation conditions. The results are dramatically different. The descriptors tested are 2f (excluding the amino and carboxylic group), MR (same basis), and the combined count of both types of H-bonds (HB). Equation 20 shows the set developed with the basic eluent. Neither MR nor H-bonding have any significance in this relation which depends only on the calculated lipophilicity. Several outliers were detected and deleted. Four of the five outliers are basic (amino[3] and imidazole); the other is the only mercaptoamino acid. The correlation is of acceptable strength for a set of this diversity and the residuals approach a symmetrical distribution with central tendency, suggesting that only experimental error remains. Amino acids: a- and P-alanine, a-, P- and y-butyric and isobutyric acids, e-aminocaproic acid, a,Y-diaminobutyric acid, aspartic acid, citrulline, glutamic acid, glutamine, glycine, histidine, P-hydroxyglutamic acid, hydroxyproline, P-hydroxyvaline, leucine, isoleucine, lysine, methionine, methionine sulfone, norleucine, norvaline, ornithine, a-phenylalanine, aphenylglycine, proline, sarcosine, serine, threonine, tryptophan, tyrosine, valine R^ = - 0.269 Sf (-11.59) + 0.196 n = 33
5 = 0.186
r^ = 0M2
F = 134.3
(20)
The simplicity of this relation suggests that amino acid anions somehow inhibit the formation of hydrogen bonds between substituents and cellulose, perhaps by engaging in intramolecular H-bonds with the carboxylate group. Developed under acidic conditions, the same set displays a radically different binding mechanism as shown in Eq. 21. As in Eq. 20, five outliers were detected and deleted. Only one outlier was common to each eluent, namely, the mercapto-amino acid. The others
24
PHILIPS. MAGEE
were two terminal carboxylic acid groups, one keto- and one hydroxy-substituent. The correlation is dominated by the bulk descriptor, MR, and by H-bonding to cellulose (HB). The lipophilicity descriptor, Zf, has no significance. It is interesting to note that the bulk descriptor has a negative coefficient while that of HB is positive, as expected. This binding of the side-chain polar substituents is, of course, supplementary to that of the amino and carboxylic acid groups which are assumed to provide a constant binding energy through relatively strong H-bonds to cellulose. The correlation indicates that binding of the polar side chains is controlled by H-bonds to cellulose hydroxyl groups in opposition to the repulsion of the predominantly hydrocarbyl structures. One possibility for the difference in binding mechanism may be attributed to the additional strength of the neutral carboxylic acid bond to cellulose. The strength of this bond may force the side chain into closer contact with the cellulose surface to effect specific interactions unavailable to the amino acid anion. The equation is substantially weaker than Eq. 20, but displays a similar distribution of residuals with central tendency. Same amino acids as Eq. 20: RM = -0.0279 MR (-4.99) + 0.167 HB (6.82) + 1.05 n = 33
5 = 0.269
r^ = 0.665
F = 29.72
(21)
Paper
The use of paper for chromatography has a longer history than that of powdered cellulose plates. It also differs in the preparation process in that powdered cellulose has suffered more physical abuse than cellulose papers. In the following examples, we begin with the chromatography of aliphatic acids followed by studies of substituted 2-amino-l-alkanols and simply substituted chloro- and alkylphenols. Unfortunately, there is no way to obtain a direct comparison between related sets run on paper and powdered cellulose. The response of the descriptors provides the best evidence that binding in each case is essentially to cellulose and not to impurities therein. To minimize the variance, each of these studies was developed on the same paper, Whatman No. 1. An interesting small set of diversely structured mono-, di- and tricarboxylic acids was developed on paper with an acidic eluent (phenol-water-formic acid [75:25:1 v/v]).^^'^^ With only 15 members, no more than 3 descriptors can be used to correlate the set (Eq. 22, Table 7). Chosen for testing were MR and Ef of the aliphatic non-carboxylic structure. As each carboxylic acid was expected to bind strongly, two indicator variables, 12 and 13, were used to distinguish the di- and tricarboxylic acids from the singly binding monoacids. In agreement with Eq. 20, Ef (partial log P) proved to be substantially stronger than the simple bulk descriptor, MR. In addition, both indicator variables show strong positive coefficients, supporting
Energetics of Binding
25
additional H-bonding by each carboxylic acid group. It is interesting that the coefficient for 13 is substantially larger than for 12. RM = -0.308 Xf (-6.46) + 0.753 12 (5.32) + 1.13 13 (6.32) - 0.650 n=l5
^ = 0.231
r^ = 0.872
F = 24.98
(22)
The correlation is robust and the negative coefficient of Zf is similar to that of Eq. 20 for a much different set of acids developed on powdered cellulose. The term is basically repulsive for forcing lipophilic structures onto a hydrophilic surface. Substituted 2-amino-l-alkanols were developed on paper with n-butanol saturated with 0.1% ammonium hydroxide (Eqs. 23 and 24, Table 8).^^ The set is small but of exceptional structural variation. Descriptors selected for analysis are the MR and 71 values (aromatic partial log P) of the substituents, several of which were estimated as the groups are quite unusual (guanidylpropyl, imidazolemethyl, 4-hydroxyphenyl methyl, etc.). Binding of the 2-amino and 1-hydroxy groups is expected to be strong and to dominate orientation on the cellulose surface. As these associations are constant for all members, the analysis concerns the secondary effect on binding of the residual structure. In addition, indicator variables for aromatic structure and the capacity for forming additional H-bonds were tested. Consistent with the binding of other aliphatic structures to cellulose, the lipophilic descriptor, 71, correlates with much greater strength than the bulk descriptor, MR. Plotting
Table 7. Chromatography of Mono-, Di- and Tricarboxylic Adds on Whatman No. 1 (eluent = phenol:water:formlc add) Aliphatic Acid Aconitic Adipic Citric Fumaric Glutaric Glycol ic
'^M
Lactic
0.25 -0.79 0.45 -0.23 -0.55 -0.16 -0.41
Levulinic
-1.00
Yest^
If
12
0.21 -0.71 0.52
0.87
0 1 0 1 1
-0.03 -0.51
2.64 -0.12 0.44
13 1 0 1 0 0 0
-0.55
1.98 -0.98 -0.32
0
0
-0.75
0.31
0
0 0
-0.35
0
Malic
0.14
0.31
-0.55
1
Malonic
0.03
-0.10
0.66
1
0
Oxalic
0.66
0.10
0.00
1
0
Succinic
-0.29
-0.31
1.32
1
0
Syringic
-1.28
-1.20
1.78
0
0
Tartaric
0.63
0.85
-2.43
1
0
-0.03
-0.06
1.75
0
1
Tricarballylic Note: ^Equation 22.
26
PHILIPS. MAGEE
indicated curvature and additional strength is gained in the parabolic correlation. As 71 is colinear with Zf, the magnitude of the negative coefficient is in perfect agreement with that of other sets of mainly hydrocarbyl groups binding to cellulose (Eqs. 18, 20-22). Forced binding of aliphatic structure to cellulose is clearly repulsive in nature. Rj^ =-0.237 71 (-5.97)+ 0.421 n=l5
5 = 0.260
r^ = 0.733
F = 35.68
(23)
RM = -0.30171 (-6.72) - 0.0428 n^ (-2.26) + 0.549 n=l5
5 = 0.227
r2 = 0.813
F = 26.03
(24)
Data for 22 multiply substituted phenols developed on paper with xylene saturated with formamide were analyzed (Eqs. 25 and 26, Table 9).^^'^^ Descriptors tested were MR, n, and Hammett's a summed over all the substituents. Due to the simple nature of the substituents (CI, CH3, C2H5), there is a natural colinearity between MR and n(r = 0.991) that makes precise selection of the key descriptor difficult. In consistence with other cellulose and paper correlations, n is selected over MR. For 2,6-substituted phenols, Charton's upsilon(\)) is selected to describe
Table 8, Chromatography of 2-Amino-1 -Alcohols on Whatman No. 1 (eluent = n-butanol-0.1% ammonium hydroxide) Aminoalcohol
^M
Alaninol
0.52
Argininol Aspartidol
1.12 0.52
Ethanolamine
0.75
0.60 0.42
Glutamidiol
0.52
Histidinol
71
n'
0.38
0.51
1.08 0.76
-3.80 -0.77
0.26 14.44
0.55
0.00
0.00
0.48
0.62
-0.26
0.07
0.87
0.32
0.41
0.18
Isoleucinol
-0.23
-0.01
-0.14
0.43 1.82
Leucinol Lysinol
-0.23 1.12
-0.02
-0.16
1.87
3.50
1.19
1.08
-3.23
10.43
Phenylalaninol
4.04
Yest^
Yest^
0.30 1.32
0.59
3.31
-0.37
-0.05
-0.23
2.01
Prolinol
0.45
0.14
0.13
1.20
1.44
Serinol
0.75
0.67
0.81
-1.03
1.06
Threoninol
0.35
0.54
0.69
-0.52
0.27
-0.07
0.10
0.07
1.34
1.80
0.02
0.09
0.04
1.40
1.96
Tyros i no! Valinol Notes: ^Equation 23. ^Equation 24.
Energetics of Binding
27
Table 9. Chromatography of Substituted Phenols on Whatman No. 1 (eluent = xylene saturated with formamide) Phenol
Yest^
Zn
^2,6
-0.087
-0.120
-0.288
-0.351
1.22 1.22
0.00 0.52
'^M
3-Me-4-Chloro 2-Me-4-Chloro 3-Me-6-Chloro 2-Me-6-Chloro2
-0.432
-0.365
1.22
-0.908
-0.596
1.22
0.55 1.07
2,3-DiMe-4-Chloro
-0.432
-0.566
1.73
0.52
2,5-DiMe-4-Chloro
-0.501
-0.566
1.73
0.52
3,5-DiMe-4-Chloro
-0.410
-0.335
1.73
0.00
3,4-DiMe-6-Chloro
-0.575
-0.580
1.73
0.55
3-Me-5-Et-4-Chloro
-0.630
-0.550
2.24
0.00
3-Methyl
0.288
0.180
0.51
0.00
2-Methyl
0.000
-0.052 -0.664
0.51
0.52
1.93
-0.896 -0.871
1.93 2.42 2.42
0.55 1.07
3-Me-4,6-Dichloro
-0.720
2-Me-4,6-Dichloro 3,5-DiMe-2,4-Dichloro
-1.005
3,4-DiMe-2,6-Dichloro
-1.005 -1.061 -1.061
0.55 1.10
-0.213
-1.116 -1.095 -0.267
2,5-Dimethyl
-0.176
-0.267
1.02
0.52
3,4-Dimethyl
0.105
-0.035
1.02
0.00
3-Me-5-Et-2,4-Dichloro 2,3-Dimethyl
2.95 1.02
0.55 0.52
3,5-Dimethyl
-0.158
-0.035
1.02
0.00
3-Methyl-5-Ethyl
-0.368
-0.251
1.53
0.00
3-Methyl-2,4,6-Trichloro
-0.954
-1.209
2.64
1.10
Notes: ^Equation 25. ^Deleted from Eq. 26.
potential effects on phenolic H-bonding to cellulose. This excellent correlation again supports repulsive binding for lipophilic substituents and presumably for the phenyl ring as well. In addition, the 2,6-steric effect clearly identifies the phenolic group H-bond to cellulose as the primary binding mechanism. It is unfortunate that electronic support for this mode of binding was not significant for the limited selection of substituents in this set. Deletion of one outlier, 2-methyl-6-chlorophenol leads to a significant improvement in statistical strength but without change in interpretation. R^ = -0.422 171 (-9.12) - 0.445 1)2,5 ("^-^S) + 0.395 n = 22
5 = 0.129
^ = 0.906
F = 91.30
(25)
28
PHILIPS. MAGEE
RM = -0.463 E 71 (-12.03) - 0.337 1)2,5 M-'79) + 0.429 n = 2l
5 = 0.102
r^ = 0.941
F = 142.6
(26)
V. BINDING OF ORGANIC COMPOUNDS ON BIOORGANIC POLYMERS The binding of pesticides and ordinary organic chemicals to organic soils is a necessary field of research for understanding the complex process of soil binding and release in the application of chemicals to solve agricultural problems. Excellent experimental work has been performed and the physical chemistry of soils is well documented.^^ Measured values of soil/water partitioning, K(OMAV), are corrected for the organic matter (OM) content on the reasonable assumption that nonactivated sand/clay will have little affinity for binding organic chemicals. There are some exceptions, such as the strong ionic binding of paraquat dication to clay, but such cases are rare. The usual treatment of K(OMAV) is that of a simple partitioning event consistent with high log P(o/w) correlafions. The inaccuracy of this treatment was demonstrated by Magee through the application of log P factoring (see Section 11).^^ It is useful to review some of this work as a special extension of binding to organic polymers. In some interesting work by Briggs, 21 commercial pesticides were chromatographed on thin-layer plates composed of finely divided soil (Eqs. 27 and 28, Table J Q>^ 12,70 jYiQ Rj^ values correlate flawlessly with measured log P values and the factoring of log P does not reveal any additional information. The coefficients of PL and PH are nearly identical with that of log P. Note also that neither 5* nor r^ have changed and that F is simply halved due to the addition of a second descriptor. This is, in fact, a perfect example of a verified log P relation and of the harmlessness of factoring. It is also an excellent example of the effect of grinding the complex humic acid structures in the soil organic matter. The situation with physically intact organic matter is quite different. R^ = 0.522 log P (21.52) - 0.943 n = 21
^ = 0.109
r^ = 0.960
F = 463.1
(27)
R^ = 0.502 PL (15.56) + 0.531 PH (20.43) - 0.837 n = 2l
5 = 0.110
^ = 0.960
F = 230.9
(28)
A smaller set of 14 pesticides was measured in equilibrium with whole soil and water by Briggs^^ and factored by Magee (Eqs. 29 and 30, Table 11).^^ Correlation with log P is again satisfactory, but factoring now shows substantial improvement with selectivity for the hydrophilic substructures. Note that s and r^ are enhanced and F greatiy exceeds one half of the unfactored F, It is also interesting to note how
Energetics of Binding
29
Table 10, Thin-Layer Chromatography of Commercial Pesticides on Finely Divided Soil Pesticide Cycloheximide Oxycarboxin Fenuron Monuron Simazine Pyrazon
'^M
Yest^
LogP
PL
PH
-0.908 -0.432
-0.712 -0.482
0.55 0.90
5.79 4.27
-5.24 -3.37
-0.348
-0.421
0.96
3.25
0.035 0.087
0.017
1.84
-2.29 -2.41
0.041
1.85
4.25 3.62
1.50
3.46
2.35 2.36
3.23
-0.78
3.81
-1.45 -2.06
0.105
Captan
0.194
-0.140 0.371
Carbaryl
0.213
0.306
-1.77 -1.96
Picloram(Me ester)
0.269
0.259
2.30
Metobromuron
0.348
2.38
4.36 4.54
2,4-Dichlorophenol Diuron
0.368 0.454
0.296 0.557
2.80
3.24
-0.44
0.471
2.74
5.10
-2.36
Amiben(Me ester) Propanil
0.477
0.535
2.80
4.00
-1.20
0.501
0.516
4.63
-1.83
3,4-Dichloroaniline Linuron
0.550
0.530
2.80 2.78
3.78
-1.00
0.689
0.598
5.10
-2.12
Chlorbromuron
0.788
0.695
2.98 3.17
5.25
-2.08
Fenac(Me ester) Chloroxuron
1.005
1.028
3.80
5.29
-1.49
1.005
1.028
3.85
6.21
-2.36
Pentanochlor
1.061
0.978
3.70
5.21
-1.51
Fluorodifen
1.380
1.370
4.40
4.46
-0.06
-2.16
Note: ^ Equations 27 and 28. Vest from Eq. 27.
closely the log P coefficient of this equilibrium measure agrees with that of the thin-layer procedure (Eq. 27). log K(OMAV) = 0.557 log P (14.57) + 0.525 n=14
^ = 0.239
r^ = 0.947
F = 212.2
(29)
log K(OMAV) = 0.521 PL (15.05) + 0.640 PH (14.17) + 0.831 Az=14
5 = 0.197
r^ = 0.966
F = 158.5
(30)
While Eq. 30 is indicative of additional mechanism other than simple partitioning, the set is too small to define any specific effects beyond the imbalance of PL and PH. For this purpose, we are fortunate to have a major study by Sabljic on the soil adsorpfion coefficients of 128 polar compounds.^^ The collecfion is extremely diverse with anilines, nitrobenzenes, acetanilides, ureas, and carbamates in addition
30
PHILIPS. MAGEE Table 11. Distribution of Pesticides Between Soil Organic iMatter and Water LogK(OM/W)^
Pesticide Dimethoate
0.72
Aldicarb
LogP
PL
PH
0.79 1.57
3.33
-2.54
3.76
-2.19 -1.77
Simazine
1.39 1.44
1.85
3.62
Carbaryl
1.78
2.32
3.77
-1.45
Captan
2.06
2.54
3.32
-0.78
Diazinon
2.12 2.23
3.49
5.98
-2.49
3.17
7.24
-4.07
Chlorfenvinphos Fenamiphos
2.28
3.18
6.20
-3.02
Phorate
2.58
3.59
5.14
-1.55
Parathion
2.78
3.93
4.39
-0.46
Folpet
3.03
3.63
2.70
Captafol Dieldrin
3.08 3.87
3.83 6.2
4.61 7.74
0.93 -0.78 -1.54
Aldrin
4.45
7.4
7.40
0.00
Note:
^Corrected for sand content.
to 56 commercial pesticides. This set already deviates substantially from simple partitioning as found by Magee in Eq. 31.^^ This equation is then subjected to log P factoring as shown in Eq. 32. Polar compounds: 56 pesticides, 32 arylureas, 14 acetanilides, 8 anilines, 7 N-phenylcarbamates, 6-nitrobenzenes, 5 miscellaneous compounds log K(OMAV) = 0.365 log P (10.00) + 0.0175 MR (5.95) -0.385 HBD (4.99)+ 0.513 n=128
5 = 0.276
r2 = 0.874
F = 288.6
(31)
log K(OMAV) = 0.256 PL (5.31) + 0.401 PH (10.95) + 0.0257 MR (6.84) - 0.386 HBD (4.96) + 0.542 n=l2S
s = 0.265
1^ = 0.886
F =231.5
(32)
There are significant improvements in s, r^ and F (expected value = 216.4) along with a clear demonstration of selection for hydrophilic substructures. Hydrogenbond acceptors (HBA) also appear to play a role, but were just under statistical significance (T = 1.87). Equation 32 clearly shows the mechanistic complexities of the binding of polar compounds to complex soil structures and should serve to eliminate the oversimplified concept of passive partitioning. From a statistical
Energetics of Binding
31
viewpoint, a set of this size provides the additional opportunity to deduce completeness from the residual pattern. In this case, the residual distribution is a perfectly symmetrical gaussian, revealing that all significant information has been extracted.
VI. CONCLUSIONS All of the studies reviewed in this chapter, many of which are previously unpublished, have one thing in common. Each of the binding events can be described in mechanistic terms without compromising the quality of the correlation. No additional descriptors are necessary to account for the bulk of the experimental variance. In order to account for the energetics of binding to inorganic, organic, and bioorganic polymers, nothing more than descriptors modeling known intermolecular forces is required. Within the full range of examples presented, nearly every known imf: dispersion forces, electronic and steric effects, and both common types of H-bonding (HBA, HBD) have played critical roles in dissecting the energetics of each event. Even the complex descriptor, log P(o/w), can be made to show structural selectivity by various surfaces, although the effects are still composite. It now seems safe to state that any binding event for both related and unrelated compounds can now be analyzed in mechanistic terms, providing the data are well measured and the compound set is of sufficient size and diversity. While the choice of descriptors will change over time to reflect scientific advances, the key to consistency will always be selection of the best current descriptors that model each of the known intermolecular forces.
NOTE The author recognizes that a few readers may have sufficient interest in the raw data and descriptors to wish to repeat the work or perform a variation on it. The tables included in the text (Tables 1-11) are those of manageable size (n - 14-36). The tables for Equations 8, 10, 11, 18, 19, 31 and 32 have not been included due to excessive size in length or breadth {n = 38-128). Any or all are available from the author by simple request.
REFERENCES 1. Israelachvili, J. N. Intermolecular and Surface Forces', Academic Press: London, 1985, pp 45-85. 2. Smith, D. A., Ed., Modeling the Hydrogen Bond; American Chemical Society: Washington, DC, 1994. 3. Newman, M. S., Ed. Steric Effects in Organic Chemistry, John Wiley & Sons: New York, 1956. 4. Martin, Y. C. Quantitative Drug Design] Marcel Dekker: New York, 1978, pp 80-81. 5. Bondi, J. Phys. Chem. 1964, 68,441-451. 6. Moriguchi, I.; Kanada, V.; Komatsu, K. Chem. Pharm. Bull. 1976,24, 1799-1806. 7. Magee, P. S. In Rational Approaches to Structure, Activity, and Ecotoxicology of Agrochemicals; Draber, W; Fujita, T., Eds.; CRC Press: Boca Raton, PL 1992, pp 79-101. 8. Charton, M.; Charton, B. I. J. Org. Chem. 1979,44, 2284-2288. 9. Vandenbelt, J. M.; Hansch, C ; Church, C. J. Med Chem. 1972, 75,787-789.
32
PHILIPS. MAGEE
10. Charton, M.; Charton, B. J. Theor Biol. 1982, 99, 629-644. 11. Guy, R. H.; Honda, D. H. Int. J. Pharm. 1984, 79, 129-137. 12. Magee, P. S. In QSAR in Environmental Toxicology-IV; Hermens, J. L. M.; Opperhuizen, A., Eds.; Elsevier: Amsterdam, 1991, pp 155-178. 13. Charton, M. In Advances in Quantitative Structure—Property Relationships, Charton, M., Ed.; JAI Press: Greenwich, CT, 1996, pp 171-219. 14. Hansch, C ; Leo, A. Exploring QSAR; American Chemical Society: Washington, DC, 1995, Chaps. 1-2. 15. Kamlet, M. J.; Abboud, J.-L. M.; Abraham, M. H.; Taft, R. W. J. Org. Chem. 1983,48,2877-2887. 16. Raevsky, O. A.; Grigor'ev, V. Yu.; Kireev, D. B.; Zefirov, N. S. Quant. Struct.—Act. Relat. 1992, 77,49-63. 17. Reference 3, Chap. 13, pp 556-675. 18. Charton, M. In Topics in Current Chemistry. Charton, M.; Motoc, I., Eds.; Springer: Berlin, 1983, pp 57-91. 19. Charton, M. J. Am. Chem. Soc. 1969, 91, 615-618. 20. Gibbons, J. J.; Soundararajan, R. American Laboratory 1988, July, 38-46. 21. Jednacak-Biscan, J.; Cukman, D. Colloids and Surfaces 1989,41, 87-95. 22. Jeziorowski, H.; Knozinger, H.; Meye, W.; Muller, H. D. J. Chem. Soc, Faraday Trans. 11973, 69, 1744-1758. 23. Acosta Saracual, A. R.; Pulton, S. K.; Vicary, G. J. Chem. Soc, Faraday Trans. I 1982, 78, 2285-2296. 24. Meyer, C ; Bastick, J. Bull. Soc Chim. Fr 1978, 9-70, 359-362. 25. Hirva, R; Kakkanen, T. A. Surface Sci. 1992, 277, 530-538. 26. Cross, S. N. W.; Rochester, C. H. J. Chem. Soc, Faraday Trans. 11981, 77, 1027-1038. 27. Rochester, C. H.; Trebilco, D.-A. J. Chem. Soc, Faraday Trans. 11978, 74, 1125-1136. 28. Acosta Saracual, A. R.; Rochester, C. H. J. Chem. Soc, Faraday Trans. 11982, 78, 2787-2791. 29. Pohle, W. / Chem. Soc, Faraday Trans. 11982, 78, 2101-2109. 30. Rochester, C. H.; Trebilco, D.-A. J. Chem. Soc, Faraday Trans. 11978, 74, 1137-1145. 31. Snyder, L. R. J. Phys. Chem. 1963, 67, 240-248. 32. Snyder, L. R. J. Phys. Chem. 1963, 67, 234-240. 33. Snyder, L. R. / Phys. Chem. 1963, 67, 2344-2353. 34. Holmes-Farley, S. R. Langmuir 1988, 4,166-11 A. 35. Glass, R. W.; Ross, R. A. J. Phys. Chem. 1973, 77, 2571-2576. 36. Glass, R. W.; Ross, R. A. J. Phys. Chem. 1973, 77, 2576-2578. 37. Bate-Smith, E. C ; Westall, R. G. Biochim. Biophys. Acta 1950,4, 427-438. 38. Dallas, M. S. J. J. Chromatog. 1965,17, 267-277. 39. Magee, R S. Quant. Struct.—Act. Relat. 1986, 5, 158-165. 40. Snyder, L. R. \n Advances in Chromatography: Giddings, J. C ; Keller, R. A., Eds.; Marcel Dekker: New York, 1967, pp 3-46. 41. Vemin, G.; Vemin, Mrs. G. J. Chromatog. 1970,46, 48-65. 42. Vemin, G.; Vernin, Mrs. G. J. Chromatog. 1970, 46, 66-78. 43. Klemm, L. H.; Chia, D. S. W.; Kelly, H. R J. Chromatog. 1978,150, 129-134. 44. Zweig, G.; Sherma, J., Eds. Handbook of Chromatography: General Data and Principles', CRC Press: Boca Raton, FL 1972, Table TLC 60. 45. Snyder, L. R. J. Chromatog. 1965, 20, 463-495. 46. Snyder, L. R. J. Chromatog. 1964,16, 55-88. 47. Reference 44, Table TLC 69. 48. Reference 44, Table TLC 119. 49. Giles, C. H.; Hassan, A. S. A. J. Soc Dyers Colour 1958, 74, 846-857. 50. Timofei, S.; Schmidt, W.; Kurunczi, L.; Simon, Z.; Sallo, A. Dyes and Pigments 1994, 24, 267-279.
Energetics of Binding
33
51. Timofei, S.; Kurunczi, L.; Schmidt, W.; Fabian, W. M. R; Simon, Z. Quant. Struct.—Act. Relat. 1995,14, 444-449. 52. Grant, T. M.; King, C. J. Ind. Eng. Chem. Res. 1990, 29, 264-271. 53. Chitra, S. P.; Govind, R. AIChEJ. 1986, 32, 167-169. 54. Abe, I.; Hayashi, K.; Kitagawa, M. Kagaku to Kogyo (Osaka) 1981,55,441-442. 55. Kamlet, M. J.; Doherty, R. M.; Abraham, M. H.; Taft, R. W. Carbon 1985, 23, 549-554. 56. Grate, J. W.; Abraham, M. H.; Du, C. M.; McGill, R. A.; Shuely, W. J. Langmuir 1995, 11, 2125-2130. 57. Abe, I.; Hayashi, K.; Hirashima, T.; Kitagawa, M. Colloids and Surfaces 1984, 8, 315-318. 58. Abe, I.; Hayashi, K.; Hirashima, T.; Kitagawas, M. J. Colloid Interface Set. 1983, 94, 201-206. 59. Abe, I.; Kayama, H.; Ueda, I.; J. Pharm. Sci. 1990, 79, 354-358. 60. Abe, I.; Hayashi, K.; Kitagawa, M. Carbon 1983, 21, 189-191. 61. Sawicki, E.; Stanley, T. W.; Elbert, W. C ; Morgan, M. Talanta 1965,12, 605-616. 62. Reference 44, p 283-436. 63. Reference 44, Table TLC29. 64. Magee, R S. Quant. Struct.—Act. Relat. 1990, 9, 202-215. 65. Reference 44, Table TLC13. 66. Reference 44, Table PCI3. 67. Reference 44, Table PC57. 68. Reference 44, Table PC31. 69. Hartley, G. S.; Graham-Bryce, I. J. Physical Principles of Pesticide Behavior; Academic Press: London, 1980, pp 236-331. 70. Briggs, G. G. J. Agric. Food Chem. 1981, 29, 1050-1059. 71. SabljiC, A. Environ. Sci. Technol. 1987, 21, 358-366.
This Page Intentionally Left Blank
STRUCTURAL EFFECTS ON GAS-PHASE REACTIVITIES*
Gabriel Chuchani, Masaaki Mishima, Rafael Notario, and Jose-Luis M. Abboud
I. Introduction 36 II. Correlation Models and Substituent Constants 37 III. Reactions Involving Ionic Reagents and Products 42 A. Experimental Methods 42 B. Complexes between Bromide Ion and Substituted Benzenes (SB) 42 C. Li"*" Complexes 45 D. Halogen Cations as Lewis Acids in the Gas Phase 48 E. The Power of LFER: Ionization of Br0nsted Acids and the Discovery of "New" Substituents 52 F. Structural Effects on the Stability of Carbocations 56 G. SE on the Intrinsic Basicity of Carbonyl and Thiocarbonyl Compounds . . 66 H. Solvent Effects on Selected Proton Transfer Equilibria 75 I. Correlation between Carbocation Stability in the Gas Phase and Kinetics of Carbocation Formation Reactions in Solution 78
*Dedicated in memoriam to Prof Robert W. Taft Advances in Quantitative Structure Property Relationships Volume 2, pages 35-126. Copyright © 1999 by JAI Press Inc. Ail rights of reproduction in any form reserved. ISBN: 0-7623-0067-1 35
36
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
IV. Reactions Involving Neutral Reagents and Products A. Experimental Considerations B. Esters C. Halides D. Carbonates E. Carbamates F. Thionocarbamates G. P-Hydroxyolefins H. a-Keto Acids I. Methanesulfonates J. Alcohols K. Addition of Ketene to Carboxylie Acids Acknowledgments References
83 83 83 100 106 112 113 114 116 116 119 120 121 121
\. INTRODUCTION The quantitative study of structural and substituent effects (SE) in organic chemistry (often by means of linear free energy relationships^"^) may provide important clues for the assignment and interpretation of reaction mechanisms. Difficulties met in the analysis of these effects frequently arise from the involvement of solvent. At variance with this situation, SE on gas-phase chemical reactivity (both kinetic and thermodynamic), are intrinsic, that is, free from perturbations originating in solvent-solute interactions. Comparison of SE on the same reaction taking place in solution and in the gas phase allows to quantify the influence of solvation.^ In the case of molecules involving long alkyl chains, the situation is obviously more complicated, as the molecule can solvate itself intramolecularly in a suitable conformation. Because of technical difficulties, there are relatively few systematic experimental studies of substituent effects on gas-phase reactivity. As to reactions of neutral species, some reviews on gas-phase pyrolysis are available^^"^^ but there seems to be no monographic treatment of SE in these processes. In the case of reactions involving anions and cations, Taft and Topsom^^ and Gal and Maria^"^ have published in 1987 and 1991, respectively, two major surveys. The first one specifically addresses the quantitative study of SE; the latter is more general and focuses on quantitative treatments of acid-base reactions involving neutral bases and a variety of charged electron acceptors. Here, a survey of some recent studies of SE on gas-phase reactivity is presented. Both neutral and charged reagents/products are treated. We try to cover material not included in these reviews and, eventually, when a minor overlap occurs, the treatment of the experimental data is somewhat different from, and complementary to, that given in refs. 13 and 14. In several cases, structural effects on gas-phase and solution reactivities are compared.
Gas-Phase Reactivities
37
Because of the highly specialized and widely different techniques used in the experimental study of charged and neutral species, we shall examine separately both groups of reactions. As we shall see, however, correlation techniques give a surprisingly unified picture of SE on these systems.
II. CORRELATION MODELS AND SUBSTITUENT CONSTANTS Hammett's classical definition^ of a parameters through Eq. 1 is an appropriate starting point: a^ = \ogK^-\ogK^
(1)
K^ and K^, respectively, stand for the ionization constants in water at 25 °C of benzoic acid and a meta- or/^ara-X-substituted benzoic acid. For each substituent, two families of substituent parameters, a and G^ are thus obtained. ^^ Beyond this, several models have been used by different groups of workers. For the sake of conceptual unity, and because of its breadth, we consider Charton's general treatment^^'^''' of the electrical effect Q^ induced by a substituent X on closed shell active sites ranging from cations, such as carbenium ions, to anions, such as carbanions, in systems with or without a skeletal group. According to this triparametric model, Q^^ is given by Eq. 2, Q^ = Laj^ + DGa^ + RG^^ + h
(2)
where GJ represents the electrical effect observed when one or more ^/7^-hybridized carbon atoms separates the active site from the substituent. In this type of system, delocalization of substituent valence electrons is thus minimal. This constant has been called the "inductive" or "field effect" constant. Charton refers to it as the "localized electrical effect constant." The constant a^ represents the resonance effect of the substituent (Charton's "intrinsic delocalized" effect). The constant G^ reflects the electronic demand of the system under scrutiny; h is a generalized intercept. It is important to notice that for a system in which the electronic demand remains constant, Eq. 2 reduces to the biparametric Eq. 3, Q^ = LGjx + DG^x + h
(3)
wherein a^ has the form of Eq. 4, G^ = ^G^ + G^
(4)
and r| is determined by the electron demand. This equation reflects a very important fact: the necessity of using resonance or delocalized effects appropriate to different kinds of reaction centers. Charton takes GJ^ as identical to GJ, defined^^ by means of Eq. 5,
38
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
(5^ = ^^KJ\.56
(5)
using the pi^^ values for the 4-X-bicyclooctane carboxylic acids. In a thorough review of SE parameters, Hansch et al.^^ showed that these a^ values are very close to those, Gp, m&2iS\ix\xi%field/inductive effects and obtained by Taft and Topsom^^'^^ by averaging values of determinations by numerous methods. These values shall be used here, for the sake of consistency with previous studies,^^'^"^'^^'^^ on gasphase reactivity of ionic species. It is a reassuring fact that the Gibbs energy changes for the ionization of 4-X-bicyclooctane carboxylic acids in the gas phase are linearly related to Op to a very high degree of precision.^^ Different o^ values are appropriate for situations involving different electron demand. Here, the following parameters shall be used: 1. OR, as determined by Taft^^ through Eq. 6: aR = a p - a ;
(6)
2. aj^+, also determined by Taft,^^ largely on the basis of the a^ parameters (in turn obtained by Taft and coworkers^^ from the ^^C NMR spectra of monosubstituted benzenes). They are appropriate for the treatment of electrondeficient systems. aj^+ values are fairly close to the corresponding a^'s. The main difference is that G^ = 0 for electron-acceptor (+R) groups. 3. o^- parameters are appropriate for the study of electron-rich systems.They are based on SE on gas-phase acidities of neutral acids such as phenols and anilines.^^'^^ For electron-acceptor groups, c^- and c^ are practically indistinguishable. The differences appear in the case of strong electron-donor (-R) groups. The general Taft-Topsom treatment of substituent effects (referred to hydrogen) on a thermodynamic or kinetic property Pr in the gas phase involves the contributions of polarizability (P), field (F) and resonance (R) effects and is given by Eqs. 7 or 8 depending on whether the systems involved are electron-deficient or electron-rich, respectively: 6Pr = Pr(substituent) - Pr(H) = p^^a^ + ppCp + PR+GR^
(7)
5Pr = Pr(substituent) - Pr(H) = f^j3^ + PpGp + PR-C^-
(8)
4. a° are the "normal" substituent parameters, defined by Taft^^ and quantifying substituent effects in systems wherein direct interaction between the substituent and the reaction center is absent. An important alternative biparametric model used in this work is that developed by Yukawa and Tsuno (Y-T) in 1959.^^ It was originally intended to deal with the influence of the para p-donor substituents on reactions that are more electrondemanding than the ionization of benzoic acid. These authors suggested that the values of a"*" - o would provide a scale of enhanced resonance effects and modified the Hammett equation to incorporate this feature in Eq. 9.
Gas-Phase Reactivities
39
log(^//^o) = P(^ + r^^M
(^)
where the enhanced resonance effect (a"*"- a) is written as AG^+. The o"*" parameters are those defined by Brown and Okamoto^^ on the basis of the solvolytic rates of cumyl chlorides; r"^ measures the contribution of the enhanced resonance effect of -R substituents. Later, this equation was modified^'* and the normal substituent constant a° was used instead of a in Eq. 10,
where Aa^+ is now (c^ - a°). This form of the equation may be held to be conceptually more correct than the original one since the a scale itself involves enhanced resonance effects. When r"^ = 0, log (K/K^) = pa°, while if r"^ = 1, it corresponds to straightforward correlation with a^. This modification of the parameter scale does not affect the original meaning and the applicability of the equation. The same idea leads to Eq. 11 for describing the enhanced resonance effect of +R substituents on an electron-rich reaction system such as the ionization of phenols (protonation of phenoxide anions),^^ log(/i://^o) = P(^° + ^"^^/?-) (^^) where Aa^- equals a" - a°. The r~ value indicates the contribution of the enhanced p-7i interaction between Sipara p-acceptor substituent and a negative charge. In this review the Y-T Eq. 10 is mostly applied to the study of substituent effects on the stabilities of electron-deficient systems. With this equation, the concept of varying resonance demand of reactions was introduced into the field of correlation analysis of SE. In the general application of this equation, the r"^ value has been found to widely change with the reaction, and not to be limited to values lower than unity. Indeed, values significantly higher than one are met in reactions more electrondemanding than the solvolysis of r^r/-cumyl chlorides. These r"^ values shed light on the nature of the transition state, and have been widely applied to the assignment and interpretation of reaction mechanisms. A thorough review on the use of the Y-T equation and the concept of varying electron demand has recently been published.^^ A fundamental contributor to SE in many gas-phase reactions of charged species is polarizability.^^ Physically, it reflects the stabilization of the charge (positive or negative) by the substituent through ion-induced dipole interaction. In the TaftTopsom scheme this effect is quantified by the parameter a^. We present in Tables lA and IB the values of the various parameters used in this study. Most of them are taken from refs. 15, 27, 28, and 29. It is a remarkable fact that for effects other than polarizability,^^ no serious differences exist between substituent constant values appropriate for gas and solution phases, except for some particular substituents which have strong specific interaction with the solvent (e.g. hydrogen bonding).^^'^^ This allows us to directiy compare results of correlation analyses of SE in the gas phase and in solution.
40
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table 1A. Substituent Parameters^
Substituent
^F
%
%
<
^R^
^R-
^m
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
CH3
-0.35
0.00
-0.08
0.03
-0.07
-0.17
-0.31
-0.17
CH2CH3
-0.49
0.00
-0.07
0.02
-0.07
-0.15
-0.30
-0.19
CH2CH2CH3
-0.54
0.00
-0.07
0.02
-0.06
-0.13
-0.29
-0.06
CH(CH3)2
-0.62
0.00
-0.07
0.01
-0.04
-0.15
-0.28
-0.16
CH2CH2CH2CH3
-0.57
0.00
-0.07
0.02
-0.08
-0.16
-0.29
-0.12
0.01
-0.08
-0.12
0.02
-0.07
-0.12
H
^ g
— —
—
CH(CH3)CH2CH3
-0.68
0.00
-0.07
CH2CH(CH3)2
-0.61
0.00
-0.07
C(CH3)3
-0.75
0.00
-0.06
0.00
-0.10
-0.20
-0.26
CH2C(CH3)3
-0.67
0.00
-0.07
0.02
-0.05
-0.17
-0.31
C-C3H5
-0.62
0.00
-0.15
-0.07
-0.21
-0.41
-0.09
-0.05
-0.15
-0.29
-0.14
0.06
-0.04
-0.16
-0.11
-0.14
-0.22
-0.18 0.53
—
0.01 -0.13
—
C-QH11
-0.76
0.00
-0.06
0.01
CH=CH2
-0.50
0.06
-0.16
(0.16)
CH2CH=CH2
-0.57
0.03
-0.07
0.02
C=CH
-0.60
0.23
(0.00)
0.21
0.23
0.18
CeHs
-0.81
0.10
-0.22
0.22
0.06
-0.01
-0.18
0.02
CH2C6H5
-0.70
0.05
-O.05
0.02
-0.08
-0.09
-0.28
-0.09
CH2CH2C5H5
-0.65
0.03
-0.07
0.02
-0.07
-0.12
-0.28
-0.12
0.13
0.44
-0.25
-0.25
0.34
0.06
-0.07
-0.03
CI
-0.43
0.45
-0.17
-0.12
0.37
0.23
0.11
0.19
Br
-0.59
0.45
-0.15
-0.10
0.39
0.23
0.15
0.25
0.35
0.18
0.14
0.27
F
1
—
—
(0.00)
—
—
—
— —
CH2F
-0.30
0.22
-0.03
0.02
0.12
0.11
CHF2
-0.27
0.36
0.00
0.04
0.29
0.32
CF3
-0.25
0.44
0.00
0.07
0.43
0.54
0.61
CH2CI
-0.54
0.23
-0.05
0.02
0.11
0.12
-0.01
CHCI2
-0.62
0.36
0.00
0.02
0.31
0.32
0.40
0.46
— — 0.65
— — —
— — — —
CCI3
-0.70
0.44
0.00
0.02
CH2CH2CI
-0.57
0.12
-0.07
0.02
OH
-0.03
0.30
-0.38
-0.28
0.12
-0.37
-0.92
-0.37
OCH3
-0.17
0.25
-0.42
-0.27
0.12
-0.27
-0.78
-0.26
OCH2CH3
-0.23
0.25
-0.45
-0.27
0.10
-0.24
-0.81
-0.28
0.25
-0.03
-0.50
-0.10
0.08
0.01
-0.05
—
OCeHs
-0.38
0.38
-0.32
CH2OCH3
-0.42
0.14
-0.06
0.02
CH2CH2OCH3
-0.52
0.07
-0.07
0.02
—
—
—
—
— —
—
NH2
-0.16
0.14
-0.52
-0.28
-0.16
-0.66
-1.30
-0.15
N(CH3)2
-0.44
0.10
-0.64
-0.26
-0.16
-0.83
-1.70
-0.12
COCH3
-0.55
0.26
(0.00)
0.17
0.38
0.50
CO2CH3
-0.49
0.24
0.00
0.16
0.37
0.45
0.49
0.37
0.45
0.48
CO2CH2CH3
—
—
—
—
—
0.84 0.75 0.75
1[continued)
Gas-Phase Reactivities
41 Table 1A.
Substituent
^a
^F
CN
-0.46
NO2 SCH3
Continued
-;
%
^R^
^R-
^m
0.60
(0.00)
0.10
0.56
^P 0.66
0.66
-0.26
0.65
0.00
0.18
0.71
0.78
0.79
1.27
-0.68
0.25
-0.27
0.15
0.00
-0.60
0.06
0.23 -0.04
0.07
-0.55
0.18
-0.07
.02
0.60 0.27
0.72 0.44
—
SQHs
-0.88
0.34
-0.10
0.03
Si(CH3)3
-0.72
-0.02
0.00
0.06
SO2CH3
-0.62
0.59
0.00
0.12
4-pyridyl
—
—
—
—
— —
1.00
— 1.13 0.81
Note: ^From ref. 15.
Table IB.
Substituent Constants Used for Analysis of Gas-Phase Substituent Effects by Means of the Y~T Equation 10^
Substituent
a°
Aa^+
4-NMe2
-0.43
-1.30
4-NH2
-0.19
-1.00
3-CH2CH2O-4
-0.19
-0.75
4-OMe
-0.10
-0.70
0.22
-0.72
3-CI-4-OMe 3-F-4-OMe
0.22
-0.72
3-CN-4-OMe
0.47
-0.73
4-SMe
0.04
-0.73
3-CI-4-SMe
0.25
-0.73
3-CN-4-SMe
0.60
-0.73
-0.05
-0.50
4-Bu
-0.27
-0.17
4-Me
-0.13
-0.20
4-OH
4-F
0.20
-0.17
4-CI
0.20
-0.15
3,5-Me2
-0.28
0.00
3-Me
-0.12
0.00
3-F
0.39
0.00
3-CI
0.36
0.00
4-COCH3
0.17
0.00
4-C02Me
0.14
0.00
4-CHO
0.43
0.00
3,5-F2
0.65
0.00
3-CF3
0.50
0.00 {continued)
42
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table IB. Continued
Substituent
o°
4-CF3
Ao^+
0.56
0.00
3-CN
0.69
0.00
4-CN
0.73
0.00 0.00
3-NO2
0.73
4-NO2
0.80
0.00
3,5-(CF3)2
0.98
0.00
H
0.00
0.00
Note: ^ Values from refs. 27, 28, and 29.
III. REACTIONS INVOLVING IONIC REAGENTS AND PRODUCTS A. Experimental Methods
Consider the equilibrium constant K for the ion-molecule reaction 12 in the gas phase: A±(g) + C(g) ^ D± (g) + E(g)
K^
(12)
The knowledge of ^p at a given temperature leads to the standard Gibbs energy change for the reaction. From K values at various temperatures, the corresponding standard enthalpy changes are obtained. In all cases, the determination of K requires the simultaneous determination of the ratio of the partial pressures of the ions, p{D-)/p(A-) as a function of the reaction time. The pressures of the neutral reagents must be known; they can be determined by means of standard methods. The ratios p(D- )//?(A-) are given by the ratios of ion intensities. They are generally determined by three main techniques: pulsed electron-beam high-pressure mass spectrometry (HPMS),^^ ion cyclotron resonance spectrometry (ICR/FT ICR),^^'^^ and flowing afterglow methods (FA).^"^ Comparison of the results obtained by these techniques has generally shown a quite satisfactory agreement. Details of the experimental methodology for the study of gas-phase ion-molecule reactions are not the purpose of this review and are well covered in the literature. B. Complexes between Bromide Ion and Substituted Benzenes (SB)
The standard Gibbs energy changes, AGg^-, for the formation of complexes between monosubstituted benzenes, SB and bromide anion in the gas phase, reaction 13a at 423 K, have been determined by Paul and Kebarle^^ by means of HPMS:
Gas-Phase Reactivities
43
SB(g) + Br-(g)^(SBBr)-(g)
(13a)
^Glf
SE in this reaction, relative to unsubstituted benzene are measured by 6AGgj.- , the standard Gibbs energy change for reaction 13b: (SB Br)-(g) + C^H^ (g) ^ (C^H^ Br)" (g) + SB(g)
5AG°,-
(13b)
Experimental results are given in Table 2. In principle, several structures (I-IV) can be expected for these complexes.
-^
x : . . . H ^ ^
(I)
(IV)
(III)
Ab initio theoretical calculations on appropriate systems (using CI" as the reference halide)^^ suggest that structure I is the most stable one for benzene and singly substituted benzenes. For strong -R substituents, the complexes are predicted to be of type I with CI" interacting with the C-H hydrogen meta to the substituent, although the interaction with the para C-H hydrogen should be extremely close in stability. In the case of single strong +R substituents, structure I, with a preferred interaction between X" and ihQ para C-H hydrogen, is expected. We present in Table 3 the results of a treatment of SAGgj.- in terms of Taft's Eq. 8. The correlation coefficient is excellent. The p^^ value is very small and barely significant. Thus, exclusion of polarizability has very little effect both on the quality
Table 2, Substituent Effects on 5AG^-, Reaction 13b br '
Substituent X H
5AG^/'^ (0.0) 2.1
F CI
2.6
CCI3 CF3
3.9 4.1
CHO
4.7
NO CN
5.2 6.2
NO2
6.5
Notes:
^ All values in kcal mo I" b From ref. 35.
44
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table 3. Treatment of 6AG^^-/ 5AG^,-/ SAG^^OQH' ^^^ SAC^^^' ^y Means of Equation 8^
Property SAC^r'
5ACJ,-^
P/?-
Pg
PF
7.35(0.91)
8.29 (0.48)
7.51 (0.80)
8.48 (0.28)
7.77 (0.89)
8.46 (0.42)
7.53 (0.99)
8.47 (0.32)
- 0 . 3 0 (0.60) 0 -0.35 (0.70)
r
sd
0.987
0.9
0.987 0.998
0.3
0
0.995
0.4
SAC^rCOOH'^'
14.6(0.7)
15.0(0.6)
0.1 (1.0)
0.997
0.4
SAC^KDH'''
49.0(1.5)
18.6(0.5)
0.6 (0.8)
0.999
0.4
Notes: ^ All values in kcal moM. Uncertainties in parentheses. ^ Data from ref. 35. ^ Data from ref. 13. ^ Correlation from ref. 13.
of the correlation and on the values of Pj^- and pp. We also present in this Table similar analyses for the effects of +R substituents on the stabilities of complexes involving chloride anion and monosubstituted benzenes in the gas phase (5AGQ-) as well as on the acidities of 4-substituted benzoic acids ( S A G ^ Q Q ^ ) and phenols ( S A G ^ J ^ ) bearing the same +R substituents. From these results, and largely following Kebarle and Paul,^^ the following conclusions can be drawn: 1. The energetics of these interactions is largely determined by electrostatic field effects, as shown by the sizes of PR-CTR- and PpGp for the same systems (Table 3). Thus, even for substituents like NO2, which are strong n acceptors, the resonance contribution is about one-fifth of that originating in field effects. It is noteworthy that 5AG3r- values are fairly well correlated with the molecular dipole moments of the corresponding SBs. 2. The ratio PR-/ Pp for chloride and bromide complexes is very close to that for the acidities of benzoic acids and much smaller than that for the acidities of phenols (see Table 3). These facts seem to indicate that the relevant resonance structures (shown below in the case of CHO substituent. Chart 1) are much less important than the direct conjugation occurring in the case of the phenoxide. In general, the role of resonance (particularly in the case of strong +R groups) in these complexes is two-fold: (i) the favorable charge distribution induced in the neutral SB, and (ii) the enhancement of resonance structure on approach of the anion. 3. The success of this analysis of equilibrium (reaction 13b) strongly suggests that the relative positions of the halide and the substituents remain constant throughout the series of complexes. In the case of strong +R substituents, the results support the existence of a C-H • • • X" bond para to the substituent. In the case of - R substituents such as F or CI, field effect favors again this structure, but n
Gas-Phase Reactivities
45
^c--^'
^;
V. > ^
>
Chart 1.
donation favors a meta orientation, as it increases the negative charge in the para position. Theoretical calculations show, however, that the difference in energy is very small. It is likely, therefore, that the actual situation involves an equilibrium mixture of both complexes, this explaining the validity of the treatment. C. Li^ Complexes
A substantial body of experimental information^^ is available on the standard Gibbs energy change for reaction 14, B(g) + Li^(g)->BLi^(g)
AGl^
(14)
wherein B is a neutral base. As shown by Gal and Maria^"*'^^ as well as by Taft^^ and coworkers, there are fairly good linear relationships between AGy+ and AG^^, the standard Gibbs energy change for reaction 15: B(g) + H^(g)-^BH-^(g)
AG^.
(15)
(Notice that the negative of AG^+, is known as the "gas-phase basicity of B" and is represented by GB. The negative of the standard enthalpy change, Af/^+, is known as the "proton affinity of B" and is represented by PA). These relationships are family-dependent and have slopes generally in the range 0.37-0.52. This behavior is consistent with the principal component analysis of basicity,^'^^ which indicates the existence of two main components. Reaction 14 seems to involve mostly electrostatic interactions while reaction 15 is best considered a blend of covalent and electrostatic contributions. Yaiiez, Taft, and coworkers^^ have published a thorough experimental and theoretical study of reactions 14 and 15 for a series of methyldiazoles. This work is an excellent example of the combination of quantummechanical and modeling methods and correlation analysis. The relevant database is given in Table 4. Figure 1 portrays the linear relationship between AG^+ and AG?;..
46
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
Table 4. Experimental AG^+ and AG^j+ Values for Reactions 14 and 15 of Selected Heterocyclic Bases^'^ Compound
AC^H-
^^Lt
(1) 1,2,4-triazole
203.0
35.4
(2) thiazole
205.9
35.9
(3) pyrazole
204.7
36.3
(4) 1 -methylpyrazole
208.8
37.0
(5) 4-methylpyrazole
207.7
38.5
(6) 3(5)-methylpyrazole
208.0
38.1
(7) 1,4-di methyl pyrazole
212.7
39.7
(8) 1,5-dimethylpyrazole
214.0
40.3
(9) 3,4,5-trimethylpyrazole
216.8
41.3
(10) 1,3/5-trimethylpyrazole
217.4
41.2
(11) 1,3,4,5-tetramethylpyrazole
220.5
41.6
(12) imidazole
215.6
41.2
(13) 1-methylimidazole
219.4
42.8
(14) 1,2-dimethylmethylimidazole
225.1
44.8
(15) 2,4,5-trimethylmethylimidazole
225.3
45.2
Notes: ^ All values in kcal moi~\ ^ Data taken from ref. 39.
Because of the essentially constant entropy terms within each of these series of reactions, proton and lithium affinities, that is, the negative of the standard enthalpy changes for reactions 14 and 15, respectively, PA and Li A are also linearly related with essentially the same slope, 0.419(0.023). This slope is close to that found in a similar LFER applying to unsubstituted azoles. For each compound, AG^^+ZAG^^ = LiA/PA = 0.19. This small value reflects a fundamental difference between lithiation and protonation. Its physical origin appears very clearly by means of a Bader analysis of the charge densities and Laplacians thereof for the neutral, protonated, and lithiated bases.^^ In the latter case, the N-Li bond is largely electrostatic, as it corresponds to an interaction between two closed-shell systems. The structures of the bases are seen to change very little upon lithiation. In the case of protonation, the N-H bond is largely covalent and the bases undergo substantial structural changes. It seems therefore that the LFER portrayed in Figure 1 hides substantial mechanistic differences, although no parameters seem to be at hand to unravel them. The authors"^^ applied the following simplified model: 1. The main contributor to the stabilization of B-Li"^ adducts was considered to be the interaction of the ion (point charge) with the molecule, a polarizable dipole.
Gas-Phase Reactivities
47
46 R=0.981 sd = 0.6
Slope=0.419(0.023) — 1 — \ — I — I — I — I — .
200
205
210
215
220
, —
225
Figure 1. - AGLI^ VS. - A G H for reactions 14 and 15.
The polarizability of the diazole molecule was in turn assumed to be the sum of two components: the polarizability of the azolic ring (constant along the series) plus the polarizability of the methyl groups. With these assumptions, the energies (or enthalpies) of the various complexes with respect to that of the parent compound (imidazole or pyrazole) could be estimated as arising from the interaction of the molecular dipole moments of the relevant species with the point charge and the ion-induced dipole interaction between this point charge and the various methyl groups. This very simple model led to an excellent description of the experimental results. 2. For the protonated species, the same approach was followed in order to estimate the energies (or enthalpies) of the protonated forms with respect to the parent compounds. Here, however, the -R effect of the methyl groups is likely to
48
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
be operative and to further increase the stability of the protonated species. On the other hand, the good quality of the correlation between AGy+ and AG^+ indicates that resonance effects, relevant only in protonation are either constant or steadily increase with the number of methyl groups. In the first case, inclusion of this term would not affect the slope of the correlation equation. In the latter case, resonance stabilization should lead to a larger slope. The computed value for the slope of the correlation between LiA and PA is 0.50, about 20% larger than the experimental value, 0.42. This is consistent with the concept that resonance stabilization by the methyl groups is significant in the case of protonation but not in the case of lithiation. An interesting feature revealed by this study is that the contributions from ion-dipole (whole molecule) and ion-induced dipole (methyl groups) interactions are of nearly equal importance in the case of protonation, while the latter are almost nil in the case of lithiation. Clearly, in systems as these, wherein a "traditional'' dissection of effects in terms of the various c parameters is not possible, theoretical treatments lead to quantitative rationalizations of SE which are in excellent conceptual agreement with the models discussed in Section II. D. Halogen Cations as Lewis Acids in the Gas Phase Abboud and coworkers'^ reported in 1989 that Ij^^, obtained by electron ionization of 12(g) is able to react with «-donor bases according to reaction 16: B(g) + l2^(g)^(B-I)^(g) + r(g)
(16)
The adducts (B-I)'^(g) were shown to reversibly exchange I"*" according to reaction 17: (Bi-I)^(g) + B2(g) ^ (B2-I)^(g) + Bi(g)
(17)
This feature allows the experimental construction of a scale of iodine cation basicity, ICB, defined as the standard Gibbs energy change for reaction 18, AG°+: (B-r(g)^B(g) + r(g)
AG^.
(18)
A number of ICB values for several organic bases, notably pyridines, were determined in that study. Some years later. Cooks et al."^^ extended these studies to the determination of chlorine cation affinities (CLCA), that is, the standard enthalpy changes for reaction 19: (B-Cl)^(g) ^ B(g) + Cl^ (g)
A//^,.
(19)
Their study also focused on substituted pyridines. Differential ICB and CLCA values, that is, relative to unsubstituted pyridine, can be safely compared on account of the essentially constant entropy changes for
Gas-Phase Reactivities
49
reactions 18 and 19 within the same family of compounds. In what follows, we present the various structural effects relative to unsubstituted pyridine, taken as a reference, 6AG^- = AG^+ (X-Pyridine) - AG° ^ (pyridine), where Y^ = U^, C r , T, etc. With this definition, 5AGY+ is the standard Gibbs energy change for the Y"^ exchange reaction 20: X-Pyridine (g) + (Pyridine-Y)"" (g) ^ (X-Pyridine-Y)-' (g) + Pyridine (g)
5AG^^
(20)
Furthermore, as indicated above, for such a process, 5AGY+ = 5AH^+. The experimental database is given in Table 5.
Table 5. 5 A C ^ - , dAC} and SAG^,- for Reaction 20 with Y"" = H^ C\\ and r Substituent X (1)2-F
5AG^+^
5AC^-^
SAH-,-^
+10.2
+6.8
(2) 2-CI
+6.6
+4.6
(3) 3-CI
+6.1
(4) 4-CI
+3.4
(5) 4-COMe
+3.4^
+1.6
(6) 4-C02Me
+2.2
+1.0
0
0
(7)H (8) 2-Me
-3.8
+3.7 +1.7
0 -2.2
(9) 3-Me
-2.7
-1.1
-2.2
(10)4-Me
-3.5
-1.4
-2.6
(ll)2-Et
^.5^
(12)3-Et
-3.5^
(13)4-Et
-4.3
(14)2-n-Pr
-5.2^
(15)4-i-Pr
-5.2
-2.2
(16)4-t-Bu
-5.8
-2.4
(17)4-OMe
-7.2
-3.2
(18)2,6-diMe
-6.7^
-A.2
(19)3,5-diMe
-5.1^
^.5
(20) 2,3-diMe
-5.9^
(21)2A6-triMe
-9.9^
(22) 2-OMe
-0.9^
Notes: ^ All values in kcal moM. ^ From ref. 42 unless stated otherwise. *^ From ref. 40. ^ From ref. 41.
-3.1 -3.1 -1.7
-4.0 -3.7
-4.6 -606 +0.2
50
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
A detailed theoretical (ab initio) study of the bonds between first- and second-row organic bases and H"^, F^, Cl"^, Br"*" was published in 1991."^^ There it was shown inter alia that the bonds involving halogen cations are strong and largely covalent. Indeed, it had been shown experimentally that reaction 18 should be endothermal by at least 65 kcal mol'^ It thus seems appropriate to compare ICB and CLCA values to the corresponding gas-phase basicities (GB) of the same compounds (GB values are the negative of the standard Gibbs energy changes for reaction 15; see above). In Figure 2 we have plotted SAGJV and 5A//^j+ against 5AG^+ for the unhindered pyridines, using the data given in Table 5. The quality of the correlations is exceptionally good, with standard deviations of fit of the size of the experimental error.
o E 03 O
X •D C OJ
o
O
0
5
5AG°H+ / kcal mor'' Figure 2. AC|+ (squares) and 5AHcr (circles) for reactions 18 and 19 vs. 5ACH^ for the protonation of pyridines.
Gas-Phase Reactivities
51
The corresponding correlation Eqs. are 21 and 22: 5AG?H = (0.10 ± 0.03) + (0.443 ± 0.007) 5AG°^
(21)
Values in kcal moP^ n = 10; r = 0.999, sd = 0.08 5A^°i^ = (0.45 ± 0.12) + (0.73 ± 0.22) 5AG^^
(22)
Values in kcal mol"^; n = 9; r = 0.997, sd = 0.36 The slopes of the correlations are smaller than one; that for 5AH^f being the largest as expected on the basis of the theoretical study."^^ It is remarkable that the slope of 5AG°+ against 5AG^+ is very close to that (0.49) for the correlation between 5AGy+ vs. 5AG^+ for the same compounds, lithium complexation being a predominantly electrostatic interaction (see above). It is known^-^ that the analysis of 5AG^+ in terms of Gp, c^+ and a^ shows that the three contributions are of substantial size and statistically significant for 2-, 3-, and 4-substituted pyridines. Here, the database does not allow a detailed study. However, the difference in 6AG°+ between 4-methyl- and 4-r^rr-butyl-pyridines, can be taken as mostly reflecting differences in polarizability (see Table 5). Acetyl and methoxy groups have very nearly the same Op parameters (0.26 and 0.25) but 5AG°+ values for the corresponding 4-substituted derivatives are, respectively, +1.6 and -3.2 kcal mol"^" Inasmuch as a^ values (-0.55 and -0.17) would substantially favor the acetyl derivative, it follows that the stabilization by - R substituents is quite important {c^+ = 0 for acetyl and -0.42 for methoxy). Obviously, the large deactivating effects of acetyl and carbomethoxy groups reflect the influence of the field effect). It thus seems reasonable to infer that the relative importance of the various electric effects in I"^ adducts (and likely in those involving CI"*") is quite similar to that in protonated pyridines. On the other hand, as shown in ref. 40, a broader comparison of 5AG^+ and 5AGJV displays a pattern of family-dependence, originating, inter alia, in the softness^ of the halogen cation. Taft and Topsom^^ showed that 5AG^+ values for 2-substituted pyridines are amenable to a very accurate dissection in terms of Gp, o^+, and a^, this indicating the absence of significant differential steric effects between the N-H bond and the various substituents. Cooks' work"^^ showed that, with respect to CP, fluorine and chlorine in the 2 position do not lead to a significant steric interaction, i. e. their data points fall nicely on the line defined by Eq. 22. Other substituents, including Me (barely) and OMe are off the line. Larger departures are observed for other groups and for multiple substitutions, as a consequence of steric and buttressing effects."^^^ These departures from the behavior predicted by Eq. 22 can be used to estimate steric effects for the various systems. Cooks did not strictly apply Eq. 22 but used a conceptually similar approach to determine a series of steric parameters 5^ for these substituents. These parameters are presented in Table 6. For comparison
52
G. CHUChHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
Table 6. Steric Parameters ^ and 5° Derived fronn Reaction 19 and Menshutkin Reaction in Solution Substituent
S'b
S^ ^
H
0
0
2-MeO
-0.84
2-Me
-0.43
-0.73
2-Et
-1.1
-1.08
2-n-Pr
-1.2
-1.20
2,3-dimethyl
-0.64
-0.92
2,6-dimethyl
-1.6
-1.98
2,4,6-trimethyl
-1.7
Notes:
-1.28
^ From ref, 4 1 . ^ From ref. 45b,c.
purposes we include Gallo's'^^*''^ steric parameter S^. This parameter is based on the kinetics of the Menshutkin reaction"^^ between substituted pyridines and methyl iodide and seems an excellent reference model. The two magnitudes follow the same trend, with the exception of the 2-OMe substituent, which has a value of S^ slightly smaller than expected. According to Cooks, this hints at a direct, throughspace interaction between the incoming chlorine cation and the substituent."^^ E. The Power of LFER: Ionization of Bronsted Acids and the Discovery of "New" Substituents
Consider the dissociation of hydroxy lie acids XOH in the gas phase, Eq. 23: XOH(g) -^ XO- (g) + H^ (g)
AG^,
(23)
The generalized proton transfer reaction of Eq. 24 gives the acidity increasing effect (- 5AG°(g)) of the substitution of X for an H atom of water, XOH(g) + OH-(g) ^ XO- (g) + up
(g)
8AG°(g)
(24)
with 8AG«(g) = AG^,,, ( H P ) - AGl^ (XOH) We report in Table 7 experimental values of AG^^-^ for a set of 25 different OH acids, including alcohols, phenols, carboxylic acids, and inorganic species such as nitrous and nitric acids. These data are taken from ref. 47. Analysis of these data by Taft and coworkers"^^ leads to Eq. 25: - 5AG^(g) = - (23.4 ± 0.8) a„ + (73.4 ± 0.9) Qp + (72.8 ± 2.0) a^where n = 25, r = 0.999, sd = 0.8 kcal mol"^
(25)
Gas-Phase Reactivities
53
Table 7. Experimental Gas-Phase Acidities of X-OH Br0nsted Acids^'^'^Acid
X
AC.
(1)H20
H
384.5
(2) CH3OH
CH3
374.0
(3) C2H5OH
C2H5
371.4
(4) C3H7OH
C3H7
369.4
(5) i-C3H70H
i-C3H7
368.8
(6) t-C4H90H
t-C4H9
368.0
(7) S-C4H9OH
S-C4H9
367.6
(8) CH30(CH2)20H
CHjOlCHjJz
366.8
(9)c-C6HiiOH
c-CeHi,
366.1
(10)t-C4H9CH2OH
t-C4H9CH2
366.0
(11)F(CH2)20H
F(CH2)2
363.5
(12)C6H5CH20H
QH5CH2
363.4
(13)F2CHCH20H
F2CHCH2
359.2
(14)CF3CH20H
CF3CH2
354.1
(15)C6H50H
CeHs
342.3
(16)CH3C02H
CH3CO
340.7
(17)HC02H
HCO
338.0
(18)(CF3)2CHOH
(CF3)2CH
338.3
(19)t-C4H9CH20H
t-C4H9CH2
337.7
(20) C6H5CO2H
C6H5CO
332.6
(21)HN02
NO
330.1
(22) CF3CH2CO2H
CF3CH2CO
327.3
(23) (CF3)3COH
(CF3)3C
324.0
(24) HNO3
N02
317.1
(25) CF3CO2H
CF3C0
316.0
Notes: ^ All values in kcal moM. ^ Values statistically corrected as needed. ^ All values from ref. 47.
These authors also established that, (i) the independent variables for this data set show a high degree of nonlinearity, (ii) the three variables are statistically significant and, (iii) Eq. 25 is quite "robust." This study sheds light on many important features of structural effects on acidities. A good example is that of the origin of the difference of acidities between alcohols (such as ethanol) and carboxylic acids (such as acetic acid). For this particular couple, polarizability effects are very similar (12-13 kcal mol"^) and the resonance contribution in acetic acid (12 kcal mol"^) is large, compared to that of
54
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
ethanol (1 kcal mor^). What is most remarkable, however, is the fact that the electrostatic field/inductive effects are, respectively, 19 kcal mol"^ and 0 kcal mol'^ This very strong influence of the polarity of the carbonyl group nicely has been confirmed by careful quantum-mechanical studies by Streitwieser, Wiberg, and coworkers."*^ The enormous quantitative importance of field and resonance effects revealed by Eq. 25 guided the search for very strong neutral Br0nsted acids. In a massive study, Koppel, Taft, Yagupolskii, and numerous coworkers,"^^ reported the intrinsic acidity of 90 of such compounds. Here we single out the effects of "superacceptor substituents." They were developed by Yagupolskii and his group. Formally, they originate in the replacement of an oxygen atom doubly bonded to S, P, and I systems by the =NS02CF3 group. Some of these substituents are: -S(0)(=NS02CF3)CF3, -P(=NS02CF3)(C3H7)2, and -I=NS02CF3. The experimental AG^^-^ of the aniline 4-CF3(0)(=NS02CF3)SC6H4NH2 was reported in ref. 49 (see also ref. 50). It amounts to 313.4 ± 0.4 kcal m o r \ that is, some 34 pK^ units more acidic than aniline itself and 13.1 pK^ units more acidic than the most acidic previously measured substituted aniline, 4-CF3SO2C6H4NH2. Gp and c^ values (1.17 and 0.38) had been determined for CF3(0)(=NS02CF3)S substituent by means of the ^^F NMR shifts of the corresponding 3- and 4-substituted fluorobenzenes in CH2CI2 solution.^^'^^ Some years earlier, Eq. 26 had been set forth that described substituent effects on the acidity of 4-(+R) substituted anilines relative to aniline: - 5AG^,i^ = (0.4 ± 0.2) a« + (19.4 ± 0.4) a^ + (54.9 ± 1.7) a^-
(26)
Substitution of the above values for Gp and c^- into Eq. 26 leads to - 5AG°^-j = 44.0 ± 1.5 kcal mol"^ for 4-CF3(0)(=NS02CF3)SC6H4NH2, compared to the experimental value of 45.7 ± 0.4 kcal mol'^ Superacceptor substituents also have a strong influence on basicity. Notario and colleagues^"^ experimentally determined SE on the intrinsic basicity of 4-substituted pyrazoles. Their experimental results are given in Table 8. Let 5AG^+ and 5Af/^+, respectively, stand for the Gibbs energy and enthalpy changes for reaction 27 (for systems like these, 8AG^+ = 5A//^+): (4-X-pyrazole-H)'^ + pyrazole ^ (pyrazole-H)"^ + 4-X-pyrazole 5AG°s5A//°^
(27)
Equation 28 was found to hold: - 5AG^^ = (3.74 ± 0.78) c^ + (24.8 ± 1.3) Gp + (12.9 ± 1.9) a^^ n = 8; r = 0.991; sd = 1.0 kcal mol'^
(28)
Gas-Phase Reactivities
55
Table 8, Experimental and Calculated (AMI) Values of 5AC^- and 5AH°> for Reaction 27 X
5AC°- '
N02
-16.5
F
-7.2
CI
-5.8
C02Et
-3.0
H
(0.0)
CsHs
(0.0)
3.0
CH3
3.1
1 -Adamantyl
4.5
SOCF3(=NS02CF3)
5AH°+ '
-29.8
SO2CF3
-23.3
t-C4F9
-14.7
COCF3
-14.5
CF3
-12.3
CN
-10.4
NO
-9.2
CHO
-7.5
CO2CH3
-6.9
COCH3
-6.1
Br
-5.0
OH
-3.0
OCH3
-0.5
C2H5
2.1
i-C3H7
2.5
t-C4H9
3.1
N(CH3)2
4.6
Notes: ^ All values in kcal mol" ^ From ref. 53.
A set of 17 5A//^+ values for other substituents were computed by means of the AMI semiempirical method. They are also given in Table 8. Treatment of all the available SE (experimental and calculated) leads to Eq. 29: - 5A//^^ = (2.4± 1.0) a„ + (26.5 ± 1.3) Qp + (13.5 ± 1.9) a^^
(29)
n = 25; r = 0.911; sd= 1.9 kcal mol'^ The coefficients of Eqs. 28 and 29 agree well within their limits of uncertainty. Furthermore, using the 6A/:/^+ value for CF3(0)(=NS02CF3)S, neglecting the
56
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
polarizability contribution and taking G^+ = 0 for this (+R) substituent, a value of 1.15 was obtained for Gp, in remarkable agreement with the value 1.17 obtained by ^^FNMR. F. Structural Effects on the Stability of Carbocations
These are generally very reactive species. Gas-phase studies have provided (and are still providing) a wealth of information on their intrinsic thermodynamic stability. The position of reaction 30, determined by the standard Gibbs energy change, AG^QW is of great chemical importance: R;(g) + R i - H ( g ) ^ R t ( g ) + R„-H(g)
AG^joj.AW^jo)
(30)
The corresponding standard enthalpy change, Mi^^o) ^^^^ provides a very good estimate of this position, on account of the fact that ^G%Q. ~ A//?3Q^ for these hydride-exchange processes. An alternative process of great usefulness is the halide-exchange reaction 31: R;(g) + Ri-X(g) ^ R|(g) + R„-X(g)
AG°3i), AfT^,,^
0 D
Generally, X = CI, Br. Here again, one generally has AG?3jv = A//?3js. The halide-exchange equilibrium method has some advantages over the hydridetransfer equilibrium such as the well-defined position of the positive charge in the ion and the higher rate of equilibration.^'^"^^ Recently, using the capabilities offered by FT ICR spectrometry, the "dissociative proton-attachment method" (DPA) has been developed that allows to indirectly determine AG?3jx under extremely mild conditions, well suited for strained or otherwise unstable ions.^^'^^ Notice that rankings according to these two reactions are linked through Eq. 32: Ri-H(g) + R,-X(g) ^ R,-H(g) + Ri-X(g) AG^32), A//^32)
(32)
This is an isodesmic process involving the neutral species only. In principle, AG?32) and AW22) can be obtained from experimental thermochemical data for the neutral species (furthermore, AG?32) = A//?320. In practice, this information is quite scarce. In general, however, AG?32) and A//?32) are very small. Whenever they are significant, they can be reliably computed by ab initio methods of relatively modest level.5« Other means are available to quantitatively rank stabilities of carbocations. For example, equilibrium proton exchange between ethylenic compounds are ideally suited whenever isomerization and/or other processes cannot compete with proton exchange. Again, and because the contributions from the neutral species are essentially constant, rankings are practically identical. For example, the free energy change of reaction 33 is in excellent agreement with the corresponding value for the proton transfer equilibrium (34).'^^ C6H5CH(CH3)C1 + t-C^Ul ^ C6H5CH(CH3)-' + t-C^U^Cl
AG° = -7.7 kcal moP^
^^^^
Gas-Phase Reactivities
57
p-MeC6H4CHMe
o 15
r
.
E
\
3.5-Me2C6H4CHMe 10
p-MeCgH4CHMe
-
r/^ ij
3
•S 5
y^
f-Bu+
/
rf K CgH5CMe2
CgHgCHMe
y^
o 0 h 6AGo = 1.03AGB (R=0.998)
2 C6H5C-.CF3 L
.5
^
1
1
1
0 5 10 15 AGB of the corresponding olefins / kcal mol'"'
Figure 3. Comparison between chloride ion affinities of cabocations and gas-phase basicities of the corresponding olefins.
C6H5CH=CH2 + t-C^n; ^ C6H5CH(CH3)^ + (CH3)2C=CH2 AG° = -7.5 kcal mor^ (34) The same results were generally observed for other carbocations and there is indeed an excellent linear relationship between these quantities, as shown in Figure 3, indicating that the chloride ion affinity values of carbocations and proton affinities of the corresponding olefins have an identical response to substituent perturbation. Gibbs energy changes for proton and halide exchange can generally be determined within ±0.2 kcal mol'^ Similar data obtained by means of DPA have inherent uncertainties of ca. ±2 kcal mol'^ We consider below structural effects on two large families of carbonium ions. First, we examine substituent effects on the thermodynamic stability of benzylic, benzenium, and phenonium cations. This provides information on the role of n delocalization. Next, we treat structural effects on the stability of bridgehead cations. This sheds light on the treatment of strain in these species. Substituent Effect in Benzylic Carbocations
Substituent effect on stabilities of benzylic carbocations can be given by the Gibbs energy changes of the proton transfer and chloride ion transfer equilibria:^^'^^"'^^
58
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
R
R (R.CH3. H. CF3)
. ^j0-C=CR
^ -
^0-C=CHR
*
Q-C'.CR
(R=H. Me.CFg)
"
(R=H. CF3)
^
As mentioned above, since there is a good linear relationship with a slope of near unity between the chloride-ion affinities of various carbocations and the gas-phase basicities of olefins, which give the corresponding carbocation by protonation. Both data of the chloride-ion affinities and the gas-phase basicities can be combined to construct a single scale of relative stabilities of the carbocations. The intrinsic stability of the unsubstituted member of the respective series including a vinyl cation widely varies with the variation of a-substituent(s), as shown below. R-^ (AAGOx=,H / kcal mol*"') PhC+(Et)Me > PhC+-(Me)2 > PhCH+Me > PhC+=CH2 > PhCH2+ > PhC+(CF3)Me > PhCH+CFa
-0.4
0.0
5.2
7.5
12.0
16.2
19.5
Figure 4 shows the plot of the relative stabilities of substituted benzyl cations against those of the corresponding a-cumyl cations. This plot can be regarded as being a gas-phase o^-plot. There is neither a simple linear relationship nor a monotonic curvature as seen for the substituent effect on the solvolysis of this system.^ ^'^^ In this figure, a good linear relationship with a slope of unity is observed for meta substituents and para 7i-electron-withdrawing substituents, but all para 7C-donor substituents significantly deviate upward from the line of unit slope. The linear relationship with unit slope for nonconjugative substituents clearly suggests the same contribution of inductive/field effects to both systems. Therefore, significant deviations of para 7i-donor substituents must be due to different contribution of resonance effect between both systems. The same pattern of LFER can be observed for the relative stabilities of l-aryl-l-(trifluoromethyl)ethyl cations shown in Figure 5. The upward deviations of para 7C-donor substituents in these figures are systematic, i.e. the stronger the para 71-donor substituent, the greater the deviation, suggesting that the resonance stabilization from para 7i-donor substituents must be greater in the benzyl cation and l-aryl-l-(trifluoromethyl)ethyl cation systems than that in the a-cumyl cation. These trends are consistent with those observed for the gas-phase basicities of aromatic carbonyl compounds as shown below.
Gas-Phase Reactivities
59
20 p-OMe p-SMe 3-CI-4-OMe
Q
3-F-4-OMS 3-CI-4-SMe p-f-Bu 3-CN-4-OMe 3-CN-4-SMe
1
5
0)
2
0
5 o 3.5-F2
-10
m-N02 3.5-(CF3)2
-15
-15
-10 -5 0 5 10 AGB of a-methylstyrenes / kcal mor^
15
Figure 4, Plot of the relative chloride ion affifinties of substituted benzyl cations against relative gas-phase basicities of the corresponding a-methylstyrenes.
The Y-T Eq. 10 could be equally applicable to treatment of these substituent effects as shown in Figures 6, 7, and 8. The correlation results for the stabilities of benzylic carbocations, given by well-behaved substituents, are summarized in Table Q 29,59-70 rpj^^ resonance demand (r"*") value significantly varies with substitution at the benzylic carbon, from 1.00 for the stable a-cumyl cation system to 1.53 for the highly electron-deficient 2,2,2-trifluorophenylethyl cation system. It is found that r"*" increases along with a decrease of the stability (AAG^^j^) of the unsubstituted member of the respective series of benzylic carbocations. Including an ^-p-hybridized carbocation, a vinyl cation, there is an excellent linear relationship between these two quantities with a correlation coefficient of 0.997 and a standard deviation of ±0.02 (Eq. 35 and Figure 9): r+ = 0.0261AAG^^H+l-00
(35)
This correlation clearly demonstrates that the resonance demand substantially varies with the intrinsic stability of a given carbocation, showing a continuous
60
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
p-MeO p-MeS
.
3-CI-4-MeO 3-F-4-MeO
o E
s
S
3-CI-4-MeS
3,4-Me2 1 p-f-Bu H p-Me r) i*'^ 3,5-Me2 ^ r
•J3
s
'
^ ' ^ m-Me
3-CI-4-Me
It 0
^
p-Cl m-F
—
>•
^
H
>^
^ ^
> ^ to
y/
^
10
c o
CO
QL fl
m-CI
/7T-CF3
Q) to
• ^ ^ -10 -
3.5-F2 1
-10
1
—
1 0
1
1 10
J
Relative stabilities of a-cumyl cations / kcal mor'' Figures. Plot of gas-phase stabilities of 1 -aryl-1 -(trifluoromethyl)ethyl cations against the corresponding a-cumyl cations: Open circles; para 7i-donor substitutes, closed circles; meta substitutes.
Table 9. Results of the Y-T Analysis for Gas-Phase Stability of Benzylic Cations ArC(R^)R^
R'
Gas-Phase Stability
R"
AAGl^^
p'
r"
CF3
H
19.5
- 1 0 . 6 (-14.2)
1.53
CF3
Me
16.2
- 1 0 . 0 (-13.7)
1.41
H
H
12.2
-10.3 (-14.0)
1.29
H
Me
4.9
-10.1 (-13.8)
1.14
Me
Me
0.0
-9.5 (-13.0)
1.00
Me
Et
-0.4
- 9 . 5 (-13.0)
1.00
=CH2'^
7.5
-10.3 (-14.0)
1.18
=CH-CH3'^
5.7
- 9 . 7 (-13.2)
1.12
14.4
- 9 . 9 (-13.5)
1.39
=CH-CF3
Notes: ^ In kcal mol \ Relative stabilities of the unsubstituted member of respective series, based on free energy changes of proton-transfer or chloride ion-transfer equilibria. ^ Values in p.arentheses are obtained by multiplying the p of log K/K^ by the factor 2.303RT/1000, i.e., kcal mol"^ a"^ unit. ^ 1-Phenylvinyl cations.
Gas-Phase Reactivities
61 p-OMe
15 L X
^
A
IVJ^MA
^-" ' O
o E
3-Cl-4-OMe ^
-8
10 h p-f-Bu 3.4-Me2 p-Me
c o
o-r*4-vjMe
•
^—o
3-CM-SMe
\
•
3-CN-4-OMe
r\
A
VCN-^-^M**
3.5-Me2 m-Me
N
0 U h
H N
c
m-C\
i "^
\^ \
.5
P-CF3
CO
i -10 o
3'5"F2
h
%
m-N02 \
p-CN ^^^^2
3.5-(CF3)2
-15 h^ 1
-1.0
_i
-0.5
1 0.0 a-scale
\^
1
1
0.5
1.0
1
Figure 6. The Y-T plot of gas-phase stabilities of substituted benzyl cations against: a^ (open circles), a° (closed circles), and a with r= 1.29 (squares).
spectrum of the r"*" values. This fact also suggests that the origin of the varying resonance demand is the intrinsic stability of the parent carbocations. In addition, the variation of the r"^ value can be described with the a° and AG^+ substituent constants of the a-substituents (R^ and R^) with a satisfactory precision (r = 0.9992, ^J = ±0.01),Eq. 36, r-" = 0.45 Zo° + 0.40 ZAa^^ + 1.28
(36)
where Za° = a°(R^) + a°(R^) and Aa^^ = Aa^^R^) + Aa^^R^). This result indicates that the r"^ value as well as the intrinsic stability of the parent carbocation are affected by both field/inductive and 7i-electronic effects of the R^ and R^ substituents, in spite of the variation in the central carbon from the primary to tertiary character. This correlation may further have practical use to estimate an r"^ value for a new system of unknown resonance demand. Furthermore, it was found that the r"*" values are correlated linearly with theoretical parameters given by ab initio molecular-orbital calculation at the RHF/6-31G(d)
62
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
level, such as the charge (Mulliken populations) on ihQ para position of the phenyl ring and the Wiberg bond order or bond length of Ph-C"^^^ which are associated with the concept of a resonance interaction. Thus, the r^ value has physical significance for characterizing the intrinsic nature of a carbocation itself. The n delocalization of the charge into the aryl n ring competes with the stabilization from the a-substituent(s). This conclusion is also consistent with the fact that the r"^ value for the gas-phase stability of the conjugate acid of the R-substituted benzoyl (ArCOR) system decreases along with an increase in the electron-donating ability of the R-substituent, as discussed later. Substituent Effects in Benzenium and Phenonium Cations There are other important kinds of 7i-delocalized cationic systems—for example, benzenium ion and phenonium ions, which intervene as intermediates in the
15
10
p-MeO r
rn
L\-, iJ
O
^
f^>^_ vj
\.
l
^
^ —V^)
A
^
«-M^Q
1
#- •"*'*-
^^^__ ^ ^ 3-CI-4-MeS
3-CI-4-MeO 3-r-4-MeO
3.4-Me2 p-f-Bu
L
o 6
p-Me
ri/^
\
A
3,5-Me2
CO
o
h
o
i
«o
m-CI
h
1
-10 -1.0
-0.5 0.0 o-scale
\
3.5-F2 .
\
1
0.5
Figure 7, The Y-T plot of the intrinsic stabilities of 1-aryl-1-{trifluoromethyl)ethyl cations against: a"^ (open circles), a ° (closed circles), and a with r = 1.41 (squares).
Gas-Phase Reactivities
63
P-NH2
20 U
A
w
i ''
'r
\
p-MeO r~l / ^
A
TJ 0 \ ^
I 10
\
h
t o >. 5
• p-MeS '^
V, ri 0 \J \^
<
^
'"-Me X ^ p-F H * i h * P-CI
O
^
/>Me
\T)—#
3.5-Me2
1
3-F-4-MeO A 3-Cl-4.Mfin ^ 3-CI-4-MeS
n
L
m-CI
• •
'^'^
m-CFg
^
3.5-F2
-10 [ ,-_j
-1.5
1
-1.0
1
-0.5 a-scale
1
0.0
' ^ 1
1
0.5
Figure 8, The Y-T plot of the relative stabilities of substituted phenyl vinyl cations against: a"^ (open circles), a^ {closed circles), and a with r"*"= 1.18 (squares).
electrophilic aromatic substitutions and the solvolysis of 2-arylethyl systems via the neighboring phenyl group participation mechanism, respectively. The relative stabilities of these ions could also be determined based on proton- or bromide ion-transfer equilibria^"*'^^:
The results of the Y-T analysis of the substituent effects are summarized in Table 10. The p values for both systems are significantly larger than those observed for ordinary benzylic carbocation systems, e.g., -9.5 for the a-cumyl cation. Such large
64
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD 1.6
{\-icZ
H. Me
Me.Et
-25
-20
-15-10
-5
0
5
Relative stability of the parent carbocation / kcai mol'i Figure 9. Plot of the r"*" values against the relative stabilities of the unsubstituted member of respective carbocations.
p values appear to be characteristic of the benzenium ion structure which bears the positive charge in the phenyl ring itself. The r"^ of 1.30 for the benzenium ion higher than unity reveals a large n delocalization of a positive charge into thQ para 7i-donor substituent. On the contrary, the r"^ value of 0.63 for the phenonium ion clearly indicates that the degree of the n delocalization in the phenonium ion is intermediate between o^ and a° (Figure 10). This value, which is significantly smaller than that of the benzenium ion, may be attributed to its high stability due to strong electronreleasing effect of the cyclopropane-hke ring. Although this trend of the resonance demand is consistent with that for the benzylic carbocation system, the r"^ values for the benzenium ion and phenonium ion are not reconciled with the linear relationship between the r"^ value and the stability of the parent carbocation shown in Figure 9. This is due to a framework of the n system which is very different from that of the benzylic cation system. Structural (Strain) Effects in Bridgehead Carbonium Ions
Table 11 presents the standard Gibbs energy changes for reaction 37, the bromide exchange between 1-adamantyl cation (1-Ad"*") and a variety of bridgehead (or heavily congested tertiary) bromides, R-Br. These values were recently obtained by means of the DPA technique and by direct bromide exchange:^^
Gas-Phase Reactivities
65
Table 10. Results of the Y-T Analysis for Gas-Phase Stability of Phenonium Ion and Benzenium Ion Gas-Phase Stability System
^^X=H
Phenonium ion
9'
'
-2.4 12.7
Benzenium ion
-12.3 (-16.7)
0.63
-13.2 (-18.0)
1.30
Notes: ^ Relative to f-butyl cation. ^ Values in parentheses are obtained by multiplying the p of log K/K^ by the factor 2.303RT/1000, i.e., kcal mol a"^ unit.
R^ (g) + Ad-Br (g) -> R-Br (g) + Ad^ (g)
AG^37)
(37)
Until recently, much of our knowledge on carbenium ion stabilities was derived from solvolytic studies 7^ The early empirical force-field calculations of Schleyer et al7^ correlating solvolytic reactivity with strain changes between bridgehead derivatives and the corresponding carbenium ions suggest that the transition state
la-
A
\ l
A
LI
. .
•
0- —-nh s'
1 \
^
p-Meu
/>MeS
/>Me
0—^^HF^
m-Me
•
3-CI-4-MeO
\
0-
-5
\
m-CI
m-F A .
m-CFsV
10-
—I
-1.0
-0.5
T
0.0
1 0.5
'
o-scale Figure 10, The Y-T plot of the relative stabilities of substituted phenonium ions against: a"*" (open circles), a° (closed circles), and a with r"^= 0.63 (squares).
66
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
Table 11. Standard Gibbs Energy Changes for Reaction 37 and Differential Strain Energy, AE^^, for Selected Bridgehead Species^ Compound
^^37)
AAE/'-^
(1) 1-bromoadamantane
(0.0)
(2) 1-bromobyciclo[2.2.2]octane
-8.1
6.50
(3) bromocubane
-14.5
13.54
(4) 3-bromonoradamantane
-15.0
15.05
(5) 1 -bromonorbornane
-24.3
19.96
(0.0)
Notes: ^ See text. ^ All values in kcai mol"^ *= From ref. 58. ^ From ref. 78.
for solvolysis should occur late on the reaction coordinate and resemble the carbenium ion with respect to structure and energy. Miiller and coworkers compared the thermodynamic stabilities of a number of bridgehead carbocations (as defined by Eq. 37) with the difference in strain energy AE^^{R^ - RBr) computed for each R"^/RBr couple by means of the MM2 and UNICAT4 methods.^^ The results were encouraging but the gas-phase stability data available at that time were rather uncertain. We present in Figure 1 1 a somewhat unusual correlation, that of AE^.iR' - RBr) - AE^^Cl-Ad"' - 1-AdBr) = M^^^{R^ - RBr), taken from ref. 78 against AG?37x from ref. 58. The correlation Eq. 38 is : AA^^^CR-" - RBr) = (0.4 ± 1.2) - (0.856 ± 0.080) ^Gl^^^
(38)
All values are in kcal moH^ n-5\r0.987; sd=\A kcal mor^ It spans a range of nearly 30 kcal mol"^ and strongly supports the importance of strain at determining the intrinsic stability of bridgehead carbonium ions. The slope of the linear regression is slightly smaller than unity. 1-Adamantyl cation, the least strained ion in this series has a framework with 10 carbon atoms, and 1-norbornyl, the most strained one, has seven. Other ions in the correlation have an intermediate number. This suggests that, besides strain, there is a small contribution from the polarizability of the hydrocarbon framework to the differential stability of these cations. G. SE on the Intrinsic Basicity of Carbonyl and Thiocarbonyl Compounds Benzoyl Compounds
Substituent effects on gas-phase basicities of benzoyl compounds, ' ' are given by the standard Gibbs energy change for reaction 39, AG39. This family is
67
Gas-Phase Reactivities
^G^(37) / kcal mor'' Figure 11, Differential strain energy AAfst. vs. ^Cf^syy
particularly interesting because basicities in aqueous solution are also available for comparison.
&-K"
xfl^^(R=N(CH3)2. OCH3. CH3 H. CF3)
(39)
SE on AG39 (with the exception of R = CF3) have been successfully analyzed^^ by means of Eq. 7. This treatment shows that polarizability contributions are very small while field and resonance effects are quite large. Here we shall focus on the treatment of these systems by means of the Y-T Eq. 10. These systems are most suitable for this approach, due to the limiting structure II for the protonated species that shows the "benzylic" character of these ions.
68
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
N.
-OH
/ (I)
(II)
Chart 2.
The relative GB values of benzaldehydes are plotted in Figure 12 against those of the corresponding a-methylstyrenes as a a"^ plot in the gas phase, because the gas-phase stabilities of the a-cumyl cations could linearly be correlated with the ordinary a^ values in solution as mentioned above. This figure shows that there is a good linear relationship between both systems, indicating that the stabilities of a-hydroxybenzyl cations can be described by a"^. The slope of 0.9 suggests that the response of the stability of the cation to the ring substituents is somewhat reduced in the a-hydroxybenzyl cation system compared with that in the a-cumyl cation. In contrast, the plot of the relative GB values of methyl benzoates versus a-methylstyrenes shows no simple linear relationship for the whole set of substituents (Figure 13). If limited only to nonconjugated substituents, meta substituents
p-NMe2
3-Ci-4-OMe 3-CM-SMe
-10
0 10 -6AGo/kcal mo|-i
20
figure 12. Plot of the gas-phase basicities of benzaldehydes against relative stabilities of a-cumyl cations.
Gas-Phase Reactivities
69
10 1
^
o 6 m-MeJB H j f
L
O m-F
Jr
^
CP 3-CI-4-OMe p-Me 3-CI-4-SMe
p-F p-CI
p-cN
U ^ / ^ m-N02 -10 h K3.5-(CF3)2 1
1
1
1
10
-10 -dAQo/kcal mol'"*
Figure 13, Plot of the gas-phase basicities of methyl benzoates against relative stabilities of a-cumyl cations.
and para 7i-electron acceptors, there exists a good linear relationship. All para n donors show negative deviations from this line. Similar situations of these n donors are observed in a mutual comparison between two benzoyl compound series, as shown in Figure 14. The deviations of the para n donors in these figures are systematic, i.e. the stronger para 7i-donor substituent shows a greater deviation, suggesting that the resonance-stabilization effects due to the para 7i-donor substituents vary with the system. The deviations of the p-tert-huiy\, 3,5-dimethyl, and m-methyl groups shown in Figure 14 cannot be explained in terms of a different contribution of the resonance effect between the two series but may be interpreted in terms of the enhanced contribution of the polarizability effect in the more electron-deficient carbocation system compared with that in relatively stable carbocations.^^ Excluding these particular substituents, the deviations of the para 7i-donor substituents are satisfactorily related to the resonance substituent constant, AG^+ (= c^ - a°). In fact, the application of the Y-T Eq. 10 to these SE using the gas-phase substituent constants listed in Table IB provided excellent correlations, as shown in Figures 15-18. Table 12 shows that the r"^ value widely decreases from 1.24 for R = CF3, higher than unity for a^ value, to 0.28 for R = NMe2, close to the r"^ value involved
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
70
10
3.5-Me2
o
m-Me
e
I m-CN
-10 -10
10 -6AGo/kcal mol-"*
Figure 14,
Plot of the gas-phase basicities of N,N-dimethylben2amides against those
of benzaidehydes.
Table 12,
Results of the Y-T Analysis for Gas-Phase Basicities of Aromatic Carbonyl Compounds, ArCOR
R
P^
r"
G^
Aa/
ASEp,^
NMe2
-8.2 (-11.1)
0.29
213.8
-1.30
10.8
OMe
-8.2 (-11.1)
0.50
195.7
-0.70
14.8
Me
-8.5 (-11.5)
0.82
197.3
-0.20
19.3
0.00
27.8
0.00
29.1
H
-8.5 (-11.6)
1.06
192.1
CF3
-8.3 (-11.3)
1.24
184.4
Notes: ^ Values in parentheses are obtained by multiplying the p of log K/K^ by the factor 2.303RT/1000, i.e., kcal moM a~^ unit. ^ Gas-phase basicity of the unsubstituted member of respective series, in kcal moM. ^ Resonance effect substituent constant of R in the gas phase. ^ Stabilization effect of the phenyl group given by [GB(phcoR) - GB(HCOR)1/ ' " l^cal moM. GB value of CF3CHO is 155.3 kcal mol"^, Koppel, I. A.; Anvia, F.;Taft, R. W.}. Phys. Org. Chem., 1994,7,717-724.
71
Gas-Phase Reactivities 20 «
p-NMe2
p-OMe p-SMe « 3-CM-OMe « 3-CI-4-SMe 3.5-Me2
E
I I
p-CN
-10
a-scale Figure 15. The Y-T plot of gas-phase basicities of substituted acetophenones.
in a (r"^ = 0.27). The order of the decrease in the r value seems to be related to the electron-donating ability of the R group. Indeed, there is a good linear relationship between the r"*" values and the differential GB values between PhCOR and HCOR, which measure the stabilization effects of the phenyl group on the stability of a cation, HC"^(OH)R, (Figure 19). This result is consistent with the basic concept introduced by Yukawa and Tsuno that the r"*" value is a measure of the n interaction between a positive charge and the phenyl ring. In conclusion, the charge formed at the benzylic position by the addition of a proton is stabilized through competitive 71 delocalization by the aryl group and the R group. In contrast to the high response of the r"*" value to the variation of the R group, the p values are nearly constant in this system. Such constancy of the p value was also observed for a series of substituent effects on the GB of the a-substituted styrene system, PhC(R)=CH2. The identical p value within the homologous series suggests that the response of the stability of a cation to the polar effect of substituents is primarily determined by the distance between the charge center and a substituent.
72
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
5h _^ p-SMe ^ pOH 3-CI-4-OMe 3-CI-4-SMe
o
I 0
I
CO
-10
Figure 16. The Y~T plot of gas-phase basicities of substituted methyl benzoates.
Aliphatic and Alicyclic Carbonyl and Thiocarbonyl Compounds
We present in Table 13 the gas-phase basicities of 13 carbonyl and thiocarbonyl compounds; that is, the standard Gibbs energy changes for reactions 40 and 41. These data mostly originate in work by Abboud's^^' ^^ and Gal's^"^ groups. [XC(=OH)Y]^ (g) ^ XC(=0)Y (g) + H^ (g)
AG^. (CO)
(40)
[XC(=SH)Y]^(g)->XC(=S)Y(g) + H^(g)
AG°.(CS)
(41)
Figure 20 is a plot of AG°+(CS) against AG^^ (CO) for all the available data. The quality of the correlation is seen to be excellent. The breadth of structural effects involved (59.1 and 72.1 kcal mol"^ for thiocarbonyl and carbonyl compounds, respectively) is possibly the largest ever reported for any LFER. Carbonyl compounds are known to protonate on the carbonyl oxygen in the gas phase. This LFER strongly suggests that the homologous thiocarbonyl compounds also have a con-
Gas-Phase Reactivities
73
A V
5 p-OMe
0
p-f-Bu "^
0
3-CI-4-OMe o 3-CM-SMe 0 -
p-M6
m-Me ^n •^ "11
H
W
P<5
o -5 m-CF3 POF3
\ ^m<^N
pCN
\
m-N02 10 1
-2.0
-1.0
1
-0.5
1^
0.0
\
P-NO2 1
0.5
A 1
1.0
Figure 17. The Y-T plot of gas-phase basicities of substituted N,N-dimethylbenzamides.
stant basic center, namely the sulfur atom of the CS group. This is confirmed by ab initio calculations of substantial level.^^ The slope of the correlation is close to 0.80, indicating that differential SE are 20% smaller in the thiocarbonyl series. Notice, however, that thiocarbonyl compounds are consistently more basic than their carbonyl homologues over the entire range of reactivity examined in this work. Isomerization problems preclude the study of many thiocarbonyl compounds, notably thials and thiones. This notwithstanding, ref. 82 reports a series of A//^+ (CS) computed at the MP2/6-31 + G(d,p)//6-3 lG(d) + ZPE level for the protonation of the HC(=S)X series. With respect to thioformaldehyde, SE effects on AH^^ (CS) follow Eq. 42 to a high degree of precision: A/^KHCSX) - AH^^iKJCS) = (19.0 ± 1.6) a„ + (48.5 ± 1.8) Gp + (45.6± 1.9) a^^
(42)
n = 9; r = 0.998; 5-^=1.1 kcal mol"^ Here, as in the case of carbonyl compounds,^^ the three effects are large and statistically significant. Field and resonance contributions are of the same order of
74
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
^
. v^ Y,
lio
1 i
^
pOMe
^
p-SMe
^
A
r\
-J^—
p-Me
\ » — 0 [f^-
P-'-Bu
Q.
1
3-CI-4-SMfi
^
3-CI-4-OMe
3.5-Me2 m-Me
o
0
S 0
H \
^ p-F
its
6 6 o
-5
«o 1
,
1 -1.0
«
1
.
-0.5
0.0
0.5
1
G-scale Figure 18. The Y-T plot of gas-phase basicities of substituted a,a,a-trifluoroacetophenones.
1.4
CF3 1.2
H
>0
€ ^
O/^Me
n 0-8 1Ii
OUe r/ r+» 0.049 ASE(ph)-0.2
0.6 JO
0.4
1
10
NMeg 1
LJ
20 ASE^pf^j/kcalmol"''
30
1 —
1
0.2 Figure 19. Plot of the r"^ values against the stabilization effect of the phenyl group.
Gas-Phase Reactivities
75
Table 13. Gas-Phase Basicities (GB) for Thiocarbonyl and Carbonyl Compounds GB (kcal mot^:^
Substituents Y
x(co)y^
X(CS)y^
214.2 209.0
218.1 213.7
(3) NHCH3
N(CH3)2 N(CH3)2 NHCH3
208.3
213.2
(4)1-CioHi5
1-^10^15
205.5
209.4
(5)H
N(CH3)2
203.8
208.0 (207.9)"^
(6) CH3O
N(CH3)2
(7) C-C3H5
C-C3H5
201.9 201.4
205.7 207.1
X (1)N(CH3)2 (2) CH3
(8) NH2
NH2
201.0"^
205.1
(9) t-C4H9
t-C4H9
198.4
202.2
(10) camphor
thiocamphor
197.3
201.7
(11)CH3
OC2H5
191.4
197.0
(12) H
H
162.3
177.0
(13) F
F
142.1
159.0
Notes: * All valijes in kcal moM. ^ Values from ref. 82. ^ Values from ref. 84. ^Values from ref. 83.
importance. Notice that the formal analogy suggested by Figure 20 is somewhat misleading since high level ab initio calculations^^ reveal that charge redistributions undergone by protonated carbonyl and thiocarbonyl compounds are very different. H. Solvent Effects on Selected Proton Transfer Equilibria Benzoyl Systems
The relative basicities of the benzaldehydes^^ and acetophenones^^ in aqueous solution are plotted against the corresponding values in the gas phase in Figures 21 and 22, respectively. In an analysis of the solvent effects, it may be convenient to separate the solvent effects into two classes, i.e. common solvent effects on a whole system and specific solvent effects arising from a specific interaction with the particular substituent. Since the latter solvent effects correspond to solvent modifications of the substituents, in order to explore the solvent effects on a whole system it would be reasonable to exclude particular substituents, such as the hydroxyl group, from a comparative analysis, -5AG° versus -5AGg^3. In fact, by excluding these substituents we can find a good linear relationship in both Figures 21 and 22.
76
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
130
-i—I—I—I—\—1—I—I—I—I—I—1—I—I—r 140 150 160 170 180 190 200 210
220
AG°H+ (CO) / kcal mor^ Figure 20.
- ACH^ (CS) VS. - ACH+ (CO), for reactions 40 and 41.
The existence of such a linear relationship between gas and aqueous solution phases suggests that there is a common set of substituent constants for the respective series in both phases. Table 14 shows the results of the analysis of these substituent effects in aqueous solution, using a larger number of substituents than that of the present comparative analysis, -5AG° versus -5AG° .^^ The r"^ values for the benzaldehyde and acetophenone series in aqueous solution practically agree with those for the corresponding gas-phase basicities, being consistent with the graphical analysis described above. Considering the similarity of the electron-donating ability between the hydroxyl and methoxyl groups and between NMe2 and NH2 groups, it is likely that the r"^ values of the benzoic acid and benzamide series in aqueous solution are also identical to those in the gas phase. Consequently, the r"^ value is essentially the same in aqueous solution and gas phases. That is, the degree of stabilization of the positive charge through n delocalization into the aryl ring relative to that by an inductive/field effect is independent of the solvation of the
Gas-Phase Reactivities
77
-10
Figure 21,
-5
0 6 -MQO(gi^)/kcalmoM
10
Aqueous solution versus gas-phase basicities of substituted benzaldehydes.
"^ * A
v
y 1 rL
Jj
L
P^ X^
L
I-'
H
/
# mO\tm
/o
1
n>M#
0
^/o
r
p-M«
/ O X^
8 ^h
pOM«
"^ m-a
^r
1 /l>N02 y ^ -2
r ^x yO 1
-10
Figure 22. nones.
5AG0,q = 0^46fiAQOg- 0.16 P-N02
,.
1.
1
-5
0
5
Aqueous solution versus gas-phase basicities of substituted acetophe-
78
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table 14, Results of the Y-T Analysis of Basicities in Aqueous Solution^ Benzamide
p^ r""
-1.243 (-1.67) 0.36
Benzoic Acid -1.146 (-1.56) 0.55
Acetophenone -2.200 (-2.99) 0.76
Benzaldehyde -1.764 (-2.37) 1.16
Notes: ^ Taken from ref. 24. ^ Values in parentheses are obtained by multiplying the p of log K/K^ by the factor 2.303RT/1000, i.e., kcal mol"^ a~^ unit.
cation, and the r"^ value is a function of the structure of the ion. On the contrary, the p values of the solution basicities are remarkably smaller than those of the gas-phase basicities. This is easily explained by the effective dispersion of the positive charge of the ion to solvent molecules. In conclusion, the solvation of a cation reduces the central charge, and this lowers the response to substituent perturbation, essentially without changing the nature of the intramolecular charge-delocalization. Aliphatic and Alicyclic Carbonyl and Thiocarbonyl Compounds
Experimental evidence exists showing that most ketones, esters, amides and ureas also protonate on the carbonyl oxygen in acidic solutions.^^'^"^'^^ The same is known to happen for the homologous thiono compounds. At variance with the gas-phase results, whenever a direct comparison can be carried out between the pK^s of the corresponding conjugated acids (as it is the case for amides/thioamides) one finds that the thiocarbonyl compound is more basic by 1.5-2.0 pK units. This is a consequence of solvation effects (p^^ values are referred to a standard state of pure water). The matter is discussed in detail in refs. 83, 85, 89. I. Correlation between Carbocation Stability in the Gas Phase and Kinetics of Carbocation Formation Reactions in Solution Solvolysis of Benzylic Substrates I Ph-t~L
slow
•
+ R1 Ph-CcT"
tast
•
Product
R2
The p and r"^ values for the Sj^^l solvolysis of a series of benzylic substrates are summarized in Table 15,26,72,90-95 j^. ^^ ^^^^^ ^^^^ ^^^ ^ values for the solvolysis are significantly reduced compared with those for the gas-phase carbocation stabilities. This is reasonably interpreted by the solvent stabilization of the transition state and intermediate cation in the solvolysis. Most importantly, the r"*" value for the Sj^l solvolysis is found to be in complete agreement with that for the gas-phase stabilifies of the corresponding benzylic carbocafions.
Gas-Phase Reactivities
79
Table 15, Results of the Y-T Analysis for the Solvolysis of the Benzylic Substrates ArC(R^)(R^)L
Solvolysis
/
R'
R"
CF3
H
-6.05
1.53
1.53
CF3 H
Me H
-6.29
1.39
1.41
-5.20
1.30
1.29
H
Me
-5.45
1.15
1.14
Me
Me
-^.59
Et
-4.69
1.00 1.04
1.00
Me =CH2^
-4.10
1.20
1.18
CH2CH2 - (k^-process)
-3.87
0.63
0.63
P
4a.
1.00
Notes: ^ 1-Phenylvinyl tosylates. b 2-Phenylethyl tosylates.
Since the solvation of a cation reduces the central charge to lower the response to substituent perturbation, essentially without changing the magnitude of the r"^ value, as noted already, the identity of the r"^ value between the carbocation stabilities and solvolysis rates means that the degree of the charge-delocalization in the rate-determining transition state of the solvolysis is very close to that of the carbocation intermediate. This result provides an important information on the analysis of the substituent effects in the solvolysis. The extremely large r"^ value of 1.53, observed for the solvolysis of l-aryl-2,2,2-trifluoroethyl tosylates, is not a correlational artifact, but must be the resonance demand reflecting a highly electron-deficient cationic transition state of the limiting Sj^l ionizing process in the same manner as that of the solvolysis of ordinary benzylic substrates to give relatively stable carbocations. Similarly, the exalted r"*" value of 1.3 obtained for the solvolysis of benzyl tosylates with electron-donating substituents is not a correlational artifact arising as a result of the non-linearity caused by the k^-k^ mechanistic transition as suggested by Shorter,^^ but must be an intrinsic feature characterizing the nature of the transition state of Z:^ solvolysis of benzyl substrates. The less stable primary benzyl cation should have an inherent resonance demand distinctly higher than the value of r"^ =1.0 of the tertiary a-cumyl cation system. Furthermore, the r"*" value of 0.63 for the phenonium ion is also in complete agreement with the value observed for the corresponding solvolysis via a phenonium ion intermediate. The intermediate r"*" value is characteristic of its unique bridged structure. The agreement of the r"*" value between the cationic transition state and an intermediate cation for all series of the benzylic systems, including a phenonium ion and phenylvinyl cations, leads us to the conclusion that the geometry of the transition state in the ionizing process of the Sj^l solvolysis, which is a highly
80
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
endothermic reaction, closely resembles the high-energy product, an intermediate cation. Clearly, these results have confirmed that the r"*" value is an inherent nature characteristic of the carbocation structure itself. Thus, the intrinsic behavior of carbocations in the gas phase provides an important basis for better understanding of the real features of the transition state of organic reactions in solution. Acid-Catalyzed Hydration Reactions of Olefins
Acid-catalyzed hydration of a carbon-carbon double or triple bond, reaction 43, is an alternative route to generate a carbocation intermediate in solution.^^ I ^ Ph-6=CH2 + H^
slow •
iL^CHo Ph-CCp 3
fast _ _ ^
„ _, , Product
(43)
The results of the Y-T analysis of the substituent effects of acid-catalyzed hydration of the styrene and phenylacetylene substrates in acidic media are summarized in Table 16. The p values as large as those of the ordinary benzylic Sj^^l solvoly sis are consistent with the currently accepted mechanism of a rate-determining formation of benzylic carbocation. On the contrary, the r"*" values for the
Table 16. Results of the Y-T Analysis for Acid-Catalyzed Hydration of Double Bond and Triple Bond^ System PhCH=CH2
^hyd
''hyd
rU
hya gas
1.14
-3.11^ -3.94'' -3.30'' -3.56^
0.80^ 0.70"^ 0.79'' 0.94^
-5.45^
0.6/ 0.74'^'S
1.00
0.74 0.82 0.74
0.59
PhC(Me)=CH2
-3.36''S
PhC(CF3)=CH2 PHC=CH
-4.77^
1.15^
1.41
^.30'
0.87'
1.18
-4.20^'^
0.92^'j
Notes: ^ Calculated using data in the literature. In aq H2SO4 at 25 °C unless otherwise noted. ^ Ref. 97a. ^ Ref. 97b. "^ Ref. 97c. « In HCIO4 at 25 °C, Ref. 97d. ^The addition of CF3COOH in CCI4, Ref. 97e. 8 Ref. 97f. ^ Ref. 97g. ' In acetic acid-water-sulfuric acid at 50.2 °C, Ref. 97h. i Ref. 97i.
0.83 0.61 0.69 0.82
0.78
Gas-Phase
Reactivities
81
hydration are noticeably smaller than those of the corresponding cations in the gas phase and of the solvolysis. Although the data used for the present correlation involve only a few substituents, the small r"^ value seems unlikely to be a correlational artifact, because the reduction of the r"^ value is observed for all substrates. The disagreement of the r"^ value between the hydration rates and the gas-phase carbocation stabilities or solvolysis rates therefore suggests that the structure of the transition state of the acid-catalyzed hydration is appreciably different from the corresponding stable cationoid intermediates or Sj^l transition state with respect to 71 delocalization of the positive charge at the reaction center. These results demonstrate that the Yukawa-Tsuno equation is applicable to the gas-phase substituent effects on the intrinsic stabilities of benzylic cations in exactly the same manner as to the solution-phase substituent effects. Solvolysis of Bridgehead Derivatives
We report in Table 17 the standardized rates of solvolysis (as Alogk values in 80% EtOH at 70 °C, relative to l-adamantyl-/?-toluene-sulfonate) of the tosylates of a group of bridgehead and heavily hindered tertiary groups. The thermodynamic stabilities of the corresponding carbocations, as defined by Eq. 37, are also given. Figure 23 is a plot of Alog/: against AG/°37^.Thecorrelationspans231ogunitsfor k. Taking into account that at 70 °C one order of magnitude in rate constants corresponds to 1.57 kcal mol"^ in Gibbs energy of activation, this amounts to 36.1 kcal mol"^ and almost 50 kcal mol"^ in Gibbs energy of bromide exchange. It covers
Table 17,
E x p e r i m e n t a l Values o f AG?37. a n d A l o g k^^^^
Compound
^^(3?)'^
^^^S Koh
(1) 2-re/t-butyl-2-bromoadamantane
15.9
8.8
(2) 9-fe/t-butyl-9-bromobicyclo[3.3.1 ]nonane
15.1
8.6
0.0
0.0
-8.1
-3.6
-10.6
-5.9^
(3) 1 -bromoadamantane (4) 1-bromobicyclo[2.2.2]octane (5) 4-bromohomocubane (6) bromocubane
14.5
-7.3
(7) 3-bromonoradamantane
-15.0
-6.9
(8) 1 -bromohomocubane
-23.7
-ILO*"
(9) 1 -bromonorbornane
-24.3
-10.1
(10) 6-bromotricyclo[3.2.1 .O^*^]octane
-29.6
-13.9"^
Notes: ^In kcal mol"\ ^Relative to 1 -bromoadamantane. ^Extrapolated from triflate solvolysis.
82
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
r=0.9957 Slope=0.492(0.016)
lntercept=0.55(0.29) sd=0.77
m i l Hiiiiiiiii|ii
-40
-30
-20
i n i i i i i i i m i l l I I I II n i i i i i i i
-10
0
10
20
AG°P7) / kcal mol"'' Figure 23.
Differential effects on solvolysis rates, Alog k^o\^ vs. AC(37).
kcal mol"^ and almost 50 kcal mol"^ in Gibbs energy of bromide exchange. It covers practically the full experimental rate range for solvolytic bridgehead reactivities, including the previously not accessible 1-homocubyl (8"^), 1-norbomyl (9"^), and 6-tricyclo[3.2.1.0^'^]-octyl (10"^) cations. To our knowledge, this seems to be the widest range ever reported for a correlation of gas-phase data and solution kinetics. Correlation coefficient (0.996) and standard deviation of fit (0.77 on log k) are very satisfactory. The slope of the correlation between log k and the ion stabilities (-0.49) implies that 77% of the energy difference between the bromides and the respective cations are expressed in the rates of solvolysis. This slope compares nicely with that of -0.39 relating log k with strain changes between R"^ and R-Br.^^'^^ The self-consistency of all these results fully supports the basic mechanistic concepts on bridgehead solvolysis.
Gas-Phase Reactivities
83
IV. REACTIONS INVOLVING NEUTRAL REAGENTS AND PRODUCTS A. Experimental Considerations
Kinetic experiments in gas-phase pyrolyses or elimination of neutral organic molecules may lead to complicated interpretations and erroneous Arrhenius parameters unless special precautions are taken, such as seasoning the reaction vessel and most of the times in the presence of a free radical inhibitor. In the following sections only homogeneous gas-phase processes are considered. The literature coverage is careful but by no means exhaustive. Previous studies are briefly reviewed and reexamined from the standpoint of the Taft-Topsom model. B. Esters
The mechanism generally accepted for the gas-phase pyrolysis of esters of carboxylic acids may be represented as in reaction 44:
9
, ,
^o^^^-^ r 1^^
?^ vft-
—
'-?^^^
I
I
— ^-^°°^ ' -^-S- ^^^
For molecular cis elimination, the presence of a P-hydrogen at the alkyl moiety of the ester is necessary. Excellent reviews^^ have accounted for the substituent effects in several series of aliphatic and aromatic carboxylic esters. Substituents in Aliphatic Systems
P'Substituted ethyl acetates: CH3COOCH2CH2Z. The pyrolysis of acetates with alkyl and polar substituents separated from the C^^-O bond by at least three methylene groups (Table 18,1-16) was considered to be subject to a slight steric acceleration.^^^ The best approximate linear correlation was obtained by plotting log k/k^ against Hancock's steric parameter, E^ values (5 = -0.12, r=0.916, at400°C). Electron-withdrawing substituents Z, directly attached to the P-carbon of ethyl acetate reduced the pyrolysis rate according to their electronegative character^^ (Table 18, 1, 2, 18-23). A linear correlation of log k/k^ versus Taft's original inductive effect parameter, a*,^^ was obtained with a p* value of -0.19 (r = 0.961) at 400 °C. Likewise, plotting of log k/k^ against Qj values also gave an approximate linear relationship with a slope Pj = -1.03 (r=0.960) at 400 °C. Notice that although o* essentially reflects field/inductive effects, it also includes a small but significant resonance effect. The negative slope of the lines suggested, in both cases, a transition state somewhat deficient in electrons.
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
84
Table 18, Kinetic Parameters for ZCH2CH2OAC Pyrolysis, at 400 °C
z
E^ kj mor^
log A, s ^
10\j, s~^
7 0 \ , s"'
oirH
200.4
3.33
204.1
12.55 12.77
10.00
(2) CH3
8.51
(3) CH3CH2
199.5
12.50
10.47
4.26 5.24
(4) CH3CH2CH2 (5) (CH3)2CH
194.1
12.20
13.80
202.5
12.73
10.23
6.90 5.12
(6) CH3CH2CH2CH2
200.8
12.54
9.12
4.56
(7) (CH3)3C
194.1
12.34
19.05
9.53
(8) CH3CH2CH(CH3)
211.9
13.62
15.31
7.66
(9) (CH3)2CHCH2 dOc-CeHii
203.1
12.82
11.86
5.93
207.4
13.20
14.44
7.22
dDc-CsHg
208.1
13.30
13.30
6.65
(12)CH30CH2
203.3
12.69
8.13
4.07
(13)C6H50CH2
198.0
12.52
14.22
7.11
(14)C6H5CH2
203.1
{15)CH3COCH2 (16)CH30CH2CH2 (17)(CH3)3SiCH2^
198.9
12.80 12.74
10.91 20.14
5.46 10.07
209.5
13.37
12.88
6.44
189.6
12.49
58.88
29.44
(18) F
211.2
(19) CI (20) CH3O
202.0 199.9 200.4
12.68 12.14
1.95 2.88 2.82 3.47
0.98 1.44 1.41 1.74
(21)CH3CH20 (22) QHsO (23) (CH3)20 (24) CH3S (25) CH2=CH (26) CH=C (27) CeHs
179.0
11.96 12.09 12.50 13.90 11.27
200.8 197.3
13.20 13.12
23.99 41.69 64.57
191.6
12.48
41.34
206.6 220.4
2.95 6.17
1.48 3.09 12.00 20.85 32.28 20.67
(28) NC
171.9
11.51
147.9
(29) CH3CO
153.9
10.90
912.0
(30) (CH3)3Si^
175.4
12.19
380.2
190.1
(31)(CH3CH2)3Si^
173.5
12.17
501.2
250.6
73.95 456.0
(32) (CH3CH2)3Ge^
178.0
12.35
338.8
169.4
(33) C6H5(CH3)2Si^
174.7
12.19
426.6
213.3
(34) CH3SCH2CH2
192.1
12.30
Notes: ^ Values taken from ref. 99b. ^ Values taken from ref. 101.
24.55
12.27
Gas-Phase Reactivities
85
7i-bonded substituents at the (3-carbon caused a very large increase in rates (Table 18, 25-29), due to resonance effect. Moreover, the P-organometallic substituents were found to strongly accelerate the elimination process (Table 18,30-33) because of a combination of increased acidity of the |3-hydrogen, stabilization of the incipient positive carbon by carbon-metal hyperconjugation, and steric acceleration-'oi Given the large size of the data base of substituents, it is interesting to examine their effects by using the Taft-Topsom treatment of substituent effects, Eqs. 7 and 8. In this case, Eq. 45 is obtained: log k/k^ = - (0.450 ± 0.041) a ^ - (1.29 ± 0.11) G^
(45)
At 400 °C, r = 0.959, and sd = 0.086 (Table 18,1-10,12,14,16,18-23). Substituent parameter values for 11, 13, 15, and 17 were not available, and the CH3S group, as already described, assists anchimerically the elimination process. Consequently, they were not included into the treatment. The negative value of p^^ indicates the elimination reaction to be favored by the polarizability of the P-substituent Z, while the size of negative pp suggests the stabilization of the transition state by field/inductive effect. The influence of a^ as G^+ or a^- is insignificant. The series of+R substituents (Table 18,1, 26, 28-30) yielded Eq. 46: log k/k^ = - (1.81 ± 0.02) a„ - (0.38 ± 0.03) c^ + (7.34 ± 0.12) a^-
(46)
At 400 °C,r = 0.999,5^ = 0.015 This result implies appreciable polarizability and resonance effect on the rates. The high quality of the correlation is not an artefact due to the use of three parameters with a limited set of data. Indeed, the use of two parameters (excluding the small value of Pp) leads to an excellent correlation (Eq. 47): log k/k^ = - (1.74 ± 0.13) a„ + (6.70 ± 0.74) a^-
(47)
At 400 °C, r = 0.995,5^ = 0.101 Phenyl and vinyl substituents were not included due to lack of coplanarity with the reaction center. No parameters are available for substituents 31-34 of Table 18. It is interesting that these satisfactory correlations with c^ do not contradict previous regression equations involving steric parameters, the reason being at least for alkyl groups that o^ and E^, £f, and i) parameters are significantly correlated. At this point, it is difficult to ascertain whether the physical contribution arises from one of these two effects or a combination thereof.
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
86
a-Substituted ethyl acetates: CH3COOCH(Z)CH3. a-alkyl substitution (Table 19,1-9) enhanced the elimination rates of these acetates, this effect being attributed to steric acceleration.^^^ The quality of the correlations obtained when plotting log k/k^ against Taft's steric parameter E^ values (5 = -0.21, r = 0.858 at 320 °C) and Charton's \) values (\|/ = 0.46, r = 0.842 at 320 °C) were rather modest. They showed, however (within the experimental uncertainties of product distribution analyses), that the greater the bulkiness of the a-alkyl the larger the k values. This is reasonable, because the hybridization change at both C^^ and Co atoms from sp^ to sp^, releases the steric interactions between the substituents of these C atoms. The effect of electron-withdrawing substituents directly attached to the a-carbon was believed to be electronic in nature. Thus, plots of log k/k^ versus a* or Oj values approximate straight lines, indicating that field/inductive effect has a significant effect on elimination rates (p* = -0.32, r = 0.878 and pj = -2.18, r = 0.898, at 320 °C).
Table 19. Kinetic Parameters for ZCH(OAc)CH3J Pyrolysis,at 320 °C Z
E^ kj mor^
logAs'^
10\^, S-'
(1-olefin)
(1)CH3
193.8
13.42
2.24
1.12
(2) CH3CH2
197.4
13.70
2.06
1.16
(3) (CH3)2CH
190.6
13.12
2.15
1.63
(4) (CH3)3C
184.3
12.54
2.03
2.03
(5) (CH3)3CCH2
181.2
12.87
8.14
2.52
(6) CH2=CHCH2
178.2
12.34
4.42
1.10
(7) CH3CH2CH2
182.8
12.73
4.27
1.98
(8) CH3CH2CH(CH3)
180.7
12.60
4.84
3.37
(9) C-C3H5
176.9
12.19
4.07
2.12
1.68
1.68
(10)CH2=CH
174.9
11.63
(11)cis-trans-CH3CH=CH
183.6
13.11
8.70
8.70
(12)C6H5
182.8
12.75
4.47
4.47
(13)CH3COCH2
156.4
11.88
(14)CH30CH2
194.9
13.05
0.77
127.4
— 0.44
(15)CH3CO
202.7
13.40
0.35
0.35
(16)COOCH3
209.5
13.45
0.10
0.10
(17)Cl3C
193.7
12.12
0.11
0.11
(18)CICH2
197.4
12.95
0.37
0.24
(19)FCH2
197.8
12.83
0.26
0.19
(20) NC
203.3
12.88
0.09
0.09
(21)(CH3)2NCH2
185.9
12.66
1.94
1.19
(22) C6H5CH2
180.0
12.53
4.75
1.10
(23) C6H5CH2CH2
179.8
12.33
3.12
0.44
Gas-Phase Reactivities
87
Alkyl groups Z at the P-carbon in CH3COOCH(CH2Z)CH3, showed alkyl-alkyl interactions in the cis conformation and alkyl-hydrogen interactions in the trans conformation. In the former, the k value decreased due to steric hindrance, while in the latter the rate increases because of steric acceleration. When Z is an electron-withdrawing substituent, the rate decreased (Table 19, 13-21). When plotting log k/k^ versus c* and QJ values good linear correlation are obtained (p* = -0.26, r = 0.996, and p, = -1.39, r = 0.995, at 320 °C). In view of the experimental difficulties for the analysis of product distribution of a-substituted ethyl acetates, it is possible that the elimination process proceeds by kinetic control with some degree of equilibration which may not be completely ruled out. a-Substituted tertiary acetates: CH3COOC(CH3)2Z. Table 20 reports the kinetic parameters for the gas-phase pyrolysis of tertiary acetates,^^^ CH3COOC(CH3)2Z. The alkyl group (Table 20,1-7,9,12) affected the elimination processes, likely through steric acceleration. This was deduced by correlating log k/k^ against the steric parameters, E^ values of Taft (5 = -0.55, r = 0.956 at 280 °C) and ^ values of Hancock (5 = -0.38, r = 0.964 at 280 °C). When considering the polar Z groups directly attached to the a-carbon, their effects were found to be electronic in nature (Table 20, 1, 13-17). This conclusion was reached when
Table 20. Kinetic Parameters for CH3COOC(CH3)2Z Pyrolysis, at 280 °C
z
10\ s-^ (1-olefin)
E^kjmor^
log As ^
(DCHa (2) CH2CH3
167.2 168.7
(3) CH2CH2CH3 (4) CH2CH2CH2CH3
169.8 166.1 170.2 172.0 154.1 170.2
13.13 13.46 13.85
21.88 33.88 64.57
14.59 24.60 41.20
13.35 13.59 14.45 12.42 13.64
45.97
32.00
162.18 73.42 37.15
(5) CH(CH3)2 (6) C(CH3)3 (7) CH2CH(CH3)2 (8) CeHs (9) CH2CH2C6H5
W^ky s-^
32.33 162.18
151.5
11.97
45.85
50.81 37.15 30.54
(10)CH=CH2
169.8
13.59
35.48
35.48
(11)CH2CH=CH2 (12)c-C3H5
171.0
13.69
34.67
18.72
170.7
13.95
67.61
67.61
(13)CH2COCH3
160.6
12.30
—
13.49
(14)COCH3
180.9
13.47
2.40
2.40
(15)COOCH3
174.6
12.42
0.85
0.85
(16) CN
198.6
14.45
188.8
13.86
0.49 1.07
0.49
(17)CCl3
1.05
88
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
plotting log klk^ versus a* and Oj values (p* = -0.45, r = 0.950 and pj = -3.11, r = 0.964, at 280 °C). This result was taken to indicate that the greater the electronwithdrawing character of the polar substituent, the slower the elimination rate is. In the case of the multiple or 7i-bonded substituents such as CH2=CH and C^^ as Z (Table 20, 8 and 10), the rates were affected by the simultaneous steric and resonance effects. The data of Table 20 give a very good correlation (Eq. 48) by means of the Taft-Topsom method: log kJk^ = - (3.24 ± 0.27) a„ - (2.16 ± 0.46) ap - (6.8 ± 1.8) o^^
(48)
At 280 °C,r = 0.965,5^ = 0.26 The phenyl datum is excluded due to lack of coplanarity with the reaction center. The polarizability and the field inductive effects are significant, although the main contributor is Oj^+, which indicates the existence of an electron-deficient center. This fact helps explain why some neutral substrate are rather unstable even at room temperature. Tertiary acetates are more sensitive to polarizability effects than primary ones, and this is reflected when the substituent is directly attached to the reaction site. More significant is the size and the sign of Pj^+ which is quite comparable to that for the correlation with Pj^- for the substituent series in ZCH2CH2OAC with +R groups. Both cases provide support to the concept that the C^^-0 bond polarization in the transition state is the limiting factor followed by the Cp-H bond assistance in the elimination process of these esters. Acylsubstituted carboxylic esters: ZCOOR. Data for the homogeneous unimolecular gas-phase pyrolysis of ethyl^^"^^ (ZCOOCH2CH3), isopropyl^^^^ [ZCOOCH(CH3)2] and tert-buiyl^^ [ZCOOC(CH3)3]a-substituted carboxylic esters are given in Table 21. Correlating log k/k^ versus a* values yielded for ethyl ester, p* = 0.315 and r = 0.976, at 400 °C; for isopropyl ester, p* = 0.464 and r = 0.963, at 330 °C; and for tert-bntyl ester, p* = 0.635 and r = 0.972, at 250 °C. It is important to point out that the k values for several isopropyl a-alkyl-substituted esters given in the above mentioned work^^^^ have been estimated. In this respect, the reported^^^ rate coefficient at a single temperature was now used to determine the E^ parameter by taking log A = 13.10 (Table 21). This value is believed to be reasonable for a six-membered cyclic transition state for the elimination of these isopropyl esters. These studies^^^'^^^^'^^ supported the general concept that electron-withdrawing groups at the acyl side of ethyl, isopropyl, and tert-butyl esters enhance the elimination rate, while electron-releasing groups appear to reduce it.^^^ In addition to these facts, the slopes of the lines for the above-mentioned esters indicated, by extrapolation to one temperature (PT2/PTI ~ ^/^2)» ^^^^ ^^^ negative nature of the acidic carbon and polarity in the transition state increase slightiy from primary to tertiary esters.
Gas-Phase Reactivities
Table 21.
89
Kinetic Parameters for ZCOOR Pyrolysis E^kjmor^
logAs~^
1(fk^, s-^
R=CH2CH3 (ZCOOCH2CH3), at 400 *»C 1)CH3
200.4
12.55
9.93
2) CH3CH2
202.9
12.72
9.40
3) CH3CH2CH2
207.1
13.04
9.27
4) (CH3)2CHCH2
202.5
12.70
9.64
5) (CH3)3CCH2
207.1
13.04
9.27
:6) {CH3)3C
184.1
11.24
8.96
7)C6H5
199.5
12.70
16.49
8) C6H5CH2
200.0
12.60
11.98
9) rrans-CH3CH=CH
195.9
12.25
11.13
10)FCH2
194.0
12.57
32.66
11)F2CH
195.5
12.81
47.86
12)F3C
184.0
12.13
70.80
13)F3CF2C
183.1
12.16
98.18
14)F3CF2CF2C
183.6
12.29
121.15
15)CICH2
197.0
12.70
25.77
16)Cl2CH
193.9
12.62
37.30
17)Cl3C
185.1
12.27
80.29
18)CICH2CH2
196.8
12.54
18.48
19)CICH2CH2CH2
198.7
12.67
17.75
20) BrCH2
195.7
12.62
27.04
21)BrCH2CH2CH2
205.2
12.83
8.95
22) HOCH2
201.4
12.75
13.17
23) NCCH2
191.8
12.29
24) CeHsNH^
169.4
13.30
189.5
12.70
25) CeHsO^
25.32 14188 97.7
COOCH((:H3)2 ), at 330 X 1)CH3
191.1
13.21
4.54
2) CH3CH2
189.9
13.06
4.08
3) CH3CH2CH2
193.7
13.39
4.07
4) (CH3)2CHCH2
189.5
13.01
3.94
5) (CH3)3CCH2
197.0
13.65
3.85 4.79
6) (CH3)3C
189.5^
13.10
7) FCH2
182.8
12.83
9.91
8) CICH2
179.0
12.63
13.34
9) BrCH2
181.1
12.84
14.26
10)ICH2
181.1
13.09
25.35
11)HOCH2
179.9
12.56
9.48
12)CH30CH2
187.8
13.04
5.93 {continued)
90
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table 21, Continued
(13)C6H5CH2
194.1
13.63
6.58
(14) NCCH2
180.3
13.01
24.72
(15)CICH2CH2
180.8
12.57
8.13
(16)Cl2CH
176.2
12.78
32.96
(17)Cl3C
178.3
13.55
(18)CH3CH2CH2CH2
190.9^
13.10
3.63
(19)(CH3)2CH
190.3^
13.10
4.17
(20) (CH3CH2)2CH
190.1^
13.10
4.27
(21)C6H5CH2CH2
189.8^
13.10
4.57
(22) (C6H5)2CH
187.5^
13.10
7.24
(23) CH3CH=CH
190.0^
13.10
4.37
(24) C6H5CH=CH
189.3^
13.10
5.01
(25) (CH3CH2CH2)2CH
190.2^
13.10
4.17
(26) F3C'
171.5
12.70
69.18
(27) CeHs^
187.0^
13.10
7.96
(28) CeHgNH^
166.1
12.10
51.47
180.3
13.50
76.12
(29) CeHsO^ R=C(CH3
127.6
COOC(C H3)3), at 250 °C
(1) CH3
166.0
13.06
3.04
(2) CH3CH2
160.7
12.56
3.24
(3) CH3CH2CH2
163.9
12.77
2.51
(4) (CH3)2CHCH2
170.5
13.42
2.45
(5) (CH3)3CCH2
174.3
13.77
2.29
(6) (CH3)3C
169.1
13.44
3.55
(7) (CH3)3Si
181.1
14.63
3.51
(8) C6H5CH2
164.7
13.15
5.05
(9) Q H g
165.4
13.63
6.97
(lOQHsNH
167.1
14.03
22.05
dDCeHsO
153.2
13.20
79.43
(12)CH30CH2
168.2
13.49
4.90
(13)BrCH2
154.6
12.64
15.85
(14)CICH2
153.1
12.49
15.85
(15)Cl2CH
150.0
12.67
48.98
(16)Cl3C
141.1
12.41
(17)F3C
105.4
10.58
(18)NCCH2
137.8
11.31
Notes: ^ Values taken from ref. 104b. ^The obtained f^ value by scaling log/\ = 13.10. '^ Values taken from ref. 105b. ^ Values taken from ref. 105c.
208.9 11220 35.48
Gas-Phase Reactivities
91
The interposition of a methylene group between the substituent and the carboxylate reaction center greatly reduces resonance interactions. Moreover, the crucial C ^ • • • O ^ bond is rather far from Z, which means that polarizability effects are minimal and may be neglected. Consequently, and according to the Taft-Topsom treatment, the field inductive effect Qp, appears to be the main factor affecting the elimination rates of these esters (Eqs. 49-51): ForZCOOCHXH,, log ik/it^ = (2.09 ± 0.11) Gp
(49)
At 400 °C, r = 0.979, sd = 0.078 ForZCOOCH(CH3)2, log k/k^ = (2.98 ± 0.22) Gp
(50)
At 330 °C, r = 0.958, sd = 0.145 AndforZCOOC(CH3)3, log )fe/ito = (3.76 ± 0.23) Qp
(51)
At 250 °C, r = 0.979, sd = 0.l35 Estimation of Pp at a single temperature as above, confirms the increase of the negative character of the acidic carbon in the transition state from primary to tertiary esters. Substituents in Cyclic Systems
The sequence of relative rate coefficients for gas-phase monocyclic acetates are presented in Table 22.^^ The pattern is analogous to that found by Sicher^^^ for amine oxide eliminations.
Table 22. Kinetic Parameters for CH3COOZ Pyrolysis, at 330 °C
z
E^ k} mor^
log A, s ^
/o'/c,, s-^
(1)(CH3)2CH (2) C-C5H9
191.1
13.20
4.47
179.8
12.68
12.88
(3)c-CeH„
203.9
14.02
2.29
(4)c-C7Hi3
178.3
12.62
15.14
(5)c-C8Hi5
177.5
12.81
26.92
(6) c-C^o'^19 (7) C-C12H23
168.7
12.55
87.10
181.0 177.1
13.09
25.70
12.46
13.18
(8)c-Ci5H29
92
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table 23, Ring Strain of CycloaikyI Acetate Pyrolyses
Acetate
E^kjmor^
E^ kj mor^ ^
cyclohexyl
203.9
0.0
cyclopentyl
179.8
26.4
cycloheptyl
178.3
26.8
cyclooctyl cyclodecyl
177.5 168.7
41.4 52.7
cycled odecyl
181.0
18.4
cyclopentadecyl
177.1
6.3
Eykjmor^^
AEy kJ mor^""
5.9 24.7
0.0
22.6 25.1
10.1
7.6
37.5
22.2 21.1
— —
— —
Notes: ^ E^ = strain energy in cycloalkane. ^ fg = strain energy in cycloalkene. ^ Afs = (fs -"^s) - KEs-'E^ ^or cyclohexyl].
The relative low k value of the cyclohexyl acetate is probably due to a reflection of the difficulty of the six relevant atoms to assume an optimum planar or chair conformation in the transition state. It was considered that this requirement for a cyclic array of six key atoms was the most important factor in determining the relative pyrolysis rate of the other members of these series. The strain energy data given in Table 23 indicated that the strain energy difference, AE, except for cyclodecyl, increased in the same sequence as the rate given in Table 22. After a study of the rates of pyrolysis of cycloalkyl chlorides, Dakubu and Holmes^^^ concluded that ring strain may affect the rates in two ways: (1) strain enhances the energy of the ground state relative to the transition state thereby lowering the activation energy, and (2) the presence of strain in a ring system facilitates the attainment of the geometry of the transition state. This study ^^^ did not report the kinetically controlled product ratio of cis- and trans-olcfm and so, extensive speculation concerning the reasons for the enhanced rate of pyrolysis of cyclodecyl acetate was not warranted. It was thought possible that part of both the Baeyer strain and the intra-annular repulsions are relieved in proceeding to the transition state for elimination. Substituents in Alicydic Systems
The results for the gas-phase unimolecular elimination of 4-substituted isobornyl acetates are given in Table 24.^^^ The schematic representation of this elimination reaction is shown in Scheme 1. The rates were followed by CH3COOH titration. Electron-withdrawing polar substituents at C4 caused a decrease in k values with pj = -0.70 and r = 0.903, at 340 °C. The effect of these polar substituents on the elimination rate was to be modest. Negative result of Pj was associated to that of P-substituted ethyl ace-
Gas-Phase Reactivities
93
//
+
CH3COOH
/
tates.^^ In the case of this work,^^^ however, there is an interposition of a tertiary carbon containing the substituent. In the application of Taft-Topsom treatment, it is rather surprising to find the c^+ parameter to be of paramount importance in the CH3COOH elimination (Eq. 52). log yk^ = - (0.76 ± 0.07) Gp - (1.75 ± 0.26) a^^
(52)
At 340 X , r = 0.977, sd = 0.072 This result seems to indicate that the isobomyl moiety bears an overall positive charge which is stabilized by the electron-donating group and destabilized by field effect. Substituents in Aromatic Systems
P'Aryl ethyl acetates: CH^COOCHICHIC^HAZ. The kinetic parameters for the gas-phase thermal decomposition of P-aryl ethyl acetates^^^ are shown in Table
Table 24. Kinetic Parameters for 4-Substituted Isobornyl Acetates Pyrolysis,
at 340 °C Substituent
E^kjmor^
log A, s '
70%, s~^
(1)H
189.2
12.82
5.01
(2) CH3
186.7
12.72
6.50
(3) QHs (4) CH3CO
181.5
12.57
12.76
190.9
12.79
3.35
(5) CI
191.8
12.89
3.53
(6)CN
192.6
12.66
1.78
(7) NO2
191.5
12.56
1.75
94
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
25. On plotting log klk^ values against Hammett's a, a reasonable correlation was obtained with p = 0.2 at 377 °C. This work suggested electron-supplying substituents at the aromatic nuclei decreased the rate, while electron-withdrawing substituents increased it. Use of the Y-T Eq. 11, yielded a better linear relationship. Apparently, resonance interactions of the substituent with the C^-H are important for the overall reactivities. If the 4-Cl substituent of Table 25 is excluded from the correlation against the resonance parameter a", a good relationship is obtained (Eq. 53). log it//:^ = (0.16 ±0.017) a"
(53)
At 377 °C,r = 0.978,5^ = 0.011 In spite of the limited number of substituents and small differences in k values, their influence on the P-hydrogen assistance for elimination may apparently be rationalized as above. a-Aryl ethyl acetates: CH3COOCH(C6H4Z)CH3. The effects of a considerable number of substituents at the aromatic rings of a-ary 1 ethyl acetates pyrolyses have been reported in various papers.^^"^"^^^ An interesting feature of a-aryl ethyl acetates pyrolysis was thought to serve as a model reaction for determining quantitative electrophilic reactivities in the absence of solvents, catalysts, etc.^^^^^^ Glyde and Taylor investigated the gas-phase elimination kinetics of several polymethyl^^^ and polychloro-substituted^^^ a-aryl ethyl acetates. The methyl and chloro substituent effects were found to be not additive. In addition to these studies, several papers on the effect of heteroaromatic and heterocyclic groups at a-position of ethyl acetates were published.^^"^^^^
Table 25. Kinetic Parameters for CH3COOCH2CH2C^H4Z Pyrolysis, at 377 °C
z
E^ kj mor^
logAs'^
10\,\s-'
(1)2-CF3
191.2
12.55
15.19
(2) 3-CF3
189.9
12.41
13.99
(3) 3-F (4) 4-CI
189.9
12.39
13.37
191.2
12.41
(5)H
191.6
12.48
11.00 12.00
(6) 2,3,4,5-6-F5
191.6
12.46
11.46
(7) 4-F
191.6
12.46
11.46
(8) 4-CH3
192.4
12.50
10.84
(9) 4-CH3O
192.4
(10)2-F
192.8
12.51 12.44
11.09 8.77
Note: * Our calculated /c-values from the parameters of this table disagree with data reported in ref. 113.
Gas-Phase Reactivities
95
In many cases, the pyrolysis experiments were carried out at a single temperature. Collecting such information into a single large table and extrapolating to one common temperature yielded unreliable and contradicting rate coefficients. Consequently, very poor correlations were unfortunately observed. However, most of these studies reached to the conclusion that electron-donating substituents in the benzene ring increased the k values and the electron-withdrawing substituent reduced them. a-Aryl-a''Methyl Ethyl Acetates: CH3COOC(CH3)2C6H4Z. In contrast with the effect of the aryl group in a-aryl ethyl acetates, the influence of substituents in the aromatic ring of a-aryl-a'-methyl ethyl acetates was better described^^^ (Table 26). As in the a-aryl ethyl acetates, the electron-donating substituents enhanced the rate while electron-withdrawing substituents decreased it. These tertiary esters showed large elimination rates because of the more positive character of a-carbon in the transition state. A good Hammett correlation with the original a"" values was obtained (p"" = -0.74 at 550 K (277 °C)). The good correlation and the corresponding interpretation of substituent effect described above is confirmed with a^ and a^ from Table 1A (Eq. 54). log k/k^ = - (0.86 ± 0.05) Q-'
(54)
At 277 °C,r = 0.993,5^ = 0.03 Substituted ethyl benzoates: ZC6H4COOCH2CH3. The relative rates of elimination of substituted ethyl benzoates were determined in a flow system at 515 ^C^^"^ (Table 27). Linearity of the correlation against Taft's a° (p° = 0.21 at 515 °C) values was better than that against Hammett a values. The authors claimed that this means that the build up of negative charge in the C - 0 bond of the ester on going from reagents to the transition state in ester pyrolysis is much smaller than that
Table 26. Kinetic Parameters for CH3COOC(CH3)2C^H4Z Pyrolysis, at 277 °C
z
E^ kj mor^
logAs'^
W\,,s-'
(1)4-CH3
156.4
12.87
103.0
(2) 3-CH3
159.8
13.03
70.79
(3)H
160.6
13.00
55.46
(4) 4-CI
161.9
13.08
(5) 3-CI
166.1
13.22
50.18 27.64
(6) 3-pyridyl
165.2
13.23
34.44
(7) 4-pyridyl
173.2
13.56
12.80
(8) 2-pyridyl
174.0
13.56
13.22
96
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table 27. Relative Rate of ZC^H4COOCH2CH3 Pyrolysis, at 515 °C (Method: Flow System) Z
Relative Rate
(1)4-NH2
0.86
(2)4-OH
0.91
(3) 4-OCH3
0.96
(4) 3-OH
0.98
(5) H
1.00
(6)3-NH2
1.01
(7)4-CH3
1.03
(8)3-OCH3
1.07
(9)3-CH3
1.08
(10)4-Br
1.25
(11)3-Br
1.27
(12)3-CI
1.28
(13)4-CI
1.28
(14)3-1
1.32
(15)4-1
1.32
(16)4-N02
1.42
(17)3-N02
1.50
corresponding to the ionization of benzoic acid. The relevant correlation equations are: log i^^^ = (0.21 ± 0.02) a°
(55)
At515°C,r = 0.931,5^ = 0.03 logifc/^o = (0.21±0.03)a
(56)
At515°C,r = 0.923,5^ = 0.05 The difference between the quality of fit in Eqs. 55 and 56 is not large enough to permit any conclusion to be drawn. Substituted isopropyl benzoates: ZC6H4COOCH(CH3)2. A series of metaand para-substituted isopropyl benzoates^^^ at the single temperature of 337.4 °C were pyrolyzed and, as expected, the results were similar to those for substituted ethyl benzoates.^^"^ The rate of formation of propene is increased by electronwithdrawing substituents and reduced by electron-releasing substituents (Table 28). The log k/k^ correlated well with Taft's a° values. The p° = 0.33 was reported to be slightly higher than that observed in ethyl benzoates (p° = 0.20).^^"* The authors
Gas-Phase Reactivities
97
Table 28. Rate Coefficients for ZqH4COOCH(CH3)2 Pyrolysis, at 337.4 ^C 7 0 % s-^ (1)4-C(CH3)3
10.8
(2)4-CH3
11.0
(3)3-NH2
11.0
(4)4-OCH3
11.1
(5)3-CH3
11.9
(6)H
12.4
(7)3-OCH3
12.6
(8)4-F
13.8
(9) P-naphthyl
14.0
(10)4-CI
15.6
(11)3-F
16.0
(12)3-a
16.4
(13)3-N02
20.9
(14)4-N02
21.6
were surprised by the fact that ethyl benzoate pyrolyses showed a resonance-free a° correlation as there is no insulating methylene bridge between the reaction center and the benzene ring, especially as the transition state involves a degree of charge separation formally similar to that in the benzoate anion. In this respect, the difference in resonance stabilization between the reagent and the transition state becomes important when the carboxylic anion is fully developed; with incipient species the resonance effect may apparently be small. Among this series of aryl esters, several interesting pyrolytic eliminations of isopropyl (hetero)aryl carboxylate esters were described^^^ where the definition of new a° substituent constants of hetero-substituents were reported. Since isopropyl benzoates pyrolyzed at much lower temperature than ethyl benzoates, the transition state is more polar in nature. Therefore, the substituent at the aromatic rings must show a more pronounced effect on the reaction center. The Hammett equation gives a good correlation (Eq. 57): log k/k^ = (0.310 ± 0.010) o
(57)
At 337.4 °C, r = 0.987, sd = 0.02 Taft's G° values perform slightly better: log k/k^ = (0.310 ± 0.003) a° At 337.4 °C, r = 0.993, sd = 0.01
(58)
98
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
Substituted tert'butyl benzoates: ZC6H4COOC(CH3)3' Earlier work on the pyrolysis of substituted tert-butyl benzoates at 274.4 °C was found difficult to analyze because of nonreproducible rates.^"^^ However, a later investigation on the kinetic studies of tert-butyl benzoates showed less difficulties and normal Arrhenius parameters were obtained.^^^ The log k/k^ gave a good correlation with a° values with p° = 0.58 corrected to 600 K (327 °C) (Table 29). The magnitude of this value compared to previous reported value (corrected to 600 K) for ethyl^^"* and isopropyl^^^ of 0.26 and 0.34, respectively, was assumed to confirm that the transition polarity of esters along the series increased along the order primary < secondary < tertiary with the biggest polarity differences occurring between secondary and tertiary esters. In the study for the pyrolysis of tert-huiyl heteroaryl carboxylate esters,^-^^ the Hammett correlations with the literature a° values of heteroaryl substituents showed a reaction p° constant compatible with the ethanoate molecular frame rather than with the carboxylate structure. The data of Table 29 for tert-butyl benzoates leads to an excellent Taft's o° correlation, (Eq. 59). The result with a values is quite fair (Eq. 60). log it/ko = (0.62 ± 0.02) a°
(59)
At 311.9 °C,r = 0.996,5^ = 0.02 log i^^ = (0.62 ±0.04) a
(60)
At 311.9 °C, r = 0.989, sd = 0.05 a-Arylethyl benzoates: C6H5COOCH(CH3)C6H4Z and tert-butyl-a'arylacetates: ZC6H4CH2COOC(CH3)3. Rate data for pyrolysis of a-arylethyl benzoates, C6H5COOCH(CH3)C6H4Z, given in Table 30,^"*^ gave a good correlation
Table 29. Rate Coefficients for ZC^H4COOC(CH3)3 Pyrolysis 1(fkyS~^ at 311.9''C
/A-rS'^ at 297.8 °C
(1)4-OCH3
3.23
1.44
(2) 4-CH3
3.42
(3) 3-CH3
3.58
1.48 1.57
(4)H
3.83
1.66
(5) 3-OCH3
4.37
1.88
(6) 4-F
4.86 5.94
2.23 2.62
(7) 4-CI (8) 3-CI
6.60
3.02
(9) 3-NO2
11.17
5.00
(10)4-NO2
11.80
5.08
Gas-Phase Reactivities
Table 30.
99
Kinetic Parameters for qH5COOCH(CH3)C6H4Z Pyrolysis, at 641 K
(368 "O E^kjmor^ (1)4-CH3 (2) 3-CH3
167.6 175.4
logAs'^
.4/. a
lO^k ; , S
12.15
308.7
12.64
220.7
(3)H
173.4
12.38
176.5
(4) 4-CI
175.9 180.7
12.53
156.0
12.70
93.72
(6) 4-CF3
181.1
12.63
74.00
(7) 3-NO2
176.1
12.11
57.12
(8) 4-NO2
177.8
12.21
52.26
(5) 3-CI
-/
with c^ values with p"^ = -0.68 at 641 K (368 °C). This result suggested the p factor to be between those for acetates and phenyl carbonates, and nearer to the value of the former. Previous work of Smith and coworkers^ ^^^ on a series of a-arylethyl benzoates, had laid major emphasis on obtaining LFER involving <9rr/io-substituent in order to demonstrate that proximity effects were minimal or non-existent in gasphase reactions. In addition to this fact, several a j substituent constants were defined. In the case of rerr-butyl-a-aryl acetates, ZC6H4CH2COOC(CH3)3, the rate data gathered in Table 31 ^'^^ led to a good correlation of log kJk^ versus a° values at 600 K (327 °C) with the p"" = 0.39. Amin and Taylor considered that, in principle, a° values are supposed to represent conjugation-free interaction of the substituent with the reaction site, and were defined from reactions in which the substituent and site
Table 31.
Kinetic Parameters for ZC^H4CH2COOC(CH3)3 Pyrolysis, at 600 K (327 °C) E^ kj mor^
log A, s"'
/ O^k^ ^ s~
(1)4-OCH3
162.5
12.90
(2) 4-CH3
166.3
13.22
548.2
(3) 3-CH3
164.9
13.12
576.6
(4)H
164.9
13.17
647.0
(5) 4-F
162.5
13.02
741.1
(6) 4-CI
159.0
12.73
766.7
(7) 3-CI
161.0
12.96
872.0
(8) 4-NO2
164.0
13.42
1378.0
562.2
100
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
are insulated from each other by a saturated chain. However, they estimated that such saturated chains can transmit conjugation effects so that a° values are themselves exalted. Nevertheless, good correlations have generally been obtained with these parameters in reactions where resonance transmitting ability of saturated linkage is comparable to that in the standard reactions. This led Wepster and coworkers^"^^ to further investigate the problem. Contrary to the poor correlation of a-aryl ethyl acetates, CH3COOCH(C6H4Z)CH3, an excellent linear relationship is obtained for the a-aryl ethyl benzoates, (C6H5COOCH(C6H4Z)CH3, Table 31) against a^ values (Eq. 61). log k/k^ = - (0.71 ± 0.02) a-'
(61)
At 368 °C,r = 0.997,5^ = 0.02 The Hammett correlation is fairly good (Eq. 62). log k/k^ = - (0.73 ± 0.06) a
(62)
At 368 °C,r = 0.986, ^d= 0.07 For tert-butyl a-aryl acetates, ZC5H4CH2COOC(CH3)3, the interposition of CH2, as defined, does not appear to insulate dramatically the substituent Z and the reaction center. This fact, noticed and well-documented by Amin and Taylor^'*^ is shown by means of Eqs. 63 and 64: log k/k^ = (0.40 ± 0.02) a°
(63)
At 327 °C, r = 0.995, sd = 0.02 log i^o = (0.39 ±0.03) a
(64)
At 327 °C,r = 0.983,5t/ = 0.03 C. Halides Gas-phase pyrolysis of alkyl halides leads to molecular dehydrohalogenation, as shown in reaction 65:
-9R-9aH
X
—
-«^r4IH ^ X
—
- 4P= c -
+ HX (65)
Here again, the presence of a hydrogen P to the C-X bond is necessary for molecular elimination to take place. Many contributions since 1953 have advanced our knowledge of alkyl halide pyrolyses with respect to the interpretation of substituent
Gas-Phase Reactivities
101
effects, LFER, trans elimination, elimination-cyclization, the occurrence of neighboring group participation and rearrangement reactions.^^^ Aliphatic Halides
fi'Substituted etiiyl clilorides: ZCH2CH2CI. Kinetic parameters for the gasphase elimination of P-substituted ethyl chlorides are given in Table 32.^"^^ A fair linear correlation with a* parameters can be obtained in the case of alkyl groups with p* = -1.17 ; r = 0.971 at 440 °C (Figure 24). The inductive effect of the alkyl groups appeared to stabilize a modest amount of positive charge on the a C-atom in the transition state, and therefore enhanced the rate of HCl elimination. However, polar electron withdrawing substituents decreased the rate giving rise to an inflexion point at o*(CH3) = 0.00 into another good straight line with p* = -0.30, r = 0.992 at 440 °C (Figure 24). The small p* value of -0.30 suggested a less developed bond polarization of C ^ • • • Cl^ bond in the transition state. The inflexion and the observation of two slopes was attributed to a simultaneous effect operating at the transition state during these eliminations, particularly with electron-withdrawing substituents. This means that, as these groups destabilize the positive carbon reaction center in the transition state, the hydrogens adjacent to Z in ZCH2CH2CI may become more acidic or labile, and thus assist the removal of the leaving chloride ion. Consequently, the inflexion point at a*(CH3) = 0.00 was attributed to a slight alteration in the polarity of the transition state due to changes of electronic transmission at the carbon reaction center. Several additional substituents at the P-position of ethyl chlorides have been reported to greatly enhance these elimination rates due to resonance effect or through anchimeric assistance. Equation 66 applies to all the available data: log k/k^ = - (1.35 ± 0.05) o^-(1.86
±0.13) Op- (1.14 ± 0.18) QR-
(66)
At 440 °C, r = 0.987, sd = 0.09 The contributions of c^, Gp, and a^+ are statistically significant, p^^^ is quite comparable to Pj^+. This is the first time in reactions of gas-phase neutral species in which polarizability is recognized to play an important role in the rate of elimination. This fact unifies the results of alkyl and polar electron withdrawing groups into a single expression. Of course, the possibility of some steric contribution being hidden behind the c^ term cannot be entirely ruled out at this point. a-Substituted ethyl chlorides:CH3CH(Z)CI. The effect of substituent Z in the gas-phase elimination of a-substituted ethyl chlorides, CH3CH(Z)C1, is listed in Table 33.^^^
102
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table 32. Kinetic Parameters for ZCH2CH2CI Pyrolysis, at 440 °C
z
mi (2) CH3 (3) CH3CH2 (4) CH3CH2CH2
E^kjmor^
logAs-^
1(fk;, S-'
icfkf^ s~^
241.8 229.2 230.7
13.83 13.44
1.34
0.45
4.47
13.63
5.37
14.61
2.23 2.69 2.72
(5) CH3CH2CH2CH2
244.0 236.2
14.09
5.44 6.13
(6) (CH3)2CHCH2
232.7
13.91
7.31
(7) (CH3)2CH
235.3
14.12
7.64
3.82
(8)c-C6Hii
216.9
12.83
8.71
4.35
(9) G-C5H9
231.8
13.87
7.76
3.88
(10)CH3CH2CH(CH3)
223.7
13.40
10.30
5.15
(11)(CH3)3C
218.8
13.08
11.34
5.67
(12)CH2=CHCH2 (13)CH30 (14) HO
238.4
14.25
6.11
3.06
244.7
14.06 12.80
1.36
0.68
9.80
0.96 0.51
0.48 0.26
13.20
0.35
0.18
(15) CI
229.6 192.4
(16) NC
241.0
3.06 3.65
A Taft correlation of log k/k^ for alkyl substituents versus a* values gave a good straight line with p* = -3.58; r = 0.996, at 360 °C. This result suggested that the branching of alkyl group at C^^ enhanced the HCl elimination due to electron-releasing effect. The negative p* hinted at a positive charge at C^^ in the transition state. The overall log k/k^ vs. a* plot was again found to show a change of slope at a*(CH3) = 0.0 and followed by a good straight line with p* = -0.46; r = 0.972 at 360 °C. The value -0.46 for p* implied a small polarization of the C-Cl bond in the transition state. The figure found in this work resembled the Taft correlations found in the gas-phase elimination of P-substituted ethyl chlorides/^^ ZCH2CH2CI. The bimodal correlation was explained in terms of a slight change in the polarity of the transition state due to changes of electronic transmission at the carbon reaction center. In the case of - R substituents attached to the C-Cl bond (see Table 33) electron delocalization through resonance with the positively charged carbon atom in the transition state was thought to be important. Consequently, they could not be plotted in the Taft figure for CH3CH(Z)C1. In addition to this fact, the log k/k^ of these substituents were not correlated with G^ values. This was because a"*" values had been defined for substituents on the benzene ring and not adjacent to the reaction site. Equation 67 gives a fair description of all the available substituent effects:
103
Gas-Phase Reactivities 0.5
Figure 24.
Log (k/k^ for the pyrolysis of ZCH2CH2CI vs. a
log kJk^ = - (2.42 ± 0.45) a„ - (3.01 ± 0.87) Op - (12.6 ± 1.2) o^^
(67)
At 360 °C, r = 0.932, sd = 0.75 It may be compared to Eq. 66 for ZCH2CH2CI; p^^ and pp are about twice as large as those previously found. However, PR+ is now about 10 times larger than in Eq. 66, which means an enormous increase in electron demand in the transition state. This result may suggest the onset of a modest "intimate ion-pair" character. a-Substituted-a^-chloro propanes: ZC(CH3)2CI. The Taft plot for the alkyl substituents of tertiary alkyl chlorides, [ZC(CH3)2C1, Z=alkyl] (see Table 34^"^) was a good straight line with a slope p* = -4.75; r = 0.993 at 300 °C. The inclusion of data for polar electron withdrawing substituents (Table 34) also gave rise to an inflexion point of the line with p* = -0.73; r = 0.912 at 300 °C. This result was found to resemble similar correlations for a- and (J-substituted ethyl chlorides, and
104
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
the alteration in the polarity of the transition state was rationalized as before. ^"^^'^"^^ The datum for the vinyl group was not included in the plot, because the n electrons of the double bond may be delocalized by resonance. The overall regression Eq. 68 applies to the entire data set: log k/k^ = - (4.28 ± 0.27) c^ - (6.75 ± 0.57) Op - (7.7 ± 1.5) o^^
(68)
At 300 °C,r = 0.986,5^ = 0.27 It appears to confirm the trend of increased zwitter-ionicity of the transition state in the elimination of ZC(CH3)2C1. The Pj^+ value is smaller than in the case of CH3CH(Z)C1. This is reasonable on account of the saturation induced by the two methyl groups. The size p^ and Pp clearly suggests a substantial extent of positive charge development in the tertiary chloride.
Table 33. Kinetic Parameters for CH3CH{Z)CI Pyrolysis, at 360 °C.
z
E^ kj mor^
logAs'^
)0\^,s-^
10^k
^ s'^
' ^ ^CH3 ' ^
niH
241.8
13.83
0.008
0.008
(2) CH3
211.6
13.47
1.02
(3) CH3CH2 (4) CH3CH2CH2 (5) (CH3)2CH (6) (CH3)3C
211.7 212.1 207.8 197.3 224.0
13.99 14.10 13.80 12.99
3.31 3.97
0.51 1.42 1.71 2.07 4.62
(7) CI (8)F
4.51 5.13
13.28 14.02
(9) CI3C (lOCHjCO
225.1 205.3
13.50 11.77
0.068
0.018 0.057 0.068
(11)CH30C0
217.0
12.22
0.021
0.021
(12) NC
236.1
13.45
0.009
0.009
(13)CH30COCH2 (14)CH2=CH
214.9
13.65
0.83
0.20
203.8 197.4 .
13.42
4.02
4.02
13.39
12.65
12.65
(15)CH2=C(CH3) (16)cis-CH3CH=CH
131.3
9.85
(17)trans-CH3CH=CH
196.8
13.86
(18)CH3CH=CHCH=CH
144.5
10.43
(19)C6H5
187.8
(20) CH3O
139.0 128.2
12.63 11.44
(21)CH3CH20
Note: ^ k value to a CH3 group of CH3CH(Z)Cl.
10.68
0.063 0.018 0.084
0.032
239.7
1040 41.85 320.0 13.62 9400 12600
1040 41.85 320.0 13.62 9400 12600
Gas-Phase Reactivities
105
fi'Substituted ethyl bromides: ZCH2CH2Br. The kinetic data from the maximally inhibited gas-phase elimination of P-substituted ethyl bromides (ZCH2CH2Br -> ZCH=CH2 + HBr) are given in Table 35.^^^'^"^^ The log k/k^ of the electron-releasing alkyl group listed in Table 35 versus a* value gave a very good straight line (p* = -1.87; r = 0.991 at 400 °C). Yet, the Taft plot of the few electron-withdrawing substituents shown in Table 35 produced an inflexion point at a*(CH3) = 0.00 and gave another straight line with slope p* = -0.24; r = 0.988 at 400 °C. The result of one slope with electron-releasing alkyl groups and another slope at a*(CH3) = 0.00 with CN-substituted groups resembled the Taft correlation in the gas-phase pyrolysis of alkyl and polar substituted primary/"^^ secondary/"^^ and tertiary^"^ alkyl chlorides. A similar explanation has been given for the change of the transition state. The overall Eq. 69 is: log */ito =-(1.19 ± 0.10) a^,-(1.32 ± 0.16) Qp-(4.56 ± 0.90) QR^
(69)
At 400 °C,r = 0.991, ^t/ = 0.07 This equation is comparable to that of Eq. 66 for the homologous chlorides, except for the larger size of pj^+ for the bromides. a-Substituted ethyl bromides: CH3CH(Z)Br. The data for the gas-phase elimination kinetics of several secondary alkyl bromides are given in Table 36.^"^^ Data for alkyl substituents Z could not be correlated. This is because the formation of the corresponding olefin products are subject to HBr-catalyzed isomerization. The absence of kinetic control prevented an adequate rationalization of the factor by which alkyl Z in CH3CH(Z)Br affected the direction of elimination. Yet, an
Table 34. Z
Kinetic Parameters for ZC(CH3)2CI Pyrolysis, at 300 °C E^kjmor^
logAs'^
1(fk^,s~^
(1-olefin)
(1)H (2) CH3
213.8
13.64
0.017
0.017
188.2
13.77
4.13
(3) CH3CH2 (4) (CH3)2CH
184.1
13.77
9.76
2.75 6.54
175.3
13.33
22.47
17.77
(5) (CH3)3C
171.5
13.80
147.19
147.19
(6) CH2=CH
178.2
13.30
11.48
11.48
(7) CH3CO
190.8
12.56
0.15
0.15
(8) CH3OCO
215.2
13.81
0.016
0.016
(9) CICH2
207.6
14.29
0.23
0.23
(10) CI
199.0
12.88
0.055
0.0275
106
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table 35. Kinetic Parameters for ZCH2CH2Br Pyrolysis, at 400 °C
z
logAs'^
/O^/^i, s"^
224.6
13.19
0.58
212.1
12.90
2.75
212.9 211.2 211.2
13.18 13.09 13.14
4.57
(6) (CH3)2CHCH2 (7) (CH3)2CH
209.3
13.04
6.31
214.0
13.57
9.12
CITH (2) CH3 (3) CH3CH2
E^ kj mor^
(4) CH3CH2CH2 (5) CH3CH2CH2CH2
5.01 5.62
(8) CH3CH2CH(CH3)
220.1
13.94
7.24
(9) (CH3)3C
214.8
13.66
9.77
(10) NC
231.9
13.56
0.36
(11)NCCH2
233.5
14.12
1.00
(12)NCCH2CH2
225.7
13.79
1.86
increase in electron releasing effect of Z (Table 36) produced a small but significant increase in the overall rate of dehydrobromination process. The Taft-Topsom treatment could not be fully implemented because of the limitations described above. However, the treatment of the overall rate of HBr elimination leads to Eq. 70: log k/k^ = - (3.81 ± 0.07) a„ - (14.74 ± 0.53) a^^
(70)
At 340 °C,r = 0.999,5^ = 0.02 p^ and Pj^+ are comparable to those found in the study of CH3CH(C1)Z, which means that charge development is similar. Notice that the data set includes hydrogen and alkyl groups only. Therefore, great circumspection is needed in the interpretation of the results. Aromatic Halides Substituted a-phenylethyl chlorides, ZCgH4CH(Cl)CH3, pyrolyses were reported to be homogeneous unimolecular and to obey a first-order rate law.^"^^ The Arrhenius parameters are listed in Table 37. Log kJk^ when plotted against o^ values of Brown and Okamoto yielded a straight line of slope p"^ = -1.36 at 335 °C with r = 0.998 and sd = 0.05. According to this result, a moderate degree of charge separation in the transition state was suggested. D. Carbonates Carbonates with at least a P-hydrogen atom are known to decompose at high temperature to olefins, alcohols (or phenols), and carbon dioxide. The commonly
Gas-Phase Reactivities
107
Table 36, Kinetic Parameters for CH3CH(Z)Br Pyrolysis, at 340 °C
z
E^, kj mor^
logA.s-^
(1)H
224.6
13.19
0.012
(2) CH3
199.9
(3) CH3CH2
9.88
13.08
14.13
(5) CH3CH2CH2CH2
188.3 187.0 185.7
13.62 13.04 13.08
18.20
(6) (CH3)2CHCH2
183.4
13.08
28.38
(4) CH3CH2CH2
70^1, s-^
3.89
accepted mechanism is a six-membered transition state with the initial formation of the unstable intermediate bicarbonate which decomposes extremely rapidly to give alcohol (or phenol) and CO2, (reaction Eq. 71). o*
o II I I RO-C-O-C-C-H I
c>
RO-C-OH
.iJc-
I
ROH
(71)
CC2
Aryl-Substituted Carbonates
a'Arylethyl methyl carbonates: CH30COOCH(C6H4Z)CH3. The rate for the pyrolysis of meta- and para-isomers of a-arylethyl methyl carbonates (Table 38)116a ^gj.^ correlated with the equation log k/k^ = p^a'^ using standard o^ values. The result of p"^ = -0.71 suggested that charge separation occurs in the transition state. In addition to this fact, LFER involving ortho substituents were further generated in order to demonstrate that proximity effects in gas phase are constant or perhaps even negligible. This work defined several of these a^ values. However later work pyrolyzing several additional substituted a-arylethyl methyl carbon-
Table 37. Kinetic Parameters for ZC^H4CH(CI)CH3 Pyrolysis, at 335 °C
z
E^ kJ mor^
logA.s'^
k^/k^
(1)4-CH3
199.1
14.1
(2) 4-F
190.3
12.9
2.89 1.27
(3) 3-CH3
178.2
12.0
1.46
(4)H
187.0
12.5
(5) 4-CI (6) 4-Br
184.5
12.2
1.00 0.77
184.5
12.2
(7)4-CN
196.6
12.5
0.72 0.14
108
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table 38. Kinetic Parameters for CH30COOCH(qH4Z)CH3 Pyrolysis at 307.2 °C E^kjmor^
1(fkyS-1
(1)H
172.3
(2) 4-OCH3
171.9
(3) 2-OCH3
158.5
39.2
(4) 4-CH3
171.5
39.2
(5) 2-CH3
178.6
32.2
(6) 3-CH3
184.9
—
(10)2-Br
— — -_ —
(11)3-Br
184.1
(7) 2-F (8) 4>F (9) 4-Br
(12)4-CI
—
(13)2-CI (14)3-CI
179.5 184.1
(15)4-N02
177.4
(16)2-N02
172.3 174.4
(17)3-N02^
18.5 106
9.65 19.4 13.5 7.35
— 15.2 7.54 8.42
— — 4.26
Note: ^ Value taken from ref. 116b.
ates^^^ (Table 39) indicated, contrary to the previous paper/^^^ that there is no correlation of rate with activation energy so that a LFER relationship may be a fortuitous result and the determination of a^ values cannot be justified. On this basis, ^^^ c?-substituents were excluded in the correlation of the data given in Table 38. Using only data for meta 2indpara substituents, an excellent correlation (Eq. 72) is obtained.
Table 39. Kinetic Parameters for CH30COOCH(qH4Z)CH3 Pyrolysis, at 306.7 °C -1 W^ky s (1)H
22.3
(2)4-1
17.3
(3) 2-1
9.8
(4)3-F
11.2
{5)3-Br
11.0
(6)3-1
10.35
Cas-Phase Reactivities
109 log k/k^ = - (0.88 ± 0.04) a^
(72)
At 307.2 °C, r = 0.991, sd = 0.06 a-Arylethyl phenyl carbonates: C6H50COOCH(C6H4Z)CH3. The data for the thermal decomposition of phenyl a-arylethyl carbonates (Table 40)^"^^ was found to correlate with a"*" values giving p"*" = -0.84 at 600 K (327 °C). According to this result, the transition state for the phenyl carbonate was found to be slightly more polar when compared to the corresponding acetate pyrolysis (p"^ = -0.66). The greater polarity of the carbonate transition state was confirmed by the kinetic isotope effect which is smaller for a-phenylethyl phenyl carbonate (k^/k^ = 2.11 at 600 K) than for a-phenyl ethyl acetate (k^/kj^ = 2.32, at 600 K). The Hammett correlation was reported to give slight curvature, attributable to -R substituents activating more, and +R substituents deactivating less than expected. Similar behavior is met in the pyrolysis of a-aryl ethyl acetates.^^^'^-^^ The same pattern was also observed within a series of mono-substituted a-aryl ethyl acetates, where the p factor was believed to be an average value representing the range of transition state structures. A better correlation was found by the use of Y-T Eq. 10.^^ Values of-0.80 for p and 1.3 for r"" were obtained. The presence of the phenoxy group must favor C^^-0 bond polarization in the sense of C^^ • • • • O ^ in the transition state. Consequently, it has to be more polar than the corresponding acetate. The positive carbon is therefore expected to be stabilized or destabilized through delocalization to substituents in the benzene ring, as is indeed the case. Notice that a straightforward correlation of log k/k^ with a^ already provides a very good correlation, (Eq. 73). log k/k^ = - (0.82 ± 0.02) a^
(73)
At 327 °C,r = 0.998, ^^ = 0.03 Table 40. Kinetic Parameters for qHpCOOCH(qH4Z)CH3 Pyrolysis, at 600 K (327 °C)
z
f^, k) mor^
log A, s ^
(1)4-CH3
166.7
13.17
(2) 3-CH3
165.9
12.89
277.9
(3)H
170.7
13.21
221.7
(4) 4-CI
186.6
W'k^, s-^ 451.0
(5) 3-CI (6) 4-CF3
168.0 171.1
12.90 12.94
173.8
12.99
71.8
(7) 3-NO2
173.1
12.81
(8) 4-NO2
175.9
13.05
54.6 54.1
109.9
110
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
Ethyl-substituted phenyl carbonates: ZC6H4OCOOCH2CH3 and ethylsubstituted-benzyl carbonates: ZC6H4CH2OCOOCH2CH3. The pyrolyses of a series of ethyl ring-substituted phenyl and benzyl carbonates in the gas phase were carried out at one temperature only (Tables 41 and 42).^^^ The results for substituted phenyl ethyl carbonates (Table 41) suggested that the rate of elimination is retarded by electron-releasing substituents and accelerated by electron-withdrawing substituents. Log klk^ was found to be better described by Taft's a° (p° = 0.20, at 363.4 °C) than by Hammett's a values. Substituent effects on the gas-phase pyrolysis of benzyl ethyl carbonate, at 384.5 °C, were very small (Table 42). The rate data were believed to follow a a° relation. However, most of these values fall within the experimental errors. This fact was attributed to the insulation due to the interposition of a methylene group. The treatment of the data given in Table 41 leads to Eq. 74: log )^^^ = (0.19 ± 0.01) a°
(74)
At 363.4 °C, r = 0.990, sd = 0.01
Table 41, Kinetic Parameters for Z q H 4 0 C O O C H 2 C H 3 Pyrolysis, at 363.4 °C
(1)4-NH2 (2)4-OCH3 (3)4-C(CH3)3
10.7 11.3 11.3
(4)4-CH3
11.5
(5)3-CH3
11.7
(6)H
12.1
(7)2-OCH3
12.3
(8)3-OCH3
12.3
(9)4-C6H5
12.7
(10)3,4-C6H4
13.0
(11)2-CH3
13.2
(12)4-F
13.2
(13)4-CI
13.6
(14)3-CI
14.1
(15)2-CI
16.1
(16)4-N02
16.7
(17)3-N02
17.2
(18)2-N02
35.6
Gas-Phase Reactivities
111
Table 42, Kinetic Parameters for ZqH4CH20COOCH2CH3 Pyrolysis, at 384.5 °C 70^/ci, s-^ (1)2-OCH3 (2) 3-OCH3
25.7 25.7
(3) H (4) 4-CH3
26.3 26.4
(5) 2-CH3
26.7
(6) 3-CH3
26.7
(7) 4-OCH3
26.9
(8) 3-CI
28.3
(9) 4-CI
28.4
(10)2-CI
29.0
(11)4-N02
29.8
(12)3-N02
30.4
(13)2-N02
64.9
The range of experimental rates for ZC6H4CH2OCOOCH2CH3 (Table 42) is small and, as discussed by the authors/^^ the values themselves may be unreliable. This fact is possibly reflected by the low quality of the correlations obtained. O-Ethyl S-aryl thiocarbonates: ZC6H4SCOOCH2CH3. Substituted Oethyl ^-aryl thiocarbonates^^^ (Table 43) were found to be less reactive than the corresponding ethyl aryl carbonates. ^^^ The relative rates derived from Table 43 at 700 K (427 °C) showed a good correlation with a° values with p° = 0.26. The p° value estimated at 600 K (327 °C) is 0.30, larger than the value of 0.20 for carbonates, suggesting that sulfur is a better transmitter of substituent effect than
Table 43.
z
Kinetic Parameters for ZC^H4SCOOCH2CH3 Pyrolysis, at 700K (427 °C) E^ kj mor^
logAs'^
W%yS-^
195.3 194.1
12.74
145.6
(2) 4-OCH3
12.58
123.8
(3) 2-OCH3
192.0
12.40
117.4
(4) 4-CH3
196.6
136.8
(5) 4-CI
191.3
(6) 3-CI
189.0 189.2
12.81 12.49 12.37
162.8 183.4
12.48
228.3
(1)H
(7) 4-NO2
112
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
oxygen. If sulfur is to be a better transmitter than oxygen towards the benzene ring, then a° should be a less appropriate parameter than Hammett a values. Although, differences in the estimations are small, they are significant. The correlations with o° and a values, Eqs. 75 and 76 are as follows: log klk^ = (0.24 ± 0.005) a°
(75)
At 427 °C,r = 0.989,5^ = 0.02 log klk^ = (0.25 ± 0.009) G
(76)
At 427 °C,r = 0.996,5^ = 0.01 E. Carbamates
The rate constants for the pyrolyses of tert-buiy\ N-aiy\ carbamates at 548.8 K (275.8 °C) are given in Table 44.^^^ Log k/k^ correlates linearly with a° values (p° = 0.55), and with the rate of thermolysis in diphenyl ether at 450.7 K (177.7 °C). The data of Table 44 suggested that electron-withdrawing substituents enhanced the rate, while electron-donating substituents decreased the k values. Consequently, this result led to consider that a negative charge is developing at the y-carbon atom (reaction 77).
O ZC6H4NH—Cy—o—Ca—Cp—H
^-
- ^
ZC6H4NH—
ZC6H4NHCOOH
.
11 —c=c
1'
,ZC6H4NH2
+ CO2
(77)
The similar values of the p factor for the gas phase and diphenyl ether pyrolyses, led us to believe that the substituent effect (and therefore the p factor) in the elimination reaction is largely independent of solvation. The data of this work compared with those for carbonates indicated that the NH group may be more effective at transmitting the conjugative (resonance) effect than O. Additional consideration was given to the fact that the Hammett p values not only depends upon charge developing at the reaction center but also upon the ability of the system
Gas-Phase Reactivities Table 44.
z (1)H (2) 3-CH3 (3) 4-CH3 (4) 3-OCH3 (5) 4-OCH3 (6) 4-F (7) 4-CI (8) 3-NO2 (9) 4-NO2
113 Kinetic Parameters for ZC^H4NHCOOC(CH3)3 Pyrolysis, at 548.8 K (275.8 °C)
E^kjmor 166.9
— — — — — — — —
13.8 —
129 121
—
110.5
— —
139.5 107
—
157
—
182
—
328
—
482
to transmit an electronic effect. In comparing the p factor between the carbamates and benzoates, the transition state of the latter is less polar. As already advanced/^^ a very good correlation is obtained with a° values (Eq. 78). log kJk^ = (0.63 ± 0.02) a°
(78)
At 275.8 °C, r = 0.993, sd = 0.03 Since the development of negative charge at the carbamoyl moiety in the transition state is a determining factor, the c~ parameters yield a fair relationship (Eq. 79). log k/k^ = (0.48 ± 0.03) a"
(79)
At 275.8 °C, r = 0.982, sd = 0.05 F.
Thionocarbamates
The kinetics for the thermal rearrangement of thionocarbamates, R2NC(S)OC6H4Z, to thiolcarbamates, R2NC(0)SC6H4Z, in the absence of solvent were determined at 170 °C or 190 °C by following the evolution of the reaction by means of UV spectrophotometry.^^"^ The reaction was found to be first order, and the k values shown in Table 45 indicated the rearrangement process to be favored by electron-withdrawing substituents. Use of the Y-T (Eq. 11)^^ with p = 1.92 and r" = 1.60 led to an excellent correlation of log k/k^. The large positive r~ value suggested that the transition state may be greatly stabilized by an additional resonance effect of the para substituent. The influence of the alkyl group R in 4-nitrophenyl-A^,A^-dialkyl thionocarbamates at 170 °C is also shown in Table 45.
114
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table 45. Kinetic Parameters for R2NC(S)OC6H4Z Pyrolysis, at 170° and 190 °C 10^ k,min-^
R
Z
190°C
170°C
(1)CH3
4-CI
0.07
(2) CH3
3-CI
0.07
(3) CH3
3-CF3
0.09
(4) CH3
4-COOC2H5
0.54
(5) CH3
4-COCH3
1.24
(6) CH3
4-CN
2.67
(7) CH3
3-NO2
0.32
(8) CH3
4-SO2CH3
2.71
— — — — — — — —
(9) CH3
4-NO2
18.12
4.44
(10)C2H5
4-NO2
7.47
(11)n-C3H7
4-NO2
(12)i-C4H9
4-NO2
— — —
7.71 9.41
For these substituents, plots of log k^^ versus Taft a* values gave a good straight line with p* = 1.09. The electron releasing effect of the R group appeared to enhance the nucleophilic character of the sulfur atom, thus increasing the rate of rearrangement in the order: /-C4H9 > n-C3H7 > C2H5 > CH3. The mechanistic process of this rearrangement was assumed to proceed through a four-membered cyclic transition state caused by the nucleophilic attack of the sulfur atom to CI of the aromatic ring (reaction). RoN-C-T-O
Z
Z G.
P-Hydroxyolefins
Pyrolysis of |3-hydroxyolefms are known to proceed through a six-membered cyclic transition state as described below in reaction 81:
R2C(OH)CH2CH=CR2
Rj-C
/>,CH
R2CO + CH2=CH-CR2H (81)
Gas-Phase Reactivities
115
3'Aryl'3'buten'1'Ols: HOCH2CH2C(C6H4Z)=CH2. The results on the pyrolyses of 3-aryl-3-buten-l-ols (Table 46)^^"^ appeared to add further support to the six-membered cyclic transition state involved in the elimination process of P-hydroxyolefms in the gas phase.^^^ The Hammett plot of log k/k^ versus a values was shown to be approximately linear with p = -0.59, at 619 K (346 °C). The small p value suggested a modest substituent effect with little charge being developed at the 3-position. The authors were of the belief that conjugation of the olefin from pyrolysis and the acidity of the alcohol are the major factors in controlling the rate of elimination. If HOCH^ • • • • ^CH2 bond polarization is rate-determining, the C3 bearing the aromatic ring may acquire a positive charge. This means that substituent Z may affect the rate, and the more appropriate parameter should be the a^ value. In this sense, an acceptable correlation is obtained (Eq. 82). log k/k^ = -(0.53 ± 0.10) a-"
(82)
At 346 °C,r = 0.926,5^ = 0.05 Perhaps a reason for the small r is the possibility that the styrene product polymerizes slightly during the reaction. /-y4ry/-3-6afen-r-o/s: ZC6H4CHOHCH2CH=CH2, and 1 ^Aryl-S-butyn-l^ ols: ZC6H4CHOHCH2C^CH. The pyrolyses of several l-aryl-3-buten-l-ols, ZCgH4CHOHCH2CH=CH2, in a static system were found to be a first-order, homogeneous, and unimolecular elimination process at 610-644 K (337-371 °C) (Table 47).^^^ The Hammett pa plots yielded a small value of p = -0.26 suggesting a minor substituent effect for the meta- and para-isomQvs. Consequently, little or no charge appeared to have developed at CI position in the transition state. Later work on the thermolyses of l-aryl-3-butyn-l-ols, ZC6H4CHOHCH2C=CH, and l-aryl-3-buten-l-ols, ZC6H4CHOHCH2CH=CH2, in dilute solution in an inert solvent^^^ led to the conclusion that this reaction proceeds through an intramolecular six-membered cyclic transition state analogous to the corresponding olefinic system in the gas phase. ^^^ The experimental result of this work^^^ showed that aryl substituents at CI position of both systems gave almost identical rate acceleration relative to the corresponding phenyl derivatives. The Hammett equation for both reactions gave a p value of -0.35 over the range 250-280 °C. The rationale for this lack of electronic differences on rate accelerations in the two systems is that of steric factors. Furthermore, because of the high nucleophilicity of oxygen of the OH in the bond polarization, ZCgH4C(0H)^ • • • • ^CH2 appeared to overshadow the effect of any substituent in the aromatic ring. The Hammett correlation for the few available substituents in l-aryl-3-buten-lols(Table47)isEq. 83: log k/k^ = -(0.37 ± 0.04) a At 346 °C,r = 0.975,5^ = 0.02
(83)
116
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table 46.
z
ic Parameters for HOCH2CH Kinetic HOCH2CH2C(C6H4Z)=CH2 Pyrolysis, a t 6 1 9 K ( 3 4 6 ° C )
E^ kj mor^
log A, s ^
1(fky S'^
(1)H
148.5
10.1
39.8
(2) 4-F
151.8
10.4
33.4
(3) 3-CH3
149.8
10.2
(4) 4-CH3
153.5
10.6
38.9 44.4
(5) 3-OCH3
153.9 162.7
10.5
31.5
(6) 3-Br
11.0
18.8
(7) 4-CI
160.6
11.0
28.2
H. a-Keto Acids Taylor^^^ was able to correlate the relative rates of a few substituents in the pyrolysis of a-keto acids, ZC^^OCpOOH -> ZCOH + CO2, at 600 K (327 °C) (Z = CH3, C^H5, OH). He used QJ values. According to his results, a negative charge appeared to develop at the a-carbon in the transition state, which may be stabilized by the inductive/field effect of the Z group. The sequence of the relative rates was found to be 1:19:145 for pyruvic, benzoylformic, and oxalic acid, respectively. This notwithstanding, the data set has only three points and no firm conclusions can be drawn. I. Methanesulfonates The kinetic parameters for the homogeneous, unimolecular gas-phase elimination of alkyl and polar P-substituted ethyl methanesulfonates, CH3SO3CH2CH2Z,
Table 47.
Kinetic Parameters for ZqH4CHOHCH2CH=CH2 Pyrolysis, at 619 K (346 °C) E^ kJ mor^
log As-^
10^k^, s-^
(1)H
151.4
10.8
105.1
(2) 4-CH3
146.0
10.4
119.5
(3) 4-CI
151.0
10.7
90.2
(4) 4-OCH3
10.7
135.7
(5) 3-OCH3
148.9 154.4
11.0
93.0
(6) 3-CH3
146.0
10.4
119.5 129.3
(7) 2-CH3
152.7
11.0
(8) 2-CI
160.6
11.6
110.9
(9) 2-OCH3
131.3
9.1
104.2
Gas-Phase Reactivities
117
are given in Table 48.^^^ The log ^Z^^CHS ^^ ^^^^ substituents versus Taft a* values gave an approximate straight line with p* = -0.82; r = 0.967, at 320 °C. This result suggested that branching of alkyl groups increased the rates owing to the electronreleasing effect. However, the plot of polar substituents gave an inflexion point of the line at a*(CH3) = 0.00 into another good straight line with p* = -0.29; r = 0.994, at 320 °C (Figure 25). The value of p* = -0.29 was thought to be due to a very small polarization of the C • • • O bond in the transition state. The occurrence of one slope with the alkyl electron-releasing groups and another slope at a*(CH3) = 0.00 with polar electron-withdrawing groups was explained in terms of a slight change in the polarity of the transition state due to changes in electron transmission at the positive carbon reaction center. The authors suggested that a simultaneous effect may be operating at the transition state during the process of pyrolysis, especially with polar electron withdrawing substituents. In other words, as the polar groups decreases the reaction rates, the adjacent hydrogen to Z, in ZCH2CH2OSO2CH3, becomes more acidic and thus assists the leaving CH3SO3 group. The position of the points for C^H^ (Table 48, 13) and C6H5CH2CH2 (Table 48, 15) lie far above the line. These two apparent exceptions were attributed to the anchimeric assistance of the phenyl group at the P- and 5-positions (with rearrangement).
Table 48, Kinetic Parameters for CH3SO3CH2CH2Z Pyrolysis, at 320 °C
z (1)H (2) CH3 (3) CH3CH2 (4) CH3CH2CH2 (5) CH3CH2CH2CH2 (6) (CH3)2CH (7) CH3CH2CH(CH3) (8) (CH3)3C
E^kjmor^
logAs-^
10%, s-^
M^J 171.6 168.7
11.48 17.78
167.9 165.2
12.18 12.36 12.16 12.25 12.21 12.74 12.28 12.14
169.4 168.9 174.7
10^l<^s'^
21.38 22.39 30.90 38.90
3.83 8.89 9.98 10.69 10.69 11.20 15.45 19.45 1.51
19.95 21.38
(9) Br
172.8
11.70
3.02
(10) CI
173.9
11.67
2.25
1.13
(11)CH3CH20
11.52
6.03
3.02
(12)CICH2 (13)C6H5
167.3 171.7
12.01
7.70
3.85
167.1
12.18
28.84
14.42
(14)C6H5CH2
167.1
11.87
14.17
7.09
(15)C6H5CH2CH2
168.9
12.33
28.18
14.09
(16)CH30CH2
163.3 166.1
11.50 11.78
13.21
6.61 7.02
(17)CICH2CH2
14.03
118
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD 0.4-
-1.0 -0.5
I—T
0.0
r—I
0.5
r—I
1.0
1
1
1.5
1
1
1
2.0
1
2.5
1 1—
3.0
Figure 25. Log {kiko) for the pyrolysis of CH3SO3CH2CH2Z vs. a
The mechanism of this reaction was considered to be very polar in nature and to proceed through an intimate ion-pair intermediate. Here, the general Taft-Topsom treatment leads to Eq. 84: log kJk^ = - (0.85 ± 0.06) a„ - (1.90 ± 0.20) c^ - (0.14 ± 0.33) a^^
(84)
At 320 °C,r = 0.964,5^ = 0.10 In view of the small size of p^ and its large uncertainties, the treatment was further limited to two parameters, this leading to Eq. 85: logit//:^ = -(0.85 ±0.05) a^-(1.90±0.15)ap At 320 °C, r = 0.964, sd = 0.10
(85)
Gas-Phase Reactivities
119
Table 49. Kinetic Parameters for Z(CH3)2COH Pyrolysis, Catalyzed by HCI, at 430 °C
z
E^ kj mor^
logA,s-^
f
(1)H (2) CH3
—
—
136.8
12.30
(3) CH3CH2
142.2
12.83
183.7
(4) CH3CH2CH2
145.3
13.26
290.9
3.1 136.5
In spite of an activation energy lower than that for the decomposition of the methanesulfonates, p^^ and Pp values are similar. J. Alcohols
The pyrolysis of alcohols has been found to be complicated and to proceed from a radical chain to a molecular mechanism. In this sense, few alcohols have been shown to undergo acid-catalyzed homogeneous unimolecular elimination in the gas phase. The elimination kinetics of several tertiary alcohols catalyzed by hydrogen chloride between 385 and 445 °C are reported in Table 49.^^ Some of these decompositions were also catalyzed by hydrogen bromide at 320-392 °C (Table 50).'^' According to the data reported in these two tables, an increase in bulkiness of the alkyl group gave rise to a significant increase in the overall rate. The effect of alkyl substituents in these tertiary alcohols could not be correlated because the olefin products are subject to olefin isomerization in the presence of HX. Consequently, this pyrolysis reaction did not proceed via kinetic control. Even if there is no correlation, when plotting the log k^^^ in HCI catalyst (Table 49) against log k^^^ in HBr catalyst (Table 50), an excellent straight line is obtained with a slope of 1.17 ± 0.01, r = 0.999, and sd = 0.05. This result suggests that these tertiary alcohols decompose through the mechanism in Scheme 2.
H--—X
HO
"h
CH3-IW"C4
~^
^^-9=CH2 * H2O . HX
1
^
Z Scheme 2.
r86 (86)
120
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD Table 50. Kinetic Parameters for Z(CH3)2COH Pyrolysis, Catalyzed by HBr, at 350 °C
z
E^kjmor^
logAs'^
(1)H
138.0
12.00
(2) CH3
127.2
12.96
(3) CH3CH2
113.4
12.00
311
(4) (CH3)2CH
115.2
12.26
398
(5) CH3CH2CH2
112.5
12.23
631
(6) CH3CH2CH2CH2
107.4
11.77
589
ky cm mor s 2.7 200
K. Addition of Ketene to Carboxylic Acids The homogeneous gas phase reactions of ketene with the halogenated and unsaturated carboxylic acids, including formic acid, were found to be first-order in ketene and in acid, and to yield the corresponding anhydride as the only prod^^^ 162.163 ( . y ^ ^ ^ ^ Q ^ zcOOH -^ CH3CO-O-COZ. Second-order rate constants were calculated from initial rates or from integrated second-order plots. The mean value of the rate constants for all reactions were measured at 155 °C (Table 51). A Taft plot of logfc^^jagainst a* value gave a p* value of 0.43, when the polyhalogenated or highly branched alkyl substituents were excluded. Consideration of the A factor (about 10^ 1 mol"^ s"^ for ketene-acid addition, which is the reverse of the anhydride decomposition^^ with A factor around 10^ ^^) and the close similarity of these to ester pyrolysis favor a concerted but nonsynchronous sixmembered cyclic transition state as portrayed in Scheme 3. The effect of substituents at the y-carbon suggested that electron-withdrawing groups Z tend to stabilize the transition state. This interesting second-order gas-phase reaction seems to show that the main factor for a possible correlation is ap. However, it is noted to our concern that this gas-phase process was studied at a temperature well below the boiling point of some
H
9
Jy ;-
( CO
' \
J.
9
CHg-C—O-C-2
(87)
z Scheme 3.
GaS'Phase Reactivities
121
Table 51. Kinetic Parameters for ZCOOH + CH2=C=0 Pyrolysis, at 155°C
z (1)H (2) CH3 (3) CH3CH2CH2
E^kjmor^
log A, 1 mor s
208.8 212.2
6.10 6.24
(4) CH3CH2CH2CH2 (5) (CH3)2CH (6) (CH3)3C
l(fk^,s-^ 8.98 3.83 4.21 4.73 5.88
200.1
6.19
(7) (CH3)3CCH2 (8) CH2=CH
8.35 6.11 6.35
(9) (CH3)2C=CH
6.65
(10)CH=C
15.9
(iDCeHs
12.2
(12)CICH2CH2
9.85
(13)CICH2 (14)Cl2CH
20.7 158.0
(15)Cl3C
7500
(16)FCH2 (17)F2CH
13.2 40.4
(18)F3C
316.0
(19)CH3CHCI
18.0
of the substrates, even if worked at sub-atmospheric pressure. Therefore, it is possible that some k values are unreliable. ACKNOWLEDGMENTS G. C. thanks I. V.I.C. and C.S.I.C. for partial support. M. M. is grateful to professors Y. Tsuno, M. Fujio, and S. Kobayashi for their helpful suggestions. R. N. and J.-L.M.A. acknowledge the financial support from the Spanish D. G. I. C. Y. T through grants PB93-0286-C02-02 andPB93-0142-C03-01.
REFERENCES 1. 2. 3. 4.
Hammett, P. L. Physical Organic Chemistry, 2nd. ed.; McGraw-Hill: New York, 1970. Johnson, C. D. The Hammett Equation; Cambridge University Press: New York, 1973. Hine, J. Structural Effects on Equilibria in Organic Chemistry; Wiley: New York, 1975. Shorter, J. Correlation Analysis in Chemistry. Recent Advances; Chapman N. B; Shorter, J., Eds.; Plenum Press: New York, 1978, Chap. 2. 5. Charton, M. Correlation Analysis in Chemistry. Recent Advances; Chapman N. B.; Shorter, J., Eds.; J. Plenum Press: New York, 1978, Chap. 5.
122
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
6. Hansch, C. Correlation Analysis in Chemistry. Recent Advances, Chapman, N. B.; Shorter, J., Eds.; Plenum Press: New York, 1978, Chap. 9. 7. Exner, O. Correlation Analysis in Chemistry. Recent Advances; Chapman, N. B.; Shorter, J., Eds.; Plenum Press: New York; 1978, Chap. 10. 8. (a) Shorter, J. Correlation Analysis of Organic Reactivity; Research Studies Press, John Wiley and Sons: Chichester, 1982. (b) Shorter, J. In The Chemistry of Double Bonded Functional Groups; Patai, S., Ed.; Wiley: Chichester, 1997, Supplement A3, Part 1., Chap. 3. 9. Amett, E. M.; Scorrano, G. Adv. Phys. Org. Chem. 1976,13, 83. 10. MaccoU, A. Chem. Rev. 1969, 69, 33. 11. Smith, G. G.; Kelly, F. W Prog. React. Kinet. 1971, 8, 75. 12. (a) Chuchani, G. In The Chemistry ofHalides, Pseudo-Halides andAzides; Patai, S.; Rappoport, Z., Eds.; Wiley: Chichester, 1995, Supplement D, Part 2, Chap. 19. (b) Egger, K. W.; Cooks, A. T. In The Chemistry of the Carbon-Halogen Bond, Patai, S., Ed.; Wiley: Chichester, 1973, Chap. 10. 13. Taft, R. W.; Topsom, R. D. Prog. Phys. Org. Chem. 1987. 16, 1. 14. Gal, J.-F.; Maria, P-C. Prog. Phys. Org. Chem. 1990,17, 159. 15. Hansch, C ; Leo, A.; Taft, R. W. Chem. Rev. 1991, 91,165. 16. Charton, M. In Advances in Quantitative Structure-Property Relationships; JAI Press: Greenwich, CT, 1996, Vol. 1, pp. 171-219. 17. Charton, M. Prog. Phys. Org. Chem. 1981,13,119. 18. Taft, R. W; Koppel, I. A.; Topsom, R. D.; Anvia, R J. Am. Chem. Soc. 1990,112, 2047. 19. Koppel, I. A.; Mishima, M.; Stock, L. M.; Taft, R. W; Topsom, R. D. J. Phys. Org. Chem. 1993, 6, 685. 20. Bromilow, J.; Brownlee, R. T. C ; L6pez, V. O.; Taft, R. W J. Org. Chem. 1979,44,4766. 21. Taft, Jr., R. W; Ehrenson, S.; Lewis, I. C ; Click, R. E. J. Am. Chem. Soc. 1959, 81, 5352. 22. Yukawa, Y; Tsuno, Y Bull. Chem. Soc. Jpn. 1959,32, 971. 23. Brown, H. C ; Okamoto, Y / Am. Chem. Soc. 1958, 80,4979. 24. Yukawa, Y; Tsuno, Y; Sawada, M. Bull. Chem. Soc. Jpn. 1966,39, 2274. 25. Tsuno, Y; Fujio, M.; Takai,Y.; Yukawa,Y Bull. Chem. Soc. Jpn. 1972,45,1519. 26. Tsuno, Y; Fujio, M. Chem. Soc. Rev. 1996,25,129. 27. Mishima, M.; Arima, K.; Inoue, H.; Usui, S.; Fujio, M.; Tsuno,Y. Bull. Chem. Soc. Jpn. 1995,68, 3199. 28. Mishima, M.; Mustanir,; Fujio, M.; Tsuno, Y Bull. Chem. Soc. Jpn. 1996,69, 2009. 29. Mishima, M.; Usui, S.; Fujio, M.; Tsuno,Y Nippon Kagaku Kaishi 1989,1262. 30. Harrison, A. G.; Houriet, R.; Tidwell, T. T. J Org. Chem. 1984,49, 1302. 31. Kebarle, P Ann. Rev. Phys. Chem. 1965, 28, 495. For a recent review, see, e.g., Kebarle, P In Techniques for the Study of Ion-Molecule Reactions; Farrar, J. M.; Saunders, Jr., W, Eds.; Wiley: New York, 1988, Chap. 5. 32. Bowers, M. T.; Aue, D. H.; Webb, H. M.; Mclver, R.T. J Am. Chem. Soc. 1971, 93,4314. For a recent review, see, e.g., Freiser, B. In Techniquesfor the Study of Ion-Molecule Reactions; Farrar,J. M.; Saunders, Jr., W Eds.; Wiley: New York, 1988, Chap. 2. 33. Laukien, F H.; AUemann, M.; Bischofberger, P; Grossmann, P; Kellerhals, H. P; Kofel, P In Fourier Transform Mass Spectrometry. Evolution, Innovation and Applications. Buchanan, M. V., Ed.; ACS Symposium Series 359; American Chemical Society: Washington, DC, 1987, Chap. 5. 34. Fehsenfeld, F C ; Ferguson, E. E. / Chem. Phys. 1973,59,6272. 35. Paul, G. J. C ; Kebarle, P J Am. Chem. Soc. 1991,113, 1148. 36. Taft, R.; Anvia, W. R; Gal, J.-F; Walsh, S.; Capon, M.; Holmes, M. C ; Hosn, K.; Oloumi, G.; Vasanwala, R.; Yazdani, S. Pure Appl. Chem. 1990, 62,17. 37. Gal, J.-F; Maria, P-C. Prvg. Phys. Org. Chem. 1990,17,159. 38. Maria, P - C ; Gal, J.-R; de Franceschi, J.; Fargin, E. J Am. Chem. Soc. 1987,109, 483. 39. Alcanu, M.; M 6 , 0 ; Ydnez, M.; Anvia, F ; Taft, R. W. J Phys. Chem. 1990, 94,4796.
Gas-Phase Reactivities
123
40. Abboud, J.-L. M.; Notario, R.; Santos, L.; Ldpez-Mardomingo, C. J. Am. Chem. Soc, 1989, 111, 8960. 41. Eberlin, M. N.; Kotiaho, T.; Shay, B. J.; Yang, S. S.; Cooks, R. G. / Am. Chem. Soc. 1994 776, 2457. 42. Data mostly determined in Prof. Taft's laboratory and disseminated through important compilations: (a) Lias, S. G.; Liebman, J. R; Levin, R. D. J. Phys. Chem. Ref. Data, 1984, 75,695 (b) Lias, S. G.; Bartmess, J. E.; Holmes, J. L.; Levin, R. D.; Liebman, J. R; Mallard, W. G. NIST Reference Database 19A, Standard Reference Data, NIST, Gaithersburg, MD, 2089, USA. Computerized version 1.1, 1989. 43. Alcami, M.; Mo, O.; Yanez, M.; Abboud, J.-L. M. J. Phys. Org. Chem. 1991,4,177. 44. Person, R. G. J. Org. Chem. 1989,54, 1423. 45. (a) Decouzon, M.; Ertl, R; Exner, O.; Gal, J.-R; Maria, P-C. J. Am. Chem. Soc. 1993, 775,12071 (b) Gallo, R.; Roussel, C ; Berg, U. In Advances in Heterocyclic Chemistry; Katritzky, R., Ed.; Academic Press: New York, 1988, Vol. 43, p. 173. (c) Berg, U.; Gallo, R.; Klatte, G.; Metzger, J. /. Chem. Soc, Perkin Trans. 2,1980, 1350. 46. Abboud, J.-L. M.; Notario, R.; Sola, M.; Bertran, J. Prog. Phys. Org. Chem. 1993, 79, 1. 47. Taft, R. W.; Koppel, I. A.; Topsom, R. D.; Anvia, R J. Am. Chem. Soc. 1990, 772, 2047. 48. Wiberg, K. B.; Ochterski, J.; Streitwieser, A. / Am. Chem, Soc. 1996,118, 8291. 49. Koppel, I. A.; Taft, R. W.; Anvia, R; Zhu, S.-Z.; Hu, L.-Q.; Sung, K.-S.; DesMarteau, D. D.; Yagupolskii, L. M.; Yagupolskii, Y. L.; Ignat'ev, N. V.; Kondratenko, N. V.; Volkonskii, A. Yu.; Vlasov, V. M.; Notario, R.; Maria, P-C. J. Am. Chem. Soc. 1994,116, 8291. 50. Koppel, I. A.; Taft, R.; Anvia, R; Kondratenko, N. V.; Yagupols'skii, L. M. J. Org. Chem. USSR 1992, 28, UU. 51. Kondratenko, N. V.; Popov, V. I.; Radchenko, O. A.; Ignat'ev, N. V.; Yagupolskii, L. M. Zh. Org. Khim. 1986, 22. 52. Yagupolskii, L. M. Aromatic and Heterocyclic Compounds with Fluorine-Containing Substituents; Naukova Dumka: Kiev, 1988, pp. 260-262 (in Russian). 53. Notario, R.; Herreros, M.; El Hammadi, A. Homan, H.; Abboud, J.-L. M.; Porfar, I.; Claramunt, R. M.; Elguero, J. J. Phys. Org. Chem. 1994, 7, 657. 54. Sharma, R. B.; Sharma, D. K. S.; Hiraoka, K.; Kebarle, P J. Am. Chem. Soc. 1985, 707, 3747. 55. Abboud, J.-L. M.; Hehre, W. J.; Taft, R. W. J. Am. Chem. Soc. 1976, 98, 6072. 56. Pu, E. W.; Dymerski, P P; Dunbar, R. C. J. Am. Chem. Soc. 1976, 98, 337. 57. Abboud, J.-L. M.; Notario, R.; Ballesteros, E.; Herreros, M.; Mo, O.;Yanez, M.;Elguero, J.; Boyer, G.; Claramunt, R. J. Am. Chem. Soc. 1994,116, 2486. 58. Abboud, J.-L. M.; Castaiio. O.; Delia, E. W.; Herreros, M.; MuUer, P; Notario, R.; Rossier, J.-C. J. Am. Chem. Soc. 1997, 119, 2262. 59. Mishima, M.; Inoue, H.; Pujio, M.; Tsuno,Y. Tetrahedron Lett. 1989, 30, 2101. 60. Mishima, M.; Usui, S.; Rujio, M.; Tsuno, Y Nippon Kagaku Kaishi 1989, 1269. 61. Mishima, M.; Rujio, M.; Tsuno, Y Mem. Fac. Sci., Kyushu Univ. Sen C1988,16, 207. 62. Mishima, M.; Rujio, M.; Tsuno, Y Mem. Fac. Sci., Kyushu Univ Ser C 1984, 14(2). 365; Mem. Fac. Sci., Kyushu Univ. Ser C1985,15(1), HI. 63. Mishima, M.; Arima, K.; Usui, S.; Rujio, M.; Tsuno, Y Bull. Chem. Soc. Jpn. 1995,68, 3199. 64. (a) Mishima, M.; Inoue, H.; Rujio, M.; Tsuno, Y Tetrahedron Lett., 1989,30, 2101. (b) Mishima, M.; Inoue, H.; Itai, S.; Rujio, M.; Tsuno, Y Bull. Chem. Soc. Jpn. 1996,69, 3273. 65. Mishima, M.; Nakamura, H.; Nakata, K.; Rujio, M.; Tsuno, Y Chem. Lett. 1994,1607. 66. Mishima, M.; Inoue, H.; Rujio, M.; Tsuno, Y Tetrahedron Lett. 1990,31,685. 67. Mishima, M.; Ariga, T.; Rujio, M.; Tsuno, Y; Kobayashi, S.; Taniguchi, H. Mem. Fac. Sci., Kyushu Univ. Ser C1988, 76(2), 217; Chem. Lett. 1992,1085. 68. Mishima, M.; Ariga, T.; Matsumoto, T.; Kobayashi, S.; Taniguchi, H.; Rujio, M.; Tsuno, Y; Rappoport, Z. Bull. Chem. Soc. Jpn. 1996, 69,445.
124
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
69. Kobayashi, S.; Matsumoto, T; Taniguchi, H.; Mishima, M.; Fujio, M.; Tsuno, Y. Tetrahedron Lett. 1993,34,5903. 70. Matsumoto, T.; Koga, K.; Kobayashi, S.; Mishima, M.; Tsuno, Y; Rappoport, Z. Gazz Chim. Ital 1995,725,611. 71. Streitwieser, A.; Hammond, H. A.; Jagow, R. H.; Williams, R. M.; Jesaitis, R. G.; Chang, C. J.; Wolf, R. J. Am. Chem. Soc. 1970, 92, 5141. 72. Fujio, M.; Akasaka, I.; Susuki, S.; Mishima, M.; Tsuno, Y J. Phys. Org. Chem. 1990, 5, 449. 73. Nakata, K.; Fujio, M.; Saeki, Y;Mishima,M.; Tsuno, Y; Nishimoto, K. J. Phys. Org. Chem. 1996, 9,561. 74. Mishima, M.; Terasaki, T; Ariga, T; Fujio, M.; Tsuno, Y Mem. Fac. Sci., Kyushu Univ. Ser. C 1989,17(1), 159. 75. Mishima, M.; Tsuno, Y; Fujio, M. Chem. Lett. 1990, 2277. 76. See, e.g.. Fort, R. C. In Carbonium Ions; Olah, G. A.; Schleyer, R v. R., Eds.; Wiley: New York, 1972, Vol. IV, Chap. 32. 77. (a) See, e.g., Bingham, R. C ; Schleyer, R v. R. J. Am. Chem. Soc. 1971, 93, 3189. (b) Schleyer, R V. R. In Cage Hydrocarbons; Olah, G. A., Ed.; Wiley: New York, 1990, Chap. 1. 78. Miiller, R; Millin, D. Helv. Chim. Acta 1991, 74, 1808. 79. Mishima, M.; Fujio, M.; Tsuno,Y. Tetrahedron Lett. 1986,27, 939. 80. Mishima, M.; Fujio, M.; Tsuno,Y. Tetrahedron Lett. 1986, 27, 951. 81. Mishima, M.; Tsuno, Y; Fujio, M. Chem. Lett., 1990, 2281. 82. Abboud, J.-L. M.; Mo, O.; de Paz, J. L. G.; Yafiez, M.; Esseffar, M.; Bouab, W; El-Mouhtadi, M.; Mokhlisse, R.; Ballesteros, E.; Herreros, M.; Roman, H.; Lopez-Mardomingo, C ; Notario, R. J. Am. Chem. Soc. 1993,115, 12468. 83. Notario, R.; Castafio, O.; Herreros, M.; Abboud, J.-L. M. J. Mol. Struct. (THEOCHEM), 1996, 371,21. 84. Decouzon, M.; Exner, O.; Gal, J.-F; Maria, R-C; Waisser, K. / Phys. Org. Chem. 1994, 7, 511. 85. See, e.g., MoUna, M. T; Yafiez, M.; Mo, O.; Notario, R.; Abboud J.-L. M. In The Chemistry of Functional Groups. Supplement A3: The Chemistry of Double-Bonded Functional Groups; Patai, S., Ed.; John Wiley: Chichester, 1997, Chap. 23. 86. Yates, K.; Stewart, R. Can. J. Chem. 1959,37, 644. 87. Stewart, R.; Yates, K. J. Am. Chem. Soc. 1958,80, 6355. 88. Calculated using data from refs. 86, 87, and (a) Edwards, J. T; Chang, H. S.; Yates, K.; Stewart, R. Can. J. Chem. 1960,38,1518, and (b) Stewart, R.; Yates, K. J. Am. Chem. Soc. 1960,82,4059. 89. Bagno, A.; Lovato, G.; Scorrano, G. J. Chem. Soc, Perkin Trans. 2 1993, 1091. 90. (a) Tsuno, Y; Kusuyama, Y; Sawada, M.; Fujii, T; Yukawa, Y Bull. Chem. Soc. Jpn. 1975, 48, 3337; (b) Fujio, M.; Adachi, T; Shibuya, Y; Murata, A.; Tsuno,Y Tetrahedron Lett. 1984, 25, 4557. 91. (a)Tsuno, Y; Fujio,M.;Goto, M.;Murata, A.;Mishima,M. Tetrahedron, 1987,43,307; (b)Fujio, M.; Murata, A.; Fujiyama, R.; Mishima, M.; Tsuno, Y Bull. Chem. Soc. Jpn. 1990, 63, 1121. 92. Murata, A.; Goto, M.; Fujiyama, R.; Mishima, M.; Fujio, M.; Tsuno, Y Bull. Chem. Soc. Jpn. 1990,65,1129. 93. Murata, A.; Sakaguchi, S.; Fujiyama, R.; Mishima, M.; Fujio, M.; Tsuno, Y Bull. Chem. Soc, Jpn. 1990,65,1138. 94. Fujio, M.; Akasaka, I.; Susuki, S.; Mishima, M.; Tsuno, Y Bull. Chem. Soc Jpn. 1990, 63, 1146. 95. Fujio, M.; Funatsu, K.; Goto, M.; Seki, Y; Mishima, M.; Tsuno, Y Tetrahedron Lett. 1983, 24, 2177; Bull. Chem. Soc Jpn. 1987, 60, 1091. 96. Nowlan, V; Tidwell, T T Ace Chem. Res. 1977,10, 252. 97. (a) Modena, G.; Rivetti, F; Scorrano, G.; Tonellato, U. J Am. Chem. Soc 1977, 99, 3392. (b) Simandoux, J. C ; Tork, B.; Hellin, M.; Coussemant, F Bull. Soc Chim. Fr 1972,4402. (c) Allen, A. D; Chiang, Y; Kresge, A. J.; Tidwell, T T J Org. Chem. 1982, 47, 775. (d) Schubert, W. M.; Keeffe, J. R. J Am. Chem. Soc 1972, 94, 559. (e) Allen, A. D.; Rosenbaum, M.; Seto, N. O. L.;
Gas-Phase Reactivities
98.
99. 100. 101. 102. 103. 104. 105.
106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133. 134.
125
Tidwell, T. T.; J. Org. Chem. 1982,47,4234. (0 Deno, N.; Kish, F. A.; Peterson, H. J. / Am. Chem. Soc. 1965, S7, 2157. (g) Koshy, K. M.; Roy, D.; Tidwell, T. T.; J. Am. Chem. Soc. 1979, 707, 357. (h) Bott, R. W.; Eabom, C ; Walton, D. R. M. J. Chem. Soc. 1965,384. (i) Noyce, D. S.; Schiavelli, M. D. J. Am. Chem. Soc. 1968, 90,1020 and 1023. (a) Taylor, R. The Chemistry of Acid Derivatives, Suplement B; Patai, S., Ed.; Wiley: Chichester, 1979 Chap. 15, pp. 859-914. (b) Holbrook, K. A. The Chemistry of Acid Derivatives, Suplement B.; Patai, S., Ed.; Wiley: Chichester, 1992, Vol. 2, Chap. 12, p. 703-746. (a) Martin, I.; Chuchani, G. J. Phys. Chem. 1981, 85, 3902. (b) Gill, M. T; Taylor, R. /. Chem. Soc, Perkin Trans. 2 1990, 1715. Chuchani, G.; Martin, I.; Hernandez, J. A.; Rotinov, A.; Fraile, G.; Bigley, D. B. J. Phys. Chem 1980, 84, 944. Eabom, C ; Mahmoud, F. M. S.; Taylor, R. J. Chem. Soc, Perkin Trans. 2 1982, 1313. Hernandez, J. A.; Chuchani, G. Int J. Chem. Kinet. 1983, 75, 205. Gonzalez, N.; Martin, I.; Chuchani, G. J. Phys. Chem. 1985,89,1314, and references cited therein. (a) Rotinov, A.; Chuchani, G. Int. J. Chem. Kinet 1984,16, 1261. (b) Taylor, R. J. Chem. Soc, Perkin Trans. 2 1975, 1025. (a) Garcia de Sarmiento, M. A.; Dominguez, R. M.; Chuchani, G. J. Phys. Chem. 1980,84,2531. (b) Blake, R G.; Shraydeh, B. R Int J. Chem. Kinet. 1981,13,463. (c) Smith, G. G.; Jones, D. A. K. J. Org. Chem. 1967,28, 3496. Reikonnen, N.; Martin, I.; Chuchani, G.; Lubinkowski, J. Int. J. Chem. Kinet. 1985,17, 337. Smith, G. G.; Mutter, L.; Todd, G. R J. Org. Chem. 1977,42,44. Taylor, R. J. Chem. Soc, Perkin Trans. 2 1978, 1255. Chuchani, G.; Martin, I.; Rotinov, A.; Dominguez, R. M.; Morris, D. G.; Shepherd, A. G. Int. J. Chem. Kinet. 1988, 20, 145. Sicher, J. Angew. Chem. Internal Ed. 1972, 77, 200, and references cited therein. Dakubu, M.; Holmes, J. L. J. Chem Soc.(B), 1971,1040. Chuchani, G.; Hernandez, J. A.; Morris, D. G.; Shepherd, A. G. J. Chem. Soc, Perkin Trans. 2 1982, 917. de Burgh Norfolk, S.; Taylor, R. J. Chem. Soc, Perkin Trans. 2 1976, 280. Taylor, R.; Smith, G. G.; Wetzel, W. H. J. Am Chem Soc 1962,84,4817. Taylor, R.; Smith, G. G. Tetrahedron 1963, 19, 937. (a) Smith, G. G.; Lum, K. K.; Kirby, J. A.; Pospasil, J. J. Org. Chem. 1969, 34, 2090. (b) Smith, G. G.; Yates, B. L. J. Org. Chem. 1965, 30, 434. Taylor, R. J. Chem. Soc.(B), 1971,622. Taylor, R. J. Chem. Soc(B), 1971, 1412. Clyde, E.; Taylor, R. J. Chem. Soc, Perkin Trans. 2 1973, 1632. Clyde, E.; Taylor, R. J. Chem Soc, Perkin Trans. 2 1975,1463. Taylor, R. J. Chem. Soc, Perkin Trans. 2 1978, 955. Clyde, E.; Taylor, R. J. Chem. Soc, Perkin Trans. 2 1977, 679. Amin, H. B.; Taylor, R. Tetrahedron, 1978, 267. Taylor, R. J. Chem. Soc.(B) 1968,1397. Taylor, R. J. Chem. Soc.(B) 1971, 2382. Taylor, R.; David, M. R; McOmie, J. R W. J. Chem. Soc, Perkin Trans. 2,1972,162. Smith, G. G.; Kirby, J. A. J. Heterocycl. Chem 1971, 8,1101. Amin, H. B.; Taylor, R. J. Chem. Soc, Perkin Trans. 2 1978,1053. Amin, H. B.; Taylor, R. J. Chem. Soc, Perkin Trans. 2 1979,624. August, R.; Davis, C ; Taylor, R. / Chem. Soc, Perkin Trans. 2 1986,1265. Clyde, E.; Taylor, R. J. Chem. Soc, Perkin Trans. 2 1977,1537. Clyde, E.; Taylor, R. J. Chem. Soc, Perkin Trans. 2 1977, 1541. Amin, H. B.; Taylor, R. J. Chem. Soc, Perkin Trans. 2 1979, 228. Smith, G. G.; Jones, D. A. K.; Brown, D. R J. Org. Chem. 1963,28,403.
126
G. CHUCHANI, M. MISHIMA, R. NOTARIO, and J.-L. M. ABBOUD
135. 136. 137. 138. 139. 140. 141.
Smith, G. G.; Jones, D. A. K. J. Org, Chem. 1963,28, 3496. Al-Awadi, N. A.; Al-Bashir, R. R; El Dusouqui, O. M. E. Tetrahedron 1990,46, 2911. Smith, G. G.; Yates, B. L. Can. I Chem, 1965,43,702. Amin, H. B.; Taylor, R. J. Chem. Soc, Perkin Trans. 2 1975,1802. Al-Awadi, N. A.; Al-Bashir R. P.; El Dusouqui, O. M. E. Tetrahedron 1990,46, 2903. Amin, H. B.; Taylor, R. J. Chem. Soc, Perkin Trans. 2 1978,1095. Hoefnagel, A. J.; Monshower, J. C ; Snom, E. C. G.; Wepster, B. M. J. Am. Chem. Soc. 1973, 95, 5350. Chuchani, G.; Martin, I.; Rotinov, A.; Hemdndez, J. A.; Reikonnen, N. J. Phys. Chem. 1984, 88, 1563. Dominguez, R. M.; Rotinov, A.; Chuchani, G. J. Phys. Chem. 1986, 90, 6277. Chuchani, G.; Rotinov, A.; Martin, I.; Avila, I.; Dominguez, R. M. J. Phys. Chem. 1985,89,4134. Chuchani, G.; Rotinov, A.; Dominguez, R. M.; Martin, I. Int. J. Chem. Kinet. 1987,19, 781. Chuchani, G.; Dominguez, R. M. J. Phys. Chem. 1989, 93, 203. Chuchani, G.; Martin, I.; Rotinov, A.; Donunguez, R. M. Int. J. Chem. Kinet. 1990,22,1249. Bridge, M. R.; Davies, D. H.; MaccoU, A.; Rose, R. A.; Stephenson, B.; Banjoko, O. J. Chem. Soc.(B), 1968, 805. Amin, H. B.; Taylor, R. J. Chem. Soc, Perkin Trans. 2 1978,1090, and references cited therein. Smith, G. G.; Jones, D. A. K.; Taylor, R. J. Org. Chem. 1963,28, 3547. Al-Awadi, N.; Taylor,R. J. Chem Soc, Perkin Trans. 2 1986,1581. Taylor, R.; Thome, M. R J. Chem. Soc, Perkin Trans. 2 1976, 799. Miyazaki, K. Tetrahedron Lett. 1968, 2793. Voorhees, K. J.; Smith, G. G. J. Org. Chem. 1971, 36,1755. Smith, G. G.; Voorhees, K. J. J. Org. Chem. 1970, 35, 2182. Viola, A.; Proverb, R. J.; Yates, B. L.; Larrahondo, J. J. Am. Chem. Soc 1973, 95, 3609. Viola, A.; MacMillan, J. H.; Proverb, R. J.; Yates, B. L. J. Am. Chem. Soc 1971, 93, 6967. Taylor, R. Int. J. Chem. Kinet. 1991,23, 241. Chuchani, G.; Alvarez, J.; And, G.; Martin, I. J. Phys. Org. Chem. 1991, 4, 399, and references cited therein. Chuchani, G.; Martin, I. React. Kinet. Catal. Lett. 1993, 57, 233. Chuchani, G.; Dominguez, R. M.; Rotinov, A.; Martin, I. React. Kinet. Catal. Lett. 1991,45, 291. Blake, R G.; Vayjooee, M. H. B. J. Chem Soc, Perkin Trans. 2 1976,1533. Blake, R G.; Vayjooee, M. H. B. J. Chem. Soc, Perkin Trans. 2 1976, 988. Blake, R G.; Craggs, A.; Vayjooee, M. H. B. J. Chem. Soc, Perkin Trans. 2 1976, 986.
142. 143. 144. 145. 146. 147. 148. 149. 150. 151. 152. 153. 154. 155. 156. 157. 158. 159. 160. 161. 162. 163. 164.
THE PREDICTION OF MELTING POINT
John C. Dearden
I. Introduction A. Relevance to Chemical Assessment B. The Basis of Fusion II. Factors Controlling Melting Point III. Measurement of Melting Point IV. Estimation Methods A. Historical B. Recent Work V. Conclusions Acknowledgments References
127 128 129 131 131 133 133 140 170 171 171
I. INTRODUCTION Melting point is the temperature at which a solid fuses to become a liquid. Melting points are almost invariably reported as measured at atmospheric pressure, although they vary with pressure. A more precise definition of melting point is that it is the temperature at which pure solid is in equilibrium with pure liquid at 1 atmosphere pressure. Glasstone^ points out that, from the Clausius-Clapeyron equation, substances whose liquid has a greater density then the solid at the melting point should
Advances in Quantitative Structure Property Relationships Volume 2, pages 127-175. Copyright © 1999 by JAI Press Inc. All rights of reproduction in any form reserved. ISBN: 0-7623-0067-1 127
128
JOHN C. DEARDEN
show a lowering of melting point with increasing pressure; for example dTldP for water is -0.0074 °C atm"^ For most substances, however, solid density is greater than liquid density, and so dT/dP is positive; thus, ethanoic acid has a value of +0.025 °C atm-^ Freezing point is the temperature at which a liquid solidifies, and for all practical purposes may be considered identical to melting point. Two points may be noted here. First, it is possible that a liquid may freeze to give a solid of a certain crystalline form that is metastable. On standing, that form may revert to a more stable form, which could well have a different melting point. An example is tetrachloromethane,^ which solidifies at -28 °C to a face-centered cubic structure. On further cooling, this transforms to a rhombohedral structure, which melts at -23 °C. Second, Berry^ has pointed out that melting and freezing behavior depends on the size of aggregates or clusters of molecules or atoms. Small clusters (up to a few hundred atoms) can coexist as solids and liquids, and show different melting and freezing points. However, under normal circumstances, clusters are much larger than this (e.g. 10^ molecules), and the difference between melting and freezing point is then less than 0.001 °C. A. Relevance to Chemical Assessment Since melting point is probably the most widely and easily determined physicochemical property, the question must be asked as to the need to be able to predict it. Surprisingly, there are some compounds reported in the literature for which no experimentally determined melting point is given. In addition, new compounds submitted to Government agencies for approval may have no reported melting point, and it could well be more convenient to make an estimate of that property than to request the manufacturer to supply it, particularly if the substance submitted is not a single pure compound. Third, it is of interest to have an estimate of a compound's melting point before it is synthesized in order to be aware of, for example, handling and disposal problems. Melting point also has great relevance to the prediction of solubility. Yalkowsky and Valvani'* showed that aqueous solubility (5 ) could be predicted from a knowledge of the octanol-water partition coefficient (K^^) and melting point: log 5 ^ =-log ^ „ , - 0 . 0 1 mp
(1)
This was later modified^ to take into account compounds that were liquid at room temperature (25 °C): log 5,q = - log K^^ - 0.01 (mp - 25) + 0.8
(2)
Meylan and Howard^ found a similar relationship using a very large training set of 1450 compounds. There are now several good methods available for the calculation of log K^^, the best of which are probably those of Hansch and Leo,^ Rekker and Mannhold,^ and
The Prediction of Melting Point
129
Meyland and Howard;^ all three have been computerized. Thus, if melting point can also be reliably estimated, then aqueous solubility can be calculated without the need for any experimental measurements. Solubility controls biological activity and toxicity in that, if a compound is only poorly soluble, its concentration in the aqueous environment may be too low for it to exert a biological effect, and also because aqueous solubility allows a compound to be transported to the active site within an organism. The solubility of a compound can legitimately be regarded as a partitioning of the compound between its crystal lattice and the solvent. If the forces holding the molecule in the crystal are high, then the solubility will be low. For the same reason the melting point will be high, since melting point is a measure of the energy required to disrupt the crystal lattice. For example, uracil (mp > 300 °C) shows no toxicity even in saturated solution; however, its 1,3-dimethyl derivative (mp 122 °C) is appreciably toxic. Similarly, anthracene (mp 217 °C) is nontoxic, while its isomer phenanthrene (mp 97 °C) is quite toxic.^^ Melting point also affects the toxicity of mixtures. If two or more high-melting compounds form a liquid when mixed (eutectic effect), the mixture is generally more toxic than the individual compounds, ^^ probably because solubility is increased. B. The Basis of Fusion
A crystalline solid is generally thought of as a rigid structure, but in fact entropy (a measure of disorder) is zero only at a temperature of absolute zero. Above that temperature, disorder occurs because of atomic and molecular vibrations, and even rotations, within the crystal, and these increase with temperature. It might therefore be expected that melting should occur over a temperature range. However, at the same time it has to be recognized that a molecule is held in a crystal lattice by intermolecular forces, and melting point can be defined as the temperature at which the vibrational forces are sufficient to overcome the forces of attraction holding the molecules together. Considered thus, it is not surprising that fusion occurs at a specific temperature. It is not appropriate here to discuss fusion in detail, and readers are referred elsewhere. Partington^^ has reviewed many early theories of fusion. At the melting point (an equilibrium) the free energy of melting (AGj„) is zero. Hence the basic thermodynamic relation, AG = AH-TAS
(3)
T^ = AHJAS^
(4)
becomes,
where A//^ and A^'^^^ are the enthalpy and entropy respectively of melting.
130
JOHN C. DEARDEN
Now the entropy of boiling was shown by Trouton^^ to be approximately constant for many liquids so that the boiling point is controlled almost exclusively by the enthalpy change. This is not the case with melting. Abramowitz and Yalkowsky^'* state that the entropy of fusion has three components—positional, expansion, and rotational. Positional entropy of fusion relates to the change from ordered arrangement within the crystal to more or less random arrangement in the melt. For most small molecules, this term is relatively constant at 2-3 eu (entropy units). Entropy of expansion is found to be 1-3 eu for most rigid aromatic compounds; molecules that are highly eccentric in shape will have a large increase in volume on melting and thus will have a high entropy of expansion. The largest contribution to the entropy of fusion comes from rotational entropy, which is related to molecular symmetry: the higher the symmetry, the lower is the rotational entropy of fusion, and the higher is the melting point. There is also an increased enthalpy of fusion from a higher packing efficiency, since molecules are closer to each other and therefore intermolecular attraction is greater. Rotational entropy of fusion is in the range 7-11 eu for most rigid compounds, and is thus the predominant component of entropy of fusion. Yalkowsky^^ has discussed the estimation of entropy of fusion of organic compounds in detail. More recently, Chickos et al.^^'^^ have devised a group additivity method for the calculation of entropies of fusion of organic compounds. They report good agreement with experiment, the average error being 1.77 eu for monosubstituted compounds and 2.0 eu for multisubstituted compounds. The method is simple to use, and if applied to melting point prediction could result in improved accuracy. Abramowitz and Yalkowsky^"^ use a rotational symmetry number, a, defined as the number of ways that a molecule can be orientated to give indistinguishable images; for example, benzene has o = 12, and toluene (methylbenzene) has a = 2. They calculate the total entropy of fusion as, AS^=13.5-Rlna
(5)
where R is the gas constant (cal mol"^ deg~^). Hence, A^n, "^ " 1 3 . 5 - 4 . 6 log a
(6)
where T^^ = melting point (K). Dannenfelser et al.^^ have evaluated a for 362 organic compounds and several elements and inorganic compounds; calculated and observed entropies of melting are also reported. Abramowitz and Yalkowsky^^ also introduced a term to account for the entropy of expansion. They define eccentricity as maximum molecular length divided by mean molecular diameter, and the entropy of expansion term is obtained as:
The Prediction of Melting Point
A.
(EXPAN) = ^ ^ 2 ^ ^ ^ '"P^ ^ 6(volume)
131
^'^
Tsakanikas and Yalkowsky^^ introduced a term AS-^^^ = 4.6 log (]) to account for conformational flexibility; (^ is defined as ^^^''^'^'^^ where n is the total number of carbon atoms in the chain, B is the total number of branches and T is the number of r-butyl groups. This expression is a refinement of the simple In n term first proposed by Huggins^^ and introduced by Flory and Vrij."^^ The effect of conformational flexibility, which increases the entropy of fusion and reduces packing efficiency in the crystal, is to lower melting point, as can be seen in the example of 4-methylaniline (mp 44 °C) and 4-ethylaniline (mp - 5 °C). Dannenfelser and Yalkowsky,^^ taking into account both rotational symmetry and molecular flexibility, have calculated entropies of fusion for 949 compounds, with an average error of 12.5 J K~^ mol'^
II. FACTORS CONTROLLING MELTING POINT We have already seen that melting point is controlled by both enthalpic and entropic forces. Enthalpic forces include ionic, dipole, dispersion and hydrogen bonding forces, while entropic forces include positional, expansion, rotational, and conformational flexibility effects. Table 1 lists examples of each type. Ionic bonding clearly has a very pronounced effect on melting point, since the forces holding the crystal together are the same strong forces that hold individual ions together. The effect of dispersion forces is exemplified by the aromatic hydrocarbons; the greater the molecular size, the greater is the total attractive force between molecules. Intramolecular hydrogen bonding deserves particular mention: when all the hydrogen bonding ability of a compound is "tied up" internally, as is the case with 2-nitrophenol, then the melting point is close to that of a similar non-hydrogen bonded compound.
III. MEASUREMENT OF MELTING POINT At first sight, nothing seems simpler than measurement of the melting point of a substance. The usual method is to place a few milligrams of sample in a capillary tube, and heat the sample slowly until melting occurs. Partington^"^ states, however, that this method does not give accurate results, and quotes the necessity to use at least 20 g of material! Two main reasons may be adduced for errors in the capillary method: too-rapid heating, and failure to correct for the length of thermometer mercury column projecting above the heated block of the melting point apparatus. In addition oxidation, hygroscopicity and CO2 absorption can sometimes be a problem, and Skau et al.^^ describe an apparatus that obviates such difficulties. Furniss et al.^^ give extensive details of the capillary method, pointing out the need for purity in all melting point determinations; a compound is generally
132
JOHN C. DEARDEN Table T. Some Examples of Compounds Displaying the Various Factors Affecting Melting Point Compound
Melting Point (°C)
SnBr2 (ionic) SnBr4 (covalent)
215.5
Ethylbenzene (nonpoiar)
-95
Anisole (polar)
-33 97
3-Nitrophenol (H-bonded)
31
3-Nitroanisole (non-H-bonded)
39
2-Nitrophenol (intramolecularly H-bonded)
45
Benzene (one ring)
5.5
Naphthalene (two fused rings)
80
Anthracene (three fused rings)
217
1,3-Dichlorobenzene (unsymmetrical)
-25
1,4-Dichlorobenzene (symmetrical) Dimethylterephthalate (one-carbon chain)
53
Diethylterephthalate (flexible chain)
140 44
2-Methyl-2-butanol (low eccentricity)
-12
1-Pentanol (high eccentricity)
-78.5
considered "pure" if its melting range is <0.5 degrees. Skau et al.-^^ also give much experimental detail, and mention that the amount of material used, the heating rate, the capillary diameter, and even the type of glass used can affect the observed melting point. Heating can be in either an electric furnace or a heating bath such as the Thiele apparatus. Heating should be at about 1 degree per minute so as not to overshoot the melting point. Furniss et al.^^ give the stem correction for the exposed length of thermometer as 0.00016 N (^1-^2) where A^ is the length of exposed mercury thread in degrees, t^ is the observed temperature on the thermometer scale and ^2 is the mean temperature of the exposed mercury thread (determined by an auxiliary thermometer placed alongside with its bulb at the middle of the exposed thread). The Kofler hot bench^^ is designed to allow the rapid determination of melting point. It is heated electrically at one end, and thus provides a graduated temperature gradient. Crystals are sprinkled on to the bench and moved along by a lancet until they melt. It is useful for compounds that decompose on gradual heating. The microscope hot-stage apparatus is essentially an electrically heated block on a microscope stage. It is of particular value for determining the melting point of a very small amount of material, such as a single crystal. For anisotropic substances, the melting point is best observed with a polarizing microscope, since the temperature at which the color disappears is the true melting point.
The Prediction of Melting Point
133
It is reported^^ that, with care, melting points can be determined within ±0.02 °C, even with the capillary method. Ford and Timmins^^ have discussed the use of differential thermal analysis (DTA) and differential scanning calorimetry (DSC) in the determination of melting point. With care, it is possible to determine melting points by this method to ±0.05 degrees. Of particular interest is the fact that DSC can be used to determine the enthalpy of melting, and to indicate the purity of a compound. Partington^^ has discussed the methods available for the determination of hightemperature melting points. In the change of form method the solid is heated until it is perceived visually to flow. This method is unsatisfactory if the liquid is very viscous. An alternative method is to place the sample on a strip of platinum foil, which is then heated, if necessary in a protected environment. Melting is again observed visually. Accuracy is estimated at ±5 degrees. Various methods for determining the melting points of metals include: using them to make an electrical connection, which is broken when the metal melts; and allowing a rod, connected to a lever arm, to rest on the sample—the lever moves when the metal melts. Skau et al.^^ have drawn attention to the fact that some melting points can be considered to be anomalous. For example, many substances exist in more than one crystalline form, each of which will probably have a different melting point. For example, rhombic sulphur melts at 112.8 °C, and monoclinic sulphur melts at 119.25 °C. Dynamic isomerism exists between, for example, a pair of tautomers or geometric isomers. On melting either form, an equilibrium mixture is formed. Thus cw-benzaldoxime melts at 35 °C, whereas the trans form melts at 130 °C. The equilibrium mixture melts at 27.7 °C. Many substances form liquid crystals upon melting, and will thus display a melting range. 4-Methoxycinnamic acid transforms to the liquid crystalline state at 172.1 °C, and into the clear liquid state at 187.3 °C. Substances that sublime below their melting point (e.g. chloranil) are best handled in a sealed capillary that is totally immersed in the heating bath. Substances that tend to explode on heating should not be examined in a capillary, but should be floated on a liquid metal surface or on a cover glass floating on the metal; a safety screen must be used.
IV. ESTIMATION METHODS A. Historical Given the ease of measurement of melting points, and their obvious dependence on molecular structure, it is not surprising that the relationships between the two have been the subject of study for well over a century. Carnelley,^^ for example, stated that "of two or more isomeric compounds, those whose atoms are the more symmetrically and the more compactly arranged melt higher than those in which
134
JOHN C. DEARDEN
the atomic arrangement is asymmetrical or in the form of long chains." Partington^^ cites a large number of such studies up to 1949. However, the relationships reported are qualitative, and thus are of little or no value for the purpose of prediction of melting point. Probably the earliest attempt to predict melting points was that of Mills,^^ who proposed the following expression for the prediction of melting points in homologous series: mp = ^
^(x-c) l+yix-c)
(8)
where x = number of CH2 groups in the chain and P, y, and c are constants. Mills found for many series that melting points could be predicted to within 1 or 2 degrees; he devised different equations for odd- and even-numbered chains, following the observation of Baeyer-^^ that there was an alternation in melting point between adjacent members of an homologous series. Mills pointed out that for "infinite" members of a series, his equation reduced to mp = p/y, and he was thus able to calculate such melting points. This is an outstanding work for the period, and far ahead of its time. Kipping^^ pointed out that for ketones, with two alkyl chains, a number of isomers existed with different melting points, whereas Mills' equation yielded only one prediction. Longinescu^'* derived the following expression for the melting points of organic compounds, r^=8.37DV^
(9)
where D = density and n = number of atoms in molecule, although he used the equation to calculate the number of atoms per molecule from the melting point, rather than vice versa. Tsakalotos,^^ observing that the melting points of odd and even chain number (n) alkanes alternated, commented that this was a function of molecular symmetry, and proposed the equation, AT^ = (85 - 0.01882 (n -lf)/(n
- 1)
(10)
where AT^ is the incremental rise in melting point between adjacent homologues. The average error in predicted melting point for eight alkanes from C^^ to C24 was 0.4 degrees. A number of early workers, recognizing that molecular vibrations were the cause of crystal lattice disruption on melting, attempted to correlate melting points with vibrational frequencies. Lindemann^^ derived Eq. 11 for monatomic substances and simple inorganic compounds, T^ = KMa^^\^
(11)
The Prediction of Melting Point
1 35
where M = atomic weight (relative atomic mass), a = elasticity coefficient, and v = vibrational frequency. However, Lindemann used his equation to predict vibrational frequencies and not melting points. Starting with Lindemann's formula, Robertson^^ obtained, KV^'W m-
(12)
^
where A^ = a constant, and s = mean specific heat from absolute zero to the melting point. He pointed out that for higher members of a series, density and specific heat are approximately constant, so that Eq. 12 becomes: T^ = K'M^'\^
(13)
He commented that under such conditions \) is probably a simple function of M, and found that the empirical equation, T^ = K''M''^
(14)
gave good results for a series of amides of fatty acids. Prud'homme^^ proposed a "rule of three temperatures" whereby he observed that for a large number of compounds, 7-„,+ 7'b = 7'c (15) where T^ = normal boiling point (K) and T^ - critical temperature (K). Lorenz and Herz^^ found that the ratio TJT^ ranged between 0.3 and 1.0, the average for organic compounds being 0.5839. Lyman"^^ used a set of 12 very diverse "benchmark" chemicals to test the Lorenz and Herz method; boiling points of some of the compounds were estimated. He found an average error of 27.1 degrees, or 10.5%. Taft and Stareck"^^ explored this relationship further, and also that between T^ and T^\ they observed wide variation in TJT^, in contrast to the almost constant value of Ty^lT^. As explained earlier, this is due to the entropic contribution to melting point. It has long been noted that there is often alternation in melting points between homologues containing odd and even numbers of carbon atoms.-^^ Table 2 shows two separate relationships for alkanoic acids. Malone and Reid"^^ found, for alkyl esters of 3,5-dinitrobenzoic acid, a higher melting point for secondary alcohol esters with an odd number of carbon atoms, while the reverse was true for normal alcohol esters. It seems likely that the phenomenon is a function of crystal packing; Partington"^^ has briefly discussed alternative hypotheses. Beacall"^ showed that melting point increased with the density of the solid. Following the work of Lindemann"^^ and Robertson,^^ numerous studies have been reported of the variation on melting point with molecular weight or chain length in homologous series. Garner et al.'^^''*^ devised equations for the melting points of both odd and even carbon chain monobasic fatty acids; their equation for even number acids from CJQ to C24 was.
136
JOHN C. DEARDEN
Table 2. Correlations between Molecular Weight and Melting Point for Some Homologous Normal Series^'^ Minimal No. of Carbon Atoms for Equation to Apply
Series
A
B
Alkanes
0.0038
2.337
15
1-Alkenes
0.0040
2.333
10
1-Alkanols
0.0040
2.185
6
Alkanals
0.0039
2.245
4
Alkanoic acids (odd)
0.0047
2.143
5
Alkanoic acids (even)
0.0053
2.070
6
Methyl esters of alkanoic acids
2.332
8
Alkanoic acid chlorides
0.0031 0.0034
2.400
6
Alkane nitriles
0.0035
2.258
5
Notes:
^From ref. 25. ^Log M = A • mp + B
T^ = (1.030 n - 3.61)/(0.002652 n -0.0043)
(16)
and the average prediction error for eight such acids was 1.7 degrees. Garner et al."*^ proposed a similar equation for the melting points of even-numbered normal paraffins: T^ = (0.6085 n - 1.75)7(0.001491 n - 0.00404)
(17)
Above C20, Eq. 17 gave very good predictions, with an average error (for 14 compounds from C20 to C^Q) of 0.6 degrees, but was considerably less accurate for shorter chain homologues. It may be noted that Eqs. 16 and 17, like those of Mills-^^ and Tsakalotos,^^ give convergence temperatures for chains of infinite length (as observed in practice) and thus can be used to predict the melting points of polymers. Garner et al."^^ and Timmermans'*^ have pointed out that such convergence temperatures should be identical for all series; this is because, at infinite chain length, the influence of the terminal group should be negligible."^^ In practice, differences are observed,^^ clearly indicating that topology and/or interaction capability of the terminal group exert an appreciable effect. Austin^ ^ proposed, for the higher members of an homologous series, a correlation with molecular weight: \ogM = AT^-^B where A and B are constants. He showed graphically that this relationship does indeed hold for a wide range of homologous series, although slopes and intercepts
The Prediction of Melting Point
137
vary. Skau et al.^^ have determined the constants A and B for a large number of homologous series, and some of these are listed in Table 2. It should also be noted that Eq. 18 tends to fail for compounds with very long chains, for, as has been pointed out above. Garner et al."^^ and Timmermans"^^ have suggested that the melting points of any homologous series should approach a common limit of about 120 °C as the chain length increases (cf. also Mills^^). This effect has also been taken into account by Lovell and Hibbert^^ who proposed the equation, mp = P/(a-^bP)
(19)
where P is the number of repeating units in the chain. Huggins,^^ using the equation developed by Garner et al.,^^ pointed out that at large values of n, 1/T^ oc l/n, and demonstrated the validity of this for normal paraffins. Merckel^^ devised several equations for the correlation of melting points of normal paraffins, depending on the chain length. For the range C1QH22 to C22H45 he proposed, log T^ = 2.5896 - 1.870/(n -1)
(20)
which gave, for eight such compounds, an average prediction error of 1.3 degrees. Meyer and van der Wyk^"^ developed the following equation for the melting points of normal paraffins: T^ = n/(0m2395n + 0.0171)
(21)
For 12 compounds from C12 to C^Q, with both odd- and even-numbered chains, the average prediction error was 1.3 degrees. Moullin^^ used an approach similar to that of Austin^ ^ for the melting points of paraffins and fatty acids. For the former, his equation was: log(2n-2)-1.14
(22)
mp = 5.8 X 10"^ For all normal paraffins from C3 to C^^ the average prediction error from this equation was 4.0 degrees, with the greatest errors for the short chain (
00. Seyer et al.^^ used Moullin's equation to predict the melting points of n-paraffins from C^ to C^Q, with an average prediction error of 0.6 degrees; the greatest discrepancy found was 2.5 degrees for Cg^H^^. Etessam and Sawyer^^ devised a correlation between melting point and molecular weight for both odd- and even-numbered normal paraffins above €22^ r ^ = 414.5 M/(M +94.4)
(23)
For 26 compounds from C22 to C^Q the average prediction error was 0.3 degrees, and the form of the equation allows for convergence as n —> «>.
138
JOHN C. DEARDEN
Gray^^ derived, from thermodynamic considerations, an equation of the form mp = A - B/{n - 1) for the melting points of normal paraffins. He had to use different constants A and B for different chain-length ranges, and for odd- and even-length chains up to €20- For example, for the range C21 to €3^, he used the equation: mp= 123.4-1659/(n-l)
(24)
The mean error in predicted melting point for the 16 compounds was, remarkably, only 0.2 degrees, with none being more than 1.0 degree in error. Smittenberg and Mulder^^ proposed that the values (JC) of many physical properties of members of homologous series varied according to the empirical equation, x = x^-^a/{n-\-b)
(25)
where n is the number of carbon atoms in the chain. Fortuin^^ examined the application of this equation to, inter alia, the melting points of/i-alkanes. He derived an equation similar in form to that of Eq. 25, and showed graphically that an excellent straight line correlation was obtained between m, the number of methylene groups in a chain, and \x^ (reduced melting point, defined as T^/{T^ - T^), where T^ is the melting point of the infinite-chain polymer). He also stated (without giving any derivation) that according to classical thermodynamics it may then be derived that m and the melting point T^ are related as follows, ,
m+2
J
I
n
(26)
where A is a constant. A graphical plot of log (m + 2)/m against \IT^ gave an excellent straight line relationship, but no predicted or measured melting points were quoted. Keyes^^ used Lindemann's^^ work (Eq. 11) to relate the thermal conductivity of insulating crystals to melting point, density, and atomic weight. Benko^^ deduced the following relationship between melting point and molecular weight, _ 1600 Mn
(27)
^ s
where n = number of atoms in the molecule, and V^ = molar volume at the boiling point. Benko gave no indication of how well his equation predicted melting point. However, Gold and Ogle^^ reported that for 225 organic compounds, the average error in prediction of melting point by Benko's method was 2.36%, with 95% confidence limits of ±41.75 degrees. This compared with an average error of 3.11% (95% confidence limits ±52.63 degrees) for 154 organic compounds using Prud'homme's rule of three temperatures,"'^ and 11.98% (±40.89 degrees) for 48
The Prediction of Melting Point
139
polar organic compounds and 10.73% (±56.25 degrees) for 26 non-polar organics using the method of Lorenz and Herz.^^ However, Gold and Ogle's method of calculating average percentage error is open to question since they reported several negative values, which leads one to suspect that they might have averaged the percentage errors rather than their modulus; this would lead to, for example, an error of +12% and one of -12% averaging to 0% error. Broadhurst^ devised a simple equation for the prediction of melting point of long-chain normal paraffins, T^ = 414.3 (n -1.5)/(Ai + 5.0)
(28)
where n = number of atoms in chain. The equation was claimed to apply to all normal paraffins above C44H9Q, and for 14 such compounds from C44 to CJQQ the error was never greater than 0.5 degrees. The following year Flory and Vrij^^ found that the inclusion of the term R\nn (where n = number of carbon atoms in chain), a term first suggested by Huggins,^^ helped to account better for the entropy of fusion, by allowing for chain flexibility, and they claimed that its inclusion in the general form of Eq. 28, namely, ^m = K(^ + a)/(n i-R\nn + b)
(29)
gave predicted melting points within 0.5 degrees for straight-chain paraffins from Cjj to CJQQ. Latcr Broadhurst,^^ using Rory and Vrij's correction, was able to predict the melting points of 33 Az-alkanes with a standard deviation of only 0.3 degrees. His 7^ value was found to be 417.8 K, in reasonable agreement with the experimentally observed melting point of 411.8 K. Grigor'ev and Pospelov^^ were the second group of workers, after Hory and Vrij,^^ to introduce en tropic considerations into the prediction of melting point. They devised the following equation for poly cyclic aromatic hydrocarbons. K_r2] T^ = 273.16 + 1.54(M - 108.8) + 52.S(n - 2) K7 - L M
(30) J
where M = molecular weight, n = number of axes of rotation, A^ = incremental increase in molecular weight in homologous series, and L = "mechanical" rigidity of molecule, calculated as number of groups of molecular components in molecule minus number of groups of molecular components in first member of homologous series. (It is not clear how this value was arrived at, as the values of L quoted for benzene, naphthalene, anthracene, tetracene and pentacene are respectively 0,0,0, 3 and 4). For benzene, biphenyl and 13 fused-ring poly cyclic aromatic hydrocarbons, the average error in the predicted melting point was only 2.1 degrees. Strangely, pentacene was poorly predicted (error = 164 K). This is a remarkable piece of work, which does not seem to have achieved any recognition. Eaton^^ published an interesting variation on the relationship of melting point to molecular weight. He observed that for many homologous series of compounds the
140
JOHN C. DEARDEN
product of solid range S (i.e. 273 + mp) and liquid range L (bp - mp) was proportional to molecular weight: L x S = aM +
fc
(31)
This rearranges to the following expression: bp - 273 ± V(bp - 273)^ - 4(-273bp + aM^b)
(32)
mp = The author commented that the negative root is generally correct. For 10 compounds of widely differing structure, the method yielded an average error in predicted melting point of 3.6 degrees. A method based on atomic contributions was devised by Wachalewski,^^
T =Mli m
(33)
£)
where n- is the number of atoms of type /, U- is the contribution of atom / to melting point, and D is the average diameter of the molecule. Wachalewski reported predictions for 85 hydrocarbons and halogenated hydrocarbons, with an average error of 31.5 degrees. The method is discussed in more detail later, in the section on the group contribution approach. B. Recent Work Hydrocarbons Hydrocarbons, and particularly alkanes, are among the simplest organic compounds, and their intermolecular interactions are not complicated by dipolar and hydrogen bonding forces. The prediction of their melting points has long been studied, as indicated in the previous section, and it continues to be a focus of attention. The alternating nature of the melting points of odd- and even-numbered carbon chains led Syunyaeva^^ to postulate that the melting points of alkanes were related to the C-C-C bond angle 9j^. It is not clear how the bond angles are obtained. /
V7'm = 0.498 + ^ ^ 1 • 10-2 fQ^ gygj^ tXkmes
(34)
n-2
f
%/T^-
0.578 + ^ ^ 1 • 10-2 fQj Q^jj alkanes
(35)
0 = 1 1 8 - 0.12/1 for even alkanes
(36)
0 = 1 5 1 - 0.65/1 for odd alkanes
(37)
The Prediction of Melting Point
141
where n = number of carbon atoms in the chain. Syunyaeva claims that these equations give T^ values that differ from experimental values by no more than one degree, for 10 < n < 20. Mackay et al.,^^ in work carried out for the U.S. Environmental Protection Agency, developed a rather cumbersome equation for the prediction of melting points of low-volatility hydrocarbons and halogenated hydrocarbons.
-(4.4 +In r.)
-0.803 In ^
T^ =
InP
(38) +1
6.8
where T^ = normal boiling point and P = vapor pressure at environmental temperature T. The authors comment that it is possible to develop similar equations for other classes of compounds. Seybold et al.^^ used topological indices and ad hoc descriptors to model the melting points of halogenated hydrocarbons. Only fair correlations could be obtained, the best being. r ^ =-183.0 1 / V +299.0 n = 22
r = 0.791
(39)
5 = 26.5
where ^x^ is first-order valence molecular connectivity,^^ n is the number of compounds used, r is the correlation coefficient, and s is the standard error of the estimate. They also observed a reasonable correlation (r = 0.745) between melting point and the enthalpy of vaporization, although this is not so good as that found by Westwell et al.^^ (vide infra) for a wider range of compounds. Hanson and Rouvray'^'^ also investigated the use of topological indices, including Balaban and Wiener indices, to correlate the melting points of normal alkanes. The simplest correlations were obtained using the carbon number index n^. For example, using data for Cjj-C43 (melting from the hexagonal solid) and for C44~C39Q (melting from the orthorhombic solid) they obtained: 2430 /
i^
(40) 1 - - + 411.6 n n = 49 r = 1.000 5 = 0.82 Hanson and Rouvray claim that their equations give accurate predictions of the melting points of normal alkanes. Although they do not quantify that statement, the excellent correlation coefficients and standard errors obtained bear out the claim. Adler and Kovaci6-Beck^^ correlated the melting points of normal alkanes with the topological indices of Wiener and of RandiC. Using the Wiener index W they
71 =
obtained, for CJ5-C5Q alkanes:
r = 188.066 W^^^^
(41)
142
JOHNCDEARDEN
n not given
r = 0.989
Using the RandiC index % they found, for
s not given
C^Q-C^Q
alkanes:
^ ^ X "" 8.403 X 10'^ + 2.395 x 10"^ % n not given
r = 0.989
(42)
s not given
These correlations are clearly not so good as those of Hanson and RouvrayJ"* but nonetheless they model normal alkane melting points quite well. Needham et al./^ using connectivity indices and ad hoc descriptors, were not able to model well the melting points of normal and branched alkanes. Their best equation was: mp = 39 - 2192 (1/^x) + 859 (1/^x) - 94 \ n = 56
r = 0.755
+ 291 ^%, + 77 %
(43)
5 = 23.8
It is not clear whether this poor correlation results from inclusion of branched alkanes or from inappropriate transforms of the indices used. The authors comment that melting point is associated with a shape-dependent dimension that is not well modeled by the indices examined. Pogliani^^ investigated the prediction of melting point using linear combinations of connectivity indices derived by a trial-and-error composition procedure. Using a set of 56 alkanes, he was unable to obtain a correlation with r better than 0.507. However, by breaking the data into subsets differing by their degree of substitution, good correlations were obtained; the subsets were not defined. For example, one subset of 17 compounds was well-correlated (r = 0.949, s = 14.0) by a combination of 6 molecular connectivity indices. Tsakanikas and Yalkowsky^^ took into account the flexibility and symmetry of alkanes in attempting to predict their melting points. They defined a molecular flexibility number ^ (see section on the basis of fusion) and also a rotational symmetry number a, defined as the number of orientations of a molecule that are indistinguishable from a reference orientation. They developed the following non-linear equation: 370.8(CH) + 1188.7(CH2) + 944.6(CH3) - 817.4
(44)
31.9+ 5.8 log (t)-11.4 log a n=102
r = 0.972
5=19.5
The effect of flexibility arises in normal as well as branched alkanes, and yet excellent predictions of melting points of the former were made by Hanson and
The Prediction of Melting Point
143
Rouvray^"^ without considering flexibility. This suggests that symmetry factors are of greater importance in governing melting point. Somayajulu'^^ modified the equations of Kreglewski^^ and Kreglewski and Zwolinski,^^ developed for the prediction of boiling points and critical temperatures. For hydrocarbons above a certain chain length he proposed the equation:
HT:-TJ=a-bn
(45)
1/25
Somayajulu states that "it is quite amazing that this function fits the data of alkanes even from the carbon number 16 to a deviation within 1 K." Table 3 gives the values of a and b, and the standard deviation, for different series of hydrocarbons. The standard deviations reported in Table 3 are extremely low and mean that Somayajulu's method gives extremely accurate predictions. It would be of interest to know how successful the method would be with lower homologues (cf. the recent work of Riazi and Al-Sahhaf,^^ vide infra). Cherqaoui et al.^^ have applied the neural network approach to the prediction of melting points of all 150 alkanes up to CJQ. Using a standard feed-forward network with one hidden layer of neurons, they obtained a correlation coefficient of 0.978 for the correlation between observed and predicted melting points, with a standard error of 8.1 degrees; the descriptors used were the embedding frequencies of eight substructures. Linear regression analysis using the same descriptors gave r = 0.865 and a standard error of 25.8 degrees. The approach looks promising, and it will be interesting to see if it can be applied successfully for the prediction of melting points of compounds containing heteroatoms. Riazi and Al-Sahhaf^^ used an equation of almost identical form to that of Somayajulu^^ (Eq. 45) for the prediction of melting points of various homologous series of hydrocarbons: Table 3, Constants of Equation 45 for Various Series of Hydrocarbons^ Series
a
b
Standard Deviation
n^
n-Alkanes Cycloalkanes
24.7120 30.35974
17.79905 22.57216
0.947
31
1.92
31
1 -Alkylcyclopentanes
27.16582
19.80791
0.341
22
1-Alkylcyclohexanes 1-Alkenes
25.58733
21.11261
0.438
25
29.29506
19.13557
0.009
21
1-Alkynes
26.42416
19.32058
0.637
15
1-Alkyl benzenes
28.71740
21.18813
0.148
16
— —
25
1-Alkylnaphthalenes
25.15359
18.06739
2-Alkylnaphthalenes
26.00394
18.80971
Notes: *From ref. 78. ^n = no. of carbon atoms above wich Eq. 45 is applicable.
25
144
JOHN C. DEARDEN
^
(46)
Table 4 lists the constants of Eq. 46 and the standard errors of prediction, for the various series. It can be seen by comparison with Table 3 that the predictions from Eq. 46 are not as good as those of Somayajulu from Eq. 45, although Eq. 46 covers in general lower members of each series, for which predictions would not be expected to be so accurate. Table 4, Constants of Equation 46 for Various Series of Hydrocarbons^ Series
TnT
a
b
c
n-Alkanes n-Alkylcyclopentanes
397
0.14187
370
6.5096 6.52504
0.04945
0.470 0.667
n-Alkylcyclohexanes
360
6.55942
0.04681
n-Alkylbenzenes
375
6.53599
0.04912
Standard Compound Deviation Range 1.5 1.2
C5-C40 C7-C41
0.7
1.3
C7-C20
0.667
0.88
C9-C42
Note: ^Fromref. 81.
Todeschini and coworkers have recently developed a new class of descriptors called weighted holistic invariant molecular (WHIM) descriptors, and have applied these to the prediction of various physicochemical properties mcluding melting point, as well as to the prediction of biological activities. WHIM descriptors are 3D molecular indices that contain information about the whole 3-D molecular structure in terms of size, shape, symmetry, and atom distribution. They are calculated from x,y,z coordinates, usually of a minimum energy conformation, within different weighting schemes. The descriptors are invariant to roto-translation and represent interpretable properties of the molecules. The melting points of 79 poly aromatic hydrocarbons were modeled by six WHIM descriptors with r = 0.946 and s = 36.9 degrees;^^ this was later^"^ modified to r = 0.942 and 5* = 35.1 degrees with only four descriptors. Marano and Holder^^ devised a general equation for predicting physical properties of homologous series. They used this to predict the melting points of n-paraffms and n-olefms, for which the equation took the form. T = 418.07 - 6.288 x iQ^Q-^-^^^in-nf'''
(47)
where n is the number of carbon atoms in the molecule, and n^ = 0.341 for n-paraffms and 2.081 for n-olefms. The RMS error was reported as 0.0435 degrees for n-paraffms with n ranging from 9 to 100, and 0.0446 from n-olefms with n ranging from 10 to 40. The constant of 418.07 in Eq. 47 is the melting point of a hydrocarbon with n = oo. Thus the melting point predictions by the Marano and Holder method are extremely accurate; the authors gave no indication of how well their method would predict the melting points of non-hydrocarbons, although they
The Prediction of Melting Point
145
used it to predict the boiling points of a number of other classes such as alcohols, haloalkanes, and alkanoic acids. In summary, it is clear that it is now possible to predict the melting points of some classes of hydrocarbons, especially those with straight alkyl chains, with high accuracy. The presence of branching, unsaturation and rings reduces the accuracy, and further work is needed to model such features better. Organic Compounds Other Than Hydrocarbons
Approaches Utilizing Physicochemical and Structural Parameters. In 1980 Cramer^^ developed a novel approach to property prediction. He postulated that the variation of a number of physical properties, including melting point, could be correlated with two parameters, B (bulk) and C (cohesiveness) obtained from factor analysis of a range of molecular characteristics. A further three parameters (D,E,F), also derived from the factor analysis, could be used to effect further improvement in the prediction of the properties. The melting points of 112 compounds of diverse structure were correlated^^ as follows: mp = 77.2 + 295 B + 194 C - 4 4 3 D + 339 E - 4 0 6 F Ai=112
r = 0.851
(48)
^ = 39.6
The correlation is not particularly good, and is considerably worse than those for the prediction of other properties by the same method, thus emphasizing the difficulties involved with the prediction of melting point. When Cramer used Eq. 48 to predict the melting points of compounds not in the training set, considerably poorer correlations were obtained. Even for compounds bearing appreciable similarity to those in the training set, the correlation coefficient for n = 38 was only 0.727. The method does not therefore seem to offer much hope for the accurate prediction of melting point. Charton and Charton^^ compared the melting points of pairs of series of compounds, arguing that by so doing many of the variables affecting melting point were held constant or approximately so. They looked both at series in which the main part of the molecule was held constant and the substituent varied, and at series in which the main part of the molecule was varied and the substituent held constant. For example, the correlation between melting points of nonyl (No) compounds and undecyl (Ud) compounds was: rjNo)=1.10r^(Ud)-47.5 Az=10
r = 0.989
(49)
5 = 2.61
However, for some series comparisons, correlation was very poor; e.g. for C5H5 and C6H5CH2,r = 0.577. A series with substituent Br correlated with a series with substituent CI as follows:
146
JOHNCDEARDEN
r j B r ) = 0.954 7^ (CI)+19.5 n=l5
r = 0.980
s=n.6
(50)
Again, for some pairs of substituents (e.g. COOH and CI) correlation was poor. Thus this approach to the prediction of melting point is both restrictive in its application and, in some cases, of poor predictive ability. Dearden and Rahman^^ carried out a QSAR study of the melting points of 41 anilines, using step-wise regression to select the best parameters from a large number that were generated, including an indicator variable for the presence of hydrogen bonding. They obtained the following equation, r ^ = 2 7 6 + 1 1 7 / / D + 1 2 2 F + 3 7 . 2 V + 1 9 . 8 / / + 3 9 . 2 / 4 + 47.5/?-24.9L n = 4l
r = 0.961
^ = 20.5
(51)
where H^ and H^ = indicator variables for hydrogen bond donor and acceptor ability respectively, F and R = Swain-Lupton field and resonance parameters respectively, ^x^ = third-order valence molecular connectivity, /^ = indicator variable for 4-substitution, and L = Sterimol length parameter.^^ Dearden^^ modified Eq. 51 by using hydrogen bonding parameters supplied by Abraham,^^ and reported the following correlation: T^ = 329 + 182 a - 38.2 n + 8.91 MR - 62.2 B^ - 26.6 73 n = 42
r = 0.941
j = 24.6
(52)
In Eq. 52, a is the Abraham hydrogen bond donor parameter (it is interesting to note that this parameter alone had a correlation coefficient of 0.819 with the melting points of the 42 anilines); n is the Hansch hydrophobic substituent constant, its negative sign suggesting that it is modeling polarity; MR is molar refractivity, a measure of molecular bulk: B2 is one of the Sterimol width parameters which, together with I^, an indicator variable for 3-substitution, probably reflects shape. The compounds used in this study included several capable of intramolecular hydrogen bonding; in such cases the intermolecular hydrogen bond parameter was taken to be an arbitrary 50% of its normal value. Murugan et al.^^ predicted the melting points of 141 pyridines and piperidines reasonably well (r = 0.912, s not given) using six undefined descriptors obtained from the CODESSA software. Further unpublished work by Dearden has shown that QSAR-type equations can be developed for other classes of compound, but that the significant parameters are not necessarily the same in each case. Thus it is unlikely that a single, comprehensive QSAR equation can be developed to predict the meking points of all compounds, or even those of broad groups such as single-ring aromatics. The work of Somayajulu^^ has already been referred to in connection with hydrocarbons. He also applied Eq. 45 to the prediction of melting points of
The Prediction of Melting Point
147
homologous series of other classes of compound such as 1-haloalkanes, alkanols, alkanals, alkanones, alkanoic acids and esters, dialkyl ethers, thiols, alkylamines, and alkanenitriles. In all cases he took the limiting melting point (i.e. for n = oo) to be 414.6 K, but nonetheless achieved very good predictions, with standard deviations (SD) generally less than 1 degree. For example, for 1-alkanols the values of a and b in equation 44 are 24.11107 and 17.55276, s.d. 0.736, and for Az-alkanoic acids a and b are 20.89539 and 14.85653, SD 1.10. Excellent as these results are, they have such limited applicability that the method must be regarded as being of only marginal interest. Mason and Bernstein^"* reported that in some homologous series of salts a pronounced discontinuity in melting point can occur as chain length increases, which they attributed to a change in space group symmetry that depends on the relative sizes of cation and anion. This clearly makes the prediction of melting point in such series much more difficult. A rather different QSAR approach to the prediction of melting point was developed by Abramowitz^^ and Abramowitz and Yalkowsky.^"^ These workers examined a series of rigid, non-hydrogen bonding aromatic compounds. Starting from the basis of the relationship between melting point and boiling point proposed by Lorenz and Herz,^^ they recognized the controling influence of three geometric factors, namely molecular symmetry, molecular eccentricity, and ortho substitution. They used the rotational symmetry term a developed by Tsakanikas and Yalkowsky,^^ and defined SIGMAL = log a. They also defined a molecular eccentricity term EXPAN to account for the entropy of expansion (see Eq. 7). Finally, observing that ortho interactions involve steric forces that are important in structure-property correlations, they introduced a parameter (ORTHO) as the number of functional groups that are ortho to another group. Thus ORTHO for 2,3,5-trichlorotoluene is 3. They obtained the following correlation: mp = 0.772 bp+ 110.8 SIGMAL + 11.56 ORTHO + 31.9 EXPAN-234.4 (53) n = S5
r = 0.938
5 = 22.8
Abramowitz and Yalkowsky ^^ then applied this approach to the correlation of the melting points of polychlorinated biphenyl (PCB) congeners. They found that they were able to dispense with the bp term, but this was replaced by a term CL indicating the number of chlorine atoms in a compound. They also needed to incorporate an indicator INTER reflecting the hindered rotation of the phenyl rings due to substitution in the ortho positions; any compound having ortho chlorines in both rings has INTER = 1, and all other compounds have INTER = 0. The best correlation obtained in this study was: mp = 14.9 CL + 117.6 SIGMAL + 6.0 ORTHO + 410 EXPAN - 26.0 INTER + 200
(54)
148
JOHN C. DEARDEN
n = 5S
r = 0.91
5 = 22.1
It may be noted here that INTER, reflecting nonplanarity of the aromatic rings of hindered PCBs, is in effect another symmetry term. Thus SIGMAL and INTER could perhaps be combined into a more comprehensive symmetry parameter. Yalkowsky et al.^^ have extended this approach to a dataset of 979 non-hydrogen bonding organic compounds, and obtained a number of correlations, of which Eq. 55 is the best, T^ = - 105 + 0.876 T^ + 97.9 SIGMAL - 5.00 RIGID2 - 8.02 FLEX2 + 8.03 BRANCH2 n = 919
r = 0.899
(55)
5 = 35.5
where RIGID2 is the number of large atoms (i.e. all atoms save hydrogen and fluorine) that make up the rigid and planar portion of a molecule, with the values for bromine, sulfur, and iodine multiplied by 2, 2, and 4, respectively; FLEX2 is the number of nonring sp^ large atoms that have some degree of conformational freedom and are not included in RIGID2, but does not include atoms branching from or adjacent to aliphatic chains; BRANCH2 is the number of large atoms that are not included in either RIGID2 or FLEX2. It is also pertinent to comment that Eqs. 53 to 55 relate only to non-hydrogen bonding compounds, which could restrict the application of this approach, since hydrogen bonding is known^^ to be an important force in controlling melting point. This restriction could, of course, be overcome by the inclusion of an appropriate hydrogen bonding parameter. However, Tesconi and Yalkowsky^^ comment that hydrogen bonding does not appear to affect the entropy of fusion markedly. Bhattacharjee et al.^^ devised a new index termed geometric volume V (in effect, the unit molecular volume in space) with which to correlate a number of properties of halomethanes. Their correlation for melting point was: mp= 117.79 Vg-343.78 n=9
r = 0.962
(56)
5 = 30.4
There is no indication of whether this parameter has wide applicability. Topological indices were used by Medic-Saric et al.^^ to model the melting points of a series of 3-(phthalimidoalkyl)-pyrazolin-5-ones. The best correlation that they obtained was: mp = -53.752 ^x^ + 101.167 /-h 229.157 n=ll
r = 0.944
^^^^
5=14.95
where ^x^ = first-order valence molecular connectivity, and / = an informationtheoretic index based on the Shannon relation. Although the correlation is good.
The Prediction of Melting Point
149
this is probably because the substituents are largely alkyl groups, whose properties are generally modeled well by topological indices. Charton and Charton^^ used their intermolecular force equation to predict the melting points of 178 ethenes, benzenes, and naphthalenes with a wide range of substituents, and obtained a correlation coefficient of 0.968 and a standard error of 24.0 degrees. Charton and Charton,^^^ using a variant of the intermolecular force equation which included a variable capable of accounting for the packing energy contribution of the alkyl group, were able to predict the melting points of 366 alkanes, substituted with a variety of polar and non-polar groups, with r = 0.958 and s = 17.9 degrees. Twenty-nine compounds had to be omitted from the correlation, six of which were carboxamides. Prediction of the melting points of a test set of 45 compounds yielded a mean error of 44.6 degrees. The same approach has been used^^^ to model the melting points of 21 2-substituted pyridines (r = 0.958, s = 24.8), 24 3-substituted pyridines (r = 0.943, s = 32.5), and 24 4-substituted pyridines (r = 0.924, s = 42.8). Westwell et al.^^ observed excellent correlations between melting point and thermodynamic parameters such as enthalpy of sublimation and enthalpy of vaporization for a wide range of both organic and inorganic compounds; however, the compounds selected were essentially rigid, having no more than two internal rotations. They also found a good correlation with boiling point: A//3,b = 0.188 7^+ 0.522 n=l60
(58)
r = 0.95 s not given
AH,3p = 0.166 7^-3.99
(59)
A2 = 160 r = 0.93 s not given r , = 1.52 7 ^ + 1 4 n= 160
r = 0.93
(60)
s not given
They attributed the good correlation of T^ with AH to the fact that although the enthalpy of sublimation is the sum of the enthalpies of vaporization and of fusion, the last-named is relatively small, so that to a good approximation for rigid molecules, AH^^^ = A^^^p. Pogliani^^ was able to model the melting points of 12 caffeine homologues very well (r = 0.982, s = 20.5) by a linear combination of six molecular connectivity indices. Pogliani^^^ also modeled the melting points of 20 amino acids, this time using newly developed fractional molecular connectivity indices. The best combination of parameters yielded a correlation with r = 0.872 and s = 22.5 degrees. This was later greatly improved^^ to r = 0.964, s=l2.2 degrees.
150
JOHN C DEARDEN
Todeschini and Gramatica^^^ extended their WHIM approach beyond hydrocarbons when they modeled the melting points of 13 chlorobenzenes using two WHIM descriptors representing symmetry and size, with r = 0.966 and s= 18.1 degrees. Chiorboli et al.^^ then applied the WHIM approach to the prediction of melting points of 51 halobenzenes and 38 halotoluenes, and compared the results with predictions made using topological descriptors. A correlation involving six WHIM descriptors had r = 0.928 and s = 22.3 degrees, while a correlation involving six topological descriptors had r = 0.925 and s = 22.9 degrees. Todeschini et dl}^^ correlated the melting points of 94 heterogeneous chemicals from the European Union Priority Chemicals List 1 with a WHIM descriptor and four other descriptors such as bond and group counts, a topological index and hydrophilicity (r = 0.896, s = 35.7 degrees). Using a different WHIM descriptor and five non-WHIM descriptors, the correlation was improved to r = 0.913,5" = 32.8 degrees. No indication was given of the relative contributions of the WHIM and non-WHIM descriptors. It is still not clear how well the WHIM approach can model hydrogen bonding, which is a key factor controling the melting points of many compounds. A number of workers have reported correlations between melting point and solubility. Although such correlations have been concerned with the prediction of solubility rather than of melting point, they nevertheless offer another approach to the prediction of melting point. Yalkowsky and Valvani^ were the first to show that aqueous solubility could be modeled using melting point and the octanol-water partition coefficient (Eq. 1). Yalkowsky et al.^^^ showed that solubility in an ideal solvent should be rectilinearly related to solute melting point and observed such a relationship for the mole fraction solubility of a heterogeneous set of drugs in 1-octanol: log X^ = - 0.012 mp + 0.26 n = 36
r = 0.92
(61)
5 = 0.32
Rubino^^^ found that the aqueous solubilities of sodium salts of weakly acidic drugs could be correlated with melting point and stoichiometric number of water molecules in the crystal hydrate (N): log 5 = - 0.004 mp - 0.100 N + 1.691 n = 11
r = 0.964
(62)
s not given
Thomas and Rubino^^^ found a reasonable correlation between the aqueous mole fraction solubilities of the hydrochloride salts of some secondary amine drugs and their melting points: log 5 =-0.026 7^ +9.546
(63)
The Prediction of Melting Point
n=8
151
r = 0.898
s not given
They commented that the existence of the correlation shown in Eq. 63 implies a constant entropy of fusion, but observed that such entropies ranged from 16 to 26 cal mol deg~^ for the compounds examined; hence the correlation between solubility and melting point is not related solely to a dependence of solubility upon solid-state properties. A negative curvilinear relationship was observed by Anderson and Conradi^^^ between the solubility products of a series of six amine salts of the nonsteroidal antiinflammatory drug flurbiprofen and the melting points of the salts, although no mathematical correlation was reported. The relationship between melting point and solubility warrants further investigation, particularly as to whether a general relationship can be formulated which could aid both melting point and solubility studies. Przezdziecki and Sridhar^^^ used melting point to predict viscosity of organic liquids, and Walters et al^^^ have used melting point, rotational symmetry, and atomic contributions to estimate boiling point (n = 1419, r = 0.956, 5* = 28.1 degrees). Tesconi and Yalkowsky^^ have suggested that such correlations could aid the design of compounds with, for example, low vapor pressure, or with a high melting point (low aqueous solubility) for sustained release. Horvath^^"* has briefly reviewed published work up to 1990 on the prediction of melting points of organic compounds. The Group Contribution Approach. Equation 4 makes it clear that melting point is controlled by both the enthalpy and the entropy of fusion, and the approaches reviewed above have, implicity or explicitly, taken both these factors into account. A simpler approach, ignoring the entropy of fusion, is to consider that the enthalpy of fusion, and hence melting point, are additive properties of the interactive contributions of the various parts of a molecule. Wachalewski^^ was the first to adopt this approach. He assumed additive atomic contributions U-^ to melting point, but divided the summed contributions by the average molecular diameter D
(A): m
£)
The atomic contributions U-^ were reported by Wachalewski as: H
N
O
F
CI
Br
I
C
S
17.4
107.4
87.4
76.8
358.6
605.1
977.1
220.0
275.0
The values reported are somewhat surprising. One would have expected atoms capable of participating in hydrogen bonding (N,0) to make a greater contribution
JOHN C. DEARDEN
152
than, say, halogen atoms. Nevertheless, for 85 aliphatic and aromatic hydrocarbons and halogenated hydrocarbons, the average prediction error was 31.5 degrees. It is significant, however, that none of these compounds contained hydrogen bonding groups, and one would suspect that the method would not work well for hydrogen bonding compounds. Group contributions (GC) to melting point were investigated in great detail by Joback^^^ and Joback and Reid;^^^ the method was also outlined by Reid et al.^^^ Essentially these authors used multiple regression analysis to determine the contribution of a large number of groups to melting point. Table 5 lists some of the group contributions. Joback and Reid found that melting point could be calculated as: (64)
r = 122.5+ S G C
Reid et al.^^^ reported that for 388 simple and complex organic compounds, the average error in T^ was 23 degrees using the Joback and Reid method. As pointed out earlier, the use of group contributions alone ignores the entropic contribution to melting point. Yalkowsky and coworkers have incorporated the entropic contribution through the use of Eq. 5. Simamora and Yalkowsky^^^ examined 520 non-hydrogen bonding aromatic compounds for which they derived Table 5. Group Contributions to Melting Point^ Group Nonring >C< =CH=C< =CH
Group
m
46.43 8.73 11.14 -11.18
Ring >C< =CH=C< >CH-
m
60.15 8.13 37.02 19.88 7.75
-15.78
-CH2Oxygen -OH (alcohol)
-CI
13.55
-OH (phenol)
82.83
-Br
43.43
- O - (nonring)
22.23
-1
41.69
- 0 - (ring)
23.05
>C=0 (nonring)
61.20 75.97
=CH2 Halogen -F
-4.32
Nitrogen -NH2 >NH (nonring) >NH (ring) >N-(nonring) -CN -NO2 Note: Fromref. 116.
>C=0 (ring)
44.45
66.89
-CHO
52.66
-COOH
155.50
101.51
-COO-
53.60
48.84 59.89 127.24
36.90
The Prediction of Melting Point
153
enthalpic (group) and entropic contributions and hence calculated melting points from Eq. 6. They also calculated the melting points of the same compounds by the method of Joback and Reid.^^^ The standard error of the melting point was found to be 39.2 degrees using their approach, but was 83.4 degrees using the Joback and Reid approach. Simamora and Yalkowsky then applied the method to the 12 chlorobenzenes, which were not included in their training set but had been included in the Joback and Reid training set. Standard errors were found to be 26.9 degrees (Simamora and Yalkowsky method) and 31.2 degrees (Joback and Reid method). Simamora et al.^^^ then extended their approach to a total of 1181 aromatic compounds, some of which were capable of hydrogen bonding, including intramolecular hydrogen bonding (IHB). Since the latter lowers melting point, they incorporated a number of terms to correct for this. One would expect such correction terms to have a negative coefficient, but they found the correction term for four membered ring intramolecular hydrogen bonding to be positive; they explained this as being due to the fact that -NH2 ^^ " ^ ^ ^^ ^^^ 2-position of, say, pyridine results in the formation of a cyclic dimer involving intermolecular hydrogen bonding, rather than the formation of a four-membered intramolecularly hydrogen bonded ring which at best is highly strained. The authors reported the correlation as: ^m = Y,n,rn-/{n.5 - 4.6 log a) n=1181
r = 0.995
(65)
^ = 36.7
where m^ is the group contribution to the enthalpy of fusion. Simamora and Yalkowsky^^^ extended this approach to a total of 1709 aromatic compounds, and have increased the number of functional groups for which contributions to the enthalpy of fusion are available. The group contributions and intramolecular hydrogen bonding correction terms are given in Table 6. The correlation they reported is: Trn = ^n,m./{56.5 - 19.1 log a) n=1709
r = 0.992
(^6)
5 = 37.5
(N.B. This equation is identical to Eq. 65 save that energy units are in J mol"^) Simamora and Yalkowsky ^^^ compared their predictions to those using the method of Joback and Reid;^^^ for 1678 aromatic compounds (the maximal number of their compounds for which Joback and Reid group contributions were available) they found the standard error of the melting point predictions to be 37.5 degrees by their method and 69.1 degrees by the method of Joback and Reid, thus clearly demonstrating the importance of including entropic contributions to melting point. Krzyzaniak et al.^^^ then applied the method to 596 aliphatic non-hydrogen bonding compounds, with the following result:
JOHN C. DEARDEN
154
Table 6. Group Contributions to Enthalpy of Fusion of Aromatic Compounds, and Correction Terms for Intramolecular Hydrogen Bonding^ Term aromatic N aromatic C aromatic CH aromatic C involved in biphenyl link F attached to sp C atom CI attached to sp C atom Br attached to sp C atom 1 attached to sp C atom CH3 attached to sp C atom attached to sp C atom NO2 CN attached to sp C atom attached to sp C atom NCS SCN attached to sp C atom OH attached to sp C atom attached to sp C atom NH2 CONH2 attached to sp C atom COOH attached to sp C atom CHO attached to sp C atom SH attached to sp C atom attached to 2 sp^ C atoms 0 attached to 2 sp C atoms S attached to 2 sp^ C atoms c=o attached to 2 sp C atoms s=o attached to 2 sp C atoms SO2 NH attached to 2 sp^ C atoms attached to 2 sp C atoms CH2 attached to 2 sp C atoms COO atom in a ring 0 atom in a ring S group in a ring c=o group in a ring s=o group in a ring SO2 NH group in a ring group in a ring CH2 COO group in a ring IHB in four-membered ring IHB in five-mem bered ring IHB in six-membered ring IHB in seven-membered ring 2,2',6 and or 6' substituent in biphenyl Note: ^From ref. 120.
m.(Jmor^) 3460 97 1940 -2170 1950 3400 3900 4400 2590 7320 8240 8810 4530 7110 6520 15010 14750 7680 7700 -5440 -3870 2160 30 1150 -1750 -5120 -740 2480 2750 4270 7640 10870 6900 2220 8950 3950 -1110 -1820 -620 -1160
The Prediction of Melting Point
155
^m = ^n.m./(56.5 - 19.2 log a + 9.2T) n = 596
r = 0.988
5 = 34.3
(67)
where x is the effective number of torsional angles (a measure of flexibility^^) and is calculated as, T = (SP3 + 0.5 SP2 + 0.5 RING) - 1
(68)
v/here SP3 is the number of nonring, nonterminal sp^ atoms including NH, N, O, and S, SP2 is the number of nonring, nonterminal sp^ atoms, and RING is the number of monocyclic or fused-ring systems in a molecule. If x turns out to be negative, it is assigned a value of zero. Table 7 lists the group contributions and correction terms reported by Krzyzaniak et al.^^^ A comparison with the method of Joback and Reid^^^ for 59 compounds found a standard error of 34.4 degrees by their method, compared with 51.5 degrees by the Joback and Reid method. An interesting development in the group contribution approach was reported by Constantinou and Gani.^^^ They devised basic (first-order) contributions (Cj) to give approximate predictions and second-order contributions (C2) to give more accurate predictions. The second-order contributions are obtained from a consideration of the possible conjugation or resonance structures, the melting point being assumed to be a linear combination of the contributions of these conjugate forms. The method is claimed to capture differences between isomers. The melting point T^ is calculated as, «.r,. S«A++VE«jC: e. V m, '==Vn.C..
(69)
where t^ is a constant equal to 102.425 K, «• is the number of occurrences of a first-order group with contribution Cjj, and n- is the number of occurrences of a second-order group with contribution C2J. Table 8 lists the first-order group contributions and Table 9 the second-order group contributions reported by Constantinou and Gani. For 312 unidentified organic compounds, Constantinou and Gani reported, using first-order contributions only, a standard error of 22.51 degrees; inclusion of second-order contributions reduced this to 18.28 degrees. Mean absolute errors were 17.39 and 14.03 degrees, respectively. The authors made a comparison with the Joback and Reid method,^^^ which yielded (presumably for the 312 compounds studied by Constantinou and Gani) a mean absolute error of 22.6 degrees. It is impossible to say, on the information available, whether the Constantinou and Gani method yields superior predictions to those using the Yalkowsky approach. All that can be done is to examine the errors claimed by the two methods (Table 10).
JOHN C. DEARDEN
156
Table 7, Group Contributions to Enthalpy of Fusion of Non-hydrogen Bonding Aliphatic Compounds, and Correction Terms^ n i j (//T70r')
Term
W
X
-CH3
1953
-CH2-
2347
1806
>CH-
387
-600
>C<
129
-1155
-CH=
2448
2448
>C=
1603
1603
1449
=CH2
1456
=C=
1355
=CH
2892
-C=
1938
-F
2227
1777
-CI
4478
2488
-Br
5285
3512
-1
6789
4950
-0-
3527
2921
-CH=0
7968
7078
-C(=0)-
7352
5332
-COO-
7254
-S-
4507
-N02 -CN
6827
8271
-N=C=S
10547
-S-C=N
9639
GEM VIC
1938
5067
-1106 -230
RINGS
67
RING4
-22
RINGS
-817
RING6+
-689
CHbridge
2125
Cbridge
6432
Notes: ^From ref. 121. A fragment is identified according to the hybrid state of the atoms to which it is bonded; two types of neighboring groups are considered, X and V, which represent sp^ atoms and SJ^?- or sp atoms, respectively. ^GEM = no. of interactions among term inal electron-withdrawing groups on a single carbon atom; VIC = no. of interactions among terminal electron-withdrawing groups on adjacent atoms; RINGS, RING4, RINGS, RING6+ are the numbers of atoms in a S, 4, 5 and >5 membered ring; CHbridge and Cbridge represent bridgehead carbons in fused ring systems.
The Prediction of Melting Point
157
Table 8. First Order Group Contributions to Melting Point^ Group
Group
m
m
CH3
0.4640
CH2N
CH2
0.9246
ACNH2
10.1031
CH
0.3557
C5H3N
12.6275
C
1.6479
CH2CN
4.1859
CH2=CH
1.6472
COOH
11.5630
CH=CH
1.6322
CH2CI
3.3376
CH2=C
1.7899
CHCI
2.9933
CH=C
2.0018
CCI
9.8409
C=C
5.1175
CHCI2
CH2=C=CH
3.3439
CCI3
10.2337
ACH
1.4669
ACCI
2.7336
AC
0.2098
CH2NO2
5.5424
ACCH3
1.8635
CHNO2
4.9738
0.4177
ACNO2
8.4724
-1.7567
CH2SH
3.0044
1
4.6089
Br
3.7442
ACCH2 ACCH OH
3.5979
0.9544
5.1638
ACOH
13.7349
CH3CO
4.8776
CH^C
3.9106
CH2CO
5.6622
C=C
9.5793
CHO
4.2927
CI-(C=C)
1.5598
4.0823
ACF
2.5015
CH2COO
3.5572
CF3
3.2411
HCOO
4.2250
COO
3.4448
CH3O
2.9248
CCI2F
7.4756
CH2O
2.0695
CCIF2
2.7523
CHO
4.0352
F(except as above)
FCH2O
4.5047
CONH2
31.2786
CH2NH2
6.7684
CON(CH3)2
11.3770
CHNH2
4.1187
CH3S
5.0506
CH3NH
4.5341
CH2S
3.1468
CH2NH
6.0609
CH3COO
CHNH
3.4100
CH3N
4.0580
1.9623
Notes: ^From ref. 122. ^A indicates an aromatic group.
Another group contribution method, using three levels of contribution, has been developed by Tu^^^ for hydrocarbons, and extended by Tu and Wu^^'* to a wide range of organic compounds. Their general equation is:
JOHN C. DEARDEN
158
Table 9. Second Order Group Contributions to Melting Point^ Croup (CH3)2CH
Group
m
0.0381
CH3COOCH or
m
0.4838
CH3COOC (CH3)3C
-0.2355
COCH2COO or
0.0127
C O C H C O O or COCCOO CH(CH3)CH(CH3) CH{CH3)C(CH3)2
0.4401 -0.4923
CO-O-CO
-2.3598
ACCOO
-2.0198 -0.5480
C(CH3)2C(CH3)2
6.0650
CHOH
3-membered ring
1.3772
COH
0.3189
5-membered ring
0.6824
CHJOH)CHn(OH)
0.9124
6-membered ring
1.5656
m,n,G(0,2)
7-membered ring
6.9709 1.9913
OHmcyclic-O^^ me (0,1)
9.5209
CHn=CHm-CHp=CHk
0.2476
CH^(OH)CHn(NHp)
2.7826
k,n,m,p,E(0,2) CH3-CHf^=CHn m,n,G(0,2) CH2-CHn^=CHn
m,n,p,G(0,3) -0.5870
CH-CH^=CHn or
CHn,(NH2)CHn(NH2)
2.5114
m,n,G(0,2)
m,n,G(0,2) -0.2361
^'"mcyclic"'^ ''p"^''ncyclic
1.0729
m,n,p,G(0,2)
C-CH^=CH, m,n,G(0,2)
CH^-0-CH,=CHp
Alicyclic side chain ^cyclic ^ m
-2.8298
0.2476
m,n,p,G(0,2)
m > 1 CH3CH3'^
1.4880
AC-O-CH^
0.1175
m,G(0,3) C H C H O or
2.0547
CH3COCH2
-0.2951
CH3COCH or CH3COC
^''mcyclic"^"^''ncyclic
-0.2914
m,n,G(0,2)
CCHO
CH,=CH,-F
-0.0514
m,n,G(0,2) -0.2986 CH,=CH,-Br 0.7143
-1.6425
m,n,G(0,2)
Qyclic ( = 0 ) ACCHO
-0.6697
ACBr
C H C O O H or
-3.1034
ACI
-1.5511
ACCOOH
28.4324
CCOOH
Notes: ^From ref. 122. ^N.B. A indicates an aromatic group; G means within the range of. H'his should probably be CH3CH2
2.5832
The Prediction of Melting Point
159
Table 10, Comparison of Prediction Error with that from Joback and Reid Method Ratio of Standard (s) or Absolute (a) Error to that Given by Joback and Reid Compounds 1678 aromatic
Method Simamora and Yalkowsky
Method 0.54 (s)
596 aliphatic (non-H-bonding) Krzyzaniak et al.
0.67 (s)
312 unspecified
0.62 (a)
Constantinou and Gani^^^
^n. = ^^ + S^i+S^|
(70)
where G^ = S^=oAn(A^c)~"' ^n ^^ ^ constant, N^, is the number of carbon atoms in a molecule, and n is an integer from 0 to 5. Values of A are reported for each value of n. G^ takes into account the contribution from molecular features such as side-chains, double and triple bonds, alicyclic rings, etc., and G^is the contribution of each functional group. For both G^ and G^ different levels (up to 4 for G^ and up to 3 for G^) are reported for each type of structure or functional group, and these are then summed. For a total of 1310 organic compounds, comprising 24 different chemical classes, Tu and Wu obtained a mean percentage error in the predicted melting point of 8.2%; they compared this with 13.9% by the Joback and Reid method. By comparison, Constantinou and Gani reported a percentage error for their melting point predictions for 312 compounds of 7.23%, against an 11.2% error for the same compounds by the Joback and Reid method. For 22 compounds not included in their training set Tu and Wu found an error of 4.7%. Again, it is difficult to say whether the method is superior to that of the Yalkowsky group or that of Constantinou and Gani. The Yalkowsky group contribution method appears to this writer to be the simplest of the three methods. It is to be hoped that the group will extend their approach to include hydrogen bonded aliphatic compounds. Tesconi and Yalkowsky^^ have reviewed the Yalkowsky approach in detail, as well as mentioning other published work on melting point prediction. It may be noted here that Chickos et al.^^'^^ have also developed a group contribution method for the calculation of enthalpies and entropies of fusion of organic compounds, although they have not applied their method to the prediction of melting point. Yalkowsky et al.^^^ have combined the group contribution approach with the calculation of properties that contribute to, for example, entropy changes; they term this the unified physical property estimation relationships (UPPER) approach. In this method melting point is obtained as the ratio of (calculated enthalpy of fusion) / (calculated entropy of fusion) (Eq. 4); a total of 21 physical properties is calculated by the UPPER method. Yalkowsky et al,^^^ then applied the method to the calcula-
160
JOHN C. DEARDEN
tion of properties of the 12 chlorobenzenes; melting points were predicted with a mean absolute error of 27 degrees (see also Simamora and Yalkowsky^^^). Predictions using group contribution methods are very good, although the standard error is still rather high. The major concern here is for compounds that melt close to environmental temperatures, because for many purposes (e.g. pharmaceutical formulation and waste handling) it is important to know whether the compound is likely to be liquid or solid under normal conditions. Hence a major aim of future work must be to reduce the standard error of the prediction. It is likely that this will be achieved through improvement of entropic rather than enthalpic parameters. Computerized Methods. Although the "Handbook of Chemical Property Estimation Methods"^^^ does not deal with the estimation of melting point, the on-line version, known as CHEMEST, does in fact do so.^^^ The program uses two methods for the calculation of melting point: 1. Melting point is estimated using the method of Lorenz and Herz^^ which relates melting point to boiling point. If the boiling point is not available, it too has to be estimated using one of the methods available in CHEMEST. 2. Melting point is estimated using the method of Grain and Lyman,^^^ using the equation, T^ = 0.474(10^-^/^) + bQ^ - 28
(71)
where p^ = liquid density at normal boiling point, and b = a. constant related to chemical class (^ = 0 for halogenated aliphatic hydrocarbons, 100 for other aliphatic compounds and halogenated aromatic and cyclic compounds (except perhalogenated), and 175 for other aromatic and cyclic compounds). The first term of the equation models boiling point. Lyman et al.^^^ reported that, for 11 compounds of widely varying structure, method 1 gave an average prediction error of 19.9 degrees, and method 2 one of 11.9 degrees. Such good prediction needs further validation. Boethling et al.^^^ tested the ability of CHEMEST to predict melting point, inter alia. Using method 1 only, they used CHEMEST to calculate the melting points of 141 compounds comprising six chemical classes, most of which were liquids at room temperature. The mean prediction error was 36.2 degrees, but was considerably worse (61.6 degrees) for 16 carbamates. The physical state at 25 °C was incorrectly predicted for about 20% of the compounds. Boethling et al. also observed that estimated minus observed melting point decreased from about +70 degrees at -10 °C to about -75 degrees at 190 °C for a set of 34 phenols. The authors commented that a better method of melting point estimation generally applicable to neutral organics is much needed.
The Prediction of Melting Point
161
Lynch et al.^^^ tested an automated version of CHEMEST called AUTOCHEM. This program automatically selects the most appropriate equation(s) for the prediction of a property. The melting points of 229 compounds comprising eight chemical classes, mostly solids at room temperature, were estimated and compared with experimental values. The mean melting point estimation error was 80.9 degrees, and the physical state of the compounds at 25 °C was incorrectly predicted for 15% of the training set. The authors comment that additional effort is needed to improve the accuracy of estimation. CHEMEST is a mainframe program, but a PC version is now available. A PC-based program called QSAR^^^ also uses the Lorenz and Herz equation^^ for the prediction of melting point, so the results of Boethling et al.^^^ regarding method 1 in CHEMEST apply equally to this method. A recent PC-based program^^^ called MPBPVP estimates melting point, boiling point, and vapor pressure for organic compounds using SMILES (simplified molecular input line entry system) entry. It calculates melting point by two methods, that of Lorenz and Herz^^ and a modification of that of Joback,^^^ and takes the arithmetic mean of the two; an unspecified weighting is also sometimes applied. It may be noted that the Lorenz and Herz method requires knowledge of the boiling point of the compound; MPBPVP allows for the input of a measured boiling point, but if one is not available it calculates it using an adaptation of the method of Stein and Brown.^-^"^ Using the same twelve "benchmark" chemicals as used by Lyman"^ to test the Lorenz and Herz^^ method, the program was found to give a mean error of predicted melting point of 25.4 degrees. This compares with the mean error of 27.1 degrees found by Lyman for the Lorenz and Herz method; it must be pointed out, however, that Lyman used mainly measured boiling points for the estimation of melting points, whereas the MPBPVP program was tested with no measured boiling points being used. CambridgeSoft Corporation have recently released a program called ChemProp Pro^^^ which calculates a range of properties, including melting point which is calculated using the method of Joback;^^^ the results of Boethling et al.^^^ regarding method 1 in CHEMEST clearly apply to this program also. Polymers
Prediction of polymer melting points is a large and complex topic and it would not be appropriate here to deal with the subject in great detail. Hence a brief resume of some of the important published work in this area is felt to be sufficient. A number of early workers^^'^^'"^^'^^ postulated that the melting points of any homologous series approach a limit as chain length tends to infinity. Timmermans"^^ went so far as to propose a common limit of about 120 °C, and Somayajulu^^ used 141.4 °C. The concept of a common limit arises because at infinite chain length the contribution of different end groups could be expected to be negligible ."^^ The work of Broadhurst^'^^ and Flory and Vrij^^ is particularly relevant, as their work set the
162
JOHN C. DEARDEN
scene for much of the later work concerning polymers. These workers, using Eq. 29, were able to predict the melting point of polyethylene quite closely. Recent work on the melting point of normal alkanes by Hanson and Rouvray^"^ showed a similar asymptotic relationship (Eq. 40) with the melting point of a normal alkane of infinite chain length being predicted to be 411.6 K (138.4 °C; observed value for polyethylene 138 °C). Flory^^^ derived thermodynamically the equation, l/iT^-rj = 2 R ^
(72)
where R is the universal gas constant, jc^^is the average degree of polymerization and h^ is the heat of fusion per structural unit of the polymer. He also extended his derivation to include copolymers, but gave no indication of how well his equations predicted the melting points of polymers. Eby^^^ derived a similar equation to that of Flory^^^ for the melting points of copolymers using an entirely different rationale based on the concept of the uncrystallizable component of the copolymer and the molecular ends entering the lattice as point defects. Like Flory, he gave no data to validate his equation. Hay^^^ pointed out that analysis of the melting points of low molecular weight oligomers to determine the melting point of the infinitely extended chain in a perfectly ordered crystal is potentially a more exact method than analysis of observed melting points of high molecular weight polymers, which may be subject to considerable error. He modified Eq. 72 and Eq. 29^^ to: r „ = 7^[l-2R7;:((lnn)/n)h„]
(73)
Hay gave no predicted melting points, but graphical plots of T^ versus (In n)/n for polyethylene, poly(ethylene oxide), and poly(methylene oxide diacetate) gave good straight lines. Buckley and Kovacs^^^ devised an equation for the melting points of poly (ethylene oxide) polymers. T =7-|__£2l_l_^!^Z^ L'AH* \ AH*
(74)
where o^,f^= surface free energy. A//* = molar enthalpy of melting per monomer unit, and L = crystal thickness. They found that a plot of T^ against 1/L gave a straight line. Van Krevelen^"^^ devised a group contribution method for the prediction of melting points of a wide range of polymers. He proposed that, T^=YJM
(75)
The Prediction of Melting Point
163
where Y^ is the molar melt transition function, and M is the molar weight of the structural unit. This in fact approximates to the general melting point equation T^ = MIJAS^ (Eq. 4). The molar melt transition function is defined as, (76)
>^m = 2:r^ + ^>^m(4) + 2:y(ODD)
where Y^ is the molar melt transition function for each group in the structural unit; Y^(Ij) is an interaction correction per polar group X with I^ itself defined as the number of main chain atoms in the polar group X divided by the number of chain atoms of this group plus those of the directly connected methylene chains; Y^(ODD) is a correction (reduction) per odd methylene group adjacent to X. Van Krevelen gave values of these terms for a wide range of groups. The following example of the polyester of ethylene glycol and ll-(4-carboxyphenoxy) undecanoic acid serves to illustrate the method. The structural unit formula is:
(CH2)io
"(CH2)2—O—Q
if
C
04-
Interaction factors /. are: for-COO- attached to (CH2)2
= 2/(2 + 2)
= 0.5
for-COO-attached to (CH2)io =2/(2 + 2+10)
=0.143
for-O-
=0.091
=1/(1 + 10)
There are no methylene chains with odd numbers of methylene groups, so FJODD) = 0. Total Y^ is calculated from data for the different structural groups given by Van Krevelen in Table 11. Van Krevelen stated that of nearly 800 polymers whose melting points are reported, about 75% gave calculated values that differed by less than 20 degrees from experimental values. He commented also that some of the experimental values for the other 25% are not fully reliable, and concluded that his method gave satisfactory predictions. Wunderlich and Czornyj^"*^ used the Broadhurst^ and Hory and Vrij^^ approaches to predict the melting points of a range of polyethylene polymers, with remarkably accurate results. For 18 polymers with chain lengths from C^^ to infinity, the mean prediction error was only 0.6 degrees using Broadhurst's equation and 2.0 degrees using Flory and Vrij's equation. There is a certain irony in these results, since the Flory and Vrij equation contains an entropy term designed to improve predictive ability! These authors also introduced an equation relating melting point to the thickness t of the lamellae in the crystal:
164
JOHN C. DEARDEN Table 11. Calculation of Melting Point of Polyester of Ethylene Glycol and 11 -(4-Carboxyphenoxy) Undecanolc Acid by the Method of Van Krevelen^'^^ 68,400 50,000 5,000
12xCH2 1 X C6H4 >'m
(COO)i
|yjg(o.5'3aooo)
15,000
jUODD)
0 0 4,300
Yn, (COO)2
|yjg(o.i43'30,ooo) JUODD)
0
-3,300 Vm |yj/,)(0.091-30,000) 3,000 0 jUODD)
-O-
coo-o- interaction
-18,000 -10,000 119,400
asymmetry correction Total y „ Molar we ight of the structural unit = 348 Hence T^,= 119,400/348 = 343K Observed T„ = 338 K
7^ = 414.2(1-0.627/0
(77)
They do not quote predicted melting points using this equation, but show graphically that a plot of T^^ against l/t gives a good straight line (cf. the work of Buckley and Kovacs^^^). Mekenyan et al.^^ used a graph-theoretical approach to the calculation of a range of physical properties, including melting point, of polymers. They developed a total of 29 equations for the prediction of melting points of various polymers. Their approach is based on the Wiener index, which is a topological index relating to the number of bonds between each pair of atoms in the molecule. Since this tends to oo
infinity as AZ -^ oo, the authors used a modification W of the Wiener index in their correlations. Two examples of their correlations are given below: Polyethylene mp = 693.675-6439.3 W+ 11746.3 W^ n=l7
r = 0.996
5 = 6.51
(78)
The Prediction of Melting Point
165
Polycapramide
mp = 507.751 - 1297.3 W+ 3424.36 W^ n=ll
r = 0.909
5=1.61
(79)
However, the method does not give good predictions of the melting points of infinite chain length polymers; for polyethylene it yields 123.3 °C, whereas the observed value is 138 °C. Mandelkern and Stack^"^^ have given an excellent critical discussion of the theoretical and experimental basis for determining the melting temperatures of long-chain molecules. They confirmed the validity of the Flory and Vrij approach,^^ and suggested that some earlier correlations (e.g. that of Wunderlich and Czornyj^'*^) may not be as good as claimed because of errors in experimental melting points of very long-chain alkanes. Mandelkern and Stack pointed out that the Flory and Vrij approach is the correct one where molecular crystals are formed, but for real polymer chains of finite length, molecular crystals cannot be formed and a different analysis is required. They proposed the use of Flory's equation^^^ (Eq. 72), but stated that because the parameters involved are molecular weight-dependent, it is not possible to extrapolate to the melting point of the infinite chain length polymer using solely the melting temperatures of equilibrium crystallites formed by chains of finite length. Cantor and Dill^"^^ have pointed out that most liquid n-alkanes comprising 9-14 carbons freeze to a "rotator" phase a few degrees above the temperatures at which they fully crystallize. They developed a statistical mechanical theory to predict melting from the rotator phase, and, although they did not tabulate results, showed graphically that experimentally observed melting points were extremely close to their predicted values, from C^ to CggQ. Starkweather^"^, again using the Flory and Vrij approach,^^ was able to predict the melting points of perfluoroalkanes and poly(tetrafluoroethylene). Using differential scanning calorimetry, he calculated that a perfectiy crystalline, chain-extended, monodisperse high polymer should melt at 347 °C, which compares well with an experimental value of 346 °C. Copolymers present a rather different problem, and require a different approach. Frushour^"^^ developed an equation to predict the melting points of polyacrylonitrile copolymers: n-\
1/7-™-!/?;;=!:^^,
(80)
where T^ = melting point of the homopolymer, n = copolymer order, X- = mole fraction of the f^ monomer, and ATj = corresponding melting point depression
166
JOHN C. DEARDEN
constant. Using this equation, Frushour was able to predict the melting points of a range of copolymers and terpolymers to better than 1 degree in most cases. Tanaka^"*^ (and references cited therein) has modeled the melting points of atactic polypropylene and propylene/ethylene copolymers,
where R = universal gas constant, h^ = heat of fusion per molar structural unit of major component, h^ = heat of transition per molar structural unit due to quasicrystals in the amorphous regions, a^ = molar surface free energy at the ends of a crystal, a = a constant relating to number and mean lengths of blocks composed of crystallizable units and ^ = crystal length. No comparisons of observed and predicted melting points were given by Tanaka. Polikarpov et al.,^"^^ studying polyorganocarbosilanes, devised the equation. (82)
i/7'm=E^,Ay,/SAv, R
I where AV- = incremental volume of r unit of
Si
(CH2)3 , K- = 18.5/?/zD.,
R
R = radius of atom in question, z = coordination number and D- = bond length between atoms. For eight different polyorganocarbosilanes the average error of the predicted melting points was 7 degrees. earlier et al.^"^^ used a new concept, the percentage of rigid chain length (PRCL), as a means of predicting the melting points of poly(aryl ether ketone)s and poly(aryl ether sulfone)s. For example, for poly(aryl ether ketone)s they obtained: mp = 9.7937 PRCL - 202.33 n=lO
r = 0.996
(83)
s not given
They point out that the enthalpy of melting is fairly constant due to the isomorphism of the diphenyl ether and diphenyl ketone groups, so variations in entropy of melting are largely responsible for the variation in melting point; it is this factor which is believed to be largely responsible for the rectilinear correlation observed in Eq. 83. Tan and Rode^"*^ investigated the relationship between the melting points of oligomethylenes and quantum chemical properties calculated using CNDO/2. They found an excellent correlation with the sums of charges on carbon (QC) and hydrogen (QH) atoms, respectively:
The Prediction of Melting Point
167 n
mp = 484.4 + 278687.8 ^ i
2n+2
QC/n + 503772.4 ^ QH./{2n-^2) i
n = 4l r = 0.999 5=1.72 (84) The average error in the predicted melting point for 22 compounds not in the training set was 0.60 degrees. Tan and Rode observed that the sums of charges correlated well with the number (n) of methylene groups in the oligomers, leading toEq. 85: mp = 141.4 + 7918.6/n - 10535.2/(n + 1)
(85)
The authors claim that their method yields better predictions of the melting points of oligomethylenes than does that of Somayajulu.^^ Sumpter and Noid^^^ have used an unsupervised back-propagation neural network to predict the melting points of a range of polymers using descriptors defined by a combination of molecular connectivity indices, chemical composition and lUPAC nomenclature. For a set of 56 unspecified polymers selected from 11 different families, their method predicted melting points with a standard error of 21 degrees, compared with 26 degrees using partial least squares regression, 24 degrees using locally weighted regression, 26 degrees using ridge regression, 30 degrees using polynomial partial least squares regression, and 40 degrees using kernel regression. They reported the experimental melting point range of the polymers as 230-266 K, which seems incorrect. Burkhardt et al.,^^^ using comparative molecular field analysis, were able to predict the melting points of 11 polypropylenes with steric descriptors of the metallocenes used in the polymerization process; a four-component correlation yielded r = 0.987 and ^ = 7.05 degrees. It is hoped that these few examples of studies concerned with the prediction of melting points of polymers will have given a reasonable, albeit brief, overview of this field which, although important, is perhaps of only peripheral interest to the environmental chemist and pharmaceutical formulator. Inorganic Substances
In contrast to the relatively large number of papers dealing with the prediction of melting points of organic compounds, there has been very little work done concerning inorganics. Gold and Ogle^^ examined the accuracy of three methods in predicting melting points of inorganic compounds. They reported mean percentage errors (± 95% confidence limits in degrees) as follows: Method of Lorenz and Herz^^ for 42 compounds: 26.63% (±111 degrees) Method of Benko^^ for 35 compounds: 12.88% (± 77.94 degrees) Method of Prud'honmie^^ for 37 compounds: 6.26% (± 63.72 degrees)
168
JOHN C. DEARDEN
However, as commented earlier, Gold and Ogle's method of calculating percentage error is open to doubt. Wachalewski,^^ whose work on the prediction of melting point of organics has already been discussed (see Eq. 33) also applied his method to a series of 19 simple inorganic compounds, with an average error of 28.2 degrees. Sharma,^^^ starting from thermodynamic theory, developed an equation for predicting the melting point of "simple nonpolar liquids," meaning essentially the inert gases, T^ = r&(v''^-l)/V''^
(86)
where 7^ = characteristic temperature, 5 = lattice distortion parameter, and V = reduced volume. For four inert gases and nitrogen, the average error in predicted melting point was 6.5 degrees, with the error rising with atomic size. It is not known whether the method can be applied to inorganic nonpolar liquids. An interesting series of studies has been carried out by Kutolin and coworkers. Kutolin et al.^^^ developed the following equation for the melting points of binary compounds of lanthanide rare-earth elements, mp = 246.91^° + 48 E^(X) - 578.1 m/n + 2482.2
(87)
where dff = number of electrons in ^2 orbit of the lanthanide rare-earth element, Ep(X) = Fermi energy of the element X, m = number of atoms of X in molecule, and n = number of atoms of lanthanide rare-earth element in molecule. For 11 such compounds, the average error in predicted melting point was found to be 46.5 degrees. In an extension to this work, Kutolin and Kotyukov^^"^ and Kutolin et al.^^^ used Chebeychev functions (orthogonal functions analogous to principal components) derived from electronic parameters such as Fermi energies to develop a series of equations for the melting points of binary compounds and sesquioxides of rareearth elements. For 12 such compounds, the average error in predicted melting point was 100.3 degrees, suggesting that the method is not so satisfactory as that of Kutolin et al.^^^ referred to above. Kutolin et al.^^^ simplified their 1978 approach when they correlated the melting points of refractory metal dihydrides with a single electronic parameter, the Fermi energy level (£p) of the metal dication: r ^ = 17.537 £:p+1570
(88)
For seven such dihydrides, the average error in predicted melting point was found to be 80.4 degrees, with a range of 1 to 199 degrees. A number of workers have attempted to predict the melting points of superheavy elements. Keller et al.^^^ used the Lindemann equation^^ to predict such melting points. Kazragis et al.^^^'^^^ developed several equations for the prediction of melting points of metallic elements. For example, they developed Eq. 89 from the
The Prediction of Melting Point
169
melting points of Mo, Tc, Ru, Rh, Pd, Ag, and Cd, and used it to predict the melting points of some superheavy elements. T^ = 2888 + 15.334 w^ - 2.6124 w^
(89)
In Eq. 89, w = number of outer s and d electrons in an atom. The actual and predicted melting points for the training set elements are:
Element 7^ (observed) Tj^ (predicted)
Mo 2893 2876
Tc 2473 2743
Ru 2523 2532
Rh 2236 2226
Pd 1827 1809
Ag 1235 1267
Cd 594 582
Another equation developed by Kazragis et al., but with no predictions given, is, 7'm = ^ e ° ' ' ' - u ' " ^ ' ' ' ' / W ' i i r
(90)
where c^ = electron density in the conduction band, r^ = ionic radius, / = internuclear distance, and W^^ = ionic charge in metallic state. Kutolin et al.^^^ used electronic parameters to calculate the melting points of elements with such high atomic numbers that they were as yet undetected or had not had their melting points determined. They reported the following equation, although no evidence was offered for its validation on elements whose melting points are known, mp = 0.80726 x-^ + 130.5505 X2X^ + 144.28273 x^x^ - 42.68037 x^x^ - 299.86699 (91) where jCj = atomic number, X2 and x^ = number of electrons in outer sublevels s and d respectively, x^ = periodic table group of element, x^ = quantum number for the M^ shell, and x^ = magnetic quantum number. They reported the predicted melting points of 16 elements from atomic numbers 104 to 160. Bonchev and Kamenska^^^ used the Shannon information index to predict the melting points of the 113-120 transactinide elements. They commented that their predictions were similar to those of Keller et al.^^^ Gomez et al.^^^ have attempted to predict the melting points of some face-centered cubic noble and transition metals using the calculated tight-binding potential. Their predictions can be described as only fair, having a mean error of 341.7 degrees for nine metals. Li et al.,^^^ using neural networks, found that five descriptors (electronegativity difference, valence electron density difference, electron-atom ratio, metallic radius ratio, and average melting point of constituent elements) could model the melting points of AB-type intermetallic compounds with a mean error of 14.8 degrees for 11 such compounds.
170
JOHN C. DEARDEN
Reddy et al.^^"^ observed that the melting points of tetrahedral semiconductors could be correlated with the arithmetic mean of the nuclear effective charge (z) on each of the atoms: mp = 4 3 2 2 - 9 7 1 I
(92)
no statistics given For 18 such compounds Eq. 92 gave an average error of 86.6 degrees. Bosi^^^ developed a theory for predicting the melting points of alkali metal halides and alkaline earth oxides. He derived the following equation: T
zV
^
(93)
where z^ and z~ are the charges on the cation and anion respectively, e^ is the dielectric constant, K is the Boltzmann constant, r^ and r^ are the anionic and cationic radii, respectively, A//f is the latent heat of fusion, and A//j is the lattice energy (latent heat of sublimation); these latent heats were obtained from the literature. For 20 alkali metal halides, the average error in prediction of melting point was 59 degrees, while for 5 alkaline earth oxides it was 136 degrees. However, the average percentage error was about the same for the two series, since the alkaline earth oxides have much higher melting points than do the alkali metal halides. Kang et al.^^^ have used artificial neural networks and pattern recognition with chemical bond parameters to predict the melting point of CsClMn04 as 677 °C, reportedly in agreement with experiment. Horvath^^'* has briefly reviewed the prediction of melting points of inorganic compounds. It is apparent that there is as yet no consistent method for the prediction of melting points of inorganic compounds, even those of relatively simple composition. Clearly there is scope for much more work in this field, perhaps through the application of molecular orbital theory.
V. CONCLUSIONS Melting point is a readily measurable property of importance in many ways. As such, its prediction has attracted much interest. Early quantitative work concentrated on hydrocarbons and homologous series, and numerous equations were developed relating melting point to chain length. The odd-even alternation in melting point was generally dealt with through the use of separate equations for odd and even chain lengths. Melting points of these compounds can now be predicted with high accuracy. Hydrogen bonding is an important factor in melting point, and must be taken into account if good predictions are to be made. Few methods so far devised have incorporated hydrogen-bonding contributions; undoubtedly the best to date is the
The Prediction of Melting Point
171
group contribution method of Simamora and Yalkowsky.^^^ Even their method has a rather high standard error of prediction, and further work is needed to reduce this. Homopolymers represent the ultimate extrapolation of homologous series, and several of the equations devised to predict the melting points of homologous series have been used successfully to predict the melting points of polymers. Copolymers represent a more difficult problem, but there have been several reasonably successful attempts to predict their melting points, although generally on a more empirical basis. Elements and inorganic compounds have come in for quite a lot of attention, and a number of different approaches have been used, based generally on functions relating to electronic structure. There is, however, still no general method available for the prediction of melting points of inorganics. No work appears to have been done on the estimation of melting points of metallorganic complexes.
ACKNOWLEDGMENTS I am grateful to Prof. P.J. Duke and the late Prof. C. Silipo for translating some Russian and Italian texts respectively for me.
REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
Glasstone, S. Textbook of Physical Chemistry, 2nd. edn.; D. Van Nostrand: New York, 1946, p 461. Bean, V. E.; Wood, S. D. J. Chem. Phys. 1980, 72, 5838-5841. Berry, R. S. Sci. Amer. 1990, 263 (2), 50-56. Yalkowsky, S. H.; Valvani, S. C. / Pharm. Sci. 1980, 69, 912-922. Yalkowsky, S. H.; Banerjee, S. Aqueous Solubility Methods ofEstimationfor Organic Compounds; Marcel Dekker: New York, 1992; p 62. Meylan, W. H.; Howard, P. H.; Boethling, R. S. Environ. Toxicol. Chem. 1996, 75, 100-106. Hansch, C; Leo, A. Substituent Constants for Correlation Analysis in Chemistry and Biology; Wiley Interscience: New York, 1979, pp 18-43. Rekker, R. F; Mannhold, R. Calculation of Drug Lipophilicity; VCH: Weinheim, 1992. Meylan, W H.; Howard, R H. J. Pharm. Sci. 1995, 84, 83-92. Lipnick, R. L. In Practical Applications of Quantitative Structure-Activity Relationships (QSAR) in Environmental Chemistry and Toxicology; Karcher, W.; Devillers, J., Eds.; Kluwer Academic Publishers: Dordrecht, 1990, pp 281-293. Mackay, D. Personal communication, 1990. Partington, J. R. An Advanced Treatise on Physical Chemistry; Longmans Green & Company 1952, Vol. 3, pp 545-555. Trouton, R Phil. Mag. 1884,18, 54-57. Abramowitz, R.; Yalkowsky, S. H. Pharm. Res. 1990, 7, 942-947. Yalkowsky, S. H. Ind Eng. Chem. Fundamentals 1979,18, 108-111. Chickos, J. S.; Hesse, D. G.; Liebman, J. R J. Org. Chem. 1990, 55, 3833-3840. Chickos, J. S.; Braton, C. M.; Hesse, D. G.; Liebman, J. R J. Org. Chem. 1991, 56, 927-938. Dannenfelser, R. M.; Surendran, N.; Yalkowsky, S. H. SAR QSAR Environ. Res. 1993,1,273-292. Abramowitz, R.; Yalkowsky, S. H. Chemosphere 1990, 21, 1221-1229. Tsakanikas, R D.; Yalkowsky, S. H. Toxicol. Environ. Chem. 1988,17, 19-33. Huggins, M. L. J. Phys. Chem. 1939,43, 1083-1098.
172
JOHN C DEARDEN
22. Rory, P. J.; Vrij, A. J. Am. Chem. Soc. 1963, 85, 3548-3553. 23. Dannenfelser, R.-M.; Yalkowsky, S. H. Ind. Eng. Chem. Res. 1996, 35, 1483-1486. 24. Partington, J. R. An Advanced Treatise on Physical Chemistry; Longmans Green & Company: London, 1952, Vol. 3, pp 461-462. 25. Skau, E. L.; Arthur, J. C ; Wakeham, H. In Physical Methods of Organic Chemistry, 3rd edn.; Weissberger, A., Ed.; Interscience Publishers: New York, 1959, Part 1, pp 287-334. 26. Fumiss, B. S.; Hannaford, A. J.; Smith, P W. G.; Tatchell, A. R. Vogel's Textbook of Practical Organic Chemistry, 5th edn.; Longman: Harlow, 1989, pp 240-236. 27. Ford J. L.; Timmins, P. Pharmaceutical Thermal Analysis; Ellis Horwood: Chichester, 1989, pp 108-135. 28. Partington, J. R. An Advanced Treatise on Physical Chemistry; Longmans Green & Company: London, 1949, Vol. 1, pp 498-501. 29. Camelley, T. Phil. Mag. Sen 51882,13,112-130. 30. Partington, J. R. An Advanced Treatise on Physical Chemistry; Longmans Green & Company: London, 1952, Vol. 3, pp 462-463. 31. Mills, E. J. Phil. Mag. Sen 51884, 77, 173-187. 32. Baeyer, A. Berichte 1877,10,1286-1288. 33. Kipping, E S. J. Chem. Soc. 1894, 63,465-468. 34. Longinescu, G. G. J. Chim. Phys. 1903,1, 296-301. 35. Tsakalotos, D.-E. Compt. Rend. Acad. Sci. Paris II1906,143,1235-1236. 36. Lindemann, R A. Physik. Z 1910,11,609-612. 37. Robertson, R W. J. Chem. Soc. 1919, 775,1210-1223. 38. Prud'homme, M. J. Chim. Phys. 1920, 78, 359-361. 39. Lorenz, R.; Herz, W. Z Anorg. Allgem. Chem. 1922, 722 (2), 51-60. 40. Lyman, W. J. In Environmental Exposure from Chemicals; Neely, W.B.; Blau, G. E., Eds.; CRC Press: Boca Raton, FL, 1985, Vol. 1, pp 13-47. 41. Taft, R.; Stareck, J. J. Phys. Chem. 1930,34, 2307-2317. 42. Malone, G. B.; Reid, E. E. J. Am. Chem. Soc. 1929, 57, 3424-3427. 43. Partington, J. R. An Advanced Treatise on Physical Chemistry; Longmans Green & Company: London, 1952, Vol. 3, pp 463-465. 44. Beacall, T. Rec. Trav. Chim. 1928,47, 37-44. 45. Gamer, W. E.; Madden, R C ; Rushbrooke, J. E. J. Chem. Soc. 1926, 2491-2502. 46. Gamer, W. E.; King, A. M. J. Chem. Soc. 1929, 1849-1861. 47. Gamer, W. E.; Van Bibber, K.; King, A. E. J. Chem. Soc. 1931,1533-1541. 48. Timmermans, J., Les Constantes Physiques des Composes Organiques Cristallises; Masson et Cie: Paris, 1953, pp 256-273. 49. Powell, R. E.; Clark, C. R.; Eyring, H. J. Chem. Phys. 1941, 9, 268-273. 50. Mekenyan, O.; Dimitrov, S.; Bonchev, D. Eur. Polym. J. 1983, 79,1185-1193. 51. Austin, J. B. /. Am. Chem. Soc. 1930,52,1049-1053. 52. Lovell, E. L.; Hibbert, H. J. Am. Chem. Soc. 1939,61,1916-1920. 53. Merckel, J. H. C. Proc. Roy Acad. Amsterdam 1937,40,164-173. 54. Meyer, K. H.; van der Wyk, A. Helv. Chim. Acta 1937,20,1313-1320. 55. Moullin, E. B. Proc. Camb. Phil. Soc. 1938,34,459-464. 56. Seyer, W. R; Patterson, R. R; Keays, J. L. J. Am. Chem. Soc. 1944, 66,179-182. 57. Etessam, A. H.; Sawyer, M. R / Inst. Petrol. 1939,25, 253-262. 58. Gray, C. G. J. Inst. Petrol. 1943,29, 226-234. 59. Smittenberg, J.; Mulder, D. Rec. Trav Chim. 1948, 67, 813-825. 60. Fortuin, J. M. H. Rec. Trav. Chim. 1958, 77, 5-16. 61. Keyes, R. W. Phys. Rev 1959, 775, 564-567. 62. Benko, J. Acta Chim. Hung. 1959,27, 351-361. 63. Gold, P I.; Ogle, G. J. Chem. Eng. 1969, 76(1), 119-122.
The Prediction of Melting Point
173
64. Broadhurst, M. G. / Res. Nat. Bur. Stds. 1962, 66A, 241-249. 65. Broadhurst, M. G. J. Res. Nat. Bur Stds. 1966, 70A, 481-486. 66. Grigor'ev, S. M.; Pospelov, V. M. Sb. Nauchn. Tn, Ukr Nauchn.—Issled. Uglekhim. Inst. 1965, No.16, 153-173. 67. Eaton, E. O. Chem. Technol. 1971, 362-366. 68. Wachalewski, T. Postepy Fiz. 1970, 27, 403-412. 69. Syunyaeva, R. Z. Chem. Technol. Fuels Oils 1981, 77, 161-164. 70. Mackay, D.; Shiu, W. T.; Bobra, A.; Billington, J.; Chan, E.; Yeun, A.; Ng, C ; Szeto, F. U.S. Environmental Agency Report PB 82-230939; Athens, Georgia, 1982. 71. Seybold, P. G.; May, M. A.; Gargas, M. L. Acta Pharm. Jugosl. 1986,36, 253-265. 72. Kier, L. B.; Hall, L. H. Molecular Connectivity in Structure-Activity Analysis; Research Studies Press: Letchworth, 1986, pp 1-24. 73. Westwell, M. S.; Searle, M. S.; Wales, D. J.; Williams, D. H. J. Am. Chem. Soc. 1995, 777, 5013-5015. 74. Hanson, M. P.; Rouvray, D. H. In Graph Theory and Topology in Chemistry; King, R. B.; Rouvray, D. H., Eds.; Elsevier: Amsterdam, 1987, pp 201-208. 75. Adler, N.; Kova5ie-Beck, L. In Graph Theory and Topology in Chemistry; King, R. B.; Rouvray, D. H., Eds.; Elsevier: Amsterdam, 1987, pp 194-200. 76. Needham, D. E.; Wei, I.-C; Seybold, P G. J. Am. Chem. Soc. 1988, 770,4186-4194. 77. Pogliani, L. J. Phys. Chem. 1995, 99, 925-937. 78. Somayajulu, G. R. Int. J. Thermophys. 1990, 77, 555-572. 79. Kreglewski, A. Bull. Acad. Polon. ScL, Ser. Sci. Chim. 1961, 9, 163-167. 80. Kreglewski, A.; Zwolinski, B. J. J. Phys. Chem. 1961, 65, 1050-1052. 81. Riazi, M. R.; Al-Sahhaf, T. A. Ind Eng. Chem. Res. 1995, 34, 4145-4148. 82. Cherqaoui, D.; Villemin, D.; Kvasnicka, V. Chemom. Intell. Lab. Systems 1994, 24, 117-128. 83. Todeschini, R.; Gramatica, P.; Provenzani, R.; Marengo, E. Chemom. Intell. Lab. Systems 1995, 27, 221-229. 84. Todeschini, R.; Gramatica, P SAR QSAR Environ. Res. 1997, 7, 89-115. 85. Marano, J. J.; Holder, G. D. Ind. Eng. Chem. Res. 1997, 36, 1895-1907. 86. Cramer, R. D. /. Am. Chem. Soc. 1980,102, 1837-1849. 87. Cramer, R. D. J. Am. Chem. Soc. 1980,102, 1849-1859. 88. Charton, M.; Charton, B. I. In QSAR in Design ofBioactive Compounds; Kuchar, M., Ed.; J.R. Prous: Barcelona, 1984; pp 41-51. 89. Dearden, J. C ; Rahman, M. H. Mathl. Comput. Modelling 1988, 77, 843-846. 90. Verloop, A.; Hoogenstraaten, W; Tipker, J. In Drug Design; Ariens, E. J., Ed.; Academic Press: New York, 1976, Vol. 7, pp 165-207. 91. Dearden, J. C. Sci. Total Environ. 1991,109/110, 59-68. 92. Abraham, M. H. Personal communication, 1990. 93. Murugan, R.; Grendze, M. P.; Toomey, J. E.; Katritzky, A. R.; Karelson, M.; Lobanov, V.; Rachwal, P CHEMTECH1994, 24 (9), 17-23. 94. Mason, D.; Bernstein, J. Mol. Cryst. Liq. Cryst. 1994,242, 179-191. 95. Abramowitz, R., PhD. Thesis, University of Arizona, 1986. 96. Yalkowsky, S. H.; Krzyzaniak, J. E; Myrdal, P B. Ind Eng. Chem. Res. 1994, 33, 1872-1877. 97. Tesconi, M.; Yalkowsky, S. H. In Estimating Chemical Properties for the Environmental and Health Sciences: a Handbook of Methods; Boethling, R. S., Mackay, D., Eds.; Ann Arbor Press: Chelsea, MI, 1999, in press. 98. Bhattacharjee, S.; Rao, A. S.; Dasgupta, P Computers Chem. 1991, 75, 319-322. 99. Medic-Sarie, M.; Nickolie, S.; MatijeviC-Sosa, J. Acta Pharm. 1992,42, 153-167. 100. Charton, M.; Charton, B. I. Abstn 27th M.A.R.M., Am. Chem. Soc. 1993, 129-130. 101. Charton, M.; Charton, B. J. Phys. Org. Chem. 1994, 7, 196-206. 102. Charton, M. Personal communication, 1997.
174 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125. 126. 127. 128. 129.
130. 131. 132. 133. 134. 135. 136. 137. 138. 139.
JOHN C. DEARDEN Pogliani, L. J. Phys. Chem. 1996,100, 18065-18077. Pogliani, L. Med. Chem. Res. 1997, 7, 380-393. Todeschini, R.; Gramatica, P. Quant. Struct.-Act. Relat. 1997,16,120-125. Chiorboli, C ; Gramatica, P; Piazza, R.; Pino, A.; Todeschini, R. SAR QSAR Environ. Res. 1997, 7, 133-150. Todeschini, R.; Vighi, M.; Finizio, A.; Gramatica, P SAR QSAR Environ. Res. 1997, 7, 173-193. Yalkowsky, S. H.; Valvani, S. C.; Roseman, T. J. J. Pharm. Sci. 1983, 72, 866-870. Rubino, J. T. J. Pharm. Sci. 1989, 78, 485-489. Thomas, E.; Rubino, J. Int. J. Pharm. 1996,130, 179-183. Anderson, B. D.; Conradi, R. A. J. Pharm. Sci. 1985, 74, 815-820. Przezdziecki, J.; Sridhar, T Am. Inst. Chem. Eng. J. 1985, 31, 333-335. Walters, A. E.; Myrdal, P B.; Yalkowsky, S. H. Chemosphere 1995, 31, 3001-3008. Horvath, A. L. Molecular Design: Chemical Structure Generation from the Properties of Pure Organic Compounds; Elsevier: Amsterdam, 1992, pp 144-157. Joback, K. G. S.M. Thesis, Massachusetts Institute of Technology, Cambridge, MA, 1984. Joback, K. G.; Reid, R. C. Chem. Eng. Comm. 1987, 57, 233-243. Reid, R. C ; Prausnitz, J. M.; Poling, B. E. The Properties of Gases and Liquids, 4th edn.; McGraw-Hill: New York, 1987, pp 25-26. Simamora, P; Yalkowsky, S. H. SAR QSAR Environ. Res. 1993,1, 293-300. Simamora, P; Miller, A. H.; Yalkowsky, S. H. J. Chem. Inf Comput. Sci. 1993, 33, 437-440. Simamora, P; Yalkowsky, S. H. Ind Eng. Chem. Res. 1994, 33, 1405-1409. Krzyzaniak, J. F; Myrdal, P B.; Simamora, P; Yalkowsky, S. H. Ind Eng. Chem. Res. 1995, 34, 2530-2535. Constantinou, L.; Gani, R. Am. Inst. Chem. Eng. J. 1994,40, 1697-1710. Tu, C.-H. J. Chinese Inst. Chem. Eng. 1994, 25,151-154. Tu, C.-H.; Wu, Y-S. J. Chinese Inst. Chem. Eng. 1996, 27, 323-328. Yalkowsky, S. H.; Dannenfelser, R.-M.; Myrdal, P.; Simamora, P.; Mishra, D. Chemosphere 1994, 28, 1657-1673. Yalkowsky, S. H.; Myrdal, P.; Dannenfelser, R.-M.; Simamora, P. Chemosphere 1994, 28, 1675-1688. Lyman, W. J.; Reehl, W. F; Rosenblatt, D. H. (Eds.). Handbook of Chemical Property Estimation Methods; McGraw-Hill: New York, 1982. Lyman, W. J.; Potts, R. G.; Magil, G. C. User's Guide to CHEMEST; Arthur D. Little: Cambridge, MA, 1984; pp 4.9.1-4.9.9. Grain, C. F; Lyman, W. J. Interim Report on Task 29 of Environmental Protection Agency Contract No. 68-01-6271; U.S. Environmental Protection Agency, Office of Toxic Substances: Washington DC, 1983. Boethling, R. S.; Campbell, S. E.; Lynch, D. G.; LaVeck, G. D. Ecotoxicol. Environ. Saf 1988, 15, 21-30. Lynch, D. G.; Tirado, N. F; Boethling, R. S.; Huse, G. R.; Thom, G. C. Sci. Total Environ. 1991, 109/110, 643-648. Hunter, R.; Faulkner, L.; Culver, F; Hill, J. QSAR, Structure-Activity Based Chemical Modeling and Information Software; Montana State University: Bozeman, Montana, 1985. Syracuse Research Corporation MPBPVP PC-based program ver. 1.25; Syracuse, NY, 1997. Stein, S. E.; Brown, R. L. J. Chem. Inf Comput. Sci. 1994, 34, 581-587. CambridgeSoft Corporation ChemPropPro PC-based program; Cambridge, MA, 1998. Hory, P J. J. Chem. Phys. 1949,17, 223-240. Eby, R. K. J. Appl. Phys. 1963, 34, 2442-2445. Hay, J. N. J. Polym. Sci., Polym. Chem. Ed. 1976,14, 2845-2852. Buckley, C. P; Kovacs, A. J. Colloid Polym. Sci. 1976, 254, 695-715.
The Prediction of Melting Point
175
140. Van Krevelen, D. W. Properties of Polymers: Their Estimation and Correlation with Chemical Structure, 2nd edn.; Elsevier: Amsterdam, 1976, pp 112-127. 141. Wunderlich, B.; Czomyj, G. Macwmols. 1977,10, 906-913. 142. Mandelkem, L.; Stack, G. M. Macromols. 1984, 77, 871-878. 143. Cantor, R. S.; Dill, K. A. Macromols. 1985,18, 1875-1882. 144. Starkweather, H. W. Macromols. 1986,19, 1131-1134. 145. Frushour, B. G. Polym. Bull. 1984,11, 375-382. 146. Tanaka, N. Sen-i Gakkaishi 1986, 42, T606-T609. 147. Polikarpov, V. M.; Matukhina, E. V.; Polyakov, Yu. P.; Matveichev, P. M.; Ushakov, N. V.; Bespalova, N. B.; Razumovskaya, I. V.; Antipov, E. M. Vysokomol. Soedin., Sen A 1991, 33, 1088-1092. 148. earlier, V.; Devaux, J.; Legras, R.; McGrail, R T. Macromols. 1992, 25, 6646-6650. 149. Tan, T. T. M.; Rode, B. M. J. Polym. Sci.: Part B: Polymer Phys. 1996, 34, 2139-2143. 150. Sumpter, B. G.; Noid, D. W. J. Thermal Anal. 1996,46, 833-851. 151. Burkhardt, T. J.; Murata, M.; Vaz, R. J. Macromol. Symp. 1995, 89, 321-333. 152. Sharma, B. K. Indian J. Phys. 1979, 53B, 174-182. 153. Kutolin, S. A.; Vashukov, I. A.; Kotyukov, V. I. Izvest. Akad. Nauk SSSR, Neorg. Mater 1978,14, 215-218. 154. Kutolin, S. A.; Kotyukov, V. I. Izvest. Akad. Nauk SSSR, Neorg. Mater. 1979, 75, 96-99. 155. Kutolin, S. A.; Kotyukov, V. I.; Komarova, S. N.; Smimova, E. G.Zhur Fiz. Khim. 1980,54,35-39. 156. Kutolin, S. A.; Smimova, E. G.; Komarova, S. N. Zhur Fiz. Khim. 1982, 56, 2799-2802. 157. Keller, O. L.; Burnett, J. L.; Carlson, T. A.; Nestor, C. W. J. Phys. Chem. 1970, 74, 1127-1134. 158. Kazragis, A. Document deposited with VINITI (All-Union Institute of Scientific and Technical Information), VINITI No. 1223-78,1978 (C.A. 91:112659). 159. Kazragis, A.; Bergman, G. A.; Raudeliuniene, A.; Liksiene, R. Document deposited with VINITI (All-Union Institute of Scientific and Technical Information), VINITI No. 1398-79, 1979 (C.A. 92:203820). 160. Kutolin, S. A.; Kotyukov, V. I.; Kotlevskaya, N. L. Zhur Fiz. Khim. 1980, 54, 633-637. 161. Bonchev, D.; Kamenska, V. J. Phys. Chem. 1981, 85, 1177-1186. 162. Gomez, L.; Dobry, A.; Diep, H. T Phys. Rev. B1997,55,6265-6271. 163. Li, C ; Guo, J.; Qin, P; Chen, R.; Chen, N. J. Phys. Chem. Solids 1996, 57, 1797-1802. 164. Reddy, R. R.; Kumar, M. R.; Rao, T. V. R.; Ahammed, Y. N. J. Phys. Chem. Solids 1994, 55, 523-524. 165. Bosi, L. G. Fis. 1987, 28, 265-268; Phys. Status Solidi A 1987,101, Kl 11-Kl 14. 166. Kang, D. S.; Wang, X. Y; Li, C. H.; Zhau, Q. B.; Liu, H. L.; Chen, N. Y Acta Chim. Sinica 1997, 55, 463-466.
This Page Intentionally Left Blank
THE APPLICATION OF THE INTERMOLECULAR FORCE MODEL TO PEPTIDE AND PROTEIN QSAR
Marvin Charton
I. Introduction A. Intermolecular Forces B. The Intermolecular Force (IMF) Equation C. Side-Chain Effect Composition D. The IMF Equation for Peptide and Protein Bioactivity II. The Bioactivity Mechanism A. Transport B. Receptor-Substrate Binding . C. Chemical Reaction III. Peptide Bioactivities A. Types of Structural Variation in Peptides B. Oxytocin Analogue Uterotonic Inhibitors of Oxytocin C. Peptide Renin Inhibitor QSAR IV. Protein Bioactivities A. Limitation of the Model in Protein QSAR B. Types of Protein Bioactivity Data Sets
Advances in Quantitative Structure Property Relationships Volume 2, pages 177-252. Copyright © 1999 by JAI Press Inc. All rights of reproduction in any form reserved. ISBN: 0-7623-0067-1 177
178 178 178 181 181 182 182 182 183 183 183 184 188 208 208 208
178
MARVIN CHARTON
C. Human Growth Hormone (hGH) D. Subtilisin BPN' E. Hirudin F. L. casei Thymidylate Synthase G. 7: r/iermop/i//M5 Glutamyl-tRNA Synthase H. Rat Trypsin I. Human Growth Hormone II V The IMF Method as a Bioactivity Model A. Peptide and Protein Bioactivities B. The Hansch-Fujita Model VI. Appendix: Statistics Reported for the Correlations Abbreviations References
209 213 235 237 240 242 247 248 248 248 250 251 251
I. INTRODUCTION A. Intermolecular Forces Many phenomena depend on the difference in intermolecular forces between initial and final states. Partition, distribution, solubility, phase changes such as melting point and boiling point; chromatographic properties such as retention times in gas chromatography, relative flow rates in paper and thin layer chromatography, and capacity factors in high performance liquid chromatography; charge transfer and hydrogen bonding complex formation are examples, as are bioactivities. In the Hansch-Fujita method of modeling bioactivity the most important parameter is a measure of hydrophobicity-lipophilicity such as log P where P is the partition coefficient, or log k\ where k' is the high-pressure liquid chromatography capacity factor. These quantities are composite parameters that depend on intermolecular force differences.^""^ Composite parameters represent two or more different structural effects; pure parameters represent a single effect. In modeling bioactivities and other properties hydrophobicity or lipophilicity parameters can be replaced by parameters that represent intermolecular forces. This method has been successfully applied to the properties and bioactivities of amino acids, peptides, and proteins,"^'^ and to opiate receptor binding of 4'-substituted naloxone phenylhydrazones.^ Here we present a detailed description of the application of the method to some examples of peptide and protein bioactivities in order to show how to use it. B. The Intermolecular Force (IMF) Equation 2 x is a measurable quantity of interest that varies with molecular structure; e is the intermolecular force energy; X the variable structural feature; and i and f indicate the initial and final states. Then: Q^ = E^-E. = As
(1)
Application of IMF Model
179
The intermolecular forces and their parameterization are summarized in Table
Intermolecular Force Parameterization Parameterization of the intermolecular forces described in Table 1 results in the inter/intramolecular force (IMF) equation. In its most general form [3] it is, Qx = ^^tx + ^^dx + ^^ex + ^^x + ^^x + H.n^x + f^2\x + li^ + B^^n^^ + B^^n^^ + Sy^^ + 5^
(2)
where: • Oix is the localized electrical effect parameter. It is identical to the GJ and Op constants.^ • G^x is the intrinsic delocalized electrical effect parameter.^ • G^^ is the electronic demand sensitivity electrical effect parameter.^ • a is a polarizability parameter. ^"^ It is defined by the equation, MRy-MR„
MRy
(3)
where MR^ and M/?^ are the group molar refractivities of X and H, respectively. There are many other polarizability parameters which can be used, they all have the dimensions of volume and are highly colinear in each other. • rifj and n^ are hydrogen bonding parameters;'* n^ is equal to the number of OH or NH bonds in X, n^ is equal to the number of lone pairs on O or N atoms in X. This parameterization is deficient as it accounts for the probability of hydrogen-bond formation but does not account for the intensity of the interaction. It does frequently give reasonable results however.
Table 1. Intermolecular Forces and the Quantities Upon which They Depend Intermolecular Force Molecule-molecule Hydrogen bonding (hb) Dipole-dipole (dd) Dipole-induced dipole (di) Induced dipole-induced dipole (ii) Charge transfer (ct) Ion-molecule lon-dipole (Id) Ion-induced dipole (Ii) Note:
Quantity Ehb Dipole moment Dipole moment, polarizability Polarizability Ionization potential, electron affinity Ionic charge, dipole moment Ionic charge, polarizability
Abbreviations for intermolecular forces are in parentheses.
180
MARVIN CHARTON
• / is the ionic charge parameter.^ It takes the value 1 when the substituent is ionized and 0 when it is not. • AZj) and rip^ are charge transfer parameters',^"^ n^ is 1 when X acts as an electron donor and 0 when it cannot; n^ is 1 when X can function as an electron acceptor and 0 when it cannot. • \|/ is a steric effect parameterization.^"^^ Steric Effect Parameterizations
There are several possible parameterizations of the steric effect.^"^^ Steric effects depend on the position in the side chain and it is necessary for the parameterization to account for this. The simplest is monoparametric. An example of such a parameter is \), a composite steric parameter based on van der Waals radii (ry) that emphasizes the steric effect at the first atom of the side chain. Thus: S\\f = Sv
(4)
Monoparametric steric parameters have a fixed dependence on side-chain position; this is why they are composite. The side chain is numbered starting with the atom which is bonded to the rest of the amino acid residue. Accounting for steric effects anywhere in the side chain requires additional parameters. This is feasible only when a sufficiently large data set is available. There are four multiparametric models available to choose from:^"^^ 1. The simple branching (SB) equation: (5) /=i
This model accounts for the steric effect at each atom of the side-chain skeleton (longest chain) by counting the number of branches (atoms other than H) bonded directly to it. With amino acids, peptides, and proteins it is generally unnecessary to go further than the third skeletal atom in the parameterization. The SB equation uses the pure parameters /Zj, /t2, and riy^'^^ The model applies only to skeletal atoms that have or are assumed to have tetrahedral geometry. It assumes that the effect of all branching atoms attached to a skeletal atom is the same. Due to the existence of nonequivalent conformations this assumption is only a crude first approximation. Another problem associated with the SB equation is that a high degree of collinearity with a is generally found. 2. The extended branching (EB) equation:
Application of IMF Model
181
This method distinguishes between the first, second, and third branches on a tetrahedral atom at the expense of many more parameters. Few peptide or protein data sets are large enough to permit its use.^'^^ 3. A hybrid model which is a combination of the \) steric parameter and the simple branching equation:^^ (7) 5\|/ = 5\) + ^ a.n. 1=1
4. The segmental model: (8) /=i
where D • is the steric parameter of the smallest face of the i-ih segment of the side chain. The i-th segment consists of the f-th atom of the longest chain and all the groups attached to it.^^ C. Side-Chain Effect Composition For comparing structural effects in different data sets we make use of the percent contribution of each independent variable in the regression equation, C-,^ defined as, llOOapcl
(9)
where a- is the regression coefficient of the i-ih independent variable and x- is its value for the reference residue. His is the reference side chain in these studies. It was chosen because it has a value other than zero for each parameter in the correlation equation. Comparisons of side chain structural contributions refer therefore to those of the His side chain. D. The IMF Equation for Peptide and Protein Bioactivity A dependence on charge transfer interactions in modeling the properties and bioactivities of amino acids, peptides, and proteins is rarely found. Amino acid side chains are bonded to an sp^ hybridized carbon atom, therefore no terms in a^^ or o^^ are necessary. The amino acid moiety has a large dipole moment making the term in ^i unnecessary. Then IMF equation takes the form: Q^ = LG^ + -^ A % + H^rifj^ + H^n^^ + li^ •^Sy^f^-^B'
(10)
182
MARVIN CHARTON
II. THE BIOACTIVITY MECHANISM In order to justify the application of the IMF model to bioactivities, it is necessary to consider the mechanism of bioactivity. The mechanism given here is a modification of that proposed by McFarland."^'^^ The bioactivity is considered to be dependent on one or more of the following steps: transport, receptor interaction, and chemical reaction. A. Transport
The bioactive substance (bas) enters the organism at some point. It then moves through an aqueous phase to a receptor (rep) site with which it is to interact. This movement may involve diffusion through the medium or random binding to a biopolymer molecule such as plasma protein that carries it. During transport the bas is likely to cross one or more biomembranes. The crossing of a biomembrane begins with the transfer of the bas from the initial aqueous phase (j)^ to the anterior membrane surface (ams). It then proceeds to the posterior membrane surface (pms) either by diffusion or by binding to a lipid-soluble membrane carrier molecule (mem) which transports it. The bas is then transferred from the surface to a second aqueous phase (t)f. Each step in this process is equivalent to a transfer from one phase to another and is therefore a function of the difference between intermolecular forces involving medium and bas in the initial and final phases. B. Receptor-Substrate Binding
The interaction between receptor and substrate occurs in two stages: recognition and tight complex formation. Recognition
The rep must distinguish the substrate from all of the other chemical species present in the medium which surrounds it. The rep consists of some number of functional groups attached to a molecular framework that is part of a biopolymer. These functional groups have a particular orientation in space. To be recognized, a substrate must have functional groups that are capable of interacting with those of the rep and have the proper spatial arrangement to do so. Recognition results in the formation of a loose substrate-receptor complex (bas—rep) bound by intermolecular forces. The interactions involved in recognition are directed. Examples of strongly directed interactions are hydrogen bonding and salt bridge formation. Recognition therefore depends on the difference between the intermolecular forces involving both the bas and the rep with the aqueous phase (e-^ and e.^), and the intermolecular forces between substrate and receptor in the loose complex (e,^).
Application
of IMF Model
183
Tight Complex Formation
Conformational changes occur in the substrate and/or the receptor that maximize the intermolecular forces between the two. This results in an increase in binding energy that accompanies the formation of a tight complex, bas-rcp. The process is a function of the difference in intermolecular forces between the initial loose complex and the final tight complex and the difference in conformational energy between the initial and final conformations of bas—rep and bas-rcp respectively. C. Chemical Reaction
The tight complex proceeds along a reaction coordinate to a transition state (bas—rep)* that decomposes into a receptor-product complex (rcp-prd) by the formation and/or cleavage of covalent bonds. The rcp-prd complex then dissociates into solvated receptor and solvated product. The overall mechanism is summed up in Scheme 1. Each step in the sequences described above involves a difference in intermolecular forces between an initial and a final state. The IMF equation was designed to model such differences. Then it should be capable of modeling bioactivities.
III. PEPTIDE BIOACTIVITIES A. Types of Structural Variation in Peptides
Peptides can undergo substitution at one or more of several different sites."*'^ The types of substitution are: 1. One amino acid residue is replaced by another at a given position in the peptide. This is represented by Aax' where Aax is the residue with side chain X and / is its position in the peptide. 2. Substitution at the amino terminus of a linear peptide is represented by X^. 3. Substitution at the carboxyl terminus is represented by X^.
1. bas((|)i) ^ bas((j)2) ^ 2. bas((|)2) ^ 3. bas-rcp ^
bas-plp ^ bas-ams bas-pms ^ bas-mcm bas—rep ^ bas-rcp (bas-rcp)* ^ rcp-prd ^ rep + products
Scheme 1. Abbreviations: bas, bioactive substrate; ([), phase; pip, plasma protein; mem, membrane carrier molecule; ams, anterior membrane surface; pms, posterior membrane surface; bas—rep, loose substrate-receptor complex; bas-rcp, tight substrate-receptor complex; (bas=^rcp) , transition state; rcp-prd, receptor-product complex.
184
MARVIN CHARTON
3. Substitution at the carboxyl terminus is represented by ^ . 4. Substitution at the nitrogen atom of a peptide bond is represented by X^'^ where / and; are the positions of the residues attached to the atom undergoing substitution. 5. One or more amino acid residues in a peptide may be replaced by groups other than a-amino acids. This is represented by X^'-' where ij,... designates the positions of residues which are being replaced. 6. The H atom bonded to the a C atom of the i-ih residue may be replaced by a substituent R. This is represented by R\ 7. Chiral substitution in which the normal configuration of the amino acid to be replaced by its enantiomer, designated C. Consider the peptide l a and its derivative lb: Ser-Ala-Thr-His-Asp-Arg-Phe-Ile-Val-Tyrla tBuOC02NHSer-NHCMe2(C=0)-thr-His-(Et)-Asp-Arg-AaxNHCH2CH=CHCH2CO-Tyr-C02Ph l b The X^ substitution is the rBuOC02 group, the X^ substituent is the OPh group, the X^^'^ substituent is the Et group, the NHCH2CH=CHCH.C0 group is X^^'^ the side chain of the amino acid in position 7 is variable, R substitution occurs at position 2, and C^ substitution at position 3. B. Oxytocin Analogue Uterotonic Inhibitors of Oxytocin
/7A2 values^^'^^ (A2 is the IC5Q or ID5Q) for 155 structural analogues of oxytocin exhibiting an inhibitory effect on oxytocin in isolated rat uterus in the absence of magnesium were studied. Generally these substrates were nonapeptides substituted at all positions except 5; some of them had X^ substitution as well. Free-Wilson Analysis
The data set was first subjected to a Free-Wilson analysis^"^'^^ thus determining the side chain effects at positions 1 and 2. The Free-Wilson equation is,
p
where a is the contribution of the side chains in position p to pA2 and a^ the contribution of the invariant part of the substrate. Then,
j
p
where A. is the difference between/7A2 for the i-ih substrate and the algebraic mean of the /7A2 values for the data set; a- is the contribution to the activity of the7-th
Application of IMF Model
185
side-chain j is in the position p in the i-ih substrate and 0 otherwise; and a^^ is a residual representing the deviation of the data point from the line. The sum of all the side-chain contributions / at the position p is normalized by the equation:
1
J
Use of the Free-Wilson method gave the side-chain contributions pA^^ and pA2x reported in Table 2}^ Structural Dependence of the Side-Chain Contributions Substitution at position 1 of the peptide involved three different sites as the amino acid residue at this position has the form, ZNHCRX(C=0) where X and Z are substituents and R may be H or Me. If a substituent is part of a disulfide bridge it is considered to be the X group. If only one substituent is present it is considered to be the X group. Note that not all of the substitutions at position 1 involve amino acids. None of the X groups had OH or NH bonds, or except for Mpa(O) had lone pairs on O or N atoms; and no X group was likely to ionize. Thus, the X group parameterization was a^, %, and v^. The Z group required all of the IMF parameters. The R group was accounted for by the parameter n^^ which took the values 1 when R was Me and 0 otherwise. All parameters used are given in Table 3. The correlation equation was: P^2x = ^x^ix -^ ^x^x + ^x^x + ^z^iz + ^zf^z + ^z^z + ^i«//z Table 2. pA^^ and pA^^^ Values pyA;;^ Values X^, P^}Xf SpA^ AcCys, - 0 . 2 8 , 0.13; AcPen^ - 1 . 9 8 , 0 . 2 1 ; BaCys, - 0 . 1 0 , 0.22; Bta, - 0 . 2 2 , 0.16; CmCys, - 0 . 3 8 , 0.10; Cys, - 0 . 1 4 , 0.06; Dpe, 0.23, 0.07; GlyCys, - 0 . 5 8 , 0.22; Mep, 0 . 4 1 , 0.08; MgCys, - 0 . 1 6 , 0.22; Mma, - 0 . 7 5 , 0 . 2 1 ; M m p , - 0 . 3 1 , 0.08; M p a ( 0 ) ^ - 0 . 8 2 , 0.26; MsCys, 0.09, 0.16; Pen, 0.18, 0.08; p e n ^ - 0 . 5 9 , 0 . 2 1 ; PvCys, -0.1 7, 0.22; SarCys, - 0 . 9 1 , 0.22; TgCys, - 1 . 8 6 , 0.23; Mpp, 0 . 4 1 , - .
p/\2x Values /\ax,p/\2x, 5 p ^ ; D b t , - 0 . 3 3 , 0 . 1 6 ; l l e , - 0 . 4 5 , 0.23; L e u , - 0 . 1 3 , 0 . 1 7 ; l e u , - 1 . 5 2 , 0 . 2 4 ; Phe, - 0 . 1 1 , 0.10; phe, - 0 . 4 6 , 0.18; Phe(4-Ab), - 0 . 2 1 , 0.22; Phe(4-Et), 0.30, 0.14; phe(4-Et), 0.88, 0.12; phe(F5), - 0 . 7 4 , 0 . 2 1 ; Phe(4-Me), 0.30, 0.15; Phe(4-Pa), - 0 . 2 9 , 0.22; trp, -0.01,0.13; Tyr, - 0 . 2 4 , 0.06; tyr, - 0 . 0 5 , 0.25; Tyr(Bu), - 0 . 6 9 , 0.16; Tyr(Et), 0.17, 0.07; tyr(Et), 0.82, 0.14; Tyr(3-I), 0.02, 0.16; Tyr(Me), 0.28, 0.07; Tyr(3-Me), - 0 . 2 2 , 0 . 2 1 ; tyr(3-N02), -0.73,0.21. Note: "Excluded from the correlation.
186
MARVIN CHARTON
Table 3. Amino Acid Side-Chain Parameters for Groups in Positions 1 and 2 ^IX
^
%_
^
^
^z
'^HZ
"nZ
Jz_
^Me
Residue(l) AcCys
0.12
0.128
0.62
0.28
0.139
0.50
1
3
0
0
AcPen
0.09 0.12
0.221
1.24
0.139
0
0.208
1 1
0
0.62
0.50 0.50
3
0.128
0.28 0.35
-0.01 0.12
0.093 0.128
0.56 0.62
3 0 4
0 0
0 0
0.12
0.128
1
0 1
0
Cys Dpe
0.09
0
0
GlyCys
BaCys Bta CmCys
Mep
0.000
0.00
0
0.108 0.044
0.50
0.62
0.00 0.23 0.17
0.35
3 2
0.221
1.25
0.00
0.000
0.00
0
0
0.12
0.128
0.62
0.30
0.173
0.50
3
4
0
0
0.09
0.318
1.95
0.00
0.000
0.00
0
0
0
0
0
MgCys
0.12
0.128
0.62
0.32
0.334
0.50
3
0
0
0
Mma Mmp
0.27
0.082
0.60
-0.01
0.046
0.52
0
0
0
1
0.12
0.128
0.62
-O.01
0.046
0.52
0
0
0
1
Mpa
0.12
0.128
0.32
0.00
0.000
0.00
0
0
0
0
Mpa(O) MsCys Mpp Pen
0.21 0.12 0.11
0.126 0.128
0.77 0.62
0.00 0.42
0.000 0.162
0.00 0.80
0 1
0
0 0
0 0
0.339 0,221
1.50 1.24
0.00 0.17
0.000 0.044
0.00 0.35
0 2
5 0 1
0 1
0 0
0.128
0.62
0.28
0.279
0.50
1
0
0
0.128 0.128
0.62 0.62
0.30 0.32
0.219 0.533
0.50 0.50
2 5
0 0
0 0
PvCys SarCys TgCys Residue(2) Dbt lie Leu Phe Phe(4-Ab)
0.09 0.12 0.12 0.12
^
^
'ji_
"HX
^nX
0.06
0.456
0.7
1
2
1
-0.01 -0.01 0.03 0.04
0.186 0.186 0.290 0.503
1.02
0 0 0 1
0 0 0 3
0 0 0 0
^IX
0.98 0.70 0.70
Phe(4-Et)
0.03
0.383
0.70
0
0
0
Phe(F5)
0.12
0.285
0.70
0
0
0
Phe(4-Me)
0.03
0.336
0.70
0
0
0
Phe(4-Pa)
0.04
0.470
0.70
1
3
0
Trp
0.00
0.409
0.70
1
0
0
Tyr
0.03
0.298
0.70
1
2
0
Tyr(Bu)
0.03
0.489
0.70
0
2
0
Tyr(Et)
0.03 0.04
0.391 0.427 0.344
0.70
0 1
2
0
2
0
0 1 1
2 2
0 0 1
Tyr(3-I) Tyr(Me) Tyr(3-Me)
0.03 0.03
Tyr(3-N02)
0.06
0.344 0.360
0.70 0.70 0.70 0.70
6
3 4 10
Application of IMF Model
187
+ H2nnZ + fiz + B,^e^Me + B'
(14)
The best regression equation was obtained on the exclusion of the data points for AcPen, Mpa(O), and Phe, it is, pA21 = 2.99(±0.932)a;^+ 1.42(±0.667)a^2 -3.42(±1.10)a2- 0.317(±0.0967)n^2 +
014Q(±0-0609)A2„2
+ 0.416(±0.253)i2- 0.514(±0.198)
(15)
100/?^ 86.08; F, 11.34; S^^^, 0.251; 5^, 0.477; n, 18. r: a^^^ oc^, 0.750; a^^' «//Z' 0.676; Ci^, n^^, 0.777; a^, n^^^' ^•'^^l* ^c^, n^^, 0.916; n^^' '^nZ' ^•^^'^- ^a;^* ^^-'^^ Q/Z' 3^-8» ^ocZ» 1^-^' ^/i//Z' 8-^5' ^nnZ' ^'^^^ ^i' ^l'^'
Statistics obtained for a regression equation are reported directly below it throughout this work. The r values given are those zero-th order partial correlation coefficients that are significant at confidence levels equal to or greater than 90.0 %. The C values show the composition of the substituent effect for the reference residue, His. phe was not included in the correlation because it differs in configuration from the other members of the set. The failure of the Mpa(O) replacement to fit the model may be due to the inability of the model to account for the hydrogen-bonding capacity of the sulfonyl group.^ The failure of the AcPen replacement to fit the model may be due to the inability of the model to account for the hydrogen bonding of the carbonyl group. The results suggest that the Z group is involved in binding by ii (dispersion) interactions, and the X group by hydrogen bonding and van der Waals (vdW) (dd, di, ii) interactions. The R group has no apparent effect on the activity. Substitution at position 2 involves only the replacement of one amino acid residue by another. Again, the parameters used are reported in Table 3. The correlation equation used was: M2X = ^^^X + ^^X
+ ^1«//X + ^2«nX + ^^X + ^X^X + ^^
(1^)
Substitution at this position is complicated however by the inclusion in this data set of both D- and L-amino acid residues. The configuration was parameterized by an indicator variable that took the value 1 for the D configuration and 0 for the L configuration. The argument for this approach is that a difference in configuration should result only in a difference in the tightness of the tight complex which might be expressed by a constant. As the approach was totally unsuccessful the difference in binding between enantiomers is not constant. The data set was therefore separated into a D subset and an L subset and these subsets were separately correlated with Eq. 16. On exclusion of the data points for Phe and Tyr(Bu) the best regression equation obtained for the L set is.
188
MARVIN CHARTON
P\L-X
= -8.40(±3.37)a^ - 0.420(±0.0624)n^^ - 2.89(±0.530)%+ 2.51(10.478)
(17)
where 100/?^ 92.40; AlOOi?^ 90.71; F, 32.44; 5^^,, 0.0889; 5°, 0.338; n, 12. r: o,, I, 0.896. Q p 21.6; C„^, 13.5; C^, 64.9. As a and \) are highly collinear (r = 0.800) the dependence on \) may indicate the presence of both steric effects and polarizability. Furthermore, there is little variation in the steric parameter within the data set. Thus the substituent is probably involved in vdW interactions and hydrogen bonding as well as exerting a steric effect. The best regression equation for the D enantiomers is, P\D-X
= 9.54(±2.24)a - 0.964(±0.473)i - 3.21(±0.737)
(18)
lOOR^, 79.74; AlOO/?^ 76.36; F, 9.839; S^^,, 0.434; S^, 0.569; n, 8. Q , 69.5; C^, 30.5. It seems likely though not certain that ii and li interactions are involved. C. Peptide Renin Inhibitor QSAR
Much has been published on bioactivities of peptide renin inhibitors. ^^-^,^ They are of interest in the treatment of hypertension. The data sets studied are peptide analogues of human angiotensinogen, 2, and of aspartyl proteinase pepstatin, 3. The data are reported in Table 4. Sets 51 and 58-62 involve residues 8, 9,10, and 11 of angiotensinogen. The residues Leu^^-Val^^ in these sets are replaced by a nonpeptide structural unit, an example of X^ substitution. Residue 8 may undergo replacement by either another residue or a nonpeptide fragment. Residue 9 may vary, and there may also be X^ or X^ substitution (see Table 4). Sets 53-57 involve the residues 8 through 12 of angiotensinogen with 10 and 11 replaced by statine and X^ substitution at residue 12. Set 52 consists of derivatives of pepstatin in which residue 1 is Phe or Trp, residue 2 varies, and both X^ and X^ substitution occur. H^N-Asp^-Arg^-VaP-Tyr^-Ile^-Hi^-Pm'^-Phe^ -Hi^-Leu^^-Val^
^-Ile^^-HiP-PRN 2
Iva-Vd^-Vaf-Sta-Ala-OH
The interactions due to the side chain of an amino acid residue on the bioactivity of the peptide are given by Eq. 10. If the peptide is substituted at its amino (X^) or
Application
of IMF Model
Table 4.
189
D a t a U s e d in t h e C o r r e l a t i o n s o f Peptide Renin I n h i b i t o r s
PPB51 IC50 (nM), Boc-Phe-Aax^-NHCH(cHx)CHOHCH2SOnAk, human renal renin, substrate—pure human angiotensinogen, maleate buffer (pH 6.0)^. Aax , n, Ak, IC50; His, 0, cHx, 4.0; His, 0, iBu, 6.5; His, 0, iPr, 4.0; His, 0, Me, 10; His, 2, cHx, 2.5; His, 2, iPr, 2.0; His, 2, Me, 4 0 ; Ala, 0, iPr, 9.9; Ala, 2, iPr, 70; Leu, 2, iPr, 4.0; Phe, 2, iPr, 30; Thr, 2, iPr, 8.0; Ser, 2, iPr, 4 0 ; Hse, 2, iPr, 20; (Bzl)Thr, 2, iPr, 6.0; (Bzl02C)0rn, 2, iPr, 60; (BzlOjQLys, 2, iPr, 100; (Ac)Lys, 2, iPr, 300. PPB52 - l o g IC50, X -Aax -Aax -Sta-Ala-Sta-C02R, enriched human plasma renin, substrate—endogenous angiotensinogen . x"^, Aax^, - l o g ICSQ; BOC, His, 7.57; Boc, Cpg, 7.43; Boc, Nva, 7.29; Boc, Val, 7.19; Boc, Phg, 7.03; Boc, Ser(Et), 6.95; Boc, Nie, 6.92; Boc, Chg, 6.82; Boc, Ser(Bzl), 6.76; Boc, Phe, 6.72; Boc, Thg, 6.60; Ser(Pym), 6.50; Boc, Gin, 6.50; Boc, Met, 6.50; Boc, Ser, 6.36; Boc, Cha, 6 . 3 1 ; Boc, Asn, 5.92; Boc, Tyr(Bzl), 5.85; Boc, Asn(Ph), 5.85; Boc, Met(02), 5.00; Boc, Trp, 6.92; Iva, Phe, 6.50; Iva, Nva, 7.38; Iva, Nie, 7.55^• Ac, Phe, 5.85; Ac, Nva, 6.59; Boc, Nie, 7.52S- Cbz, Phe, 6.00^• Cbz, Phe, 5.46; Cbz, Trp, 5.85; Cbz, Val, 6.75; Cbz, Val, 6 . 8 2 ^ Boc, Trp, 6.82"^; Boc, His, 6.85"^. PPB53-58 53, Boc-Phe-His-Sta-Leu-NHW; IC50 (nM), hog kidney renin; 54, IC50 (nM), human plasma renin; 55, Kj (nM), purified human kidney renin, substrate-angiotensinogen, radioimmunoassay; 56, Kj (nM), purified human kidney renin, substrate—Synthetic tetradecapeptide, radioimmunoassay; 57, Kj (nM), purified human kidney renin, substrate—Synthetic tetradecapeptide, fluorimetric assay^ W, IC5o(53), IC5o(54), Kj(55), Ki(56), Kj(57); CH2Ph, 35, 26, 70, 55, 18; CH2CH2Ph, 170, 164, 120, 350, 36; (-)CHPhCH2Ph, 23, 1326, 700, 68, 28; (+)CHPhCH2Ph, 6.9, 1 5 1 , 280, 38, 29; CH2C6H40Me-4, 6.7, 33, 12, 100, - ; CH2C6H4CI-4, 5.0, 8 1 , 280, 50, 290; (-)CHMePh, 22, 2 1 , 36, 20, 27; (+)CHMePh, 14, 49, 100, 6, 19; ()CHMe(1-CioH7), 14, 5 1 , 140, 13, 0.98; (+)CHMe(1-CioH7), 11, 484, 600, 230, 130; CHMeCH(OH)Ph, 480, 1 78, 220, 36, 67; CHPhCH(OH)Ph^, 90, 134, 110, 0.20, 0.12; CHPhCH(OH)Ph^ 24, 8 4 1 , 350, 47, 28; (CH2)5NCH2Ph, 280, 127, 97, 0.04, 0.064; 4 (10,11-dihydro-5H-dibenzo[a,d]cycloheptenyl^ 5.8, 569, 320, 520, 120. PPB58 IC50 (fiM), purified human renal renin, substrate—pure angiotensinogen, Boc-Phe-Aax^NHCH(CH2Ak)CHOH-CH2CH2W-Z'. Aax^ Ak, W, Z, IC50; Ala, iPr, C O , iPe, 2.4; Ala, iPr, C H O H , iPe, 3.8; Ala, iPr, S, iPe, 5.5; Ala, iPr, S, CH2CH2Ph, 4.2; Ala, iPr, S, iBu, 4 . 1 ; Ala, iPr, S, iPr, 4.8; Ala, iPr, SO, iPe, 5.2; Ala, iPr, SO2, iPe, 2.4; Ala, iPr, SO2, CH2CH2Ph, 1.8; Ala, iPr, SO2, iBu, 3.2; Ala, iPr, SO2, iPr, 1.6; His, cHx, SO2, iPr, 0.0076; His, cHx, SO2, Et, 0.10; Ala, cHx, SO2, iPr, 0.076; Ala, cHx, SO2, Et, 0.14; Leu, cHx, SO2, iPr, 0.014; Phe, cHx, SO2, iPr, 0.020. PPB59 IC50 (nM), purified human renal renin, substrate—pure angiotensinogen, Z (C=0)-Phe-Aax^NHCH(CH2Ak)CHOH-CH2(C=CH2)(C=0)-NHZ^ z'^, A a x ^ Ak, Z, IC50; tBuO, Ala, cHx, iPe, 10; tBuO, Ala, cHx, iBu, 10; tBuO, Ala, iPr, iPe, 200; tBuO, Ala, iPr, iBu, 400; tBuO, Ala, iPr, C H O H i B u , 6000; tBuO, Ala, cHx, CH2CHX, 50; tBuO, Ala, cHx, CH2CH2Ph, 150; tBuO, Ala, cHx, Me, 50; tBuO, Ala, cHx, CH2CH2NMe2, 400; tBuO, Ala, cHx, CH2CMe2NMe2; 25; tBuO, His, cHx, iPe, 1.5; tBuO, Leu, cHx, iPe,4; tBuO, His, cHx, iBu, 3; tBuO, His, cHx, CH2CMe2NMe2, 5; tBuO, Phe, cHx, CH2CMe2NMe2, 8.5; EtO, His, cHx, iPe, 3; EtO, Leu, cHx, iBu, 5; tBuCH2, His, cHx, Me, 2; Me, His, cHx, iPe, 4 ; EtO, Leu, cHx, CH2CMe2NMe2, {continued)
190
MARVIN CHARTON Table 4, {Continued}
PPB60 ICsoinM), Boc-Phe-His-NHCH(CH2cHx)CHOH-CH2CZ^Z^{C=0)-NHAk, purified human renal renin, substrate—pure angiotensinogen^' . Z ,Z , Ak, IC50; O H , Me, iPe, 5.5; Me, O H , iPe, 50; O H , CH2N3, iBu, 1 ; CH2N3, O H , iBu, 20; O H , iBu, iBu, 30; O H , CH2CI, iBu, 0.8; CH2CI, O H , iBu, 20; O H , CH2NH2, iBu, 15; CH2NH2, O H , iBu, 35. PPB61 IC50 (fiM), Boc-Phe-Aax^-NHCH(CH2iPr)CHOH-CH2WZ, purified human renal renin, substrate—purified angiotensinogen, maleate buffer (pH 6.0). Aax^, W Z , IC50; His, SPh, 0.96; Ala, SPh, 8; Ala, SCH2Ph, 10; Ala, SCH2CH2Ph, 4.5; Ala, SCCHjjjPh, 1; Ala, SiPr, 0.7; Ala, SiBu, 1.5; Ala, SiPe, 1.5; Ala, StBu, 3; Ala, ScHx, 0.8; Ala, ScPe, 1; Ala, OiPr, 7; Ala, CH2iPr, 2; Ala, OiBu, 7; Ala, CH2iBu, 3.5; Ala, SO2CHX, 2; His, OiPr, 1.5; His, CH2iPr, 1.5; His, OiBu, 0.65; His, CH2iBu, 0.60; His, SiPr, 0.081; His, S02iPr, 0.20; His, S02iBu, 0.35; His, S02iPe, 0.50; His, SO2CHX, 0.090; His, ScHx, 0.035. PPB62 IC50 (nM), XZCH(C=0)-Aax^-NHCH(CH2cHx)CHOH-CH2SOnAk, purified human renal renin, substrate—pure angiotensinogen, maleate buffer (pH 6.0)^. X, Z, Aax , n, Ak, IC50; BZIOCH2, t B u 0 2 C N H , His, 2, iPr, 75; BzlOCHMe, t B u O j C N H , His, 2, iPr, 5.5; BzlOCHMe, Et02CNH, His, 2, iPr, 20; 4-MeOC6H4CH2, t B u 0 2 C N H , His, 2, iPr, 3.0; PhO, H, His, 0, cHx, 430; Bzl, Bzl, His, 0, cHx, 20; Bzl, Bzl, His, 2, cHx, 40; Bzl, Bzl, Leu, 2, iPr, 25; Bzl, Bzl, His, 2, iPr, 70; BzlOCHMe, NH2, His, 2, iPr, 300; PhCH2, tBuCH2CONH, His, 2, iPr, 3.0; PhCH2, Et02CNH, His, 2, iPr, 5.0; BzlOCHMe, iPr02CNH, His, 2, iPr, 10; PhCH2, t B u 0 2 C N H , His, 2, iPr, 2.0; PhCH2, t B u 0 2 C N H , His, 2, cHx, 2.5; PhCH2, t B u 0 2 C N H , His, 0, cHx, 4.0; PhCH2, t B u 0 2 C N H , His, 0, iPr, 4.0. Notes: ^Ref. 1; ''Ref. 2; Aax^ is Phe and R is Me unless otherwise noted. ^R is H. ^Aax^ is Trp. ^Ref. 3. ^Erythro. ^Threo. ^Not included in the correlation. 'Ref. 4. ^Ref. 5. ''The group in italics is behind the plane of the paper while that in boldface is in front of the plane of the paper. ^Ref. 6.
carboxy (X^) terminus as well, additional terms are required in the IMF equation. It is also necessary to parameterize any structural variations that occur in the X^ units. The sets studied were correlated with an appropriate form of the IMF equation. The parameter values^'^'^^ used for amino acid side chains are given in Table 5; those used to parameterize X^, X^, and X^ substitution are given in Tables 6 and 7. For each data set the best regression equation obtained and the appropriate statistics are reported. The statistics reported are described fully in Appendix 1. Structural Effects in Angiotensinogen Derivatives
The structure of the angiotensinogen derivatives studied is summarized in Table 8. In sets 51,58, and 59, Leu^° is replaced by the fragment NHCH(CH2 Ak)CH0H (where Ak = alkyl), the side chain of residue 9 is varied, and Val^^ is replaced by X^^K In set 51, X^^^ is CH2SO„Ak', and Ak is constant and equal to cyclohexyl. X^^^ is parameterized by HQ which is equal to the number of O atoms bonded to the
Application of IMF Model
191
Table 5. Amino Acid Side-Chain Parameters for the IMF Equation Aax Ala Asn Asn(Ph) Cha Chg Cpg Gin His Hse Leu (Ac)Lys (Cbz)Lys Met Met(02) Nie Nva (Cbz)Orn Phe Ser (Bzl)Ser (Et)Ser (Pym)Ser Thg Thr (BzDThr (Bzl)Tyr (Me)Tyr Trp Val
^IX
a
^H
^n
/
V)
-0.01 0.06 0.10 -0.01 0.00 -0.01 0.05 0.08 0.06 -0.01 0.01 0.01 0.04 0.11 -0.01 -0.01 0.04 0.03 0.11 0.11 0.11 0.11 0.19 0.09 0.09 0.03 0.03 0.00 0.01
0.046 0.134 0.377 0.303 0.257 0.214 0.180 0.230 0.108 0.186 0.323 0.568 0.221 0.217 0.186 0.139 0.522 0.290 0.062 0.352 0.155 0.328 0.230 0.108 0.398 0.588 0.344 0.409 0.140
0 2 1 0 0 0 2 1 1 0 1 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 1 0
0 3 3 0 0 0 3 1 2 0 3 5 0 4 0 0 5 0 2 2 2 3 0 2 2 2 2 0 0
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0.52 0.76 0.76 0.97 0.87 0.71 0.68 0.70 0.77 0.98 0.68 0.68 0.78 1.01 0.68 0.68 0.68 0.70 0.53 0.62 0.61 0.62 0.57 0.70 0.71 0.70 0.70 0.70 0.76
sulfur atom, u^i^-, which accounts for the steric effect of Ak', and a^^/, which represents its polarizability. The correlation equation is: Qx = ^tX + ^^X + ^l«HX + ^2«nX+ lix + SV^ + B^n^ + A^^a^, + 5^,1)^, + B"
(19)
The best regression equation is, log IC50 = -7.13(+3.08)o^;^ + 0.766(+0.261)«„;f - 0.636(-K).236)ix -3.48(±1.83)aA^ - 1.87(±0.944)a)^^ + 3.04(10.724)
(20)
192
MARVIN CHARTON Table 6. Other Parameter Values Used in the Correlations a,b
Group'
tBu02CNH
0.28
0.306
1
3
0
0.50
Et02CNH
0.28
0.214
1
3
0
0.50
H
0
0
0
0
0
CH2Ph
0.03
0.290
0
0
0
0.70
NH2 tBuCH2CONH
0.17
0.044
1
1
0.35
0.28
0.332
2 1
3
0.50
iPrCONH
0.28
0.240
1
3
0 0
tBuO
0.28
0.206
0
2
0
0.50 1.22
COiPe
0.30
0.288
0
2
0
0.50
CHOHiPe
0.09
0.294
1
2
0
CH2\Pr
0.01
0.140
0
0
0
0.53 0.52
CH2iBu
-0.01 0.27
0.186
0
0
0
0.52
OiPr
0.160
0
2
0
0.32
OIBu
0.28
0.206
0
2
0
0.32
SiPe
0.26
0.314
0
0
0
0.60
SCH2CH2Ph
0.26 0.26
0.488 0.268
0 0
0
0
0.60
0
0
0.26
0.222
0
0
0
0.60 0.60
5tBu
0.26
0.268
0
0
0
0.60
ScPe ScHx SPh 5CH2Ph S(CH2)3Ph
0.26 0.32 0.31
0 0 0
0 0
0 0 0
0.60 0.60 0.60
0.26
0.292 0.339 0.333 0.376
0
0
0
0.60
0.468 0.314
0
0
0
0.60
SOiPe
0.26 0.54
2
S02iPe S02CH2CH2Ph
0.58 0.58
0.311 0.415
0 0 0
4 4
0 0 0
0.66 1.03 1.03
502iBu
0.265
0
4
0
1.03
S02iPr
0.58 0.57
0.219
0
4
0
1.03
S02Et SO2CHX
0.59 0.57
0.172
0
4
0
0.336
0
4
0
1.03 1.03
CI
0.47
0
0
0
0.55
NH2 N3
0.17
0.050 0.044
2
1
1
0.35
0.43
0.092
0
1
0
0.35
Me
-0.01
0.046
0
0
0
0.52
Et IPr
-0.01
0.093
0
0
0
0.56
0.01
0.140
0
0
0
0.76
iBu
-0.01
0.186
0
0
0
0.98
tBu
-0.01
0.186
0
0
0
IPe
-0.01
0.232
0
0
0
SiBu SiPr
0
0
1.24 0.68 {continued)
Application of IMF Model
193 Table 6,
Group^' cHx Ph
a
^/
0.257
Continued "H
"n
0 0 0 0 2
/
t)
0 0 0
0.87 0.57 0.97 0.70 0.71 1.56
CH2CH2Ph
0.00 0.12 -0.01 0.02
CHOHiBu
0.09
0.336 0.248
0 0 0 0 1
CH2CH2NMe2
0.03
0.237
0
1
0 0 1
CH2CMe2NMe2
0.03
0.331
0
1
1
CH2CHX
0.243 0.303
0.68
Notes: ^WZ groups are shown with W in italics. For these groups the u value reported is for W alone. ''Nonstandard abbreviations: c, cycio; Re, pentyl; Hx, hexyl; Pn, phenylene; Nh, naphthyl.
100R\ 71.43; AlOO/?^ 62.64; F, 6.001; 5,^^, 0.391; 5°, 0.655; n, 18; r^ji a^, «^, 0.570; a^, i, 0.533; a, n^, 0.731; n^, n„, 0.490; a^^' ^^ ^-^^'^^«//»i* 0.495. C^, 15.2; C„^, 20.3; C,-, 16.8; C^, 34.8; C^^, 13.0. A plot of log IC5o^^,^ against log 1050^,^8 ^^ given in Figure 1. Before interpreting these results it is necessary to understand what is represented by the i parameter in this data set. The only set members with an ionic side chain are those for which Aax^ is His. On the basis of percent inhibitions that were reported for Orn and Lys in position 9 and IC5Q values that have been determined for their Cbz derivatives it seemed that the effect of the His side chain is not due to its charge. The / parameter in Eqs. 19 and 20 is actually functioning as an indicator
Table 7. Parameters for Sets 53-57 CHZ^Z^^
ZG,Z
^w
CH2Ph CH2CH2Ph
0.12
0.290
0.03
0.336
CHPhCH2Ph
0.15
0.580
CH2PnOMe-4
0.11
0.356
CH2PnCI-4
0.15
0.338
^"HZ
^^nZ
^'z
^1W
^2W
0
0 0
1
1
0
0 0
0
0
0
1 2
1 2
0
2
0
1
0
0
0
1
1 1
CHMePh
0.11
0.336
0
0
0
2
1
CHMe(l-Nh)
0.13
0.496
0
0
0
2
1
CHMeCHOHPh
0.09
0.392
1
2
0
2
2
CHPhCHOHPh
0.22 0.02
0.590 0.547
1
2
2
0
1
0 1
3 0.74
c[(CH2)5N]CH2Ph-1
1.5
Notes: ^Nonstandard abbreviations: c, cycIo; Pe, pentyl; Hx, hexyl; Pn, phenylene; Nh, naphthyl. ^hese parameters apply to both stereoisomers.
^3W
2b
.b .b 2" 1
194
MARVIN CHARTON Table 8, Structures of Angiotensinogen Derivatives
Set
X^
RpP'
/\ax^^
AkiRpi^^f
Rpl''
)f
CH2SOnAI<
—
51
Boc
Phe
Aax
cHx
58^ 59^ 60^ 61
Boc Z^CO Boc Boc
Phe Phe Phe Phe
Aax Aax His His/Ala
iPr/cHx(CH2CH2WZ) iPr/cHx(CH2(C=CH2)CO) NHZ^ NHAk cHxCH2C(OH)XCO iPr CH2WZ
62
—
Z^Z^HCO His
cHx
CH2SOnAk
53^
Boc
Phe
iPr
CH2CO
His
— — LeuW
Notes: ^Rpl» may be either X'^Aax or Z^Z^CHCO. ^Aax^ = Ala, Leu, Phe, His. ^RpP^ is NHCH(CH2Al<)CHOH. ^Z = Me, iPr, iBu, CH.Ph. ^Z^ = OtBu, OEt, Me, tBuCH,; ^W = CH2Z. ^W = NHCHZ^Z^.
variable for the presence or absence of an His side chain. Since the other contributions of the His side chain were already accounted for in the parameterization it seems not unreasonable that / in this case may actually represent a charge transfer acceptor contribution of the imidazole ring. The greatest effects of structural variation in set 51 result from the side chain of the residue in position 9. Varying the alkyl group in X^^^ had little effect. In set 59 X^^^ is equal to CH2(C=CH2)(C=0)NHZ. The alkyl group of X^^^ may be either iPr or cHx. Aax^ is limited to four different residues of which only His is capable of ionization and of hydrogen bond formation. For His the parameters n^, n^, and / all take the value 1. These parameters were therefore combined into the single parameter TIY^^^. Structural variation in X^ was modeled by the parameters ^ZN' ^ZN' ^^^ ^ZN' Variation in the alkyl group of X^^^ was accounted for by the
Set 51 2 1.5 IC(50)calc 1-J
0.5-J 0
I
0 Figure I.
0.5
1
1
1 —
1 1.5 2 IC(50)obs
2.5
Calculated vs. observed IC50 values.
Application of IMF Model
195
variable n* which takes the value 0 when Ak is iPr and 1 when it is cHx. The effect of structural variation of X^ was accounted for by the parameters a^^, oc^, n„2' ^"^ D^. As only one of the Z groups studied had an OH or NH bond the n^ parameter was not included in the correlation equation which took the form:
+ H^n^^ + S^y)^ + A^ + H^n^^ + S^y}^ + 5^
(21)
The best regression equation obtained is: log IC50 = 0.680(±0.170)n^^.;^ - 2.11(±0.497)% - 0.989(±0.235)n* + 13.5(±3.08)a,2 " 0.499(±0.225)\)2 + 4.07(±0.385)
(22)
lOOR^, 91.34; AlOO/?^ 89.03; F, 29.53; 5est, 0.328; 5°, 0.352; n, 20; nf. ou a, 0.779; ai, n//n/, 0.977; a, «//„/, 0.689; a, i), 0.695; a, DXM 0.578; a, axN, 0.478; axN, Dm 0.951. Cn//m, 17.5; C^,, 38.0; C^*, 25.4; Coiz, 10.4; C^z, 8.73. A plot of log IC5o^^j^, against IC50 Q^,^ is given in Figure 2. As was the case for the i parameter in set 51, the n^^. parameter in this data set is actually an indicator variable which takes the value 1 when Aax^ is His and 0 otherwise. The very extensive collinearities in this data set make interpretation of the results difficult. Set 58 differs from set 59 in having a constant X^ group, a different variable fragment for X^^\ and a different type of X^ substituent. The correlation equation takes the form:
Set 59
1—\—I—I—I—r
0 0.5 1 1.5 2 2.5 3 3.5 4 log IC(50)obs Figure 2.
log ICso^calc vs. log ICso^obs-
196
MARVIN CHARTON
(23)
The best regression equation obtained is: log IC50 = -1.97(±0.764)a;^ - 0.479(±0.125)n^„.;^ - 1.29(±0.417)\);^ - 1.34(±0.106)n* - 0.164(±0.0442)n^^+ 1.44(±0.206)
(24)
100R\ 99.10; AlOO/?^ 98.80; F, 243.2; Sesu 0.125; 5°, 0.118; n, 17. r^: awz, awz, 0.913; awz, i)vv, 0.849; a/wz, M*, 0.541; awz, AX*, 0.675; vw, now, 0.896; now, n\ 0.572; a, n*, 0.733; i), «*, 0.657; a, «//„/, 0.624; a, i), 0.773. Ca, 13.0; CnHni, 13.7; Cu, 25.7; Cn*, 38.3; C^ow, 9.38. A plot of IC5Q^,^|^ against IC^Q^^^ is shown in Figure 3. The excellent fit of the model to this data set must be fortuitous as 5^^^ is less than the experimental error of the bioactivities. The structural features in order of decreasing importance are the side chain X of the amino acid residue in position 9, the nature of the alky 1 group attached to X^^^, and the group W of X^^^. There was no observable effect of structural variation of the X^ group. This may very well be due to collinearities. In set 61 the amino acid residue in position 9 is either Ala or His, the alkyl group in the side chain of Z^^^ is iPr throughout the entire data set, and X^^^ is CH2WZ. Structural variation in position 9 is represented by the variable n^.^ which takes the value 1 when the residue is His and 0 when it is Ala. The parameters c^^^, a^^, t)^, D^, and n^yy model the WZ group of X^^^ The correlation equation is:
Set 58
log IC(50)c
~i—I—I—1—r
-2.5-2-1.5-1-0.5 0 0.5 1 log IC(50)obs Figure 3.
log ICso^calc vs. log iC5o,obs-
Application of IMF Model
197
The best equation regression obtained is: log IC50 = -0.878(±0.181)n^,., - 1.25(±0.477)D^ + 0.118(±0.0692)n„^ + 1.06(±0.279)
(26)
100/?^ 60.69; AlOO/?^ 57.27; F, 11.32; 5est, 0.418; 5^, 0.682; n, 26. nf ciwz, Vw, 0.734; a/wz, rinw, 0.117; awz, 'Ow, 0.363; \)w, rinWy 0.618; AZ///^, rinw, 0.399; Cn///5, 58.0, Ci)vy, 26.3; CnnW, 15.6. A plot of log IC5Q ^^j^ against IC5Q ^i^^ is given in Figure 4. The results obtained are poor but in agreement with those for the other sets studied. The side chain of the residue in position 9 exerts the major structural effect. Structural variation in set 60 involves X^^^ and X^. The latter takes the form NHAk where Ak is either isopentyl or isobutyl. This is accounted for by the n* parameter which takes the values 1 and 0 respectively. The effect of the group Z in X^^^ is modeled by the parameters a^^, cx^, and X)^^. The parameter n^^ accounts for the possible formation of an intramolecular hydrogen bond of Z with the OH group Q^^Rii. j|. i^^gg |-|^g value 1 when there is such a bond and 0 when there is not. The configuration of the atom to which the CH2Z and OH groups are attached is represented by the parameter n . which takes the value 1 for the configuration with the larger IC5Q value and 0 for its epimer or enantiomer. As the number of data points in this set is very small and the number of variables required to model it is large the results are only qualitative at best. The correlation equation has the form:
Set 61
log IC(50)c
0A
1
1
1
r
-1.5 -1 -0.5 0 0.5 log IC(50)obs Figure 4.
log ICso^calc vs. log ICso^obs-
198
MARVIN CHARTON
Qbas = B*n + L^a^^ + A ^ 2 + H^^^n^^ + S^D^ + 8^/1^^+ BT
(27)
The best regression equation is: log IC50 = -2.42(±0.528)a^ + 0.984(±0.454)\)2 + 1.01(±0.198)n^^+ 0.786(±0.210)
(28)
100/?^ 88.65; AlOO/?^ 84.86; F, 13.01; 5est, 0.282; 5^, 0.452; n, 9.rij\ az, Dz, 0.795; az, «*, 0.715; DZ, n\ 0.828. Ca/z, 23.2; Cx,z, 19.5; Cncf, 57.3. Figure 5 is a plot of log IC5Q ^^j^ against IC3Q ^^^. Configuration seems to play the major role in determining bioactivity in this set. The nature of the alkyl group that is part of the ^ substituent appears to have no effect. The effect of substitution for the amino acid residue in position 8 either by varying its side chain or by replacing it completely with a non-amino acid fragment has also been studied (set 62). All of these substitutions have the form XZCH(C=0). When the substitution is an amino acid X is its side chain. When the substitution is not an amino acid the X group is the larger of the two groups attached to CH. Substitution at position 11 by a non-amino acid replacement, X^^\ also occurs in this data set. It has the form CH2S0„Ak where Ak is isopropyl or cyclohexyl. Thus the correlation equation is: Qbas = ^X^XX + ^Z^Z
+ ^X^X
+ ^Z^Z
+ ^ 1 Z ^ / / Z + ^^IX^rOC + ^ 2 Z ^ n Z
^S^v^^S^v^^B^n^
(29)
+ b*n*^b'
Set 60
-0.5
0
"1 \ r 0.5 1 1.5 log IC(50)obs
Figure 5. log ICso^calc vs. log ICso^obs-
2
Application of IMF Model
199
The best regression equation obtained is: log IC50 = -5.99(±1.80)a^- 4.31(±0.600)a^2 + '7-96(±1.42)% -4.44(±1.03)a2 - 16.9(±3.47)% + 12.68(±2.47)
(30)
100/?^ 94.61; AlOO/?^ 92.65; F, 35.11; S^su 0.213; 5^, 0.294; n, 16. nf. cix, az, 0.761; a/x, rinx, 0.547; a/x, Vx, 0.846; a/z, n^z, 0.936; oiz, nnz, 0.756; o/z, n*, 0.515; ax, ^inx, 0.736; ax, n//z, 0.584; ax, n*, 0.542; az, rinx, 0.481; A2//Z, Hnz, 0.589; n//z, «*, 0.528. C^ix, 2.95; Ca/z, 6.61; Cox, 17.3; Coz, 7.44; Cx,x, 65.6. A plot of IC5Q^2ic ag^ii^st IC5Q Qtj5 is given in Figure 6. The very good fit obtained for this data set is probably due to chance. The overwhelming predominance of the steric effect is suspicious as the extent of the range of \)^ is very small. The very large number of collinearities among the variables is a further cause for suspicion. It is likely however that steric and polarizability effects are dominant in structural variations at position 8 and that variation of the alkyl group in X^^^ has no observable effect on the bioactivity. Also studied were IC5Q and K^ values determined for peptides having the structure Boc-Phe-His-(3S,4S)-Sta-Leu-J^, where Sta is the amino acid 3-hydroxy-4amino-6-methyl-heptanoic acid, 4. H2NCH(iBu)CH(OH)CH2C02H
Set 62
" 1 — I — \
\
r
0 0.5 1 1.5 2 2.5 3 log IC(50)obs Figure 6. log ICso^calc vs. log ICso^obs-
200
MARVIN CHARTON
Sta is thought by some to be a replacement for two amino acid residues. The fragment NHCH(/Bu)CHOH is indeed equivalent to Leu; the remaining fragment, CH2(C=0), needs one more atom other than hydrogen in the chain of three atoms that is the skeletal group of a peptide residue in order to be a valid replacement of an amino acid residue. The substrates in these data sets are therefore roughly equivalent to pentapeptide angiotensinogen analogues in which Sta replaces Leu^^Val.^^ The structural variation is i n ^ which can be written in the form NHW where W is CHZ^Z^. The structural effects are modeled by Sa^^' ^^Z' ^^//Z' ^^nZ' ^v ^2' and riy For four of the X^ substituents in the data sets studied the carbon atom bonded to nitrogen is chiral. Values of the bioactivity have been determined for both enantiomers in three cases and for the meso form and the racemate in the fourth case. The configurational isomerism is accounted for by the n .parameter defined above. The use of this parameter in a data set which contains both chiral and achiral members is based on the assumption that the achiral group will prefer a conformation analogous to that of one of the two chiral isomers and will therefore behave in the same way as that isomer while the activity of the other chiral isomer will be either greater or less throughout the data set. Although one of the substituents studied, 4-amido-l-benzylpiperidine, can ionize it was not necessary to include parameterization to account for this. The correlation equation used for sets 53-57 is:
The best regression equation for IC5Q values obtained with hog kidney renin (set 53) is: log IC50 = -11.7(±1.56)Za^ + 1.68(±.259)Sn^2 " 0.233(±0. lOS)!:^^^ + 0.312(±0.136)n,^+ 0.404(±0.178)^23 + 1.97(±0.243)
(32)
100R\ 93.05; AlOO/?^ 89.96; F, 21.42; 5est, 0.210; 5^, 0.349; n, 14. r,>: Sa, n2, 0.746; 2a, ns, 0.702; Za, m, 0.598; Sa, ^2, 0.643; Ea, m, 0.777; Zn//, Z/in, 0.792; In//, ^2, 0.828; Znn, m, 0.577; n2, «3, 0.861. Cia, 26.9; Ci„//, 42.9; C^n. 11.9; C„,/, 7.98; Cn,. 10.3 while for IC5Q values obtained with human plasma renin (set 54) the best equation, which resulted on exclusion of the data point for (-)NHCHPhCH2Ph, is: log IC50 = -2.71(±1.52)Za^ + 4.79(±0.964)Za2-
0.255(±0.106)IAZ„2
-0.727(±0.154)n^^-0.775(±0.193)ni + 1.35(10.273)^2 - 2.06(±0.462)n3 + 2.80(±0.402)
(33)
Application of IMF Model
201
100R\ 93.30; AlOO^^ 86.60; F, 9.950; 5est, 0.191; 5^, 0.417; n, 13. nf. Sa, m, 0.741; Za, n3, 0.723; 2a, m, 0.575; Za, AZ2, 0.626; Za, ns, 0.745; Zn//, Zrt„, 0.787; Zn//, AX2, 0.882; Z«//, «3,0.567; Zn^, ^3, 0.640; n2, ns, 0.882. Ci^a. 2.53; Cza, 19.4; C^nn, 5.28; Cnc/, 7.53; G p 16.0; C^^, 27.9; Cn3, 21.3. Plots of IC5o^^|^, against IC5QOJ,5 are given in Figures 7 and 8. Sets 53 and 54 differ in their structural effect dependence. The former set is independent of a, n^, and AZ2- It is largely a function of hydrogen bonding as is shown by the sum of the C^^ and C^^ values. The latter set is dependent on polarizability and on all three branching steric parameters. The sum of the C^, C^j, C^2' ^^^ ^ns values amounts to well over 80% of the total structural effect. If this result is not an artifact resulting from the extensive collinearities among the variables and/or the small size of the data set it may be due to either the difference in the enzyme used, the difference in the substrate used in the assay, or both of these factors. Both sets showed a dependence on configuration of about the same magnitude. Support for a difference in the behavior of the enzymes as the cause comes from the results obtained on combining sets 53 and 54 by the Omega method [4,22]. This technique can be used to combine data sets with related biocomponents into a single set. If the relationship between the biocomponents is not close enough the method fails. The biocomponent parameter co was defined from the log IC5Q values for the compound with W equal to 10,ll-dehydro-5H-dibenzo[a,d]cycloheptenamido. This data point was not included in the correlations of sets 53 and 54 due to difficulty in its parameterization. The correlation equation is identical to Eq. 19 except for the addition of a term in 0(0. The best regression equation obtained for the combined set gives results that are very much poorer than those obtained independentiy for sets 53 and 54 showing that combination of the data sets is not justified.
Set 53 •
2.5• 2• loglC(50)c1.51- .».' 0.501 \ i 1 0.5 1 1.5 2 2.5 log IC(50)obs Figure 7.
log ICso^calc vs. log ICso^obs-
3
MARVIN CHARTON
202
Set 54
1.5 2 2.5 log IC(50)obs Figure 8. log ICso^caic vs. log ICso^obs-
The correlation equation for the three sets of K- values is the same as that used for sets 53 and 54. The best regression equations obtained for set 55 are: log K^ = 3.05(±1.28)Sa2- 0.500(±0.152)En^2 + 0.869(±0.347)n2 - 1.14(±0.586)n3 + 1.28(±0.365)
(34)
100/?^ 67.32; AlOO/?^ 57.52; F, 4.636; 5est, 0.325; 5°, 0.713; n, 14. nf. see set 53. Csa, 23.6; Ci^n, 19.7; C„2, 34.3; C„3, 22.4. For set 56: log K^ = -12.8(±3.24)2n^ + 12.1(±3.15)^2 - 12.4(db3.25)M3 + 2.06(±0.664)
(35)
100/?^ 61.76; AlOO/?^ 54.81; F, 5.384; 5est, 0.774; 5°, 0.732; n, 14. ny. see set 53. Ci^H, 25.9; Q^, 49.0; Cn,, 25.1. For set 57: log K. = -1.87(±0.659)2>z^ + 3.29(±1.31)n2 - 3.75(11.59)^3 + 1.99(±0.807)
(36)
100/?^ 48.38; AlOO/?^ 38.06; F, 2.812; 5est, 0.912; 5^, 0.863; n, 13. nf "Lciz, m, 0.747; Ea/z, ns, 0.702; Zaz, ni, 0.569; Eaz, ni, 0.629; Saz, ns, 0.769;
Application of IMF Model
203
Set 55 2.52log K(i)calc1.51- • 0.50 11
• • •
t
•
»
•
•• •
i
••
•
l
l
1.5 2 2.5 log K(i)obs
c
Figure 9. log /Cj^calc vs. log K; ,obs'
IriH, I^nn, 0.950; En//, «2, 0.824; Zn„, AZ2, 0.731; m, ns, 0.857. Cxnn, 26.6; C„„ 46.8; Cn2, 26.6. Plots of log Kf ^^j^ against log K. obs ^^^ shown in Figures 9—12. As sets 56 and 57 involve the same enzyme and substrate and differ only in the method of assay we have combined them by means of the Zeta method^'^"^ into a single set (set 56, 57). The K- values for the set member for which W is the 10,ll-dehydro-5//-dibenzo[a,d]cycloheptenamido group were used to define ^. The correlation equation used for this set is obtained from Eq. 18 by adding a term in Z^. The best regression equation obtained is:
Set 56
- 1 0 1 2 log K(i)obs Figure 10. log /Cj^calc vs. log K,
3
204
MARVIN CHARTON
Set 57
-1.5-1^.50 0.5 1 1.5 2 2.5 log K(i)obs Figure 11. log/Cj^calc vs. log
log K-j^^ = -11.2(±2.20)E«;g- 0.684(±0.355)/j^^- 0.712(±0.400)ni + 10.9(±2.13)n2- ll-2(±2.19)n3 + 3.64(±0.858)
(37)
100R\ 63.24; A100^^ 56.56; F, 7.227; 5est, 0.760; 5°, 0.687; n, 27. ny. Zo/z, Ixxz, 0.453; 5kxz, Sn/zz, 0.498; ECT/Z, AH, 0.746; So(z, na, 0.702; Eaz, m, 0.584; Zaz, "2, 0.637; Zaz, ns, 0.773; I^nnz, ^nz, 0.863; En//z, «2, 0.826; In,a:, «2, 0.646; ni, m, 0.479; n2, «3,0.860. C„HZ, 24.2; C„c/, 1.48; C„„ 3.08; Q , 47.0; C„3, 24.3.
Set 56+57 2 1.51-
•
«••«§••
• m m • •
logK(i)calc°-^" -0.5-1 -1 -15-
• •
1 i 1 1 2 - 1 0 1 2 log K(i)obs
Figure 12. log /Cj^calc vs. log /Cj ,obs'
1 c
Application of IMF Model
205
Although the results are poor they are probably at least qualitatively reliable. Most of the structural effect is due to some combination of steric effects and polarizability. The rest is due to hydrogen bonding. There may also be a dependence on configuration. Structural Effects in Pepstatin Derivatives
Values of -log IC5Q have been reported for a set of 34 pepstatin analogues varying in the amino acid residue in position 2 and to a lesser extent in the amino acid residue in position 1, in X^, and in X^. The effect of the side chain on the residue in position 2 is represented by the IMF equation. X^ substitution is modeled by the variables ^XN' ^nXN> ^^^ ^XN- ^ Substitution involving the replacement of OMe by OH is accounted for by the variable HQ^ which takes the value 1 when X^ is OH and 0 when it is not. The replacement of Phe by Trp in position 1 is represented by the parameter rij which takes the value 1 when the residue in position 1 is Trp and 0 when it is not. Thus the correlation equation is: Qx = ^^iX + A^X + ^l«™ + f^2\x + ^h + S^X
The best regression equation obtained is: -log /C50 = -7.51(±1.29)a^ - 1.87(±0.552)a;^ - 0.314(±0.107)n^;^ + 0.898(±0.254)z;^- 3.49(±0.659)%+ 1.20(±0.238)\)^^+ 8.64(10.524) (39)
100/?^ 75.26; AlOO/?^ 70.84; F, 13.69; Sesu 0.341; 5°, 0.558; n, 34. nf. ax, nnx, 0.568; nnx. rinx, 0.442. C^x, 9.78; Cax, 7.00; CnHX, 5.11; Qx, 14.6; Cy,x. 39.7; CuXM 23.8. A plot log IC^Q^^i^, against IC5QQ^,3 is given in Figure 13. Again the only residue in the data set with a charged side chain is His. The / parameter is therefore acting as an indicator variable for the presence of His. The results show that steric effects dominate the bioactivity in this set. A subset of set 52 in which only the side chain of the residue in position 2 varies was studied (set 52a). Boc and OMe are the X^ and X^ substituents respectively throughout the subset. The correlation equation is the IMF equation. The best regression was equation obtained with the inclusion of the data point for His, it is: -log IC50 = -3.36(±1.58)a^ - 0.21 l(±0.0590)n„;^ + 0.997(±0.316)/;^ - 2.65(±0.660)D;^ + 8.07(±0.503)
(40)
206
MARVIN CHARTON
Set 52 8 6 5-! log IC(50)c4 -] 3 1 0 I I I I I 5 5.5 6 6.5 7 7.5 8 log IC(50)obs Figure 13. log ICsccaic vs. log ICso.obs-
100/^^ 79.65; AlOO/?^ 76.06; F, 15.66; 5est, 0.306; S°, 0.517; n, 21. nf Gix, n„x, 0.463; mx, n„x, 0.452. Caix, 8.07; C„„x, 6.33; Qx, 29.9; C^x, 55.7. Figure 14 shows a plot of log IC^^^^^ against log ICjoj,!,^. The point for His was included in order to make comparisons with the results for the whole set. Again steric effects are the predominant factor in determining bioactivity in the set. The only differences in behavior between the subset and the whole set with regard to the effect of the side chain on the residue in position 2 are the lack of dependence of the former on polarizability and its dependence on n„ rather than n^.
Set 52a
5.5 6 6.5 7 7.5 8 log IC(50)obs Figure 14. log ICjccalc vs. log ICjo.obs-
Application of IMF Model
207
In the majority of the data sets studied the number of data points is insufficient to permit the determination of reliable QSAR. The results obtained for these data sets are best regarded as suggestive. What justifies the decision to analyze the data? A major function of correlation analysis is obtaining useful information concerning the effect of structural variation on measurable properties from experimental results. If the quality or quantity of the data do not permit the determination of QSAR they can nevertheless be used to obtain semiquantitative structure-activity relationships (SQSAR). These are useful as indications of the direction for further investigation. Design of Angiotensinogen Analogues
The only set which includes some meaningful degree of substitution at X^ shows no effects of it on bioactivity. For X^^ or Aax^ the activity of the inhibitor is increased (IC5Q is decreased) by incrementation in a^^ a^^' ^z» ^^^ % ' ^^ ^^ decreased by decrementation in a^^^ (set 62). Variation of the amino acid side chain Aax^ in position 9 was extensively studied in set 51 and to a very much lesser extent in sets 58 and 59. Activity is increased by incrementaton of G^^ (set 51), i)^ (sets 58, 59), % (set 58), and by the presence of His in this position. It is decreased by incrementation of AZ^ (set 51). Structural variation in X^^^ was limited to the substitution of cyclohexyl for isopropyl. Both sets 58 and 59 show that replacement of isopropyl by cyclohexyl increases activity. As this replacement results in incrementation of both D and a the reason for the increase in activity is unknown. It may be due to steric effects, to polarizability, or some combination of them. The study of the four substrates in which isopropyl is replaced by pentyl, isopentyl, 3-pentyl, and r-pentyl, all of which have the same value of a but range in u from 0.68 to 1.63, would answer this question. If all of these substrates exhibit the same activity the structural effect observed is due to polarizability. If there are large differences in activity the cause is steric effects. If there are small differences the structural effect probably involves both steric and polarizability contributions. For X^^^ incrementing a and \) for the alkyl group attached to sulfur in set 51 increases activity as does increasing the number of O atoms in the fragment W in set 58. In set 61 incrementation of D^ and decrementation of AZ^^ increases activity. In set 60 activity is increased by the incrementation of a^^ ^^^ decrementation of For X^ in set 59, activity is increased by incrementation of D^ and decrementation of o^2- ^^^ 58 shows no dependence on structural variation in X^. With regard to the results for structural variation in sets 54-57, activity is increased by incrementation of the hydrogen-bonding parameters n^ and n^ and the branching steric parameter ^3, and decrementation of the branching steric parameter n2-
208
MARVIN CHARTON
Parameterization of Configuration
Modeling configuration with the variable n .which takes the value 1 for achiral groups and for the configuration with the larger value of Q and is 0 for the enantiomeric or epimeric configuration has had some success. The underlying assumption in this parameterization is that the factors which maximize Q for one chiral configuration will be reflected in the achiral group which can arrange itself spacially so as to bind as strongly as possible to the receptor site. The enantiomeric configuration will be so arranged in space as to bind less strongly. As soon as the n .parameter is introduced into the model any achiral groups present in the set must be assigned values of either 0 or 1 and therefore they are automatically assigned to the same category as one of the two configurations of the chiral groups. Note that the rationale for this method is that the activity-determining step involves binding to a receptor. The validity of this parameterization requires further testing before it can be regarded as generally reliable.
IV. PROTEIN BIOACTIVITIES A. Limitation of the Model in Protein QSAR
It is important to review the disadvantages of the IMF model, particularly as applied to proteins.^ They include: 1. The protein data set almost always is restricted to the 19 structurally similar amino acid residues normally found in protein termed the protein basis set (PBS).The IMF equation for proteins requires from six to nine independent variables depending on the choice of steric effect parameterization. This results in two problems, (a) The data set has a low value, 1-2, of the ratio A^j)p/A^y. A^j)p is the number of degrees of freedom in the data set. A^y is the number of independent variables. The possibility of a chance correlation is fairly large, (b) There are collinearities among the independent variables which are inherent in the PBS. No method for avoiding them is available. This makes interpretation of the regression equation more difficult. 2. In general residues substituted at the a-amino N atom require two additional independent variables to account for their structural effect. The proline type of residue requires one. As its inclusion in the data set adds only one DF, the model cannot be applied to data sets which include it. B. Types of Protein Bioactivity Data Sets
There are two major types of protein bioactivity data sets: 1. Sets in which substitution occurs at a single position in the protein. In this type of data set a residue occupying a given position in the wild-type (native)
Application of IMF Model
209
protein that is known to be involved in the bioactivity is replaced by a number of other residues and the activities of these mutant proteins are determined. The activities can be correlated with an appropriate form of the IMF equation. An example of this type of data set is the determination by Alber and coworkers^^'"^^ of the relative activities of Phage T4 lysozomes substituted at position 86. 2. Residues which are known to be part of a receptor in a bioactive protein are substituted for and the activities of the resulting mutants are determined. In correlating this type of data set it is necessary to assume that only structural effects on the receptor site are of importance and therefore the effect of the side chains of nearby residues on that of the residue that has undergone substitution is negligible. This is a reasonable approximation if only interactions between the receptor and the bioactive species need be considered. The parameterization of the IMF equation in this case must represent the difference in effect between the side chain in the wild-type and that in the mutant. An example of this type of substitution is the report of Fersht et al. on kinetically determined AAG values for the binding of ATP and of Tyr to tyrosyl-rRNA synthetase.^^'"^^ A variation on this type of study is that of Cunningham and Wells^^ which involves alanine substitution of each residue in the protein. C. Hunfian Growth Hormone (hCH)
The alanine scanning mutagenesis method has been applied to residues that are within the three regions of hGH said to be involved in receptor recognition. They are helix 1 (residues 2-19), loop (residues 54-74), and helix 4 (residues 167191).^^ Binding dissociation constants, K^, were determined for each of the mutant human growth hormone-human growth hormone liver receptor site complexes substituted at residues which are part of the receptor. In two cases the alanine-substituted mutant could not be prepared and a different substitution product was used in its place. In this type of protein bioactivity study, only those residues which are either part of or interact with a receptor site need be considered. Residues which are not involved in binding act as a skeletal group to which the active residues are attached. It is assumed that their side chains take no part in the observed bioactivity. Cunningham and Wells have considered all residues for which the ratio of K^ ^^^^^^^ to ^d,wild-type is greater than 4 to be possibly involved in binding. There are 13 such values (Table 9). Correlation of this data set requires a modified form of the IMF equation: Gx^x = ^
+ Aot^ + H^n^ + H^n^ i-Ii^ + Sv^ + B^n^ + B^n^ + B^n^ + B^ (41)
210
MARVIN CHARTON
Table 9. K^ (nM) for Ala Substitutions of Receptor Residues in Human Growth Hormone(hGH); ^ Other Substitutions are Shown Leu6, 0.95; Phel 0, 2.0; Phe54, 1.5; Glu56, 1.4; Ile58, 5.6; Ser62, 0.95; Arg64, 7.11; Glu66, 0.71; Gln68, 1.8; Lys70, 0.82; Aspl 71, 2.4; Lysl 72, 4.6; Glul 74, 0.075; Thrl 75Ser, 5.9; Phel 76, 5.4; Argl 78Asn, 2.9; llel 79, 0.92; Cysl 82, 1.9; Vail 85, 1.5. Note: ^Ref. 6. Sets 12 and 13 include all substitutions with K^> 1.4 with the exception of Phel 76. Set 14 includes Met14, Ser62, and Glu66 as well.
where the superscript A indicates the difference between the parameter value of the final side chain and that of the initial side chain. Thus, when the amino acid residue Aax^ is replaced by the residue Aax^ the value of v"^ is given by: (42) where v is one of the independent variables in Eq. 3 while v^. and v^^-are its values for the initial and final side chains, respectively. Values of v^ for use with Eq. 41 are given in Table 10. The best regression equations were obtained on the exclusion of the data point for the Phel76Ala mutant. They are, log /i:^ = -3.79(±0.940)af + 0.138(±0.0423)A2^- 0.240(±0.134)nf 0.360(±0.141)n^ - 0.635(±0.164)nf + 0.856(±0.196)
(43)
100/?^ 82.53; A100/?2, 72.55; F, 5.670; 5,,^, 0.146; 5^, 0.591; n, 12 or: log K^ = -2.50(±0.987)af + 0.152 (±0.0518)n^ - 0.330(±0.132)n^ -f0.586(±0.0996)
(44)
100R\ 63.70; A100/?2, 55.63; F, 4.679; 5est, 0.182; 5°, 0.738; n, 12 For each of these equations the F test is significant at the 95.0% CL. Plots of log ^dcaic against log ^d,obs a^^ given in Figures 15 and 16. C^ values for Eqs. 43 and 44 are given in Table 11. Binding of mutant hormones to the hGH receptor is a function of hydrogen bonding, and possibly of van der Waals interactions and steric effects as well. Values of AQ, the difference between the observed and calculated values of log K^, for the residues not represented by Eq. 44 are reported in Table 12. Three of the residues in Table 12, Metl4, Ser62, and Glu66, have AQ values which are small
Application of IMF Model
211
Table 10, Amino Acid Side-Chain Parameter Difference Values^
* ^
Aax Lys6
a*
^H
*
/
D
*
*
*
^1
"2
"3
2
0 1
0.00 0.04
0.140 0.244
0
0 0
0 0
0.46
0
0.18
1 1
Met14
0.05
0.175
0
0
0
0.26
1
1
Phe54
0.04
0.244
0
0
0
0.18
1
1
PhelO
Glu56
0.08
0.105
1
4
1
0.16
1
1
Ile58
0.00
0.140
0
0
0
0.50
2
0
Ser62
0.12
0.016
1
2
0
0.01
1
Asn63 Arg64
0.07
0.088 0.245
0 1
Glu66
0.08 0.06
0.105 0.134
3 3 4
0.24
0.05
2 4
1
0.16
1 1 1
0 1 1
2
0
0.16
1
0.173
2
0.059
1
1 1
0.16 0.24
1 1
1 1
Aspl 71
0.01 0.16
3 1 4
Lysl 72
0.01
0.173
2
1
1
0.16
1
0 1
Glu174
0.08
0.140
1
4
1
0.16
1
1
-0.02
0.046
0
0
0
0.17
1
0.04
0.244
0
0
0
0.18
1
-0.02
0.157
2
0
0 1
0
0
2
0
0
Gln68 Lys70
Thrl 75Ser Phel76 Argi 78Asn
1
0.16
Ile179 Cys182
0
0.140
0
0
1 -0.08 0 0.50
0.13
0
0
0
Vail 85
0.02
0.082 0.094
0
0
0
Note:
0.10 0.24
0
0
0
0
0
1
1
2
0
These values are for correlations with Eq. 10 and its variants. Correlation of the logarithms of the entire data set of K^ values with Eq. 10 did not give significant results. Exclusion of the Phel 76Ala mutant gave the best regression equations.
enough to suggest that they can be combined with the members of set 9. Correlation of the combined set with Eq. 41 gives the regression equation: log K^ = -4.29(±0.971)af + 2.04(±0.963)a^ + 0.157(±0.0416)n^ - 0.329(±0.145)nf + 0.377(±0.148)n^ - 0.939(±0.187)A2^
+ 0.787(±0.216)
(45)
100R\ 85.96, AlOO/?^ 78.16; F, 8.161; 5est, 0.158; 5^,0.513; n, 15 A plot of log A^dcaic ^gaii^st log A^dobs ^^ given in Figure 17. The major difference between Eqs. 44 and 45 is that the latter shows some dependence on polarizability
212
MARVIN CHARTON
hGH
0.10.20.30.40.50.60.70.80.9 log K(d)obs Figure 15. log Kj^calc vs. log /Cd,obs-
whereas the former does not. It is important to recognize, however, that there is a strong collinearity between the polarizability parameter a^ and the steric parameter riy It is quite likely that the n^ term in Eqs. 44 and 45 represents polarizability at least in part. Correlation matrices for Eqs. 44 and 45 are set forth in Table 13. The coefficients of the other independent variables in Eq. 45 show no significant difference from those in Eq. 44. AQ values calculated from Eq. 45 for the remaining residues are also given in Table 12. As the calculated values for Lys70 and Phel74 are between three and four standard deviations away from the observed values, these residues may simply be outliers. These residues do not seem to be on the binding surface of hGH. Their effect is probably due to conformational changes which affect
hGH 0.8 0.7-1 0.6 0.5 log K(d)ca0.4 0.3-1 0.2 0.1 0 0.10.20.30.40.50.60.70.80.9 log K(d)obs Figure 16. log Kd,<-alc vs. log K, ,obs'
Application of IMF Model
213
Table 11, C. values for Set 9, Equations 43 and 44^ Eq.
43
44
Eq.
43
44
Col Ca
19.9
31.9
^"3
37.1
46.7
0
0
Cimf
27.9
53.3
QH
8.05
21.4
c
72.0
46.7
c
CN Cfvip lOOR^
53.3
72.8
46.7
82.53
63.70
Cn,
14.0
C.L.(F)
95.0
95.0
Cn,
20.9
0 0 0 0 0
27.9
Q
0 0 0
Ci
Note: ^It is dear that binding of mutant hormones to the hGH receptor is a function of both imf and steric effects. The difference between the observed and calculated values of log K^, AQ, for the residues not represented by Eq. 12 are reported in Table 10.
the binding surface. The remaining residues are apparently behaving very differently from those which obey Eq. 45 although they do seem to lie on the binding surface. D. Subtilisin BPN'
In the studies of Wells and coworkers,^^ structural variation occurs at positions 156 and 166 in the enzyme and at position 1 of the substrates which are peptides having the structure succinyl-L-Ala-L-Ala-L-Pro-L-Aax-4-nitroanilide. Nine different residues have been studied in position 166, and three in position 156 (S166 and SI56, respectively) of the enzyme. Four different residues have been studied in position 1 (PI) of the peptide. The data are reported in Table 14. Parameters used in the correlations are given in Table 15. Substitution at SI66
In view of the more extensive information at position 166 we have concentrated our effort on structural effects at that position. The effect of substitution at S156 is
Table 12, AQ Values Calculated from Equations 44 and 45 Aax
Eq. 44
Eq. 45 -1.52
Aax
Eq. 44
Eq. 45
Leu6
-1.33
Lys70
-0.67
-0.61
Met14
-0.28
Glu174
-1.29
-1.12
Ser62
-0.32
Phel76
0.54
0.50
Asn63
-0.93
Ile179
-0.77
-0.83
Glu66
-0.32
-0.99
214
MARVIN CHARTON
hGH 0.8-^
0.6-J logK(d)cal0.4 0.2 0 -0.2
^
I
I
I
I
0.2 0 0.2 0.4 0.6 0.8 log K(d)obs Figure 17. log Kd,caic vs. log Kd,obs-
Table 13. Correlation Matrices for Equations 44 and 45'
<^\ a
a
"H
"n
/
D
"i
"2
"3
0.260
0.037
0.572
0.121
0.022
0.013
0.215
0.213
0.380
0.019
0.563
0.082
0.110
0.032
0.069
0.257
0.348
0.140
0.052
0.083
0.165
0.505
0.734
0.165 "H
"n
0.288
0.082
0.119
0.129
0.578
0.730
0.540
0.690
0.356
0.414
0.289
0.552
0.533
0.668
0.355
0.377
0.167
0.378
0.693
0.031
0.129
0.539
0.232
0.637
0.087
0.116
0.373
0.147
0.369
0.486
0.239
0.371
0.267
0.414
0.244
0.333
0.869
0.417
0.470
0.803
0.493
0.239
0.120
0.543
0.098
0.490
i D
ni
"2
0.478 0.592
Note: ^Upper values are for Eq. 12, lower values for Eq. 14.
Application of IMF Model
215
Table 14, Values of Log ikjKJ (first) and Log (1//CJ (second) for Subtilisin BPN' Modified in Positions 156 and 166 Interacting with the Peptide Substrate Succinyl-L-Ala-L-Ala-L-Pro-L-Aax-4'Nitroanilide in 0.1 M TrisHCI (pH 8.6) at 25° C Enzyme 156 Glu
Gin
Ser
Position 166
Glu
Substrate Gin Residue Met
Lys
Asp
ND, ND
3.02, 2.56
3.81,2.93
4.21,3.18
Glu
ND, ND 1.62,2.22
3.86, 3.28 5.02,3.97
4.48, 3.69
Asn
3.06, 2.91 3.85,3.14
4.25, 3.07
Gin
1.20,2.12
4.36, 3.64
5.54,4.52
4.10,3.15
Met
1.20,2.3
3.89,3.19
5.64, 4.83
4.70, 3.89
Ala
ND, ND
4.34, 3.55
4.90, 3.24
Gly(wt)
1.54,2.29
3.95, 3.43
5.65, 4.46 5.15,4.04
Arg
2.91,3.30
4.26, 3.50
5.32,4.22
3.19,3.06
Lys
4.09, 4.25
4.70, 3.88
6.15,4.45
4.23, 2.93
Asp
1.30,1.79 2.79, 2.98 2.04, 2.72
3.40, 3.08 4.71,4.17
5.03, 3.98 5.48, 4.32
4.41,3.22
Gly Asn
5.95, 4.86 5.97, 4.68
4.60,3.13
3.03, 2.40 3.75, 2.74
Lys
4.82, 4.66
4.51,3.76 4.64, 3.68
Asp
1.23,2.13
3.41,3.09
4.67, 3.68
3.23, 2.75 4.24, 3.07
Gly Asn
2.59, 2.92
4.38, 3.79 4.57, 3.82
5.77,4.73
3.37,2.70
1.91,2.78
5.72, 4.64
3.68, 2.80
Lys
4.21,4.40
4.84, 3.94
6.16,4.90
3.73, 2.84
Note: ^Sets IE, 2E, AaxPl = Glu; Sets 1Q, 2Q, AaxPI = Gin; Sets 1M, 2M, AaxPI = Met; Sets IK, 2K, XaaPI = Lys; Sets 1, 2, 5, 6,11, 12 include all data points; Sets 3, 4, 7, 8, Aaxi 56 = Glu; Sets 9,10, Aaxi 66 = Gly or Asn. wt, wild-type. Data from ref. 8,
Table 15, Parameter Values Used in Correlations: Parameter Values for Subtilisin BPN' Aax
<^/
Asp Glu Asn Gin Met Ala Gly Arg Lys Ser
0.15 0.07 0.06 0.05 0.04 -0.01 0 0.04 0 0.11
a
"H
"n
/
^1
1)2
^3
0.105 0.151 0.134 0.180 0.221 0.046 0 0.291 0.219 0.062
1 1 2 2 0 0 0 4 2 1
4 4 3 3 0 0 0 3 1 2
1 1 0 0 0 0 0 1 1 0
0.52 0.52 0.52 0.52 0.52 0.52 0 0.52 0.52 0.52
0.50 0.52 0.50 0.52 0.52 0 0 0.52 0.52 0.52
0.32 0.50 0.32 0.50 0.60 0 0 0.52 0.52 0
216
MARVIN CHARTON
accounted for by the internal parameter ^j^^. This allows us to study a data set in which substitution occurs at both S156 and SI66 but it does not tell us anything about the nature of the substituent effect at four different residues in peptide position 1 (PI) with the correlation equation:
The ^156 values were defined from the log (k^JK^) and log (1/^^) values for the enzyme Aaxl66 = Gly and the substrate with AaxPl = Met. The coefficients of and statistics for the best regression equations obtained for log (k^JK^) with Glu, Gin, Met, and Lys at PI, respectively (sets IE, IQ, IM, and IK) are given in Table 16, as are also those for the log (1/^^) (^^^^ ^^' ^Q' ^ ^ ' ^^^ ^^) values. The correlation matrix (matrix of the zero-th order partial correlation coefficients) for correlations with Eq. 46 are given in Tables 17 and 18. Significant correlations were obtained for all of these sets and the results certainly indicate trends. Due to the small number of data points in each set no definitive conclusions can be drawn however. We have therefore combined these data sets into larger sets. Our initial approach was to use a second internal parameter, ^pp for representing the effect of substitution in the peptide substrate. This parameter was defined from log {k^JK^ and log (1/^^) values for the enzyme with Xaal56 = Glu and Xaal66 = Gly. The correlation equation used is: Qx = ^^yX + ^ % + f^l^HX + ^2%X + ^h + S,X>^ + 52D2X + ^B-^SX + 2Ci56 + ^Cpi + B'
(47)
The coefficients of and statistics for the best regression equations obtained for log (k^JKjJ (set 1) and for (1/^^^) (^^^ ^) ^ ^ given in Table 15. The data points excluded from these sets as outliers are indicated in Table 19. The correlation matrix for correlations of sets 1 and 2 with Eq. 47 is given in Table 20. It seems clear from the C- values that the effects of substitution at positions 156 of the enzyme and PI of the substrate dominate the structural effects on the values of both log {kJKJ and log {\IKJ\ together they account for 88.7% of the effect. In an attempt to learn more about the effect of substitution at position 166 of the enzyme values of log {k^JK^ and log {\IK^ for enzymes in which Xaal56 is Glu were correlated with the equation: Qx = ^^vx + ^ % + ^i«//x + ^2\x + ^h + 5iDi;^ + 52D2X + S^^2>x + ^PlCpi + ^
('^^)
Again, the coefficients of and statistics for the best regression equations obtained for log (^ca/^m) (^^^ •^) ^^^ ^^^ ^^S (^^^m) (^^^ ^^ ^^ reported in Table 15. The correlation matrix for correlations of sets 3 and 4 with Eq. 48 is given in Table 21, outliers are reported in Table 18. The difference between sets 3 and 4 and sets 1 and
Table 16. Results of Correlationsa Hi
SH 1
-1 7.3
5.38
0.192
-1 3.8
5.1 6
0.229 0.1 70
-1.26
1Q 1M 1K
1.34 0.81 9
-0.539
0.0849
0.352
0.091 0
-0.401
2E
-1 2.6
3.24
-0.406 0.985 0.202
0.0929 0.135 0.0677
4.06
0.933
0.241 -0.387 0.266
1.71
0.71 0
L
SL
2Q 2M 2K 1 U 5
5h2
SA
1E
Set
2 3 4 5 6 7
-3.45 -7.85 -3.10
1.05
A
H2
8 9 10
-5.31
Set
Sl
1E
5.85
1.97
1Q
1.44
0.600
1M 1K 2E
1.20 1.84
0.543 0.509
1.69
0.0624
0.396
0.1 82
-0.967 -0.21 5
0.110 0.0451
0.842
0.161
0.0758 0.0790 0.0934
-0.291 0.1 10 -0.367
0.051 9 0.0344 0.675
0.481
0.1 52
4.231
0.1 30
0.422
0.143
-0.350
0.1 06 0.452
0.193
0.275
0.0835
-0.268
0.0569 4.264
0.1 29
0.273
0.108
-0.31 6
0.0865
s3
ss3
1'
3.21
1.50
0.71 3
1.51
1.69
0.767
SSI
sz
ss2
5.01
1.20
SI
0.281
2.1 0 1.01
i
1.35
56
SZI 56
0.699
0.497
0.520
0.232
1.07 -0.848 0.581
0.311 0.326 0.254
-
L P1
%p156
(continued
Table 16. Continued Set
2Q 2M 2K
2
CD
51
SS1
1.11
0.451
1 2 3 4 5 6 7 8 9 10
1.09
0.558
Set
Hip,
52
ss2
53
553
1'
56
0.566 0.823
5z156
-0.469
0.228 0.237 0.1 76
0.727 0.429
0.327 0.201
0.331
SHl P1
-.'1
SHZPl
'P 1
SIP1
ZP1
'ZPl56
0.822 1.04 1.30
0.0628 0.0902 0.181
0.352
0.114
0.1 94
B"
SBO
1002
702
1E
-1.51
2.73
94.47
91.01
1Q 1M 1K
-1.50 -0.374 8.30
1.27 1.71 1.79
90.44 85.69 70.65
86.10 80.92 66.45
2E 24 2M
-0.1 92 1.23 -0.773
0.1 72 0.980 1.04
96.65 69.01 84.54
94.55 64.58 79.39
2K
4.79
0.775
77.36
72.14
1
-3.1 6 -1.80
1.83 0.944
77.81 74.25
76.25 72.41
2
3 4
5 co
5 6
0.230
7 8 9
0.403
10
-0.196
Set 1E
F
1Q 1M 1K 2E 24 2M 2K 3 4 5 6 7
0.1 05 0.1 52
0.0903 sest
19.93 15.77 13.18 10.43 33.61 9.648 12.03 10.25 18.40
0.394 0.230 0.320 0.340 0.230 0.263 0.267 0.1 97
7.703
0.496
42.31 27.68
8
19.45 19.47
9
82.55
10
37.22
0.774 0.623 0.403 0.638 0.378 0.469 0.332
-0.807 2.1 6
0.736 0.442 0.1 87 0.828
-0.565
0.0570
-1.16
0.1 68
5.53
-0.227 -0.665 -0.269 -0.481 -0.117
0.0354 0.0828 0.0454 0.0638 0.0476
-0.882 -0.972 -0.590 -1.51 -1.11
0.107 0.240 0.137 0.202 0.146
-2.87 5.39 4.09 5.64 4.43
9 0.333 0.403 0.470 0.620 0.259 0.637 0.489 0.566 0.620
n
14 17 17 17 14 17 17 17 32
0.782 0.484 0.530
30 63
0.51 5
33
0.552 0.359
32 24 24
0.427
62
Cot
0 0 0 0 0 0 0
0 0 25.4 0 6.59 0 26.7 nd nd
0.258 0.169 0.1 66 0.1 35
66.34
64.02
47.06 78.77 75.1 2
43.1 3 77.31 72.90
78.27
75.1 7
74.26 88.72 84.81
71.50 88.20 83.36
ca
CnH
Cnn
ci
C",
19.0 22.3 0
8.00 7.1 8 4.29 5.90 7.40 5.74 3.63 10.1 7.1 5
7.51 4.73 4.89 0 7.25 6.10 4.38 2.87 5.94
8.02 0 4.82
34.8 12.7 14.6 26.7
0
17.5 0 0 19.6 0 0 0 7.63 0 19.5 0 0
0
0
11.0
10.7
0 10.4
0
0
6.32 0
7.25 0 0 18.3 0
0 0 16.7 0 0 0
0
6.51 0
0 0
0
12.0 0
0
0
0 0
0
0
0
0
nd nd (continued)
Table 16. Continued Set
cu2
1E
0 0 0 0 37.7 0 0 0 0 0 0 0 nd nd nd
1Q 1M 1K 2E 2Q 2M 2K
N
N 0
3 4 5 6 7 8 9 10 Set
11
nd L
6.05
CU,
c(T56
CLPl
CnHPl
CnnPl
CIPl
0
22.8 25.0
nd nd nd nd nd nd nd nd
nd nd nd nd nd nd nd nd nd nd
nd nd nd nd nd nd nd nd nd nd
nd nd nd nd nd nd nd nd nd nd
22.6
46.4
6.11 25.3 16.9 24.2
23.5 37.0 37.0 75.8
8.20
78.1
28.1 0
71.4
0 0 0 0 0 0
67.4 23.9 88.2 68.0 67.4 44.5
0 0 0
32.2 0 49.6
nd nd nd nd SL
3.40
12 13
nd nd nd nd
36.2 56.5
nd nd nd nd nd nd
15
6.1 3
3.66
13.7
SA
Hl
SH 1
-1 8.5
5.63
0.999
0.220
0.220 -1 3.1
5.26
0.698 0.255 0.887 0.235
A
14 16
9.22 0 15.3 0 0
-1 4.9
5.90
H2
'nX166
nd nd nd nd nd nd nd nd nd nd nd nd nd nd 0 0
5h2
-0.863
0.1 76
0.0640
-0.264
0.0374
0.1 75
-0.553
0.0898
0.0500
-0.239
0.0344
0.234 0.0527
-0.761 -0.223
0.1 81 0.0371
1 0.27
5 0.148
Set
h,
k!
Sl
SSl
s 3
SS3
11
2.37
0.658
4.25
1.64
12 13 14
0.529 2.1 2
0.31 3 0.660
3.02
1.58
15 16
1.78
52
SS2
3.66
0.71 9
56
1'
%156
0.404
0.1 69
0.375
0.1 71
0.379
0.1 77
1.84
Hlpl
'H1 P1
0.373
0.0842
0.1 99 -0.1 62
0.0809 0.0631
0.390
0.0909
Set
H2P1
SIP1
"ii
snii
B"
SBO
1002
A1002
11
-0.660
0.0467
-1.18
0.1 26
1.38
0.180
5.39
12
-0.248
0.0326
-0.747
0.0964
1.01
0.1 31
2.55
0.1 69 0.724
88.79 79.74
86.72 77.64
13
-0.545
0.0434
-1.19
0.1 29
14 15
-0.1 61 -0.661
0.0631 0.0506
-0.750 -1.23
0.1 01 0.1 38
0.981 0.721
0.1 32 0.1 03
5.38 2.83
0.1 74 0.725
87.76 79.10
86.01 76.94
1.28
0.197
-0.245
0.0349
-0.759
0.1 03
0.91 0
0.142
5.40 2.77
0.1 80
16
87.21 76.96
84.86 74.78
Set
F
11 12
38.1 8 32.05
'HZPl
'P 1
sest
9
0.472 0.372
0.371 0.481
65 65
2.98 0
21 .o 0
0 0 3.53 0
20.5 0
n
13
43.81
0.484
14 15
30.82 32.74
0.378 0.488
0.380 0.488 0.396
65 65
16
28.95
0.379
0.511
59
59
Cd
ca
19.8 0
n'
H
6.1 5 4.21 5.96
0.752 Cnn
Ci
C"1
5.32 5.05 4.72
1.66 0 0
14.6 10.1 18.1
5.88
5.49
6.39 5.28
5.48
0 0 0
12.8 0
5.02
0
(continued)
Table 16. Continued ~
N N p3
Set
CU,
CnHPl
CnnPl
C,PI
11
0
26.1
0
2.30
4.07
7.28
12
0 25.8
42.3 0
0 1.70
4.75
14.3
13
0 0
4.65
10.2
14
0
0
47.3
3.73
3.71
17.3
15
0
26.4
0
2.81
4.76
16
0
0
46.7
0
5.51
CU3
CS156
~
8.87 17.0
un'
8.52 19.3 8.37 16.6 9.20 20.4
Note: aL, A, H, . . . are the regression coefficients of the best regression Eq.; S, , S , ,S, . . . are their standard errors, 1OOR2 is the percent of the variation of the data accounted for by the regression equation; AlOOR' is the previous quantity adjusted for the number of independent variables; S,, is the standard error ofthe estimate, it is a measure of the error to be expected in a value of the dependent variable that is estimated from the regression equation. 9 is the previous quantity divided by the root mean square of the data. n is the number of data points in the set. Ci represents the percent of the data accounted for by the i-th independent variable when a reference substituent is used. nd means not determined.
Application of IMF Model
223
Table 17, Correlation Matrices for Sets 1E and 2E^ ^
1
«
^H
^n
'
^1
^2
^3
.011
.102
.863
.147
.481
.449
.094
1
.755
.253
.515
.821
.842
.960
.547
.444
.669
.675
.612
1
.274
.644
.620
.278
1
.452
.462
.480
1
.999
.882
1
.901 1
^156
.194 .180 .300 .279 .167 .155 .099 .092 .151 .140 .091 .084 .107 .099 .266 .247 1
Oj a "H
Hn
i ^1
1)2
Cl56
Note: ^Values in boldface are for set 2E only. The other values in this column are for set 1E.
Table 18. Correlation Matrices for Sets 1Q, 1M, 1K, 2Q, 2M, and 2K a
^/ 1
"H
/
^n
^1
^2
^3
.023
.120
.867
.311
.434
.517
.186
1
.769
.267
.489
.732
.821
.940
1
.478
.401
.559
.675
.626
1
.440
.575
.671
.383
1
.436
.530
.521
1
.834
.745
1
.910 1
^156
.090 .085 .192 .180 .036 .034 .000 .000 .051 .048 .168 .157 .042 .040 .168 .157 1
^1
a "H
"n
i
^2
^3
Cl56
Note: ^Values in boldface are for sets 2Q, 2M, and 2K only. The other values in this column are for sets ^Q, 1M, andlK.
224
MARVIN CHARTON Table 19. Outliers In Sets 1 Through 8
Set
Substrate
Enzyme 1
2
3
156 Gin
166 Lys
PI Glu
Gin Ser Glu Gin Ser
Lys Lys Lys
Lys Lys Glu
Lys Lys
Glu GluLys
Ser
Lys
Lys
Arg
Glu
Set
Enzyme
Substrate
4
156 Glu
166 Lys
PI Lys
5
Glu Glu Gin
Gin Met Lys
Met Met Lys
Gin
Lys
Glu
Glu
Lys
Glu
Gin Ser
Lys Lys
Glu Gly
Glu
Lys
Glu
6
Lys 8
Table 20, Correlation Matrices for Sets 1 and 2^
^/ 1 1
a .063 .081 1 1
"H
.147 .160 .762 .760 1 1
^
/
^/
.865 .865 .301 .315 .517 .527 1 1
.344 .369 .465 .455 .390 .383 .456 .474 1 1
.478 .491 .750 .750 .576 .574 .613 .621 .430 .427 1 1
^2
^3
.545 .562 .823 .822 .669 .667 .689 .700 .502 .498 .866 .866 1 1
.217 .236 .941 .941 .614 .610 .399 .413 .487 .477 .773 .773 .908 .908 1 1
Note: ^Values in boldface are for set 2, other values are for set 1.
^)56
.149 .211 .261 .176 .086 .022 .041 .118 .032 .073 .174 .056 .078 .021 .229 .141 1 1
Cp; .017 .057 .021 .059 .042 .009 .029 .021 .049 .131 .045 .078 .013 .034 .012 .061 .080 .104 1 1
^1
a "H
"n
1 ^1
^^2
Cl56
Application of IMF Model
225
2 is that the residue in SI56 is held constant in the former pair while it is allowed to vary in the latter. Although the results obtained for sets 3 and 4 are statistically significant they leave much to be desired. An alternative parameterization of the structural effects at PI of the peptide substrate was therefore considered. The effects can of course be represented by the IMF equation. For the four residues studied, steric effects are essentially constant, thus no steric parameterization is necessary. Both their electrical effects and their polarizabilities are essentially constant. It follows then that we should be able to account for their effect by means of the «^, n^, and / parameters. The entire sets of log (k^JK^) and log (l/K^) values were correlated with the equation:
(49) Coefficients of and statistics for the best regression equations obtained for the log (k^JK^) (set 5) and for the log (l/AT^) values (set 6) are given in Table 15. Outliers are reported in Table 18. The correlation matrix for correlations of sets 5 and 6 with Eq. 49 is given in Table 22. This parameterization was also applied to log (k^JK^)
Table 21, Correlation Matrices for Sets 3 and 4^ Oj
a
1 1
.101 .122 1 1
"H
.173 .171 .676 .751
^n
.789 .796 .273 .350 .577 .570 1 1
/'
^/
^2
^3
^Pi
.280 .325 .425 .490 .470 .505 .484 .542 1 1
.374 .393 .665 .651 .410 .420 .448 .487 .313 .343 1 1
.549 .581 .824 .810 .575 .595 .614 .673 .441 .488 .714 .710 1 1
.270 .305 .916 .905 .478 .535 .362 .444 .391 .460 .651 .649 .925 .927 1 1
.100 .130 .158 .139 .170 .108 .057 .104 .048 .065 .052 .045 .069 .081 .101 .120 1 1
^Values in boldface are for set 4 only. The other values in this column are for set 3.
^1
a "H
"n
1 ^1
1)2
Cl56
226
MARVIN CHARTON
and log (l/^m) values for enzymes with Xaal56 = Glu. The correlation equation used is:
The coefficients of and the statistics for the best regression equations for the log (k^JK^) values (set 7) and for the log (1/^^) values (set 8) are set forth in Table 15. The correlation matrix for correlations of sets 7 and 8 with Eq. 50 is given in Table 23. Outliers are again reported in Table 18. The results obtained for sets 5 and 6, and in particular for sets 7 and 8 are indeed an improvement over those for sets 1, 2, 3, and 4. Substitution at SI56 Finally, to determine whether anything may be learned about the effect of substitution at position 156 of the enzyme, we have correlated log (k^JK^) and log (l/K^) values for enzymes with Aaxl66 = Gly or Asn with the equation, Qx = ^ % + ^1%X + ^2«nX + ^h + ^X166«X166 + //iPl^HPl + ^2P1«.P1 + ^PI'PI + ^
(^^)
in which the parameter ^^166 takes the value 1 when Xaal66 is Asn and 0 when it is Gly. In parameterizing the effect of substitution at position 156, we have noted that for the three residues studied Dj is constant and 1)2 and QJ nearly so. Steric effects occurring at atoms past the second atom of the side chain were assumed to be negligible. The coefficients of and the statistics for the best regression equations obtained for the log (k^JK^ values (set 9) and the log (l/A'm) values (set 10) are reported in Table 15. The correlation matrix for correlations of sets 9 and 10 carried out with Eq. 51 is given in Table 24. On examining the data points which were excluded from the correlations as outliers we note that of the total of 17 such outliers in sets 1 through 8 Lys is in position 166 in 14 cases and Arg in one case. Thus 15 of the 17 outliers have ionic groups attached to two or more methylene groups in the side chain. No Asp substitution at this position occurred in any outlier. Seven of the 17 outliers had a Glu residue in position PI of the substrate while six had Lys in this position; thus 13 of the 17 outliers had ionic groups attached to two or more methylene groups in this position. In position 156 of the enzyme one-third of the 17 outliers, about six, should be Glu; seven Glu residues were in this position. There does not seem to be any preference for ionic groups in this position. It seems likely that the model is incomplete, and that an additional parameter is required to account for the interaction of ionic side chains on Glu, Lys, and Arg residues in position 166 with ionic side chains on Glu and Lys in position PI. Since interactions between opposite
Table 22. Correlation Matrices for Sets 5 and 6a (J/
1
1
a
nti
"n
.046 .063 1 1
.136 .147 .763 .762 1 1
.866 .865 .287 .301 ,507 .517 1 1
I
,321 ,344 ,475 .465 ,396 .390 .438 .456 1 1
N N
u
u1
u2
.466 .478 ,750 .750 .578 .576 .605 .613 ,434 .430 1 1
.530 .545 .824
.823 ,671 .669 ,679 .689 .509 .SO2 .867 .866 1 1
u3
h56
nHPl
nnPl
'Pl
.199 .217 .942 .941 .617 .614 .385 .399 .495 .487 .773 .773 .908 -908 1
.1 74 .178 ,031 .142 .110 .003 .150 .097 ,122 .lo3 .113 .041 .180 .037 ,108 .110 1 1
.013
,010 .043 ,012 .044 .029
.015 .033 .015 .033 .007 .004
1
Note: aValues in boldface are for set 6, other values are for set 5.
.009 .015 .010 .014 .001 .012 .003 ,004 .022
,002 .014 .009
.006 .012 .010 .161 .076 1 1
.005 .019
.014 .039 .lo2 ,035 .062 ,008 .025 .006
.046 . I 56 .loo ,440 .457 1 1
.ooo .011 ,057 .078 .039 .048 ,008 .019 ,018 .035 ,116 .077 .322 .349 .271 ,222 1 1
01
a "H
"n 1
u1 u2 u3
c156 "HPl
""PI
'Pl
228
MARVIN CHARTON Table 23, Correlation Matrices for Sets 7 and 8^
/
^/
a
''H
^n
1 1
.092 .118 1 1
.155 .173 .706 .704
.781 .779 .293 .311 .577 .591 1 1
.270 .319 .457 .443 .503 .498 .494 .530 1 1
^1
^HPI
^nPI
'Pl
.372 .390 .656 .654 .405 .402 .452 .462 .319 .313 1 1
.015 .005 .016 .009 .015 .010 .012 .006 .013 .025 .010 .014 1 1
.068 .025 .073 .041 .068 .048 .054 .029 .058 .117 .044 .063 .452 .482 1 1
.052 .019 .056 .031 .052 .037 .042 .022 .045 .088 .034 .048 .346 .365 .228 .189 1 1
Note: ^Values in boldface are for set 8 only. The other values are for set 7.
charges will be attractive and those between like charges will be repulsive we have defined the ionic interaction parameter, «•-, as taking the values 1 when the interaction is between unlike charges, - 1 when it is between like charges, and 0 when no interaction is present. We have correlated sets 11 and 12 with the equation:
Table 24, Correlation Matrices for Sets 9 and 10 a 1
"H
.690 1
/'
"n
.724 .000 1
.282 .500 .866 1
'^xiee .000 .000 .000 .000 1
'^HPl
.000 .000 .000 .000 .000 1
'^nPl
.000 .000 .000 .000 .000 .381 1
'p^
.000 .000 .000 .000 .000 .302 .316 1
a "H "n
i "X166
"HPI "nPI '156
Application of IMF Model
229
The coefficients of and statistics for the best regression equations obtained for sets 11 and 12 are given in Table 16. The correlation matrix for correlations of sets 11 and 13 with Eq. 52 is given in Table 25 and sets 12 and 14 in Table 26. Plots of log (^ca/^M)caic against log (kJK^X^^ and log (l/K^\^^^ against log (l//^M)obs ^re shown in Figures 18 and 19, respectively. The results obtained are a dramatic improvement over our earlier attempts, particularly in view of the fact that no data points were excludedfrom these sets. Though these results are excellent they do not prove that the interionic interaction requires a side chain with the structure (CH.2)J where / is an ionizable group and n is greater than two. In order to provide further evidence on the validity of the conclusion that the Asp side chain in position 166 is not involved in interionic interactions with side chains in PI the correlations with Eq. 16 were again carried out after assigning n- values of 1 to Aspl66-Lys(Pl) combinations and - 1 to Aspl66-Glu(Pl) combinations. Again, the coefficients of and the statistics for the best regression equations obtained for log (k^JK^) (set 13) and for log (1/^^) (^^^ ^^) ^^^ ^^^ ^^^^^ ^^ Table 15. The correlation matrix for sets 11 and 13 differs from that for sets 12 and 14 only for the zero-th order partial correlation coefficients of other variables with ^^5^ and n-. These matrices are reported in Table 24. Although the difference is small the results for sets 11 and 12 are indeed better than those for sets 13 and 14. Taken together with the fact that Aspl66 does not as an outlier in any correlation, it seems that this residue probably does not interact significantly with ionic side chains in PI of the substrate. Validity of the Model
In order to provide a further test of the model we have excluded six data points from sets 11 and 12 giving sets 15 and 16, respectively. The data points excluded were chosen to provide a wide range of side chain structure at positions S156, S166, and PI. The results of the correlations for the best regression equations are given in Table 15. The coefficients of the regression equations for sets 15 and 16 are in very good agreement with those for sets 11 and 12. The major differences are that the borderline dependence on / observed in set 11 has disappeared in set 15, while the borderline dependence on Dj in set 12 has disappeared in set 16. We believe that in both cases this is due to the smaller number of degrees of freedom in sets 15 and 16. These results strongly support the validity of the model. It is also of interest to determine how well the model can predict new values of log (^ca/^m) ^^^ ^^S (1/^m)- Calculated values of these quantities obtained from sets 15 and 16 together with the differences A between the observed and calculated values are reported in Table 27. The results for the data points not included in the correlations are given in bold face. The results show that it is possible to make
Table 25. Correlation Matrices for Sets 11 and 1 3a Q/
1
a .015 1
"H
"n
I
u1
u2
.115 .766 1
,867 .262 .489 1
.278 .493 .408 .407 1
.444 ,751 ,581 .590 .441 1
,503 .825 .675 .660 .515 .868 1
h,
W 0
Note: 'Values in the column headed
'3
c156
"HPl
"nPl
'Pl
"ii
.166 ,944 .623 .360 ,511 .774 ,908 1
.164 .038 .114 .145 .129 ,108 .184 .114 1
,005
,021 .024 .036 ,026 .024 .028 ,015 .017 .156 .415 1
.016 .018 .028 .020 .019 .022 .011 .013 .122 .323 .274 1
.014 .008 .012 .053 .045 .020 .024 .034 .009 .159 .295 .043 1
.005 .008
,006 .006 .006 ,034 .004 .162 1
ni are for set 13, those in the column headed nii are for set 11; all other values are for both sets.
*
"ii
,072 ,005 ,020 .Of35 .072 ,032 ,036 ,025 .014 ,021 .089 .070 1
01
a nH "n I
u1 u2
u3 c156 nHPl %PI iP1 "ii
Table 26. Correlation Matrices for Sets 12 and 14a 61
1
N
w,
a
"H
"n
.015 1
.115 .766 1
.867 .262 .489 1
I
.278 .493 .408 .407 1
U1
.444 .751 .581 .590 .441 1
U2
'3
.SO3 .825 .675 .660 ,515 .868 1
,166 .944 .623 ,360 .511 .774 .908 1
6156
.153 .117 .008 .083 .116 .032 .046 .089 1
"HPl
.005 .005 .008 .006 .006 .006 .034 .004 .068 1
"nP7
.021 .024 .036 .026 ,024 .028 .015 .017 .116 .415 1
.016 .018 .028 .020 ,019 .022 .011 .013 .091 .324 ,274 1
Note: Walues in the column headed n: are for set 14, those in the column headed n,,are for set 12; all other values are for both sets
*
"ii
"ii
,014 ,008 .012 .053 ,045 ,020 .024 .034 .033 .159 .295 .043
.072 .005 .020 .085 .072 .032 ,036 .025 .053 .021 ,089 ,070
1
1
'P7
01
a nH
nn
i U,
Ua ~j
nHp1 n,,pl
ipl "ii
232
MARVIN CHARTON
SetS11 7^ 65log k(cat)/K^ ~
• • • ^^
210-
\
1 2
1
r— 1
1
3 4 5 6 " log k(cat)/K(M),obs
Figure 18. log (k^jK^^cak) vs. log (/ccat//^M.calc).
reasonable predictions of log {k^JK^ and log (1/^^^) from the regression equations for sets 11 and 12. The Effect of Substitution at Position 166
The predominant structural effects on log {k^JK^ values resulting from substitution at position 166 based on the results for set 11 are due to polarizability and to steric effects resulting from the first and third segments of the side chain. These effects account for 21.0 and 40.7%, respectively, of the overall structural effect. Hydrogen bonding accounts for another 11.5%. There is a borderline dependence on Gj and on /. The results obtained for substitution at this position contribute most
SetS12
log 1/K(M)c;
1—I—I—I
r
1.5 2 2.5 3 3.5 4 4.5 5 log 1/K(M)obs Figure 19. log (1//CM,calc) vs. log (1//cM,calc)-
Application of IMF Model
233 Table 27. Values of Q„. and A'
S156 Glu
" II II
Gin Ser Glu
" " Gin Ser Gin Ser Glu
"
S166
PI
log^KtA
Asp Glu Asn Gin Asp
Lys
3.68 4.34 4.25 3.97 3.68 3.68 4.19 4.47 4.28 4.28 4.28 4.25 4.25 2.96 3.70 3.70 3.70 4.79 4.17 5.36 5.27 4.79 4.79 5.30 5.58 5.39 5.39 5.39 5.36 5.36 5.36 5.98 6.09 6.09 3.60 2.97 4.17 3.89 3.60 3.49 4.11 4.38 4.19 4.19
II
Met Ala Gly
" M
Asn
" Arg Lys
Gin
"
Ser Glu
II
II II II
Gin Ser Glu II II
Gin Ser Gin Ser Glu
" Gin Ser Glu II II II
Gin Ser Glu
Asp Glu Asn Gin Asp
Met
II
Met Ala Gly
" II
Asn
" Arg Lys II
'• Asp Glu Asn Gin Asp
"
'• "
Met Ala Gly
Gin
"
Gin
A 0.53 0.14 0.00 0.13 0.73 0.56 0.51 0.43 0.32 1.25 0.91 0.50 0.57 0.23 0.53 0.47 0.03 0.98 0.31 0.34 0.27 0.24 0.12 0.34 0.07 0.24 0.09 0.38 0.59 0.36 0.04 0.17 0.12 0.07 0.58 0.09 0.32 0.47 0.20 0.08 0.22 0.04 0.24 0.52
logK^ 2.64 3.55 3.09 3.09 2.74 2.90 3.29 3.24 3.29 3.40 3.56 3.20 3.36 2.65 2.63 2.74 2.89 3.64 3.64 4.10 4.75 3.75 3.90 4.30 4.30 4.30 4.30 4.30 4.20 4.36 4.57 4.97 4.65 4.81 2.90 2.90 3.36 3.36 3.01 3.56 3.56 3.56 3.56 3.67
A 0.54 0.14 0.02 0.06 0.48 0.17 0.60 0.48 0.16 1.00 0.86 0.46 0.56 0.41 0.30 0.01 0.05 0.71 0.36 0.13 0.23 0.23 0.22 0.53 0.16 0.26 0.02 0.43 0.66 0.28 0.35 0.52 0.03 0.09 0.34 0.05 0.22 0.28 0.07 0.47 0.37 0.01 0.13 0.50 {continued)
234
MARVIN CHARTON Table 27, Continued
S156 Ser Gin Ser Glu
" Gin Ser Glu II
Gin Ser Glu
" Gin Ser Gin Ser Glu II
Gin Ser Note:
SI66
P1
" Asn
" Arg Lys II II
Asn Gin Asp
" Met Gly
" " Asn
" Arg Lys
" "
Glu
logk^jK^ 4.19 4.17 4.17 4.16 4.90 4.90 4.90 1.90 1.63 1.21 1.33 1.85 1.93 1.93 1.93 1.90 1.90 3.19 3.92 3.77 3.92
A 0.19 0.34 0.40 0.10 0.20 0.26 0.06 0.28 0.43 0.09 0.10 0.65 0.39 0.86 0.66 0.14 0.01 0.28 0.17 1.05 0.29
log K^ 3.82 3.47 3.62 3.83 3.81 3.92 4.07 2.36 2.36 2.45 2.16 2.56 2.56 2.67 2.82 2.47 2.62 3.74 3.72 4.26 3.98
A 0.03 0.29 0.20 0.33 0.07 0.24 0.13 0.14 0.24 0.66 0.03 0.26 0.27 0.31 0.10 0.25 0.16 0.44 0.53 0.40 0.42
^Values in boldface are for data points w h i c h were excluded from sets 15 and 16.
to the overall effect. The results obtained for the log (l/K^) values in set 12 show a largest dependence on H^^^^ with substitution at position 166 and at PI having about the same magnitude. Sets in which substitution at PI is constant (sets 2E, 2Q, 2M, 2K) show a dependence on hydrogen bonding as does set 6; this for about 9.3% of the overall effect. There is a borderline dependence on steric effects at the first side chain segment. The Effect of Substitution at Position 156
There is certainly a dependence on substitution at position 156 for log (1/^^^)' there may be a dependence for log (k^JK^) as well. Sets 9 and 10 suggest however that there is no dependence on either polarizability, hydrogen bonding, or ionic side chains at this position. This may be due to an error in the assumption that steric effects at the second and third segments of the side chain are negligible, an error in the assumption of a constant electrical effect, or both. As the study of structural effects at position 156 involves only three residues no conclusion can be reached.
Application
of IMF Model
235
The Effect of Substitution at PI Structural effects resulting from substitution at PI in the substrate have an important effect on both log (k^JK^) and log (1/^^)- ^^^^ ^ through 12 show that n^ is significant while / is the major variable, accounting for well over 50% of the structural effect. In the case of log (k^JK^) there may also be a significant dependence on AZ^ as well. Due to the small number of residues studied these results must be considered at best semiquantitative. Salt Bridge Formation What is most striking about these results is the important contribution of ion-ion interactions between Lys, Arg, and Glu side chains in position 166 and in PI. Asp side chains in this position and Glu side chains in position 156 both seem to have little or no effect. E. Hirudin Values of the inhibition constant K^ for the inhibition of thrombin by substituted recombinant hirudins (r-hir) in which Vail and/or Val2 were replaced by other residues were determined by Wallace and coworkers^ ^ and are reported in Table 28. They have been correlated with the equation, Q^ = LJ:C4 + Ala^ + H^Zn^^ + H^'Ln^ + ILi^ + 521)^ + B^
(53)
where the superscript A indicates that the value of the independent variable is the difference between the value for the side chain X and the value for the side chain of Val, the residue in that position in the wild-type. Thus: v^ = vxf-vx'
(42)
where v is an independent variable, X^ designates the side chain of the residue Aax in the substituted protein, and ]C that of the side-chain in the wild-type or unsubstituted protein. The sum of the variables for the residues at positions 1 and 2 was used as the parameter. Had the substitution at positions 1 and 2 been parameterized separately the number of data points would have been insufficient to permit any
Table 28.
/C, Values for the Inhibition of Thrombin by Hirudin Modified at the N-Terminal Positions^
Set PRB21. Xaal, Xaa2, K,; Val, Val, (wt), 0.231; lie, He, 0.099; Phe, Phe, 0.238; Leu, Leu, 9.91; Ser, Ser, 175; Lys, Lys, 152; Gly, Gly, 694; Glu, Glu, 57000; Leu, Val, 0.235; Val, Leu, 10.3; Glu, Val, 295; Val, Glu, 248 Note: ^Data from ref. 9.
236
MARVIN CHARTON Table 29. Parameter Values for Recombinant Hirudins^ Xa«
XaahXaa2 Val,Val^
^nl
la^
0
^l
Z/^
la^
0
0
0
0
-0.04
0.092
0.52
0.04
0.300
0 0
0
Phe,Phe
0 0
0
-0.12
Leu, Leu
liejle
0
-0.04
0.092
0
0
0
0.44
Ser,Ser
0.20
-0.156
2
4
0
-0.46
Lys,Lys
-0.02
0.158
4
2
2
-0.16
Gly,Gly
-0.02
-0.280
0
0
-1.52
Clu,Glu
0.12
0.022
0 2
8
2
-0.16
Leu,Val
-0.02
0.046
0
0
0
0.22
Val,Leu
-0.02
0.046
0
0
0
0.22
Glu,Val
0.06
0.011
1
4
1
-0.76
Vai^Glu
0.06
0.011
1
4
1
-0.76
Note: ^Recombinant hirudin.
analysis. The parameter values used in the correlations are reported in Table 29. The best regression equation obtained is: log Kj = 0.520(±0.100)Zn^ - 1.38(±0.500)Z\)| + 0.249(±0.310)
(54)
100R\ 81.14; Adj. 100/?^ 79.25; F, 19.36; 5est, 0.866; 5°, 0.501; n, 12 CnnAy 86.2; CuA, 13.8. The correlation matrix for Eq. 53 is reported in Table 30. Figure 20 shows a plot of log ^i.caic against log K^ ,obs*
The structural effect of substitution at positions 1 and 2 of hirudin is almost entirely due to the hydrogen bonding parameter n^ though steric effects make a significant contribution. It must be noted however that there is significant collinearTable 30. Correlation Matrix for Equation 53
K
^l
0.254
0.395
1
0.043
1
^
2a«
1
I/^
21)^
0.758
0.237
0.232
0.165
0.157
0.677
M Za«
0.593
0.809
0.144
2"H
1
0.739
0.126
2"n
1
0.045
2i«
1
Application of IMF Model
237
Hirudin •1
43log K(l)calc 2 -
• • • •
1 —
0- 1
L
' ^
• 1
f •
• •
~i 1 1 1—1—1 -2 -1 0 1 2 3 4 5 log K(l)obs
Figure 20.
log /Cj^calc vs. log
ity between Zn^ and both Z/ and Za^. We therefore cannot exclude the possibility that there are significant contributions from dipole-dipole interactions, and iondipole interactions as well as hydrogen bonding interactions in which the hirudin residue supplies the lone pair. As there is collinearity between Za and Zi) we cannot exclude the possibility of a small contribution from polarizability. It cannot be large because the steric effect term accounts for only about a seventh of the overall substituent effect. This result is in accord with the conclusion of Wallace et al. that replacement of the two N-terminal amino acids in r-hir by polar amino acids resulted in an increase in the inhibition constant. F. L case/Thymidylate Synthase
Climie et al.^^ have reported k^^^ values for the conversion of deoxyuridine monophosphate (dUMP) to thymidylate monophosphate (TMP) with 5,10methylene tetrahydrofolate (CH2H4folate) as the reagent catalyzed by mutants of L. easel Thymidylate synthase in which Val316, the C-terminal residue is substituted. Also reported were K^ values for the interaction of dUMP and CH2H4folate with the mutants. These values are given in Table 31. They were correlated with the IMF equation in the form, Qx = ^^ix + ^ % + H,n^^ + H^n^^ + h^ + S\y^ + B^n^^ + B^n^ + B^n^^ + B'
(55)
which uses the composite parameterization of the steric effect. The best regression equation obtained for k^^^ is: log k^^^^ = ~1.85(±0.848)a;^ -
0.287(±0.0635)AZ^;^
238
MARVIN CHARTON
LCTS1 0.50log k(cat)ca-0.5 -1-1.5.2 ^
•
• •
1
••
'
"
'
1
1
1
•
•
•
1
1
1
-2 -1.5 -1 -0.5 0 0.5 1 log k(cat)obs Figure 21, log/ccat.calc vs log
+ 2.96(±0.396)\);^ ~ 0.461(±0.160)n2x " 1.46(±0.234)
(56)
lOOR^, 88.16; Adj. lOOR^, 85.62; F, 24.20; 5est, 0.264; 5^, 0.405; n, 18. nf a, \), 0.486; a, n2, 0.622; \), m, 0.605. Ca, 13.1; CnH. 8.85; C^, 63.8; C„2, 14.2. A plot of log A;^at,caic ^gainst log /:cat,obs ^^ given in Figure 21. The steric effect of the side chain in position 316 seems to be the major factor in determining the activity of a mutant. This may involve the ease of formation of the final ternary complex. The dependence on polarizability is in accord with binding involving ii (dispersion)
L. case! thymylidate synthase, folate 2.52log K_m,ca1.5 1 0.5-
0-1 1
• • •
m
• ••
•
• _• •
•
•
1
•
~~1
\
1
1.5 2 2.5 log K_m,obs
Figure 22. log /C^^^calc vs. log
•
Application of IMF Model
239
L. casei thymylidate synthase, dUMP 0.7 0.6
1—I—r
0 0.10.20.30.40.50.60.70.8 log K_m,obs Figure 23.
log /Cm^calc vs. log K^^ohs-
interactions between the mutant side chain and the (3 and y carbon atoms of Thr^"^ with which it is in contact.-^^ Correlation of i^^ values for CH2H4folate with Eq. 55 gave as the best regression equation: log K^^cH^^folate = 0.155(±0.0463)n„ - 0.516(±0.119)ni + 2.47(±0.147) (57)
100/?^ 65.41; Adj. 100/?^ 63.25; F, 14.18; 5est, 0.263; 5°, 0.644; n, 18. C„n, 23.2; C„., 76.8. Plots of log A:^ caic against log A'in,obs ^ ^ shown in Figures 22 and 23. Although the fit is poor the F test shows that the results are significant at the 99.9% confidence level. Again, the effect of the mutant side chain is largely steric, with some contribution from hydrogen bonding. There is no dependence on polarizability however. Correlation of K^ values for dUMP with Eq. 55 gave as the best regression: log K^^auMP = -1.56(±0.562)a - 0.233(±0.0666)i - 1.14(±0.249)\) + 0.403(±0.0735)ni + 0.260(±0.0739) + 0.222(±0.0992) + 0.694(10.0902) (58)
100/?^ 81.22; Adj. lOOR^, 73.39; F, 7.927; S^su 0.0951; 5°, 0.554; n, 18. ny. a, \), 0.486; a, n2,0.622; a, n3,0.801; D, nu 0.669; \), M2, 0.605; n2, ^3,0.487, Ca, 15.7; Cv, 10.2; Cu, 35.2; C„„ 17.7; C^^, 11.4; C„3, 9.76.
240
MARVIN CHARTON Table 31. Values of k^^^ and K^ for L. Casei Thymidylate Synthase
Aax, kcat (s"^), KjCH2H4folate) (^iM), K^(dUMP) (^iM): Val(w)}, 5.5, 14, 2.9; lie, 3.8, 35, 2.2; Leu, 1.3, 84, 1.7; Phe, 1.3, 65, 2.2; Thr, 1.2, 140, 3.5; Cys, 1.1, 77, 1.6; Ala, 0.81, 370, 1.2; Met, 0.65, 120, 2.5; His, 0.55, 50, 1.6; Ser, 0.54, 180, 1.7; Asn, 0.39, 1 70, 1.4; Gin, 0.32, 280, 3.1; Tyr, 0.29, 170, 2.4; Glu, 0.15, 830, 2.5; Lys, 0.12, 85, 1.2; Trp, 0.050, 300, 1.5; Arg, 0.020, 130, 1.5; Gly, 0.030, 380, 5.6
The effect of the mutant side chain is once more primarily steric, with an important contribution from polarizability. In view of the small range of the side chain effect the fit of the model is surprisingly good. G.
r. thermophilus Clutamyl-tRNA Synthase
Nurek and coworkers^^ have reported K^ values for the interaction of T. thermophilus glutamyl-rRNA synthase with rRNA^^", Glu, and ATP (sets tRNA, G, and ATP, respectively). Also reported were k^^^ values. The data are presented in Table 32. They were correlated with the equation, Qx = ^crf + Aa^ + H^n^ + H^n^ + li^ + 82^2 + ^^3^3 + ^"^
(^^)
which uses the segmental steric effect parameterization. Zeroth order partial correlation coefficients are given with the other statistics beneath the regression equations. The best regression equations for the K^ values are for tRNA: log K^ = - 1.04(±0.211)/^ - 1.16(±0.496)\)^ + 0.614(±0.177)
(60)
100R\ 73.89; AlOO/?^ 71.52; F, 14.15; 5est, 0.378; 5^, 0.583; n, 13; C, 61.2; Cu2, 38.8; ra'.a, 0.704; ra\n". 0.910; r^^xy^, 0.540; ray, 0.523; ra^, 0.500; rn",i, 0.803; r„v» 0.665; ruV» 0.569 ForG: log K^ = - 0.495(10.154)4 - 0.157(±0.0597)AZ^ + 1.10(±0.292)/^ + 1.50(±0.566)D^- 1.95(±0.507)\)^ + 1.839(10.131)
(61)
100/?^ 80.85; AlOO/?^ 71.27; F, 5.910; 5est,0.212; 5^, 0.596; n, 13; Cn", 13.3; Cn% 4.22; C/, 29.6; Cy,\ 23.0; C^\ 29.8 » r.. values for Eq. 61 are the same as those of Eq. 60. And for ATP on the exclusion of the data point R358Q:
Application of IMF Model
241
T. thermophilus glutamyl-tRNA synthase 05 •
•
0.40.3log K_m,ca^ ^ _
0.1 J 0-
•• • • •
•
• •
••
1
0 0.5
1
1
1
1
1 1.5 2 2.5 3 log K_m,obs
Figure 24. log /C^^calc vs. log K^^^hs, tRNA
log K^ = -2.42(±0.448)a^ + 0.157(10.0598)/^ - 0.462(10.145)1)^ + 1.467(±0.0691)
(62)
lOOR^, 82.09; AlOO/?^ 78.11; F, 12.22; 5est, 0.103; 5^, 0.518; n, 12; Ca, 57.0; C/, 16.1; C^,^ 26.9.; ra»,a, 0.703; ra\n^ 0.910; rc\^,^ 0.537; ra,n^ 0.521; r a ^ , 0.520; rn",i, 0.788; rnW, 0.668; rt^V^ 0.552 Plots of log K^ ^^j^ against log K^ ^^^ are given in Figures 24-26. Steric effects and ionic interactions are present in all three data sets.
T. thermophilus glutamyl-tRNA synthase 2.5
log K_m,ca
1—I—I—\—I—I—r
0.8 1 1.21,41.61.8 2 2.22.4 log K_m,obs Figure 25. log /C^^calc vs. log K^^obs, Glu.
242
MARVIN CHARTON
T. thermophilus glutamyl-tRNA synthase 2.5
log K_m,ca
1.2 Figure 26,
n 1 \ r 1.4 1.6 1.8 2 log K_m,obs
2.2
log /C^^^caic vs. log /C^^obs, ATP.
Correlation of the k^^^ values with Eq. 59 gave the best results on the exclusion of S276A and S299A. The regression equation is: log k^^^ = 2.00(±0.533)a^ - 1.80(±0.604)\)^ + 2.23(±0.597)\)^ + 0.426(±0. I l l )
(63)
100/?^ 89.27; AlOO/?^ 86.59; F, 19.41; ^est, 0.284; 5^, 0.411; n, 11; Ca, 16.7; Cu2, 37.2; Cx)3, 46.1; ra\a, 0.735; ra\/, 0.695; ra,/, 0.553; r„v» 0.524; rn"j, 0.782; r„v, 0.595; rnV» 0.639; rx,v, 0.739. Unlike the correlations with K^ there is no dependence on ionic interactions; like the K^ correlations there is a dependence on steric effects. A plot of log ^cat,caic against log /:cat,obs ^^ 8^^^^ ^^ Figure 27. H. Rat Trypsin
Corey and Craik^"^ have reported K^^ and k^^^ values for the hydrolysis of Z-GlyProArg-(7-amino-4-methylcoumarin) by rat trypsins substituted at positions 57, 102, and 195 at pH 8.0 and pH 10.1. Their data is reported in Table 33. It was assumed that at pH 10.1 the ionization of His was suppressed. Thus, the values of / for His are 1 at pH 8.0 and 0 at pH 10.1. As the substitutions at positions 102 and 195 were invariably D102N and SI95A they are represented by the indicator variables ^102 ^^^ ^^195 which take the value 1 when substitution has occurred and 0 when it has not. The correlation equation used has the form: Qx = Lcf + Aa^ + / / i 4 + H^n^ + li^ + Sv^ + ^102^2 + ^i95«t95 + ^"^ ^^"^^
Application
of IMF Model
243
T. thermophilus glutamyl-tRNA synthase 1.5
.2-1.5-1.0.5 0 0.5 1 1.5 2 log k_cat,obs Figure 27,
log /C^ .ale vs. log
Values of r., significant at the 90% confidence level or greater, are given below the regression equations. The best regression obtained for ^^.^^ at pH 8.0 is: log k^^^ = -5.31(±2.18)af - 0.618(±0.121)4 + 1.94(±0.365)/^ - 0.738(±0.221)ni95 - 0.0998(±0.126)
(65)
100/?^ 84.73; AlOO/?^ 79.64; F, 11.10; 5est, 0.281; 5°, 0.498; n, 13; C^\ 12.1; Cn", 16.3; C/, 51.4; Cn'^\ 19.6; rcj\n^ 0.827; rc\i, 0.679; ra,n, 0.706; ra,/, 0.568; rn'',/, 0.713; r„v, 0.723. andatpH 10.1 is: log k^^^ = 9.71(±4.44)af + 27.0(±6.22)a^ - 1.35(±0.404)«^ • 7.16(±2.44)\)^ + 2.714(±0.558)
(66)
Table 32, Values of K^ and k^^^ for T. thermophilus Glutamyl-tRNA Synthase XiposXf K^(tRNA^'")(^M), Kn,(Glu)(^M), K^(ATP)(^M), k^atis"^): wt, 2.73, 12.0, 23.0, 2.39; D1 60A; 1 72.4, 81.5, 41.7, 0.659; S276A, 24.7, 12.9, 46.1, 0.945; E282A, 422.4, 166, 72.3, 1.06; S299A, 2.70, 12.7, 58.1, 0.00727; L300S, 6.10, 28.6, 77.5, 1.36; W312Y, 21.0, 8.00, 65.4, 1.87; W312C, 3.43, 131, 132, 0.0312; R317Q, 40.7, 83.8, 36.2, 3.13; R349Q, 59.1, 53.3, 27.5, 1.28; R350Q, 21.5, 32.1, 53.1, 0.957; R358Q, 27.5, 103, 112, 3.03; R426Q, 55.0, 45.2, 39.8, 2.76 wt, wild-type
244
MARVIN CHARTON
Rat trypsin -0.5 ^ log k_cat,c -1 -1.5 -2
— I
\
1
1
—
-2 -1.5 -1 -0.5 0 log k_cat,obs Figure 28.
0.5
log /(cat,calc vs. log /(cat,obs/ P H 8.
100/?^71.74; AlOO/?^ 62.33; F, 5.078; 5est, 0.857; 5^, 0.678; n, 13; C^y^, 6.08; Ca, 34.6; Cn", 9.42; Cx), 49.9; rcy\n% 0.830; ra',/, 0.759; ra,m 0.689; ra,/, 0.545; r „ v , 0.518; r,//,/, 0.695; rnn,/, 0.764; rn%•^^ 0.529 Plots of log /:
1^ against log Z:^^ obs ^^^ shown in Figures 28 and 29. It is clear that the results at pH 8.0 are very different from those at pH 10.1. There is no dependence on either polarizability or steric effects at pH 8.0; at pH 10.1 they represent more than 80% of the overall structural effect. Correlation of the K^ values with Eq. 64 gave at pH 8.0 the regression equation:
Rat trypsin
T
- 2 - 1 0 1 2 3 log k_cat,obs Figure 29.
log /Ccat,calc vs. log /Ccat,obs/ pH 10.
Application of IMF Model
245
l o g / ^ ^ ^ = 0.132(±0.0503)Az^ + 0.529(±0.114)nio2+ 1-331(±0.0691) (67)
lOOi?^71.74; AlOO/?^ 62.33; F, 5.078; Sesu 0.857; S°, 0.678; n, 13; Cc', 6.08; Ca, 34.6; Cn", 9.42; C^, 49.9; rcj\n% 0.827; ra,n, 0.706; r^",/, 0.695; rn",u 0.794 At pH 10.1: log K^^ = -3.80(±1.31)af - 5.05(±1.69)a^ + 0.450(10.117)4 + 1.67(±0.710)\)^ + 0.619(±0.149)A1IO2 + 0.929(±0.177)
(68)
100/?^ 78.58; AlOO/?^ 69.06; F, 5.870; Sesu 0.265; 5°, 0.612; n, 14; Ca\ 8.53; Ca, 23.2; Cn", 11.2; C^), 41.6; Cn»^ 15.4; ra\n", 0.797; ra,n, 0.686; rn% 0.675; r;,",/, 0.787 Exclusion of the data point for D102N gives much improved results: log K^^
= -2.92(±1.01)af - 3.18(±1.41)a^ + 0.353(±0.117)AZ^
+ 1.13(±0.0555)\)^ + 0.699(±0.113)^102 + 1.097(±0.143)
Rat trypsin
1
1.2 1.4 1.6 1.8 2 log K_m,obs
Figure 30. log /C^^calc vs. log K^^^hs^ pH 8.0.
(69)
246
MARVIN CHARTON
Rat trypsin
log K_m.ca
1 1.21.41.61.8 2 2.22.4 log K_m,obs Figure 31. log /C^^calc vs. log K^^obs^ pH 10.1.
100/?^ 88.08; A100i?^ 82.13; F, 10.35; 5est, 0.194; 5°, 0.470; n, 13; Ca\ 8.69; Ca, 19.3; Cn", 11.7; C^, 37.2; Cn^^ 23.1. nj values are the same as those for Eq. 67 Plots of log K^.
against log A:„
obs ^ ^ shown in Figures 30—32. As the coefficients of Eq. 69 are not significantly different from those of Eq. 68 but the fit is much improved, the data point D102N is an outlier. Though K^ at both pH 8.0 and pH 10.1 is a function of n^^ and /1JQ2 at the higher pH it is highly dependent on polarizability and steric effects.
Rat trypsin
log K^m.ca
I I \ I r 1.2 1.4 1.6 1.8 2 2.2 2.4 log K_m,obs Figure 32. log /C^^calc vs. log /C^^obs. pH 10.1.
Application of IMF Model
247
Table 33, Values of k^^^ and K^ for Rat Trypsin XiposXf^ kcat(min"^)(pH 8.0), K^{^M)(pH 8.0), kcat(min"^)(pH 10.1), K^(^M)(pH 10.1): wt, 3200, is, 2700, 19; H57A, 0.054, 17, 0.11, 20; H57L, 0.075, 20, 0.16, 21; H57D, 0.78, 13, 0.71, 17; H57E, 0.69, 21, 0.63, 25; H57K, 0.83, 41, 5.2, 48; H57R, 0.01 7, 67, 0.65, 160; D102N, 1.3, 4.2, 140, 13; H57A/D102N, 0.1 7, 87, 7.5, 130; H57D/D102N, 0.18, 62, 0.48, 130; H57K/D102N, 0.41, 18, 6.2, 130; H57L/D102N, 0.13, 41, 4.9, 230; H57A/D102N/S195A, 0.038, 89, 0.041, 1 70; 5195A, 0.079, 41, 0.057, 45.
I. Human Growth Hormone II
Cunningham and coworkers^^ determined EC5Q values for the dimerization of a labeled human growth hormone (hGH) mutant, S257C-AF by other hGH mutants. S257C-AF was prepared by reacting the thiol group of the Cys at position 257 (the terminal position) with 5-iodoacetamidofluorescein. The data set is reported in Table 34. Also reported are values of the ratio ^^s(i,xm/^^5QM (''mut/wt)' which gives a comparison of mutant activity to that of the wild-type. The ratio is used to identify residues that are involved in the dimerization and are therefore part of the receptor site. A value of r^^ut/wt greater than or equal to 2 is considered to indicate a receptor site residue. The EC5Q values for mutants bearing such residues were correlated with the equation: Qx = L(5^ + Aa^ + H^n^^ + H^n^^ + li^ + S{y)\ + 52^2 + ^3^)3 + BT The best regression equation is:
Hunnan growth hormone
1 r 0 0.5 1 EC_50.obs Figure 33,
log ECso^calc vs. log ECso^obs-
(70)
248
MARVIN CHARTON Table 34, Values of EC5Q for Human Growth Hormone
XiposXf^ r^^^^: F1 A, 2.9, 5; I4A, 30, 55; I6A, 1.4, 3; R8A, 1.8, 3; R19A, 0.92, 2; Y111 A, 1.0, 2; K11 5'A, 0.84, 2; D116A, 3.1, 6; El 18A, 0.96, 2; El 19A, 1.1, 2.
log £C5o AX = 5.57(±0.985)\)f + 0-0850 (±0.0779)
(71)
100^^, 80.01; F, 32.01; 5est, 0.222, S°, 0.500; n, 10 A plot of EC3Q ^,^1^ against EC5Q QJ,^ is given in Figure 33. Diimerization seems to be dependent on the difference in steric effect of the first segments of the initial and final side chains in the mutant.
V. THE IMF METHOD AS A BIOACTIVITY MODEL A. Peptide and Protein Bioactivities
The peptide bioactivity models described in this work include all types of peptide substitution except that at the N atom of the peptide bond. The protein bioactivity models described include those involving substitution at one or two positions and those involving the substitution at positions that are part of the receptor site of one or more different residues. The models of peptide and protein bioactivities presented here combined with those reported previously"^ for amino acids, peptides, and proteins provide support for both the specific application to amino acids, peptides, and proteins, and to the general application of the IMF method to all bioactivities. B, The Hansch-Fujita Model
It has been shown that if all the necessary pure parameters are included in the composite parameters and if enough them are used, then a model constructed from composite parameters is completely equivalent to one which uses pure parameters in representing the data.*^^ The only advantage in using pure parameters is the ease of interpretation of the results. In its use of lipophilicity parameters such as log P, log k\ or 71, the Hansch-Fujita (HF) model uses composite parameters.^"-^ The HF model often requires in addition to transport parameters the use of electrical effect, steric effect, and polarizability parameters and occasionally dipole moment^^ parameters as well. These parameters are needed because the composition of a particular transport parameter may not be the same as that of a particular
Application of IMF Model
249
type of bioactivity. This is not surprising. The probability that all biomembranes and all receptor sites will require the same pure parameter composition is extremely small. This conclusion is supported by the review of Seydel and coworkers.^^ The addition of electrical, steric, and polarizability terms adjusts the parameter composition to that of the bioactivity being studied. To illustrate the point let us consider a typical HF correlation equation: log ba^ = Tx^ + pQ;^ + AMR^ + 5\) + i5^
(72)
where ba is the bioactivity; a is a composite electrical effect parameter of the Hammett type; x is a transport parameter such as log P, 7C, or log k'\ MR is the group molar refractivity; D is a steric parameter; and T, p, A, 5, and 5°, are coefficients. As was noted above a is given by the expression: o^^lOi^-^dG^^rc^^^h
(73)
MR^ = 100(a;^ + 0.0103) = lOOa^^ + 1.03
(74)
5„ = 5;t), + ^2a)2 + 5;u3 + 5:
(75)
Equation 3 gives:
Equation 8 gives:
X is given by the equation: % = l^lX + O O ^ + R^eX + ^^X + til^HX + til^nX + ^^X
+ M\ix + S{0^x + Sp^ + S:^-0^x + B>
(76)
Substituting Eqs. 73 through 76 into Eq. 72 results in: log ba^ = (L + p,)a„ + (D + p^)a^ + (/? + f))a^^ + (A + \mA*)a^ + i/jn^x + ^a^nx + ^h + Mjix + (iSi + 55i)\)ix + (^2 + SS*^M2x + (^3 + SS\)\)j,^ +fi°+ p/i„+1.03A' + 55;
(77)
250
MARVIN CHARTON
which may be rewritten as: log ba^ = L'Oi^ + D'G^ + R'a^x + A'a^ + H^rif^^ + H^n^ + li^ + Mii^ + ^jDj;^ + 5^\)2;^ + 5^1)3;^ + ^'(9
(78)
This is a form of the IMF equation. Then based on the success of the HF model bioactivity is a function of the difference in intermolecular forces between initial and final states. This does not mean that transport parameters should not continue to be used in modeling bioactivities. It simply provides an explanation of the manner in which they work. It is vital to recognize that any combination of pure and/or composite parameters which has the correct composition will serve to quantitatively describe a phenomenon. It is not necessary to use transport parameters. Bioactivities can be correlated directly either with the IMF equation or with any convenient combination of pure and composite parameters.
VI. APPENDIX: STATISTICS REPORTED FOR THE CORRELATIONS lOOV?, the percent of the variance of the data accounted for by the regression equation. AlOO^, the lOOR^ value corrected for the number of independent variables in the correlation equation. The difference between lOOR^ and AlOOR^ serves as a measure of the quality of the model. The smaller the difference the better the model. F, the value of the F test which is a measure of the goodness of fit of the model. Sesty the standard error of the estimate. S^, the standard error of the estimate divided by the root mean square of the data. This is also useful as a measure of the goodness of fit of the model. n, the number of data points in the set. Ty, the zeroth order partial correlation coefficients. They serve as a test for collinearity among the parameters. Values of r// are given only for those pairs of parameters which exhibit extensive collinearity.
Application of IMF Model
251
Statistics 1 through 5 may be used as a measure of the goodness of fit of the model for a given data set, all of these except S^^^ may be used in comparing the goodness of fit of one data set with that of another.
ABBREVIATIONS hb hydrogen bonding dd dipole-dipole di dipole-induced dipole induced dipole-induced dipole ii ct charge transfer Id ion-dipole Ii ion-induced dipole Vd\^^ van der Waals IMF intermolecular force Ab bromoacetamino Cm carbamoyl Mg maleoylglycine (0) sulfoxide Pv pivaloyl Cha cyclohexylalanine Cpg cyclopentylglycine Cbz benzyloxycarbonyl Boc r-butoxycarbonyl Nle norleucine Orn ornithine Thg 2-thienylglycine Pe pentyl Pn phenylene c cyclo Ak alkyl Aax amino acid with side chain X Bta butanoic acid
Dbt Dpe Mep
3,5-dibromotyrosine Deaminopenicillamine P-mercapto-P,|3-diethylpropionic acid Mma a-mercapto-a,a-dimethylacetic acid Mmp P-mercapto-P,P-dimethylpropionic acid Mpa P-mercaptopropionic acid Pen penicillamine bromoacetyl Ba Phe(F5) pentafluorophenylglycine mesyl Ms Pa proprionylamino triglycyl Tg Chg cyclohexylglycine Hse homoserine benzyl Bzl sulfone (O2) Nva norvaline Pym pyridylmethyl Sta 3-hydroxy-4-amino-6methylheptanoic acid hexyl Hx naphthyl Nh X^ replacement
REFERENCES 1. Charton, M. In Rational Approaches to the Synthesis ofPesticides; Magee, P. S.; Menn, J. J.; Koan, G. K., Eds.; American Chemical Society: Washington, DC, 1984, pp 247-278. 2. Charton, M. In Trends in Medicinal Chemistry '88\ van der Goot, H.; Domany, G.; Pallos, L.; Timmerman, H., Eds.; Elsevier: Amsterdam, 1989, pp 89-108.
252
MARVIN CHARTON
3. Charton, M. Classical and 3-D QSAR in Agrochemistry and Toxicology; American Chemical Society: Washington, DC, 1995, pp. 75-95. 4. Charton, M. Prog. Phys. Org. Chem. 1990,18,163-284. 5. Charton, M. In Lipophilicity in Drug Action and Toxicity; Pliska, V.; Testa, B.; van der Waterbeemd, J. Eds.; VCH: Weinheim, 1996, pp 387-400. 6. Charton,M.;Ciszewska,G.R.;Ginos,J.;Standifer, K. M.;Brooks,A. I.;Brown,G. P.;Ryan-Moro, J. R; Pasternak, G. W. Quant. Struct. Act. Rel. 1998, 77,109-121. 7. Charton, M. Prog. Phys. Org. Chem. 1987,16, 287-315. 8. Charton, M. In Design of Biopharmaceutical Properties Through Prodrugs and Analogs; Roche, E. B., Ed.; American Pharmaceutical Society: Washington, DC, 1977, pp 228-280. 9. Charton, M. Topics Current Chem. 1983,114, 57-91. 10. Charton, M. Stud Org. Chem. 1992,42, 629-687. 11. McFarland, J. W. Pwg. Drug Res. 1971, 75, 173. 12. Pliska, v.; Charton, M. Proc. 11th Am. Peptide Symp. 1990, pp 290-292. 13. Pliska, v.; Charton, M. J. Receptor Res. 1991, 77, 59-78. 14. Free, S. M.; Wilson, J. W. J. Med Chem. 1964, 7, 395-399. 15. Pliska, v.; Heininger, Int. J. Peptide Protein Res. 1988, 31, 520-536. 16. Dellaria, J. R; Maki, R. G.; Bopp, B. A; Cohen, J.; Kleinert, H. D.; Luly, J. R.; Merits, I.; Plattner, J. J.; Stein, H. H. J. Med Chem. 1987,30, 2137-2144. 17. Nisato, D.; Wagnon, J.; Callet, G.; Mettefeu, D.; Assens, J.-L.; Plouzane, C ; Tonnerre, B.; Pliska, v.; Fauchere, J.-L. J. Med Chem. 1987,30, 2287-2291. 18. Bock, M. G.; DiPardo, R. M.; Evans, B. E.; Rittie, K. E.; Boger, J.; Poe, M.; LaMont, B. I.; Lynch, R. J.; Ulm, E. H.; Vlasuk, G. R; Greenlee, W. J.; Veber, D. F J. Med Chem. 1987,30,1853-1857. 19. Bolis, G.; Fung, A. K. L.; Greer, J.; Kleinert, H. D.; Marcotte, R A.; Rerun, T. J.; Plattner, J. J.; Stein, H. H. J. Med Chem. 1987,30,1729-1737. 20. Kempf, D. J.; deLara,E.; Stein, H. H.;Cohen, J.; Plattner, J. J. J. Med Chem. 1987,30,1978-1983. 21. Luly, J. R.; Yi, N.; Soderquist, J.; Stein, H. H.; Cohen, J.; Rerun, T. J.; Plattner, J. J. J. Med. Chem. 1987,50,1609-1616. 22. Hui, K. Y; Carlson, W. D.; Bematowicz, M. S.; Haber, E. J. Med Chem. 1987, 30, 1287-1295. 23. Charton, M.; Prog. Phys. Org. Chem. 1981,13,119-251. 24. Charton, M. Environ. Health Perspec. 1985,61, 229-238. 25. Alber, T.; Bell, J. A.; Dao-Pin, S.; Nicholson, H.; Wozniak, J. A.; Cook, S.; Matthews, B. W. Sci. 1988,259,631-635. 26. Charton, M. Coll. Czech. Chem. Commun. 1990, 55, 273-281. 27. Fersht, A. R., Shi, J-R, Knill-James, J., Lowe, D. M., Wilkinson, A. J., Blow, D. M.; Brick, R; Carter, R; Waye, M. M. Y; Winter, G. Nature 1985,57^, 235-238. 28. Charton, M. Int. J. Peptide Protein Res. 1986,28, 201-206. 29. Cunningham, B. C ; Wells, J. A. Science 1989,244,1081-1085. 30. Wells, J. A.; Powers, D. B.; Bott, R. R.; Graycar, T. R; Estell, D. A. Proc. Natl. Acad. Sci. 1987, 84, 1219-1223. 31. Wallace, A.; Dennis, S.; Hofsteenge, J.; Stone, S. R. Biochemistry 1989,28,10079-10084. 32. Climie, S. C ; Carreras, C. W; Santi, D. V. Biochemistry 1992,57, 6032-6038. 33. Nurek, O.; Vassylyev, D. G.; Katayanagi, K.; Shimuzu, T.; Sekine, S.; Kigawa, T.; Miyazawa, T.; Yokoyama, S.; Morikawa, T. Science 1995,267,1958-1965. 34. Corey, D. R.; Craik, C. S. J. Am. Chem. Soc. 1992, 77^, 1784-1790. 35. Cunningham, B. C ; Ultsch, M.; de Vos, A.M.; Mulkerrin, M. G.; Clauser, K. R.; Wells, J. A. Sci. 1991,25< 821-825. 36. Charton, M.; Greenberg, A.; Stevenson, T. A. J. Org. Chem. 1985,50, 2643-2646. 37. Lien, E. J.; Guo, Z.-R.; Li, R.-L.; Su, C.-T. J. Pharm. Sci. 1982, 71, 641-655. 38. Seydel,J.K.;Coats,E.A.;Cordes,H.R;Wiese,M.Arc/i.Pharm.(Weinheim) 1994,327,601-610.
INDEX
Ab initio, see Quantum Abraham hydrogen bond, 15, donor parameter, 146 Acid-base reactions, 36 Acidity, 53 Activation energy 3, Activity-determining, 208 Adsorption, 2, 5, 7, 9-10-19 constant, 19, 20 Aliphatic, 10,21,24,72, 148 Alkanes,190 Alkyl group, 105, 113, 194, 196, 197, 207 Amino acids, 21, 23, 178, 187, 198, 237 Aromatic hydrocarbons, 139, Bader analysis 46 Basicity, 50, 58, 76, Benko method, 138, 167 Bondi, see Volume Binding, 13, 15-16, 18,22-3,28, constant, 9 energy, 2,10-11, 24, 183 Bioactivity, 178, 181, 196, 205, 208 Biocomponent, 201 Biomembrane, 181 Biparametric model, 38 Biphenyl, 147
Branching equation extended(EB), 180 simple(SB), 180 Bronsted acid, 54 Bulk parameter, 12, 16, 24, 145 Buttressing effect, 51 Calorimetry, 5, 165 Cations, 5, 80, Charge-transfer, 178, 194 Chemisorption, 18 Chromatographic properties adsorption, 9, 17, 21, 24 , 26, 28, capacity factors, 178 relative flow rate, 178 retention times,2,15-16, 178 Clausius-Clapeyron equation, 127 Complex formation, 181 Conformation, 183 Connectivity, 19 Correlation, 81 analysis, 39 coefficient,43,59, 143, 187 equation, 195, 202, 216, 226 Covalent bond, 183 Critical temperature(K), 135 Crystal lattice, 129, 134 253
254
distortion, 168 thickness, 162 Cyclohexyl, 190, 198 Debye, see Intermolecular forces Dehydrohalogenation, 100, Delocalization, 61 Differential thermal analysiss (DTA), 133 Dispersion forces, 131, 187, 238 Dissociation constants, 52, 209 Disulfide, 185 Elasticity coefficient, 135 Electrical effect parameters, a, 178, 248-249 Electronegativity, 22 difference, 169 Electron-atom ratio, 169 Electron donor, 8, 63, 69, 71, 96, 105, 112,117, Electron-withdrawing, 58, 86, 88, 9495,101,112,117, Electronic demand, 37, 179, 188 effect, 5, 37 Electrophilic(PH),4 Electrostatic, 45, 51, Eliminations, 83, 92, 94-95, 101, 105, 119, Enantiomers, 188, 197 Enthalpy, change, 45, 48, 54, 56 fusion, 151, 154, 156, 159 melting, 129 sublimation, 148 Entropy boiling, 130 expansion, 130 fusion, 130, 148, 159 melting, 129, 153 rotation, 130
INDEX
Enzyme, 201 Equilibrium constant, 2, 18, 29, 44 Eutectic effect, 129 Extended branching equation (EB), 180 Face-centered cubic, 128 Factor analysis, 145 Fatty acids, 135 Fermi energies, 168 Field effect 44, 54, 67, 73, See Substituent constant Force-field calculation, 65 Fisher statistic, 7 Free energy , 3, 5, 52, adsrption, 7 binding, 8 melting, 129 Free-Wilson analysis, 184 Freezing point, 128 Freundlich see adsorption constant Fusion, 129 Gibbs energy, 42, 49, 57, 63, 66, 72, Halogenated hydrocarbons, 141, 150, 153 Hammett equation, 38 , 94, 97, 109, 115, see Substituent constant Hancock parameter, 83, 87, Hansch hydrophobic substituent parameter, 146 Hansch-Fujita model(HF), 248 Hydration, 5, 80, Hydrocarbons, 143, 148, 152 Hydrogen bonding, 2,3, 14, 16, 21,23, 24,30,39,148,151,194,205, 234 complex formation, 178 intermolecular, 153, 179 intramolecular, 131, 154 Hydrophobicity, 178
255
Index
Hygroscopicity, 131 hyperconjugation, 85, Indicator variable, 5, 16, 19, 21, 24, 187,205 Inductive effect, 3, 22, 37, 58, Inhibition, 235 Intermolecular forces, 31, 129 ,178 charge transfer, 179 dipole-dipole,2,3, 179, 237 dipole-induced dipole, 2,3,179 hydrogen bonding, 179, 187, 236, 237 induced dipole-induced dipole, 2, 17, 19, 179, 238 ion-dipole,48, 179, 237 ion-induced dipole,48, 179 Intermolecular force equation (IMF), 149, 178,181 Ionic bonding, 131, 235 Ionic charge, 180 Ionization, 194, 229 Isomers (see Enantiomers), 200 Keesom see Intermolecular forces Kofler hot bench, 132 Lanthanide rare earths, 168 Lewis acid, 14, Lindemann equation, 168 Lipid-soluble, 181 Lipophilic(PL), 4, 22, 24 LFER, Linear free energy relationship, 72,107, 167 Linear relationship, 36,46,59,68,75, 83, 100, London see Intermolecular forces Lone pair electrons, 179, 185 Log P, 3,21,28, 178 Lysozomes, 209
Melting point, 127, 131, 178 estimation methods, 139,144, 155 160-1, 168, 170 homologous series, 134, 136 inorganic compounds, 167, 170 isomers, 133 paraffins, 137 polymers, 136, 162 Metallocenes, 167 Metastable, 128 Molar refraction, 3 surface, 166 Mole-fraction solubility, 150 Molecular connectivity, 19, 141, 148, 167 eccentricity, 146 field analysis, 167 structure, 178 symmetry, 146 volume, 148 Molecular orbital theory, 170 NMR see Spectroscopy Neural network, 167 Non-hydrogen bonding,, 148, 152, 155, 156 OUgomers, melting point, 162 Omega method, 201 Partial correlation coefficients, 187 Partition coefficient, 128, 151 Peptides, 178,181, 183, 188, 248 Polarizability, a ,38, 47, 85, 179, 188, 191,201,205,234,238,240, 246 Polarization, 3, 98, Polymers, 2,4, 21, inorganic, 5 melting point, 161-171 Principal component analysis, 45
256
Proteins, 178, 208-10, 235, 239, 247, Proton transfer, 56, 75, Protonation, 23, 52, 56, 58, Prud'homme'srule, 138, 167 relationship, 146 Quantum calculations, 43, 50, chemical properties, 166 QSAR, quantitative structure-activity QSPR, 2 relationship, 146 semiquantative,(SQSAR), 207 p value, 71, 85, 88,91,98,102,112, 14, Reactivity, 36 Receptor binding, 178, 181 Regression, 167, coefficient, 181 equation, 181, 187, 191, 202, 211, 216, 240, 243,245, 248, Resonance demand, 39, 59, 79, effect, 4, 37, 38, 54, 67, 73, 85, 101, interaction, 60, 91 Retardation, 2 Rhombohedral, 128 Rotation, 147 Salt bridge, 235 Segmental model, 181 Semiconductors, 170 Sidechain, 180,181,185, 193 Silica, 5-6 Simple branching equation, (SB), 180 Solubility, 10,129 Solvent effect, 56, 75, 78, 112, Spectrophotometry, UV, 113, Spectroscopy, 5, 54, 56, Steric effects,3, 4, 13, 27, 51, 65, 180, 205, 240, 246 Steric parameter, % 180, 201, 212,
INDEX
composite steric parameter, D, 26, 85180,181 strain energy, 66, 92, Structural parameters, 145, 190, 225 Structure, 2, 3, 36, 56, Substituent, 9, 12, constant, a , 4, 6, 26, 37, 41,48, 60, 95,106,110, effects, 36, 57, 69, 88, 99, 102, 110, parameter, 40, 93, 146 Substrate, 181, 184,203,213 Surface, 3 Swain-Lupton parameters, 146 Symmetry rotational, 130, 142 tetrahedral, 180 Thermal conductivity, 138 Thiele apparatus, 132 Topological descriptors, 150 indices, 141, 150, Transfer, 3, 8 Transitition state, 79, 83, 92, 103, 104, 106,113,114,183 Transport, 181,248,250 UPPER, unified physical property estimation relationships, 159 Upsilon, see Steric parameter, composite Uracil, 129 Valence electron density, 169 van der Waals 2, 180 Vibrational forces, 129 molecular descriptors, 144, 150 Viscosity, 151 Volume, 18, 19 Bondi, 3 geometric, 148 Wiener index (W), 141,164
Index
WHIM, weighted holisitc invariant vibrational frequency, 135 Yukawa-Tsuno equation, 38, 62, 81, Zeta method, 203
257
This Page Intentionally Left Blank